Pillar guideSpeech-Language Pathology18 min read

Narrative Assessment for School-Age SLPs: A 2026 Clinical Reference

Narrative assessment is the school-age SLP’s most useful single piece of language sample data, because it sits at the exact intersection of expressive language, syntax, vocabulary, and the literacy outcomes the IEP team is held accountable for. This pillar walks through the six problems a defensible narrative assessment has to solve — elicitation paradigm, story grammar scoring, NSS scoring, the literacy connection, IEP goal drafting, and the school timeline — with the two free deterministic calculators on this site as the scoring backbone.

1. Why narrative assessment is the highest-value LSA activity for school-age SLPs

Narrative assessment occupies a unique position in the school-age SLP’s assessment toolkit because it is the single language activity that simultaneously taxes every linguistic system the IEP team cares about: receptive comprehension of the prompt, expressive vocabulary for character and event description, morphosyntactic grammar for clause construction, discourse-level macrostructure for episode organisation, theory of mind for character motivation, and the working memory needed to hold a multi-clause story across thirty seconds of speech. A child who can produce a complete, well-organised, grammatically intact narrative on a familiar topic is demonstrating linguistic competence at a level that no isolated standardised subtest can measure. A child who cannot is signalling a clinical concern that connects directly to the literacy and academic outcomes the IEP team is held accountable for in grades K through eight.

The school-age SLP’s diagnostic interest in narrative is therefore both broader and sharper than the early-elementary SLP’s interest in conversational LSA. Conversational sampling is the gold standard for kindergarten and pre-K, where the goal is to capture the child’s spontaneous grammatical system in a low-demand interaction with a familiar adult. Once the child enters the academic grades, the assessment question shifts from "can this child produce age-appropriate grammar" to "can this child organise a multi-event story the way a peer can," because the second question is the one that maps onto the third-grade reading-comprehension trajectory and the IEP-team accountability framework. Narrative assessment is the LSA activity that answers the second question, and that is why every published school-age LSA protocol from the 1980s onward has narrative as a load-bearing component.

The 2026 honest framing for the rest of this pillar is that narrative assessment is the highest-leverage single piece of LSA data a school SLP can collect on a grades K-8 caseload, because the same fifty-utterance sample yields conversational metrics (MLU, NDW, PGU), narrative microstructure metrics (story grammar units, sentence-level clause counts), and narrative macrostructure metrics (the seven NSS components), and the resulting data set drives the eligibility decision, the present-levels paragraph, the IEP goals, and the literacy referral all from a single elicitation. The two free deterministic calculators on this site — the Narrative Scoring Scheme Calculator and the Story Grammar Scorer — are the scoring backbone that makes this multi-output workflow affordable on a working school caseload, and they are the tools every section of this pillar will refer back to.

The 2026 honest line

Narrative assessment is the highest-value LSA activity for grades K-8 because a single fifty-utterance sample yields conversational metrics, microstructure metrics, macrostructure metrics, and a literacy-referral signal in a single elicitation. The two free deterministic calculators on this site are the scoring backbone that makes the multi-output workflow affordable on a working school caseload.

2. Problem 1: Which elicitation paradigm — story retell, story generation, or personal narrative

The first methodological decision in any narrative assessment is which elicitation paradigm to use, and the answer matters more than school SLPs typically realise because the three paradigms tax different cognitive systems and produce different metric distributions. Story retell, in which the clinician tells or reads a story to the child and the child immediately re-tells it, taxes auditory comprehension, working memory, and expressive reproduction; it is the paradigm with the smallest range of clinical variation because the macrostructure is supplied by the model story and the child only has to reproduce it. Story generation, in which the child is given a wordless picture book or a single picture stimulus and asked to tell a story about it, taxes expressive language and macrostructure organisation simultaneously and is the paradigm with the largest range of clinical variation. Personal narrative, in which the child is asked to tell a real story about something that happened to them, taxes autobiographical memory and topic management on top of the linguistic systems and is the paradigm closest to the academic discourse demands of the upper-elementary grades.

The 2026 best-practice consensus among the school-age narrative researchers is that a defensible narrative assessment uses TWO of the three paradigms, not one in isolation, because each paradigm leaves its own diagnostic blind spot when used alone. Story retell on its own under-counts macrostructure problems because the child borrows the model story’s organisation; story generation on its own confounds linguistic difficulty with picture-book familiarity; personal narrative on its own confounds linguistic ability with the child’s willingness to share. The combination most school-age narrative researchers recommend is one wordless-picture-book story generation (Frog Where Are You is the canonical stimulus, but contemporary alternatives work) plus one personal narrative on a familiar topic, with story retell available as a fallback when the child is too anxious to generate from a picture book. The two-paradigm protocol takes 15 to 20 minutes of elicitation time and produces the data the rest of this pillar is built around.

The third elicitation decision is the prompt and the wait-time protocol. The bilingual research literature has shown that elicitation depth is bounded by the eliciter’s framing and patience, and the same finding applies to monolingual narrative assessment: a clinician who interrupts the child or supplies the next event word produces a narrative that under-represents the child’s independent macrostructure. The right protocol is to use a neutral prompt ("Tell me everything that happened in this story" or "Tell me everything you remember about that day"), wait at least five seconds before re-prompting, and re-prompt only with the same neutral phrasing. The clinician’s job is to be a transparent recording medium, not a narrative co-author, and a clinician who finds themselves filling in event words is producing a sample that the calculators on this site cannot score reliably because the data is contaminated with adult contributions.

  • Use TWO elicitation paradigms, not one in isolation — each paradigm has its own diagnostic blind spot.
  • The 2026 standard pairing: one wordless-picture-book story generation + one personal narrative.
  • Story retell is the fallback when the child is too anxious for independent generation.
  • Use a neutral prompt and wait at least five seconds before re-prompting.
  • Never supply the next event word — the clinician’s job is transparent recording, not narrative co-authorship.
  • Total elicitation time: 15 to 20 minutes for the two-paradigm protocol.

3. Problem 2: Story grammar scoring without losing inter-rater reliability

Story grammar is the oldest published macrostructure scoring framework in narrative assessment, dating to Stein and Glenn’s 1979 episode-structure analysis and refined by every subsequent generation of school-age narrative researchers. The framework decomposes a story into a fixed set of episode components — setting, initiating event, internal response, internal plan, attempt, direct consequence, and reaction — and scores the child’s narrative on the count of complete episodes (those containing all the load-bearing components) and partial episodes (those missing one or more). The clinical appeal is that the framework is theoretically grounded, the categories are well-defined, and the count maps onto a developmental trajectory that the literature has documented across grades K through eight.

The clinical problem with story grammar scoring is the one every published reliability study has identified: inter-rater reliability is mediocre when clinicians score by hand without a deterministic decision rule for the borderline cases, because the question of whether a particular utterance constitutes an "internal response" or a "reaction" is genuinely ambiguous in a meaningful fraction of school-age transcripts. The published reliability values for hand-scored story grammar hover in the 0.65 to 0.80 range, which is tolerable for research but not great for an eligibility decision that has to survive due-process review. The corrective the school-age narrative literature has converged on is to use a deterministic computational scorer with explicit decision rules for the borderline cases, and to document the rule application in the report so the reader can audit the score.

The Story Grammar Scorer on this site is the deterministic implementation that solves this problem. The scorer takes a transcript in the standard utterance-per-line format the LSA calculators use, applies the published Stein and Glenn category definitions with explicit borderline rules, and returns the count of complete episodes, the count of partial episodes, and the per-component breakdown for each episode. The same transcript produces the same score every time, which is the property that makes the scorer defensible in due-process review and which is the property the hand-scoring approach cannot guarantee. The clinician still owns the judgement call about what the score means clinically — the scorer reports the data, not the eligibility recommendation — but the scoring step is no longer the source of inter-rater disagreement.

  • Story grammar decomposes a narrative into setting, initiating event, internal response, internal plan, attempt, direct consequence, and reaction.
  • Hand-scored inter-rater reliability hovers in the 0.65-0.80 range — tolerable for research, not great for due-process eligibility.
  • The corrective is a deterministic computational scorer with explicit decision rules for borderline utterances.
  • The Story Grammar Scorer on this site applies the published Stein & Glenn definitions with explicit borderline rules.
  • The scorer reports complete episodes, partial episodes, and per-component breakdown — the clinician owns the clinical interpretation.
  • Same transcript, same score, every time — the property that makes the score defensible in due-process review.

The deterministic scoring rule

Story grammar scoring is only defensible in due-process review when the scoring step is reproducible — same transcript, same score, every time. Hand-scoring fails this test because borderline utterances are genuinely ambiguous. The Story Grammar Scorer on this site applies the published Stein and Glenn category definitions with explicit borderline rules and produces a reproducible score that the clinician can audit and defend.

4. Problem 3: The Narrative Scoring Scheme (NSS) and the seven macrostructure components

The Narrative Scoring Scheme (NSS) is the macrostructure scoring framework the school-age narrative research community has converged on as the dominant published reference in 2026, because it solves two problems story grammar scoring leaves open. First, NSS scores narratives on a 0-to-5 ordinal scale across each of seven explicit macrostructure dimensions — introduction, character development, mental states, referencing, conflict resolution, cohesion, and conclusion — which gives the clinician a richer multi-dimensional picture than the binary complete/partial-episode count. Second, NSS has published age-banded normative data from Heilmann and colleagues for school-age children in grades K through six, which is the normative reference school SLPs need for the eligibility decision and which the older story grammar literature does not provide in the same standardised form.

The seven NSS components are deliberate. Introduction captures whether the child sets up the characters, setting, and time frame at the start of the narrative. Character development captures whether the child gives the characters distinguishing features and consistent identities throughout. Mental states captures whether the child references the characters’ thoughts, feelings, and motivations — the theory-of-mind dimension that maps onto the social-pragmatic literature. Referencing captures whether the child uses pronouns and definite/indefinite articles in a way that lets the listener track which character is the subject of which event. Conflict resolution captures whether the child establishes a conflict and resolves it within the narrative arc. Cohesion captures whether the events flow in a logical sequence with appropriate connectives. Conclusion captures whether the child ends the narrative deliberately rather than just trailing off.

The clinical implementation of NSS scoring is the deterministic Narrative Scoring Scheme Calculator on this site, which takes the transcript in the same utterance-per-line format and walks the clinician through the seven components with explicit anchor descriptions for each ordinal score level. The calculator returns the seven component scores and a total NSS score that can be compared against the Heilmann age-banded norms. The clinician still makes the ordinal-level judgement call — the scorer cannot mechanically distinguish a "3" from a "4" on character development without a human reading the transcript — but the calculator structures the judgement into a reproducible workflow with explicit anchors, which is the property that takes NSS hand-scoring from a 0.70 reliability into the 0.85 range that the published reliability studies report for trained scorers using the explicit anchor rubric.

  • NSS scores narratives on a 0-to-5 ordinal scale across seven macrostructure dimensions.
  • The seven dimensions: introduction, character development, mental states, referencing, conflict resolution, cohesion, conclusion.
  • NSS has published Heilmann age-banded norms for grades K through six — the normative reference school SLPs need.
  • The mental-states dimension maps onto the social-pragmatic theory-of-mind literature.
  • The Narrative Scoring Scheme Calculator on this site walks the clinician through each component with explicit anchor descriptions.
  • Trained-scorer reliability with explicit anchors hits 0.85, vs 0.70 for unstructured hand-scoring.

5. Problem 4: How narrative metrics connect to literacy outcomes the IEP team cares about

The reason narrative assessment is the highest-leverage LSA activity for school-age children is that the narrative metrics connect more directly to the literacy outcomes the IEP team is held accountable for than any other LSA metric does, and the connection is both empirically documented and mechanistically understood. The published longitudinal evidence is that kindergarten and first-grade narrative metrics — particularly NSS total scores and story grammar episode counts — predict third-grade and fourth-grade reading comprehension at correlations in the 0.55 to 0.70 range, which is the highest single-predictor correlation in the school-age literacy prediction literature. A child whose kindergarten narrative is below age-banded norms is at meaningfully elevated risk of failing the third-grade reading comprehension standard, and the IEP team needs that signal at the kindergarten assessment, not at the third-grade failure.

The mechanistic explanation is that narrative discourse and reading comprehension share the same underlying cognitive substrate: both require holding a multi-event sequence in working memory, tracking referents across the sequence, inferring character motivations from limited explicit cues, and integrating new information with prior context as the discourse unfolds. A child who cannot organise their own oral narrative is a child whose underlying narrative comprehension and production system is not yet operating at age-appropriate levels, and a child whose narrative system is not operating at age-appropriate levels will struggle with the third-grade transition from "learning to read" to "reading to learn" because the latter is reading-as-narrative-comprehension. The narrative LSA is therefore not just a language assessment; it is an early literacy screening with linguistic specificity that the standard reading-readiness measures do not provide.

The clinical implication is that the school SLP’s narrative assessment report is one of the documents the literacy team should be reading at the kindergarten and first-grade IEP meetings, not just the language eligibility document. A defensible narrative report names the literacy connection explicitly — "this child’s NSS total score of 18 places them in the 10th percentile for grade 1 against Heilmann norms, which the published longitudinal data identifies as a meaningful elevated risk for failing the grade 3 reading comprehension standard" — and recommends a literacy consult or co-treatment as part of the IEP. This is the framing that elevates the narrative LSA from an internal SLP document into a multi-disciplinary referral driver, and it is the framing the school-age narrative research community has been pushing for the last decade.

  • Kindergarten and first-grade NSS scores predict grade 3-4 reading comprehension at correlations of 0.55-0.70.
  • Narrative discourse and reading comprehension share the same underlying cognitive substrate.
  • A child whose kindergarten narrative is below norms is at meaningfully elevated risk for failing the grade 3 reading standard.
  • The narrative LSA is an early literacy screening with linguistic specificity that reading-readiness measures lack.
  • A defensible narrative report names the literacy connection explicitly and recommends a literacy consult or co-treatment.
  • This framing turns the narrative LSA into a multi-disciplinary referral driver, not just an internal SLP document.

The kindergarten-to-third-grade prediction

The published longitudinal data is unambiguous: a kindergartener whose NSS total score is below the Heilmann age-banded norms is at meaningfully elevated risk of failing the third-grade reading comprehension standard. The IEP team needs that signal at the kindergarten assessment, not at the third-grade failure. A defensible narrative report names this connection explicitly and recommends the literacy consult.

6. Problem 5: Writing IEP goals from a narrative language sample

When a narrative assessment identifies a clinically meaningful weakness — below-age-banded scores on NSS components, missing story grammar episodes, or both — the next step is writing IEP goals that target the identified weakness with the right specificity for a school caseload. The bilingual pillar in this cluster covered the language-specification problem; the narrative case has its own specification problem, which is that narrative goals have to choose between targeting a specific NSS component, a story grammar component, or a global narrative metric, and each choice has implications for measurement and intervention. The 2026 best practice is to write goals at the component level, not the global level, because component-level goals give the SLP a measurable target that intervention sessions can be designed around, whereas global goals collapse into "tell better stories" which is not a measurable target and not a defensible IEP goal.

A component-level NSS goal looks like this: "Marcus will produce a personal narrative containing all seven NSS components scored at level 3 or higher, as measured by a 50-utterance personal-narrative sample scored against Heilmann age-banded norms across three consecutive sessions." A component-level story grammar goal looks like this: "Marcus will produce a wordless-picture-book narrative containing at least two complete episodes (each containing setting, initiating event, attempt, and direct consequence) across three consecutive sessions, as measured by the Story Grammar Scorer." Both goals are SMART, both reference a deterministic measurement protocol, and both can be tracked across the school year on a quarterly progress-monitoring schedule.

The IEP Goal Generator on this site supports the narrative goal templates as a standard option, with the deterministic calculator outputs as the measurement criterion. The clinician selects the narrative goal type, enters the baseline score from the initial narrative LSA, sets the target score, and the generator produces the SMART goal sentence in the format the school district’s IEP template expects. The clinician owns the judgement call about whether the goal is appropriate for the child — the generator does not pre-empt the clinical decision — but the structured-in prose-out drafting collapses the goal-writing step from twenty minutes to two minutes per goal, which is the time saving that makes the methodologically rigorous narrative assessment protocol affordable on a school caseload.

  • Write narrative IEP goals at the COMPONENT level, not the global level — component goals are measurable, global goals are not.
  • NSS component goals reference Heilmann age-banded norms and a fixed component-score target.
  • Story grammar component goals reference complete-episode counts measured by the Story Grammar Scorer.
  • Both goal types are SMART, both reference deterministic measurement, both fit a quarterly progress-monitoring schedule.
  • The IEP Goal Generator on this site supports narrative goal templates with the deterministic calculator outputs as measurement.
  • Structured-in prose-out drafting collapses goal-writing from 20 minutes to 2 minutes per goal.

7. Problem 6: Fitting a defensible narrative assessment into a 90-minute school timeline

The last problem a defensible narrative assessment has to solve is the school timeline. School SLPs do not have ninety minutes of uninterrupted face time with one child, plus another two hours of scoring time, plus another hour of writing time, plus another hour of IEP-team meeting time, on a working caseload that has thirty to fifty children on it. The methodologically rigorous narrative protocol described in the previous five sections has to fit inside the realistic school assessment window, or it does not get adopted, and a protocol that does not get adopted does not improve eligibility decisions or IEP goal quality. The 2026 honest framing is that the narrative LSA fits inside a 90-minute total clinician-time window when the scoring step is collapsed by the deterministic calculators on this site and the drafting step is collapsed by the IEP Goal Generator, and it does not fit otherwise.

The realistic timeline looks like this. Step one is the elicitation: 15 to 20 minutes of face time with the child, covering one wordless-picture-book story generation and one personal narrative. Step two is the transcription: the audio file is uploaded to ConductSpeech (or transcribed by hand if the SLP prefers, in which case this step takes another 30 to 60 minutes), and the transcript is placed into the standard utterance-per-line format the calculators use. Step three is the scoring: the transcript is run through the Narrative Scoring Scheme Calculator and the Story Grammar Scorer on this site, which together take roughly 10 minutes of clinician time including the borderline-utterance judgement calls. Step four is the cross-reference: the NSS total and component scores are compared against the Heilmann age-banded norms (the SUGAR Norms Lookup on this site is the access point for the published norms), and the story grammar episode counts are compared against the published age trajectories.

Step five is the clinical reasoning step: the SLP looks at the multi-metric output — NSS components, story grammar episodes, conversational metrics from the same 50-utterance sample — and identifies the pattern of strengths and weaknesses that drives the eligibility recommendation. Step six is the present-levels paragraph drafting, which takes 5 to 10 minutes once the SLP has the structured calculator output to paste into the template. Step seven is the IEP goal drafting, which takes another 5 minutes per goal using the IEP Goal Generator. Step eight is the report finalisation, which takes 10 to 15 minutes for the polish and the literacy-referral language. Total clinician time, end to end: 75 to 90 minutes for a defensible school-age narrative LSA on a working caseload, which is the timeline that makes this protocol practical to adopt.

  • Step 1 (15-20 min): elicitation — one wordless-picture-book story generation + one personal narrative.
  • Step 2 (5-60 min): transcription — 5 with ConductSpeech, 30-60 by hand.
  • Step 3 (10 min): scoring — NSS Calculator + Story Grammar Scorer with borderline-utterance judgement calls.
  • Step 4 (5 min): cross-reference — NSS scores against Heilmann norms, story grammar against published age trajectories.
  • Step 5 (10 min): clinical reasoning — strengths/weaknesses pattern across all the metrics.
  • Steps 6-8 (20-30 min): present-levels drafting, IEP goal generation, report finalisation.
  • Total clinician time, end to end: 75 to 90 minutes for a defensible school-age narrative LSA.

8. Common mistakes in school-age narrative assessment (and how to avoid them)

The published reliability and validity literature on school-age narrative assessment, plus a decade of clinical implementation experience in the school-age narrative research community, has identified a small set of recurring mistakes that show up in real district narrative reports and that lead to eligibility decisions that do not survive due-process review. The good news is that all of these mistakes are avoidable with explicit protocol decisions, and the explicit protocol decisions are the ones the previous six sections have walked through. This section lists the mistakes directly so a school SLP doing a self-audit on a recent narrative report can check their own work against the failure modes the literature has documented.

The first mistake is using only one elicitation paradigm. A narrative assessment based on a single story retell systematically under-counts macrostructure problems because the model story supplies the organisation; a narrative assessment based on a single picture book confounds linguistic difficulty with picture-book familiarity. The corrective is the two-paradigm protocol from section two. The second mistake is hand-scoring story grammar without explicit borderline rules. The published reliability values for hand-scored story grammar hover in the 0.65 to 0.80 range, which is not defensible for an eligibility decision; the corrective is the deterministic Story Grammar Scorer from section three. The third mistake is reporting NSS scores without the Heilmann age-banded comparison. NSS scores in isolation do not drive an eligibility decision; the comparison against age-banded norms is the load-bearing piece of the report.

The fourth mistake is failing to name the literacy connection in the report. A narrative LSA report that does not connect the narrative metrics to the literacy outcomes the IEP team cares about is leaving the highest-value piece of clinical reasoning on the table; the corrective is the explicit literacy-referral language from section four. The fifth mistake is writing global narrative IEP goals instead of component-level goals. "Marcus will tell better stories" is not a measurable target and not a defensible IEP goal; the corrective is the component-level NSS or story grammar goal templates from section five. The sixth mistake is treating narrative LSA as a separate assessment activity from conversational LSA, when the same fifty-utterance sample yields both data sets in a single elicitation. The corrective is to treat narrative LSA as the school-age default elicitation, with conversational metrics extracted as a free byproduct.

  • Mistake 1: Using only one elicitation paradigm. Fix: the two-paradigm protocol.
  • Mistake 2: Hand-scoring story grammar without explicit borderline rules. Fix: deterministic Story Grammar Scorer.
  • Mistake 3: Reporting NSS scores without Heilmann age-banded comparison. Fix: explicit norm comparison.
  • Mistake 4: Failing to name the literacy connection in the report. Fix: explicit literacy-referral language.
  • Mistake 5: Writing global narrative IEP goals instead of component-level goals. Fix: component-level NSS or story grammar templates.
  • Mistake 6: Treating narrative LSA as separate from conversational LSA. Fix: treat narrative LSA as the school-age default elicitation.

9. Where ConductSpeech fits on the narrative LSA workflow

ConductSpeech is built to support the school-age narrative assessment workflow described in this pillar in the same way it supports the other LSA workflows on the rest of the SLP caseload: HIPAA-compliant transcription of the elicited audio, deterministic scoring through the Narrative Scoring Scheme Calculator and the Story Grammar Scorer on this site, and structured-in prose-out drafting of the present-levels paragraph and IEP goals. The narrative-specific extensions are deliberate. The transcription pipeline preserves utterance boundaries in the standard format the calculators expect, which means the transcript can flow directly from the audio file into the scoring step without manual reformatting. The scoring pipeline surfaces the seven NSS components and the story grammar episode counts side by side, which is the multi-metric view the clinical reasoning step needs. The drafting pipeline supports the narrative goal templates with the deterministic calculator outputs as the measurement criterion.

The positioning matches the honest framing of every other pillar in this cluster exactly. ConductSpeech does not produce eligibility recommendations — the multi-metric pattern interpretation is a clinical judgement the school SLP makes from the data, and the tool surfaces the data without pre-empting the decision. ConductSpeech does not replace the elicitation step — the wordless-picture-book story generation and the personal narrative require a clinician sitting with the child, and a tool cannot substitute for that. ConductSpeech does not replace the literacy referral conversation — the connection between narrative metrics and reading-comprehension outcomes is a clinical recommendation the SLP makes to the IEP team, and the tool can draft the language but cannot replace the meeting. What ConductSpeech does is collapse the transcription, scoring, and first-draft paperwork steps into a workflow that takes 30 minutes total instead of two to three hours, which is the time saving that makes the 75-to-90-minute defensible narrative LSA protocol practical to adopt on a real school caseload.

For a school SLP evaluating ConductSpeech on a narrative-heavy caseload, the diagnostic questions are the same as the other LSA cases plus three narrative-specific ones: (1) Does the transcription pipeline preserve utterance boundaries in the format the Narrative Scoring Scheme Calculator and Story Grammar Scorer expect? (2) Does the scoring pipeline surface the seven NSS components and the story grammar episode counts in a multi-metric view? (3) Does the IEP Goal Generator support component-level NSS and story grammar goal templates with deterministic calculator outputs as the measurement criterion? ConductSpeech answers yes to all three today. The honest framing for the narrative case is the same as the honest framing for every other case in this cluster: the clinician owns the judgement call, the calculator owns the math, and the AI saves the clinician the hours that would otherwise be spent on transcription and first-draft paperwork.

Free tools and reference pages

Every link below stays on conductscience.com. The free calculators are deterministic — paste the same transcript, get the same score every time. For a school-age narrative LSA, run the same 50-utterance transcript through the Narrative Scoring Scheme Calculator, the Story Grammar Scorer, and the conversational LSA calculators (MLU, NDW, PGU) for a complete multi-metric data set from a single elicitation.

Free tools

Narrative Scoring Scheme Calculator

Heilmann (2010) seven-component NSS scoring — the deterministic macrostructure calculator the school-age narrative literature has converged on.

Open

Story Grammar Scorer

Stein & Glenn (1979) episode-structure scoring — deterministic complete and partial episode counts with explicit borderline rules.

Open

MLU Calculator

Deterministic MLU-m and MLU-w computation — conversational microstructure metric extracted from the same 50-utterance narrative sample.

Open

Lexical Diversity Calculator

NDW, TTR, MATTR, and vocd-D — vocabulary metrics from the narrative sample, useful as a corroborating cross-metric for eligibility.

Open

PGU Calculator

Percent Grammatical Utterances — the school-age grammaticality metric computed on the same narrative transcript.

Open

DSS Calculator

Developmental Sentence Score — sentence-level syntactic complexity metric for the narrative microstructure layer.

Open

IPSyn Calculator

Scarborough (1990) productive syntax inventory — second syntactic complexity option for school-age narrative samples.

Open

SUGAR Norms Lookup

Pavelko & Owens age-banded mean and SD for MLU, NDW, and TPU — the access point for school-age narrative microstructure norms.

Open

Brown's Stages Lookup

Maps an MLU value onto Brown’s five stages — useful upstream context for early-elementary narrative samples.

Open

Language Sample Worksheet

Printable elicitation prompts and tally sheet — the picture-book story generation prompts are listed for use in the narrative protocol.

Open

IEP Goal Generator

Drafts SMART goals from the narrative metrics — supports component-level NSS and story grammar goal templates.

Open

Conversation Turn Analyzer

Quantifies speaker balance and turn length — useful upstream check before a personal-narrative elicitation.

Open

Speech-Language Milestones Checker

Age-banded developmental milestones — contextual frame for whether narrative competence is developmentally on track.

Open

Early Intervention Eligibility Calculator

State-specific eligibility cutoffs — the upstream check before narrative LSA on a Birth-to-3 transition caseload.

Open

Caseload Workload Calculator

Quantifies the workload of school-age narrative assessment — the case-mix input for district staffing decisions.

Open

Therapy Frequency Recommender

Recommends a weekly therapy dose — the clinician’s judgement call after the narrative eligibility decision.

Open

Sister pillar articles

Pillar #1 — The Complete Guide to Language Sample Analysis

The hub article covering the full LSA workflow — the conceptual frame this narrative pillar specialises.

Open

Pillar #2 — MLU: Calculation, Norms, and Clinical Use

The MLU pillar — the microstructure metric extracted from narrative transcripts as a free byproduct.

Open

Pillar #3 — Brown’s 14 Morphemes: A Full Clinical Reference

The Brown’s morpheme reference — the morphology layer of the narrative microstructure analysis.

Open

Pillar #4 — SALT vs Free Alternatives: A 2026 Comparison

The tool-comparison pillar — the deterministic calculators reviewed there are the load-bearing scoring step for narrative LSA.

Open

Pillar #5 — How to Conduct a Language Sample (Protocol)

The elicitation-and-transcription protocol — the upstream step that narrative LSA specialises with picture-book and personal-narrative paradigms.

Open

Pillar #6 — Writing IEP Goals from Language Sample Data

The IEP goal pillar — the downstream step where narrative LSA results turn into component-level NSS and story grammar SMART goals.

Open

Pillar #7 — AI in Speech Therapy: What Actually Works

The AI-for-SLP pillar — the same deterministic calculator + AI drafting workflow applied to narrative LSA collapses 90 minutes to 30.

Open

Pillar #8 — Bilingual Language Sample Analysis

The bilingual LSA pillar — narrative assessment on a bilingual caseload uses one narrative per language plus the conceptual NDW cross-reference.

Open

Frequently asked questions

Which elicitation paradigm should I use for school-age narrative assessment — story retell, story generation, or personal narrative?
Use TWO of the three paradigms, not one in isolation. The 2026 best-practice consensus is one wordless-picture-book story generation (Frog Where Are You is the canonical stimulus, but contemporary alternatives work) plus one personal narrative on a familiar topic. Story retell is the fallback when the child is too anxious for independent generation. Each paradigm has its own diagnostic blind spot when used alone, and the two-paradigm protocol takes 15 to 20 minutes of elicitation time and produces the data the rest of the assessment is built around.
Why is hand-scoring story grammar not defensible for an eligibility decision?
Because hand-scored inter-rater reliability hovers in the 0.65 to 0.80 range, which is tolerable for research but not great for an eligibility decision that has to survive due-process review. The borderline-utterance problem is real — whether a particular utterance is an "internal response" or a "reaction" is genuinely ambiguous in a meaningful fraction of school-age transcripts. The corrective is a deterministic computational scorer with explicit borderline rules. The Story Grammar Scorer on this site applies the published Stein and Glenn category definitions reproducibly: same transcript, same score, every time.
What are the seven Narrative Scoring Scheme (NSS) components?
Introduction (does the child set up characters, setting, and time frame), character development (distinguishing features and consistent identities), mental states (theory-of-mind references to thoughts, feelings, and motivations), referencing (pronouns and articles that let the listener track which character is which), conflict resolution (establishing and resolving a conflict within the narrative arc), cohesion (logical event sequence with appropriate connectives), and conclusion (deliberate ending rather than trailing off). Each component is scored on a 0-to-5 ordinal scale with explicit anchors, and the Narrative Scoring Scheme Calculator on this site walks the clinician through the rubric.
What normative reference should I use for NSS scores?
The Heilmann and colleagues age-banded NSS norms for school-age children in grades K through six are the dominant published reference and are the normative reference school SLPs need for the eligibility decision. NSS scores in isolation do not drive an eligibility decision; the comparison against age-banded norms is the load-bearing piece of the report. The SUGAR Norms Lookup on this site is the access point for the published norms data, and the Narrative Scoring Scheme Calculator returns scores in the format the comparison expects.
How do narrative metrics connect to literacy outcomes?
Kindergarten and first-grade NSS total scores predict third-grade and fourth-grade reading comprehension at correlations in the 0.55 to 0.70 range, which is the highest single-predictor correlation in the school-age literacy prediction literature. Narrative discourse and reading comprehension share the same underlying cognitive substrate — holding a multi-event sequence in working memory, tracking referents, inferring motivations, integrating new information with prior context. A child whose kindergarten narrative is below age-banded norms is at meaningfully elevated risk of failing the third-grade reading comprehension standard, and a defensible narrative report names this connection explicitly and recommends a literacy consult.
Should I write narrative IEP goals at the global level or the component level?
At the COMPONENT level, not the global level. Global goals collapse into "tell better stories" which is not a measurable target and not a defensible IEP goal. Component-level goals reference a specific NSS dimension (e.g., character development scored at level 3 or higher) or a specific story grammar component (e.g., at least two complete episodes containing setting, initiating event, attempt, and direct consequence). Both give the SLP a measurable target that intervention sessions can be designed around and that can be tracked across the school year on a quarterly progress-monitoring schedule. The IEP Goal Generator on this site supports both narrative goal templates.
How long does a defensible school-age narrative LSA take, end to end?
75 to 90 minutes of total clinician time when the scoring step is collapsed by the deterministic calculators on this site and the drafting step is collapsed by the IEP Goal Generator. The breakdown: 15-20 minutes elicitation, 5 minutes transcription with ConductSpeech (or 30-60 by hand), 10 minutes scoring with the Narrative Scoring Scheme Calculator and Story Grammar Scorer, 5 minutes cross-reference against Heilmann norms, 10 minutes clinical reasoning, and 20-30 minutes drafting and finalisation. Without the deterministic calculators and the structured-in prose-out drafting, the same protocol takes three to four hours of clinician time, which is the timeline that prevents adoption on a real school caseload.
Can I extract conversational LSA metrics from a narrative sample?
Yes — the same 50-utterance narrative transcript yields MLU-m, MLU-w, NDW, and PGU as a free byproduct of the narrative scoring workflow, and the calculators on this site (MLU Calculator, Lexical Diversity Calculator, PGU Calculator) accept the same transcript format as the Narrative Scoring Scheme Calculator. This is one of the reasons narrative LSA is the highest-leverage school-age elicitation: a single 50-utterance sample produces conversational microstructure metrics, narrative microstructure metrics (DSS, IPSyn), and narrative macrostructure metrics (NSS, story grammar) all from one data set. Note that PGU values are typically lower on narrative samples than on conversational samples for the same child because narrative discourse is grammatically more demanding.
What are the most common mistakes in school-age narrative assessment?
The published literature has identified six recurring failure modes: (1) using only one elicitation paradigm, (2) hand-scoring story grammar without explicit borderline rules, (3) reporting NSS scores without the Heilmann age-banded comparison, (4) failing to name the literacy connection in the report, (5) writing global narrative goals instead of component-level ones, and (6) treating narrative LSA as a separate activity from conversational LSA when the same sample yields both. Each mistake has an explicit corrective in this pillar, and the corrective protocols are the ones the school-age narrative research community has been pushing for the last decade.
Does narrative assessment work on a bilingual caseload?
Yes, with the same adaptations the bilingual LSA pillar in this cluster covers. Run one narrative per language under appropriate elicitation conditions, score each narrative against language-specific norms when they exist, use the conceptual NDW as a corroborating cross-language metric, and apply the "evident in both languages" rule to differentiate language difference from language disorder. The Narrative Scoring Scheme rubric is broadly applicable across languages because the seven dimensions are macrostructure features that exist in any narrative tradition, but the age-banded normative comparison is language-specific. See Pillar #8 in this cluster for the full bilingual LSA protocol.

References

  1. Stein, N. L., & Glenn, C. G. (1979). An analysis of story comprehension in elementary school children. In R. O. Freedle (Ed.), New Directions in Discourse Processing (Vol. 2, pp. 53-120). Norwood, NJ: Ablex.
  2. Heilmann, J., Miller, J. F., Nockerts, A., & Dunaway, C. (2010). Properties of the Narrative Scoring Scheme using narrative retells in young school-age children. American Journal of Speech-Language Pathology, 19(2), 154-166.
  3. Petersen, D. B., Gillam, S. L., & Gillam, R. B. (2008). Emerging procedures in narrative assessment: The Index of Narrative Complexity. Topics in Language Disorders, 28(2), 115-130.
  4. Justice, L. M., Bowles, R. P., Kaderavek, J. N., Ukrainetz, T. A., Eisenberg, S. L., & Gillam, R. B. (2006). The index of narrative microstructure: A clinical tool for analyzing school-age children’s narrative performances. American Journal of Speech-Language Pathology, 15(2), 177-191.
  5. Pavelko, S. L., & Owens, R. E. (2017). Sampling Utterances and Grammatical Analysis Revised (SUGAR): New normative values for language sample analysis measures. Language, Speech, and Hearing Services in Schools, 48(3), 197-215.
  6. Mayer, M. (1969). Frog, Where Are You? New York: Dial Press. [The canonical wordless picture-book stimulus for school-age narrative elicitation.]
  7. Westby, C. E. (2005). Assessing and remediating text comprehension problems. In H. W. Catts & A. G. Kamhi (Eds.), Language and Reading Disabilities (2nd ed., pp. 157-232). Boston: Allyn & Bacon.
  8. Catts, H. W., Fey, M. E., Tomblin, J. B., & Zhang, X. (2002). A longitudinal investigation of reading outcomes in children with language impairments. Journal of Speech, Language, and Hearing Research, 45(6), 1142-1157.
  9. Gillam, R. B., & Pearson, N. A. (2017). Test of Narrative Language, Second Edition (TNL-2). Austin, TX: PRO-ED.
  10. Bishop, D. V. M., & Edmundson, A. (1987). Language-impaired 4-year-olds: Distinguishing transient from persistent impairment. Journal of Speech and Hearing Disorders, 52(2), 156-173.
  11. American Speech-Language-Hearing Association (2024). Spoken Language Disorders — Practice Portal. ASHA.
  12. Individuals with Disabilities Education Act (IDEA), 34 CFR § 300.304 — Evaluation procedures. U.S. Department of Education.

This article is a clinical-workflow reference, not legal or regulatory advice. School-age narrative eligibility decisions are subject to jurisdiction-specific procedural requirements; consult your district’s special-education compliance officer and the literacy specialist on your multidisciplinary team before finalising any eligibility recommendation grounded in this protocol.

Get the full analysis

School-age narrative LSA, done defensibly, on a school timeline

ConductSpeech runs the school-age narrative LSA workflow end to end \u2014 HIPAA-compliant transcription, deterministic Narrative Scoring Scheme and Story Grammar scoring, and structured-in prose-out drafting of the present-levels paragraph and component-level narrative IEP goals. Free trial; no installation.

Get the full analysis with ConductSpeech