1. What is Language Sample Analysis?
Language Sample Analysis (LSA) is a descriptive assessment in which a clinician records a child speaking in a naturalistic context, transcribes the speech into utterances, and then computes metrics that summarise the grammar, vocabulary, fluency, and discourse organisation of the sample. Unlike a norm-referenced test, an LSA produces a portrait of the language the child actually uses — not their performance on a standardised stimulus that the child may never encounter outside the testing room.
The technique was formalised by Roger Brown in A First Language (1973), who introduced both the morpheme-counting rules that still underpin Mean Length of Utterance (MLU) and the five-stage developmental framework that bears his name. Six decades later, LSA remains the assessment that ASHA, IDEA evaluation teams, and the Council for Exceptional Children consistently recommend as ecologically valid evidence of expressive language ability.
A modern LSA produces three families of numbers: structural metrics (MLU in morphemes, MLU in words, T-units, subordination index), lexical metrics (Number of Different Words, Type-Token Ratio, Moving-Average TTR, vocd-D), and grammatical-accuracy metrics (Percent Grammatical Utterances, Developmental Sentence Score, Index of Productive Syntax). Each family answers a different clinical question, and you almost never report only one of them.
2. Why LSA matters for assessment and IEP eligibility
Language Sample Analysis is the only routine clinical procedure that yields functional grammar evidence in the child’s own dialect, in real time, against published age expectations. Standardised tests like the CELF, PLS, or OWLS sample what a child can do on a 30-minute stimulus battery; an LSA samples what the child does when the cognitive load looks like real conversation. For dual-language learners, children whose home dialect differs from General American English, and any student where the standardised score does not match teacher report, an LSA is often the deciding evaluation.
IDEA-eligible school evaluations under 34 CFR §300.304 require multiple sources of data and prohibit the use of a single test for eligibility decisions. Most state implementing regulations explicitly name "language sample" as one of the corroborating sources districts may use. In practice, school-based SLPs use the language sample to support eligibility under speech-language impairment (SLI), specific learning disability (SLD), or developmental delay, and to write present levels paragraphs that can survive a due-process hearing.
Outside of schools, the same logic holds in early intervention, traumatic-brain-injury rehabilitation, and bilingual evaluation. If you ever have to defend a clinical decision in a deposition, the language sample is the piece of evidence that comes from the child rather than from a test publisher.
- IDEA Part B requires multiple measures — the language sample is the most defensible second source.
- CELF/PLS/OWLS sample structured stimulus performance; LSA samples spontaneous use.
- A 50-utterance sample exceeds the 25-50 utterance minimum recommended in Owens (2014), Pavelko & Owens (2017), and Eisenberg & Guo (2013).
3. The 6-step LSA protocol
The protocol below is the consensus workflow synthesised from Owens (2014), Miller & Iglesias (2008), Pavelko & Owens (2017), and Heilmann et al. (2010). It produces a clinically usable sample in 25-40 minutes of clock time and a transcript that supports MLU, NDW, TTR, percent-grammatical, and morpheme-by-morpheme analysis.
- Step 1 — Elicit. Use age-appropriate prompts (conversation for ages 2-3, story-retell for 4-7, expository for 8-12). Keep the adult talk turn at ≈30% of all turns; the goal is the child’s speech, not the clinician’s.
- Step 2 — Record. Lapel mic or phone camera, quiet room, 50-100 utterances minimum. Avoid background music or simultaneous adults.
- Step 3 — Transcribe. One utterance per line, terminate at falling intonation or a clear grammatical boundary. Codify mazes, abandoned utterances, and unintelligible segments in brackets so the metric script can ignore them.
- Step 4 — Segment. Split into communication units (C-units) for school-age children or T-units for written-language analogues. Use Loban (1976) C-unit rules.
- Step 5 — Score. Run the standard battery: MLU-morphemes, MLU-words, total utterances, NDW, TTR, percent-grammatical-utterances, and Brown’s stage. Optional: DSS, IPSyn, subordination index.
- Step 6 — Compare and write. Place each metric next to its age band, write a 4-6 sentence present-levels paragraph, and pull at least one IEP goal directly from the metric gap.
Time budget for a school-based LSA
Most school SLPs can keep total time under 60 minutes per child: 25 minutes elicitation, 20 minutes transcription, 10 minutes scoring, 5 minutes writing. The transcription step is the bottleneck — it is the step ConductSpeech automates entirely.
4. The metric battery — what each number actually tells you
Every LSA report contains the same shortlist of numbers, but it is depressingly common to see a clinician report MLU and stop there. A defensible report covers structural, lexical, and accuracy dimensions because each one fails differently in different disorders. A child with Developmental Language Disorder (DLD) often presents with low MLU and low Percent Grammatical Utterances (PGU); a child with a word-finding problem may present with normal MLU but low NDW; a child with autism may present with normal MLU and NDW but a flat subordination index.
- MLU-morphemes — average morphemes per utterance, the canonical Brown (1973) metric. Strongly correlated with grammatical age in children below MLU 4.0.
- MLU-words — average words per utterance, less sensitive but easier to compute and required for most school-age comparisons.
- NDW — Number of Different Words, the lexical breadth metric. Best computed on a fixed sample size (e.g., first 50 utterances) to avoid sample-length bias.
- TTR / MATTR / vocd-D — lexical-diversity metrics. Plain TTR drops as samples get longer; MATTR and vocd-D correct for length.
- PGU — Percent Grammatical Utterances, the accuracy metric. Below ~80% on conversational language is a clinical red flag for school-age children.
- Subordination Index, T-units, IPSyn — advanced syntactic complexity metrics for school-age and adolescent samples.
Automate this workflow
Skip the manual scoring
The free tools below cover the basic case. ConductSpeech ties them together and starts from raw audio so you do not have to type the transcript.
5. Normative comparison — interpreting the numbers
A metric in isolation is a number. A metric next to an age-banded reference range is a clinical decision. The two normative datasets that dominate the field are Brown (1973) for MLU stages I through V and Pavelko & Owens (2017, "SUGAR" — Sampling Utterances and Grammatical Analysis Revised) for MLU, NDW, and PGU norms by year of age from 3 to 7. The Heilmann et al. (2010) reference set extends NDW and TTR up through age 11.
The trick is to apply the right reference to the right child. Brown’s stages were derived from a tiny longitudinal sample (Adam, Eve, and Sarah) and overestimate MLU growth past Stage V. SUGAR was derived from 250 typically-developing US children speaking General American English and is the right comparison set for school-age General American English speakers; it should not be used unmodified for African American English, Spanish-English bilingual learners, or speakers of other dialects.
For children outside the SUGAR sampling frame, the field generally recommends a within-child comparison: collect a baseline sample, intervene, collect a probe sample 6-8 weeks later, and look for change relative to the child’s own starting point. This is the same logic used in dynamic assessment.
6. From sample to IEP — writing goals from LSA data
The mistake school-based SLPs make most often is treating the LSA as evaluation evidence and then writing the IEP goals from a separate goal bank. The whole point of the language sample is that it tells you which structures are missing or unstable for this child. Use the metric gaps to drive the goal targets.
A defensible workflow: identify the metric where the child sits more than one age band below expectation, find the corresponding linguistic structure (e.g., low MLU + missing past-tense -ed = morpheme target), draft a measurable goal that names the structure, sets a 6-month criterion, and specifies the conditions under which the data will be collected. The IEP team then has a goal that came directly from real spoken language evidence and can be measured by re-running the same LSA at quarterly progress monitoring.
- Articulation goals from PCC and phonological-processes data, not from a stimulus probe.
- Expressive-language goals from MLU, NDW, and missing morphemes — not from a Boardmaker template.
- Pragmatics goals from conversation-turn analysis on the same sample.
- Fluency goals from %SS in the conversational sample.
7. Free tools vs paid tools — the honest comparison
For 25 years the LSA toolchain meant SALT (Systematic Analysis of Language Transcripts) from the University of Wisconsin. SALT is an excellent piece of software with a 700-child reference database, but it costs roughly $295 per single-user license and has an idiosyncratic transcription syntax that takes hours to learn. CLAN (the Computerised Language Analysis suite that ships with the CHILDES corpus) is free, more powerful, and almost completely opaque to a clinician who has not taken a graduate-level seminar in it.
The middle-ground that the ConductScience SLP tool suite offers is a battery of single-purpose, browser-based calculators that compute one metric at a time, accept plain-text utterances (no special transcription syntax), and come paired with the published norm bands. They are free, run client-side, and never see the child’s data — they are also intentionally limited to the basic case so that clinicians who need full corpus management still go to SALT or CHILDES. ConductSpeech, the paid product, sits one layer above these tools: it runs the same metric battery but starts from raw audio so the clinician can skip the transcription step entirely.
When to use what
Use the free tools below for a quick clinic-day analysis of a single sample, use SALT or CLAN for grant-funded research with longitudinal corpora, and use ConductSpeech when the bottleneck is transcription time on a 50-student caseload.
8. Five common mistakes that wreck an LSA
Most failed language samples fail at one of the same five points. Read this before your next clinic day; it will save you a re-test.
- Talking too much. Adult turns above 30% of total turns suppress child MLU and NDW — the sample then describes the clinician, not the child.
- Stopping at 25 utterances. Below the 50-utterance floor, MLU is unstable and NDW is biased downward by sample length.
- Counting morphemes by ear. Brown’s rules are not intuitive (ed-past = 1 morpheme, gonna = 2 morphemes, doggie = 1 morpheme). Use a calculator that implements the rules in code.
- Ignoring mazes. Filled pauses, restarts, and abandoned utterances inflate word counts and depress MLU. Bracket them in transcription and exclude them from metric scripts.
- Comparing to the wrong norms. Applying SUGAR-English norms to a Spanish-dominant bilingual child is the single most common report-writing error in the field.