How long does a clinically valid language sample need to be?

50-100 utterances is the consensus minimum across Owens (2014), Pavelko & Owens (2017), and Heilmann et al. (2010). Below 50 utterances, MLU is unstable and NDW is biased downward by sample length. School clinicians targeting 50 utterances usually need 20-25 minutes of recorded conversation.

Do I need SALT to do Language Sample Analysis?

No. SALT is the most full-featured corpus tool, but it is not required for a defensible LSA. The ConductScience MLU calculator, lexical diversity calculator, PGU calculator, and DSS calculator together cover the metric battery used in 90% of school-based reports, and all four are free and run in the browser.

Can I use LSA for bilingual children?

Yes, but not with English-only norms. Collect a sample in each language, score each separately, and compare against language-matched norms (SUGAR for English, the SUGAR Spanish supplement for Spanish, or within-child progress monitoring for languages without published norms). Pillar #8 in this series covers bilingual LSA in detail.

What is the difference between MLU in morphemes and MLU in words?

MLU-morphemes counts every meaningful unit (so 'walked' = 2 morphemes: walk + -ed past), and is more sensitive to early grammatical development. MLU-words counts whitespace-separated words and is easier to compute. For children below MLU 4.0, report MLU-morphemes; above MLU 4.0 the two metrics correlate at >0.95 and either is fine.

How often should I re-run an LSA on the same student?

Quarterly progress monitoring is standard for IDEA-eligible students with active expressive-language goals. A 25-utterance probe is enough for progress monitoring once a baseline 50-utterance sample is on file. Re-run the full 50-utterance battery at the annual IEP review.

What does ConductSpeech actually do that the free tools do not?

ConductSpeech automates the transcription step. The free calculators on this site assume you already have a typed transcript; ConductSpeech accepts the raw audio recording and emits the transcript, the metric battery, and a draft present-levels paragraph in one pass. The free tools remain free and unchanged — ConductSpeech is the time-saver for clinicians on a 50-student caseload.

Pillar guideSpeech-Language Pathology14 min read

The Complete Guide to Language Sample Analysis (LSA) for SLPs

Language Sample Analysis is the closest thing the field has to a gold-standard descriptor of a child’s real grammar. This guide walks through every step — elicitation, transcription, segmentation, metric calculation, normative comparison, and IEP application — with the free tools and references you need to do it without paying for SALT.

Jump to table of contents Open MLU Calculator

What this guide covers

A 14-minute pillar reference covering elicitation through IEP application, with citations and free tool links throughout.

What is Language Sample Analysis?
Why LSA matters for assessment and IEP eligibility
The 6-step LSA protocol
The metric battery — what each number actually tells you
Normative comparison — interpreting the numbers
From sample to IEP — writing goals from LSA data
Free tools vs paid tools — the honest comparison
Five common mistakes that wreck an LSA
Free tools and reference pages
Frequently asked questions
References

1. What is Language Sample Analysis?

Language Sample Analysis (LSA) is a descriptive assessment in which a clinician records a child speaking in a naturalistic context, transcribes the speech into utterances, and then computes metrics that summarise the grammar, vocabulary, fluency, and discourse organisation of the sample. Unlike a norm-referenced test, an LSA produces a portrait of the language the child actually uses — not their performance on a standardised stimulus that the child may never encounter outside the testing room.

The technique was formalised by Roger Brown in A First Language (1973), who introduced both the morpheme-counting rules that still underpin Mean Length of Utterance (MLU) and the five-stage developmental framework that bears his name. Six decades later, LSA remains the assessment that ASHA, IDEA evaluation teams, and the Council for Exceptional Children consistently recommend as ecologically valid evidence of expressive language ability.

A modern LSA produces three families of numbers: structural metrics (MLU in morphemes, MLU in words, T-units, subordination index), lexical metrics (Number of Different Words, Type-Token Ratio, Moving-Average TTR, vocd-D), and grammatical-accuracy metrics (Percent Grammatical Utterances, Developmental Sentence Score, Index of Productive Syntax). Each family answers a different clinical question, and you almost never report only one of them.

2. Why LSA matters for assessment and IEP eligibility

Language Sample Analysis is the only routine clinical procedure that yields functional grammar evidence in the child’s own dialect, in real time, against published age expectations. Standardised tests like the CELF, PLS, or OWLS sample what a child can do on a 30-minute stimulus battery; an LSA samples what the child does when the cognitive load looks like real conversation. For dual-language learners, children whose home dialect differs from General American English, and any student where the standardised score does not match teacher report, an LSA is often the deciding evaluation.

IDEA-eligible school evaluations under 34 CFR §300.304 require multiple sources of data and prohibit the use of a single test for eligibility decisions. Most state implementing regulations explicitly name "language sample" as one of the corroborating sources districts may use. In practice, school-based SLPs use the language sample to support eligibility under speech-language impairment (SLI), specific learning disability (SLD), or developmental delay, and to write present levels paragraphs that can survive a due-process hearing.

Outside of schools, the same logic holds in early intervention, traumatic-brain-injury rehabilitation, and bilingual evaluation. If you ever have to defend a clinical decision in a deposition, the language sample is the piece of evidence that comes from the child rather than from a test publisher.

IDEA Part B requires multiple measures — the language sample is the most defensible second source.
CELF/PLS/OWLS sample structured stimulus performance; LSA samples spontaneous use.
A 50-utterance sample exceeds the 25-50 utterance minimum recommended in Owens (2014), Pavelko & Owens (2017), and Eisenberg & Guo (2013).

3. The 6-step LSA protocol

The protocol below is the consensus workflow synthesised from Owens (2014), Miller & Iglesias (2008), Pavelko & Owens (2017), and Heilmann et al. (2010). It produces a clinically usable sample in 25-40 minutes of clock time and a transcript that supports MLU, NDW, TTR, percent-grammatical, and morpheme-by-morpheme analysis.

Step 1 — Elicit. Use age-appropriate prompts (conversation for ages 2-3, story-retell for 4-7, expository for 8-12). Keep the adult talk turn at ≈30% of all turns; the goal is the child’s speech, not the clinician’s.
Step 2 — Record. Lapel mic or phone camera, quiet room, 50-100 utterances minimum. Avoid background music or simultaneous adults.
Step 3 — Transcribe. One utterance per line, terminate at falling intonation or a clear grammatical boundary. Codify mazes, abandoned utterances, and unintelligible segments in brackets so the metric script can ignore them.
Step 4 — Segment. Split into communication units (C-units) for school-age children or T-units for written-language analogues. Use Loban (1976) C-unit rules.
Step 5 — Score. Run the standard battery: MLU-morphemes, MLU-words, total utterances, NDW, TTR, percent-grammatical-utterances, and Brown’s stage. Optional: DSS, IPSyn, subordination index.
Step 6 — Compare and write. Place each metric next to its age band, write a 4-6 sentence present-levels paragraph, and pull at least one IEP goal directly from the metric gap.

Time budget for a school-based LSA

Most school SLPs can keep total time under 60 minutes per child: 25 minutes elicitation, 20 minutes transcription, 10 minutes scoring, 5 minutes writing. The transcription step is the bottleneck — it is the step ConductSpeech automates entirely.

4. The metric battery — what each number actually tells you

Every LSA report contains the same shortlist of numbers, but it is depressingly common to see a clinician report MLU and stop there. A defensible report covers structural, lexical, and accuracy dimensions because each one fails differently in different disorders. A child with Developmental Language Disorder (DLD) often presents with low MLU and low Percent Grammatical Utterances (PGU); a child with a word-finding problem may present with normal MLU but low NDW; a child with autism may present with normal MLU and NDW but a flat subordination index.

MLU-morphemes — average morphemes per utterance, the canonical Brown (1973) metric. Strongly correlated with grammatical age in children below MLU 4.0.
MLU-words — average words per utterance, less sensitive but easier to compute and required for most school-age comparisons.
NDW — Number of Different Words, the lexical breadth metric. Best computed on a fixed sample size (e.g., first 50 utterances) to avoid sample-length bias.
TTR / MATTR / vocd-D — lexical-diversity metrics. Plain TTR drops as samples get longer; MATTR and vocd-D correct for length.
PGU — Percent Grammatical Utterances, the accuracy metric. Below ~80% on conversational language is a clinical red flag for school-age children.
Subordination Index, T-units, IPSyn — advanced syntactic complexity metrics for school-age and adolescent samples.

5. Normative comparison — interpreting the numbers

A metric in isolation is a number. A metric next to an age-banded reference range is a clinical decision. The two normative datasets that dominate the field are Brown (1973) for MLU stages I through V and Pavelko & Owens (2017, "SUGAR" — Sampling Utterances and Grammatical Analysis Revised) for MLU, NDW, and PGU norms by year of age from 3 to 7. The Heilmann et al. (2010) reference set extends NDW and TTR up through age 11.

The trick is to apply the right reference to the right child. Brown’s stages were derived from a tiny longitudinal sample (Adam, Eve, and Sarah) and overestimate MLU growth past Stage V. SUGAR was derived from 250 typically-developing US children speaking General American English and is the right comparison set for school-age General American English speakers; it should not be used unmodified for African American English, Spanish-English bilingual learners, or speakers of other dialects.

For children outside the SUGAR sampling frame, the field generally recommends a within-child comparison: collect a baseline sample, intervene, collect a probe sample 6-8 weeks later, and look for change relative to the child’s own starting point. This is the same logic used in dynamic assessment.

6. From sample to IEP — writing goals from LSA data

The mistake school-based SLPs make most often is treating the LSA as evaluation evidence and then writing the IEP goals from a separate goal bank. The whole point of the language sample is that it tells you which structures are missing or unstable for this child. Use the metric gaps to drive the goal targets.

A defensible workflow: identify the metric where the child sits more than one age band below expectation, find the corresponding linguistic structure (e.g., low MLU + missing past-tense -ed = morpheme target), draft a measurable goal that names the structure, sets a 6-month criterion, and specifies the conditions under which the data will be collected. The IEP team then has a goal that came directly from real spoken language evidence and can be measured by re-running the same LSA at quarterly progress monitoring.

Articulation goals from PCC and phonological-processes data, not from a stimulus probe.
Expressive-language goals from MLU, NDW, and missing morphemes — not from a Boardmaker template.
Pragmatics goals from conversation-turn analysis on the same sample.
Fluency goals from %SS in the conversational sample.

7. Free tools vs paid tools — the honest comparison

For 25 years the LSA toolchain meant SALT (Systematic Analysis of Language Transcripts) from the University of Wisconsin. SALT is an excellent piece of software with a 700-child reference database, but it costs roughly $295 per single-user license and has an idiosyncratic transcription syntax that takes hours to learn. CLAN (the Computerised Language Analysis suite that ships with the CHILDES corpus) is free, more powerful, and almost completely opaque to a clinician who has not taken a graduate-level seminar in it.

The middle-ground that the ConductScience SLP tool suite offers is a battery of single-purpose, browser-based calculators that compute one metric at a time, accept plain-text utterances (no special transcription syntax), and come paired with the published norm bands. They are free, run client-side, and never see the child’s data — they are also intentionally limited to the basic case so that clinicians who need full corpus management still go to SALT or CHILDES. ConductSpeech, the paid product, sits one layer above these tools: it runs the same metric battery but starts from raw audio so the clinician can skip the transcription step entirely.

When to use what

Use the free tools below for a quick clinic-day analysis of a single sample, use SALT or CLAN for grant-funded research with longitudinal corpora, and use ConductSpeech when the bottleneck is transcription time on a 50-student caseload.

8. Five common mistakes that wreck an LSA

Most failed language samples fail at one of the same five points. Read this before your next clinic day; it will save you a re-test.

Talking too much. Adult turns above 30% of total turns suppress child MLU and NDW — the sample then describes the clinician, not the child.
Stopping at 25 utterances. Below the 50-utterance floor, MLU is unstable and NDW is biased downward by sample length.
Counting morphemes by ear. Brown’s rules are not intuitive (ed-past = 1 morpheme, gonna = 2 morphemes, doggie = 1 morpheme). Use a calculator that implements the rules in code.
Ignoring mazes. Filled pauses, restarts, and abandoned utterances inflate word counts and depress MLU. Bracket them in transcription and exclude them from metric scripts.
Comparing to the wrong norms. Applying SUGAR-English norms to a Spanish-dominant bilingual child is the single most common report-writing error in the field.

Free tools and reference pages

Every link in this guide stays on conductscience.com. Open any tool in a new tab and come back here for context.

Free tools

MLU Calculator

Mean Length of Utterance in morphemes and words plus matching Brown's stage — paste a sample, get the numbers.

The Complete Guide to Language Sample Analysis (LSA) for SLPs

Stop hand-transcribing language samples

What this guide covers

1. What is Language Sample Analysis?

2. Why LSA matters for assessment and IEP eligibility

3. The 6-step LSA protocol

4. The metric battery — what each number actually tells you

Skip the manual scoring

5. Normative comparison — interpreting the numbers

6. From sample to IEP — writing goals from LSA data

7. Free tools vs paid tools — the honest comparison

8. Five common mistakes that wreck an LSA

Free tools and reference pages

Free tools

MLU Calculator

Lexical Diversity Calculator

Brown's Stages Lookup

SUGAR Norms Lookup

DSS Calculator

IPSyn Calculator

Percent Grammatical Utterances (PGU)

Language Sample Worksheet

Norms by age

Age 3 language norms

Age 5 language norms

Age 8 language norms

LSA metrics

MLU in morphemes — metric reference

NDW — Number of Different Words

TTR — Type-Token Ratio

PGU — Percent Grammatical Utterances

Brown's morphemes

Present progressive -ing

Regular past tense -ed

Regular 3rd-person singular -s

Sister pillar articles

Pillar #2 — MLU: Calculation, Norms, and Clinical Use

Pillar #3 — Brown's 14 Morphemes Full Reference

Frequently asked questions

References

Run a complete Language Sample Analysis from raw audio