Moving-Average Type-Token Ratio (MATTR)
MATTR computes TTR across a sliding window and removes the sample-length confound that makes raw TTR unreliable.
What MATTR measures
Moving-Average Type-Token Ratio, introduced by Covington and McFall in 2010, computes TTR across a sliding window of fixed size (typically 50 or 100 tokens) and then averages across all windows in the sample. Because every window is the same length, the resulting average is invariant to how long the overall sample is — the single biggest criticism of traditional TTR disappears. MATTR has become the lexical-diversity metric of choice in modern automated LSA tools and in most corpus-linguistic research.
Formula
MATTR = mean of (TTR_i) over all length-W sliding windows, where W is the window sizeNormative ranges and benchmarks
- W = 50, age 5;0 — mean 0.72 (typical range 0.68 – 0.78)
- W = 50, age 8;0 — mean 0.77 (typical range 0.72 – 0.82)
- W = 50, age 12;0 — mean 0.80 (typical range 0.76 – 0.84)
- Window sizes below 25 tokens give noisy estimates; above 100 tokens give diminishing returns
- MATTR is stable across samples of 100 to 1000 tokens, making it ideal for cross-child comparison
Normative bands are central estimates drawn from the cited literature. Individual variation is wide — always cross-reference against the source paper and your assessment's own manual before quoting a cut-score in a report.
Clinical use
MATTR earns its place in a report on the days you have samples of wildly different lengths — one child produced 80 utterances at a picnic, the next produced 280 at the art table, and you still want to compare their lexical diversity honestly. Because MATTR is averaged across same-size windows, it behaves itself even on uneven samples. For day-to-day clinical practice the interpretation is the same as TTR — higher is more diverse — but it can be placed alongside published windows sizes without the "compared to what?" asterisk. Clinicians should set their window to 50 tokens for typical clinic-length samples and to 100 tokens for corpus-level data.
“MATTR is what happens when a linguist does the statistics homework TTR skipped. For about forty lines of Python you get a number you can actually put into a report without the committee asking about sample length.”
Get the full analysis
Automate MATTR in your next language sample
Upload the audio. ConductSpeech transcribes, scores every metric on this page — including MATTR — and writes a parent-ready summary in minutes.
Free tools that compute MATTR
Lexical Diversity Calculator
Paste a language sample and get type-token ratio (TTR), number of different words in the first 100 tokens (NDW-100, Miller 1981), and NDW per 50 utterances (NDW-50, SUGAR). Implements the standard SALT/SUGAR tokenisation rules and runs entirely in your browser.
Open toolMLU Calculator
Paste a language sample and get Mean Length of Utterance in morphemes and words, total utterances, total morphemes, and the matching Brown's stage. Implements Brown (1973) morpheme counting rules and runs entirely in your browser.
Open toolLanguage Sample Worksheet
Free printable and fillable language sample analysis worksheet for speech-language pathologists. Five columns (utterance #, transcription, morpheme count, grammatical Y/N, notes), configurable row count up to 100 utterances, browser print produces a clean PDF, and an inline running summary tracks total utterances, total morphemes, and rolling MLU as you fill it in.
Open toolRelated LSA metrics
Type-Token Ratio (TTR)
TTR divides unique word roots by total words to index lexical diversity — quick to compute but highly sensitive to sample length.
NDWNumber of Different Words (NDW)
NDW counts unique word roots across a fixed sample length and is the most stable lexical-diversity measure for school-age children.
VOCD / DVocabulary Diversity (VOCD / D-measure)
VOCD fits a mathematical curve to TTR values at many sub-sample sizes to produce a single length-invariant diversity score known as D.
References
- Covington, M. A., & McFall, J. D. (2010). Cutting the Gordian knot: The moving-average type-token ratio (MATTR). Journal of Quantitative Linguistics, 17(2), 94–100.
- Fergadiotis, G., Wright, H. H., & West, T. M. (2013). Measuring lexical diversity in narrative discourse. American Journal of Speech-Language Pathology, 22(2), S397–S408.
- Kyle, K. (2016). Measuring syntactic development in L2 writing: Fine grained indices of syntactic complexity and usage-based indices of syntactic sophistication. Georgia State University dissertation.