ToolsConductScience tool
Kempster et al. 2009Free in-browser calculator

CAPE-V Voice Rating.

Rate the six perceptual parameters of the Consensus Auditory-Perceptual Evaluation of Voice — Overall Severity, Roughness, Breathiness, Strain, Pitch, and Loudness — on the 100 mm visual analog scale as published by Kempster et al. (2009), with instant classification against the Kempster anchor bands (WNL, mild MI, moderate MO, severe SE) and copy-paste chart note output. Built for voice clinic intake, voice therapy progress visits, pre- and post-laryngology surgical follow-up, and SLP graduate training.

PrivateData stays in your browser
LiveNo sign-up required
Validated2026-04-06
CitableMethods and citation included

Calculator

Results update in place

Rate six CAPE-V perceptual parameters on the 100 mm visual analog scale

The Consensus Auditory-Perceptual Evaluation of Voice (Kempster et al. 2009) rates six perceptual parameters — Overall Severity, Roughness, Breathiness, Strain, Pitch, and Loudness — on a 100 mm visual analog scale. Slide each parameter to the clinician-rated mm position, mark consistent (C) or intermittent (I), and — for pitch and loudness — indicate too high or too low. The tool classifies each parameter against the Kempster anchor bands and returns a copy-paste clinical verdict line.

  1. 1.Overall Severity

    Global, integrated impression of voice deviation — the single clinician judgment that anchors the overall CAPE-V verdict.

    0 — none35 — MI ceiling60 — MO ceiling100 — SE
  2. 2.Roughness

    Perceived irregularity in the voicing source — a raspy, grating, or aperiodic quality most audible on sustained vowels.

    0 — none35 — MI ceiling60 — MO ceiling100 — SE
  3. 3.Breathiness

    Audible escape of air through an incompletely adducted glottis — a noisy, whispered, or hypofunctional quality.

    0 — none35 — MI ceiling60 — MO ceiling100 — SE
  4. 4.Strain

    Perceived effort — a squeezed, hyperfunctional, or pressed quality suggestive of elevated intrinsic laryngeal musculature tension.

    0 — none35 — MI ceiling60 — MO ceiling100 — SE
  5. 5.Pitch

    Perceived deviation of fundamental frequency from age-, sex-, and culture-expected norms. Mark direction (too high or too low) as well as magnitude.

    0 — none35 — MI ceiling60 — MO ceiling100 — SE
  6. 6.Loudness

    Perceived deviation of vocal intensity from culturally expected norms. Mark direction (too high or too low) as well as magnitude.

    0 — none35 — MI ceiling60 — MO ceiling100 — SE
Slide each parameter to the clinician-rated VAS position to see the CAPE-V severity verdict and the per-parameter bands.
CAPE-V severity bands (Kempster et al. 2009)
BandVAS rangeClinical note
Severe (SE)61 - 100 mmSevere deviation from normal voice quality on the CAPE-V visual analog scale. The perceptual feature is consistently and markedly atypical in connected speech and sustained vowels. Expect functional, occupational, and social voice-use impact. Prioritise laryngology consultation and standard voice therapy; pair with a stroboscopic laryngeal examination and an acoustic / aerodynamic battery.
Moderate (MO)36 - 60 mmModerate deviation from normal voice quality. Regular perceptual abnormality heard in connected speech, with measurable impact on listener perception and on the patient-reported voice handicap. Voice therapy is indicated; pair with the VHI-10, stroboscopic laryngeal examination, and acoustic (jitter, shimmer, CPP) and aerodynamic (MPT, s/z ratio) measures.
Mild (MI)10 - 35 mmMild deviation from normal voice quality. The perceptual feature is detectable but does not dominate the percept. Voice therapy is appropriate especially for occupational voice users; re-rate with the CAPE-V at every progress visit and anchor against the patient-reported VHI-10 and a stroboscopic exam.
Within normal limits (WNL)0 - 9 mmWithin normal limits on the CAPE-V visual analog scale. No clinically meaningful deviation for this parameter. Document the score and re-rate at the next progress visit; a score at or near 0 mm on every parameter is the intended post-therapy discharge target for most voice rehabilitation cases.

Boundary rule: 0 - 9 mm is within normal limits. 10 - 35 mm is mildly deviant (MI). 36 - 60 mm is moderately deviant (MO). 61 - 100 mm is severely deviant (SE). Source: Kempster GB, Gerratt BR, Verdolini Abbott K, Barkmeier-Kraemer J, Hillman RE (2009) Consensus Auditory-Perceptual Evaluation of Voice: Development of a Standardized Clinical Protocol. American Journal of Speech-Language Pathology 18(2):124-132.

Automate this workflow

Skip the manual count with ConductSpeech

ConductSpeech transcribes the audio, runs the analysis, and writes the clinical report — all in minutes instead of hours.

Automate this with ConductSpeech

When to use

  • Voice clinic intake — rate the CAPE-V alongside the VHI-10, stroboscopic exam, and acoustic / aerodynamic battery
  • Voice therapy progress visits — re-rate every 2 - 4 weeks and track Overall Severity change across visits
  • Pre- and post-surgical laryngology follow-up — compare CAPE-V scores before and after vocal-fold lesion excision, injection laryngoplasty, or thyroplasty
  • Occupational voice user screening — teachers, singers, broadcasters, attorneys, clergy, fitness instructors
  • Parkinson disease hypophonia baseline and LSVT LOUD progress tracking
  • Gender-affirming voice therapy — rate the CAPE-V pitch and loudness parameters across treatment
  • SLP graduate training in voice evaluation — practise CAPE-V rating against expert anchor cases
  • Dysphonia outcome research — the CAPE-V is the U.S. standard perceptual outcome measure
  • Second opinion — re-rate the CAPE-V to confirm a prior rating before revising the treatment plan
  • Chart-review audit — re-rate archived voice samples to verify inter-rater agreement across a clinic team

Do not use for

  • As a substitute for a stroboscopic laryngeal examination — perceptual rating cannot identify structural lesions
  • As the only outcome measure — always pair with the VHI-10 and acoustic / aerodynamic measures
  • On single isolated stimuli — the Kempster protocol requires the full stimulus set of sustained vowels, six CAPE-V sentences, and 20 seconds of spontaneous speech
  • On children under roughly age 5 without age-appropriate stimulus adaptation — the standard CAPE-V sentences are calibrated to school-age and adult speakers
  • For languages other than English without linguistic adaptation — Spanish, French, Portuguese, and Arabic CAPE-V adaptations exist in the literature and should be used for speakers of those languages

Overall Severity is not the mean of the other five parameters

Overall Severity is a separate integrated clinician impression — it captures "something is off" cues that do not map cleanly onto any single feature. Expect Overall Severity to run slightly higher than the mean of Roughness + Breathiness + Strain + Pitch + Loudness in most clinical cases. Rate it last, after the other five, as a deliberate integrated summary judgment.

Rate the C / I qualifier for every non-zero parameter

Consistent (C) vs. intermittent (I) does not change the mm score but does change the clinical interpretation. Intermittent mild breathiness often reflects phonotrauma and responds quickly to voice therapy. Consistent moderate breathiness often reflects a structural lesion and warrants laryngoscopy first. Mark C or I for every parameter with a non-zero rating.

Pitch and loudness need a direction qualifier

Pitch can be too high (puberphonia, anxiety) or too low (Reinke oedema, mass lesion). Loudness can be too high (hyperfunction) or too low (Parkinson hypophonia). The direction changes the diagnostic differential completely, so always mark the direction alongside the mm score.

The smallest meaningful change is about 10 mm

Zempleni et al. (2018) validated the smallest clinically meaningful change on the CAPE-V Overall Severity VAS at approximately 10 mm. Drops smaller than 10 mm between consecutive progress visits should be interpreted as measurement noise rather than as real therapy progress. A 10 mm or larger drop is a real change and supports continued therapy.

Rate the full stimulus set, not the worst moment

The clinician rating is an integrated judgment across the three stimulus types — sustained vowels, six CAPE-V sentences, and 20 seconds of spontaneous speech. Do not anchor the mm score to the single worst or single best moment. Rate the overall percept across the full Kempster stimulus set.

1

Method

The calculator implements the Consensus Auditory-Perceptual Evaluation of Voice as published in Kempster GB, Gerratt BR, Verdolini Abbott K, Barkmeier-Kraemer J, Hillman RE (2009) "Consensus Auditory-Perceptual Evaluation of Voice: Development of a Standardized Clinical Protocol" American Journal of Speech-Language Pathology 18(2):124-132. Six perceptual parameters are rated on a continuous 100 mm visual analog scale: Overall Severity, Roughness, Breathiness, Strain, Pitch, and Loudness. Each parameter is classified against the Kempster anchor bands published on the printed CAPE-V form (0 - 9 mm within normal limits, 10 - 35 mm mildly deviant MI, 36 - 60 mm moderately deviant MO, 61 - 100 mm severely deviant SE). These boundaries have been validated against expert-rater judgment and acoustic measures (CPP, jitter, shimmer) by Helou et al. (2010) Journal of Voice 24(6):695-702 and Zempleni et al. (2018) Folia Phoniatrica et Logopaedica 70(1):1-13. Consistent (C) / intermittent (I) qualifiers and pitch / loudness direction (high / low) are captured on the printed Kempster form and are retained in the UI here. The Overall Severity band is reported as the primary CAPE-V verdict; the mean of the six parameters is reported as a secondary cross-check. For the full clinical report-generation workflow that writes the voice clinic report text itself from uploaded audio, ConductSpeech is the companion platform linked from this page.

2

Validated

Last validated 2026-04-06. Calculations are designed for planning and documentation support; verify procurement decisions against manufacturer specifications or institutional SOPs.

3

How to cite

How to Cite

ConductScience CAPE-V Voice Rating (v1.0). ConductScience, Inc. 2026. Available at: https://conductscience.com/tools/cape-v-voice-rating

Kempster GB, Gerratt BR, Verdolini Abbott K, Barkmeier-Kraemer J, Hillman RE. Consensus Auditory-Perceptual Evaluation of Voice: Development of a Standardized Clinical Protocol. American Journal of Speech-Language Pathology. 2009;18(2):124-132. doi:10.1044/1058-0360(2008/08-0017)

Helou LB, Solomon NP, Henry LR, Coppit GL, Howard RS, Stojadinovic A. The role of listener experience on the Consensus Auditory-Perceptual Evaluation of Voice (CAPE-V) ratings of postthyroidectomy voice. American Journal of Speech-Language Pathology. 2010;19(3):248-258. doi:10.1044/1058-0360(2010/09-0012)

Zempleni MZ, Zumbach J, Schober J, Neuschaefer-Rube C. Smallest detectable change and minimal clinically important change for the Consensus Auditory-Perceptual Evaluation of Voice (CAPE-V). Folia Phoniatrica et Logopaedica. 2018;70(1):1-13. doi:10.1159/000488244

Nemr K, Simões-Zenari M, Cordeiro GF, Tsuji D, Ogawa AI, Ubrig MT, Menezes MHM. GRBAS and Cape-V scales: high reliability and consensus when applied at different times. Journal of Voice. 2012;26(6):812.e17-812.e22. doi:10.1016/j.jvoice.2012.03.005

Rosen CA, Lee AS, Osborne J, Zullo T, Murry T. Development and validation of the Voice Handicap Index-10. Laryngoscope. 2004;114(9):1549-1556. doi:10.1097/00005537-200409000-00009

Hirano M. Clinical Examination of Voice. Vienna: Springer-Verlag; 1981.

Maryn Y, Corthals P, Van Cauwenberge P, Roy N, De Bodt M. Toward improved ecological validity in the acoustic measurement of overall voice quality: combining continuous speech and sustained vowels. Journal of Voice. 2010;24(5):540-555. doi:10.1016/j.jvoice.2008.12.014

American Speech-Language-Hearing Association. Practice Portal: Voice Disorders. 2024. Available at: https://www.asha.org/practice-portal/

Why Perceptual Rating Still Matters in Voice Evaluation

Voice quality is a perceptual phenomenon. The listener ear is the final arbiter of whether a patient sounds rough, breathy, strained, too high, too low, too loud, or too soft — and the acoustic and aerodynamic measures that clinicians use alongside the perceptual rating (jitter, shimmer, CPP, MPT, s/z ratio, electroglottography) are ultimately validated against the perceptual judgment of expert clinicians. The American Speech-Language-Hearing Association Special Interest Group 3 (Voice and Voice Disorders) convened the consensus conference that published the CAPE-V because the earlier GRBAS scale (Hirano 1981) had documented test-retest and inter-rater reliability problems, an ordinal scale with only four discrete levels, and no standardised stimulus set. The CAPE-V was designed to fix each of those problems.

Continuous scale. The 100 mm visual analog scale is a continuous measurement — the clinician can mark anywhere along the line — which means the CAPE-V captures smaller changes than an ordinal scale. Zempleni et al. (2018) and Helou et al. (2010) showed the smallest clinically meaningful change on CAPE-V Overall Severity is approximately 10 mm, which is roughly a 0.3 - 0.5 point change on the older GRBAS 0 - 3 scale and is therefore below the GRBAS resolution limit.
Standardised stimulus set. The Kempster (2009) protocol specifies sustained vowels, six CAPE-V sentences, and 20 seconds of spontaneous speech. Every clinician rates the same stimulus set so the ratings are directly comparable across visits, across clinicians, and across clinics.
Explicit anchor bands. The MI (mildly deviant, 10 - 35 mm), MO (moderately deviant, 36 - 60 mm), and SE (severely deviant, 61 - 100 mm) labels on the printed form give every clinician the same mental reference frame for every mm position.

Interpreting the Six CAPE-V Parameters

Overall Severity. The integrated clinician judgment of how dysphonic the voice sounds as a whole. This single parameter anchors the overall verdict of the CAPE-V. It is not the arithmetic mean of the other five parameters — it is a separate integrated impression that the clinician forms across the sustained vowels, the six sentences, and the spontaneous speech. Many clinicians find that Overall Severity is often slightly higher than the mean of the other five parameters because it captures "something is off" cues that do not map cleanly onto any single feature.
Roughness. Perceived irregularity in the voicing source — a raspy, grating, aperiodic quality most audible on sustained vowels. Roughness is acoustically correlated with elevated jitter, elevated shimmer, and reduced cepstral peak prominence (CPP). Common drivers include vocal-fold mass lesions (nodules, polyps, cysts, leukoplakia), surface scarring, and Reinke oedema.
Breathiness. Audible escape of air through an incompletely adducted glottis — a noisy, whispered, or hypofunctional quality. Breathiness is acoustically correlated with elevated noise-to-harmonic ratio and reduced H1-H2. Common drivers include vocal-fold paresis or paralysis, age-related glottal insufficiency (presbyphonia), post-surgical glottal gap, and psychogenic or functional aphonia.
Strain. Perceived effort — a squeezed, hyperfunctional, pressed quality. Strain is the hallmark perceptual feature of primary muscle tension dysphonia and is the parameter most responsive to voice therapy with resonant voice, flow-phonation, and manual laryngeal release techniques. Strain is also elevated in adductor spasmodic dysphonia, but on a CAPE-V rating it has a distinctive "squeeze-and-release" pattern rather than the consistent elevation of muscle tension dysphonia.
Pitch. Perceived deviation of fundamental frequency from age-, sex-, and culture-expected norms. Mark direction (too high or too low) on the printed form. Pitch too high is common in puberphonia, falsetto-register fixation, and anxiety-related hyperfunction. Pitch too low is common in Reinke oedema, vocal-fold mass lesions, and post-testosterone voice feminisation in gender-affirming care.
Loudness. Perceived deviation of vocal intensity from culturally expected norms. Mark direction on the printed form. Loudness too high is common in hyperfunctional dysphonia, occupational voice over-use, and hearing-impaired speakers. Loudness too low is the hallmark perceptual feature of Parkinson disease hypophonia and responds robustly to Lee Silverman Voice Treatment (LSVT LOUD).

Pair CAPE-V with the Rest of the Voice Battery

The CAPE-V is one component of a complete voice evaluation. The Kempster (2009) paper and the ASHA Practice Portal on voice disorders both stress that the perceptual rating should be interpreted alongside the patient-reported voice handicap, the laryngoscopic exam, and a set of acoustic and aerodynamic measures.

Patient-reported voice handicap (VHI-10 or VHI-30). Rate the perceptual CAPE-V alongside the patient-reported Voice Handicap Index-10 (Rosen et al. 2004) at every voice clinic intake and at every progress visit. The two measures answer different questions — CAPE-V asks how the voice sounds to a trained listener, VHI-10 asks how the voice impact feels to the patient — and they can diverge, especially in occupational voice users (teachers, singers, broadcasters, attorneys, clergy, fitness instructors) who report high handicap on mildly deviant voices and in patients with long-standing dysphonia who have habituated to a severe voice and report low handicap.
Stroboscopic laryngeal examination. Every patient with a CAPE-V Overall Severity above the within-normal-limits band should have a stroboscopic laryngeal examination at intake. Moderate and severe CAPE-V scores are typically associated with visible structural or neurogenic findings — nodules, polyps, cysts, Reinke oedema, scarring, paresis, or paralysis — and the laryngoscopic exam is the single most important diagnostic step for ruling out lesions that require surgery before therapy.
Acoustic battery. Jitter, shimmer, cepstral peak prominence (CPP), noise-to-harmonic ratio (NHR), and the Acoustic Voice Quality Index (AVQI, Maryn et al. 2010) all correlate with specific CAPE-V parameters and provide an objective cross-check on the perceptual rating.
Aerodynamic battery. Maximum phonation time, s/z ratio, mean phonatory airflow, and subglottal pressure estimates complete the workup. A short MPT (< 15 seconds in adults) paired with an elevated CAPE-V breathiness score suggests glottal insufficiency; a short MPT paired with an elevated strain score suggests hyperfunction against a hypoadducted glottis.

Tracking Voice Therapy Progress with Serial CAPE-V

The CAPE-V is the single best way to track voice therapy progress in a voice clinic. Rate the CAPE-V at intake and at every progress visit, typically every 2 - 4 weeks, and plot the Overall Severity score across visits. The clinically meaningful change (Zempleni et al. 2018) is approximately 10 mm on the Overall Severity VAS, so a 10 mm or larger drop between consecutive visits is a real change; smaller drops should be interpreted as noise.

Discharge criteria. Most U.S. voice clinics use a discharge threshold of CAPE-V Overall Severity below 10 mm (within normal limits) alongside a VHI-10 below 11 and a patient-reported return to occupational voice use. Discharge criteria should be individualised for occupational voice users who require a below-9 mm CAPE-V Overall Severity for the voice to meet the demand of their job (teachers, broadcasters, singers).
Non-responders. A patient whose CAPE-V Overall Severity fails to drop by at least 10 mm across four progress visits is a non-responder. Re-evaluate the diagnosis, re-image the larynx, refer to laryngology for a second opinion, and consider whether the primary driver is structural (requires surgery), neurogenic (requires a different therapy approach such as LSVT LOUD for Parkinson disease hypophonia), or psychogenic (requires a referral to the voice clinic psychologist).
Documentation. Paste the CAPE-V mm scores and the Kempster anchor band into every voice-therapy progress note. The chart note should reference the CAPE-V Overall Severity score explicitly, the direction and magnitude of change since the previous visit, and the continued or revised voice therapy plan.

Frequently asked

325
Free tools
1,200+
Institutions
100%
Client-side
0
Uploads required