A Sophisticated Suite of Tests, Validated by Industry Studies
The Versant suite of tests are a set of sophisticated, reliable methods of scoring. Test validations have been well-documented and users can be confident the technology is accurate, specific and detailed.
To measure the accuracy of Versant’s automated scoring, studies have compared automatically-generated scores and the scores generated from professional human raters. A high correlation between the two scores suggests that machine scoring is similar to scores that human experts produce.
Using human transcriptions and human ratings of tests, researchers analyzed and scored test-taker responses without automatic speech processing technologies. They then compared the human-generated scores with the machine-generated scores to determine the accuracy of the automatic scoring.
Scoring from the Versant Assessment Test is free of bias or fatigue that effects human judgment. The automated scoring process is fair and virtually error-free, delivering accurate, reliable results.
Data Points Point To More Accurate Scoring
Various scatter-plots of the final machine-generated scores, and human-generated scores, have shown the correlation for the final score and each sub-score. Data points show a clear linear trend, indicating that for every case evaluation, the scores that machines generated have closely aligned with those from trained transcribers and expert raters.
Proven Reliability and Integrity
Aside from reliability, the other indicators of a test’s quality are its correlation and validity of interpretations derived from the scores. Specifically for VAET, the pronunciation sub-score may correlate highly with the fluency sub-score, although in some instances it may not correlate as highly with the vocabulary, comprehension, and interactions sub-scores.
Pronunciation and fluency are both related to acoustic characteristics of the test-taker’s responses, whereas vocabulary, comprehension, and interactions are more related to the content aspect of the test-taker’s responses. Moreover, vocabulary is important to produce appropriate and informative content, as measured in interactions.
Overall, based on very strong correlations not just for the main scores, but also for each of the sub-scores, Versant test validations have proven their reliability and integrity.