Reliability of English Learners’ Test Scores

The Every Student Succeeds Act requires that English Learners (ELs) are included in annual state testing (grades 3-8 and once in high school) and included in each state’s accountability system disaggregated by subgroup to ensure that they receive the support they need to learn English, participate fully in their education experience, and graduate ready for college or career.

As more states are using the ACT test as part of their accountability systems, one concern that educators and policymakers may have is whether the scores of English Learners (ELs) are valid and reliable indicators of their actual academic achievement level. This research brief will address the following research questions:

  1. How does the reliability of ELs’ ACT scores compare to that of non-ELs?
  2. How does the reliability of ACT scores for ELs compare to the reliability of other standardized assessment scores?
  3. How does classification consistency and differential item functioning analyses provide additional evaluative information about score validity?

Limited English proficiency can be a source of construct-irrelevant variance (measurement error), meaning that ELs’ performance on a test may be negatively impacted because they have trouble comprehending the test content in the language in which the test is presented. As a result, their scores may not reflect their true ability level, particularly if the test has a high reading component. This manifests in lower scores, as well as lower reliability estimates.

Limited English proficiency can also be a source of construct-relevant variance if English proficiency is part of the construct being measured (e.g., English grammar), resulting in lower scores that do accurately reflect students’ (lower) proficiency level.

Reliability is a measure of the extent to which test scores are consistent across testing conditions, such as across different test items or upon retest. Cronbach’s alpha is a common measure of internal consistency reliability (i.e., the extent to which students consistently respond to items sampled from the construct being measured).

A student who has mastered a construct should be able to consistently answer questions correctly, whereas a student who has not mastered the construct would be expected to consistently answer questions incorrectly. In contrast, an EL who knows the correct answer but is unable to comprehend the item content may produce an incorrect response that does not reflect their true knowledge.