Chapter 3

Classical Test Theory

At the heart of CTT is the idea of the true score for the construct we are measuring (e.g., math achievement) and its relationship with the observed score on a test.

  • the observed test score
  • true score for an individual
  • random variability in the observed score caused by factors other than true ability (fatigue, stress)

The framework can't be falsified b/c the relationship can be understood conceptually but can't be formally tested using observed data.

Error must be understood as it is central to understanding the framework.

  1. random error - specific to a time, place, examinee, or assessment, and balanced over the 4 factors
  2. systematic error - consistent across one or more of time, place, examinee, and assessment - leads to biased (upward or downward) observed scores

True score and its components

Core of CTT is the equation

X = T + E

Where
X = the observed score on the scale
T = the true score on the scale
E = error

(Equation 3.1, p. 30)

Example
  • When we obtain a score on a math test (Xi), we are really interested in the true score (Ti).
  • Ti represents the mean of a theoretical distribution of observed scores after repeated and independent assessments on the same test an infinite number of times.
  • we infer Ti using Xi because we cannot test over and over again.
  • T is the expected value (population mean) of X, where the population is the scores from the student.
  • if a student scores 91 on a math test, the score (Xi) shows how much the students knows, measured on an imperfect test.
    • the score will be used to infer the student's understanding of math, Ti, based on a single sample from the population of theoretically possible scores from the population.
    • ... the teacher needs to rely on the next best thing, the observed score

error

What constitutes an assessment?

  • an assessment can be anything from a full scale to a single item b/c all the models derived from Eq3.1 will allow us to gain a deeper understanding of the nature of the trait and the people.
Example
  • multiple assessments of a construct on the same individuals.
  • If we have J different assessments, eq3.1 would be

Xij = Tij + Eij

where

  • Xij = observed score for individual i on assessment j
  • Tij = true score for individual i on assessment j
  • Eij = error for individual i on assessment j

(Eq 3.2, p. 35)