Administration
Assessment of Learning
RELIABILITY AND VALIDITY
Robert B. Supernaw 2001
RELIABILITY
Reliability - desired consistency or reproducibility of a test score; in short, consistency of z-scores with repeated retesting. What should a test reliability be? All reliability scores are correlations. Therefore, the numeric value will be reported as a percentage of 1 (1 is a perfect correlation). So, I suggest that we strive to get a test reliability score approaching 0.7. That is, a 0.7 is very, very good (approaches 1.0). A 0.6 is getting there. These scores indicate that if we repeated the test, we would get similar results.
- What makes a test unreliable?
- Random errors vs. systemic errors (random not repeated on subsequent tests)
- Anxiety
- Health
- Reading errors
- Translation errors
- Marking errors
- Content sampling
- Guessing
- Distractions
- Administration errors
Sources of other error: small sample, incomplete sample
Reliability Coefficient - Correlation between scores on parallel tests (between -1 and +1)
VALIDITY
Validity - drawing a correct inference from reliable results
Content validity - an inference drawn from a score projected on a larger domain of similar items
Criterion-related validity - an inference drawn from a score projected to performance on some real behavior of practical importance
Construct validity - an inference drawn from a score projected to a situation whose quality cannot be adequately measured