Evaluating uncertainty: the impact of the sampling and assessment design on statistical inference in the context of ILSA
Evaluating uncertainty: the impact of the sampling and assessment design on statistical inference in the context of ILSA
Blog Article
Abstract This paper informs users of data collected in international large-scale assessments (ILSA), by presenting argumentsunderlining the importance of considering two design features employed in these studies.We examine a commonmisconception stating that the uncertainty arising from the assessment design is negligible compared with that arisingfrom the sampling design.This misconception can lead to the erroneous conclusion that there is always a relatively lowrisk of ignoring the uncertainty arising from the assessment design when reporting estimates of population parameters.We use the design effect framework to assess the impact that the sampling and the michael harris sunglasses assessment design have on theestimation.
We first evaluate the loss in efficiency in the estimation of a population parameter attributable to each ofthe designs.We then examine whether knowledge about the effect of one design feature can justify any belief about theeffect of the other design feature.We repeat this examination across different parameters characterizing theachievement distribution in a population.We provide empirical results using data collected for PIRLS 2016.
Our empirical results can be summarized in two general findings.First, when estimating mean achievement, the effectof the sampling design is often substantially larger than that of the assessment design.This finding might explain themisconception we try to address.However, we show that this is not true in all instances, and the magnitude of thedifference between both design effects is context dependent and hence not generalizable.
Second, differences in designeffects become less predictable when estimating other parameters, e.g.the belle de grasse proportion of students reaching a certainthreshold in the achievement scale (i.e.
, benchmarks), or an association estimated using linear regression.This contribution underlines that accounting for all sources of uncertainty in the estimation is of paramount importanceto obtain credible inferences.We conclude that it is difficult to justify a priori the belief that the effect of the samplingdesign in the estimation is always greater than that of the assessment design.