This paper examines the properties of two nonexperimental study designs that can be used in educational evaluation: the comparative interrupted time series (CITS) design and the difference-in-difference (DD) design. The paper looks at the internal validity and precision of these two designs, using the example of the federal Reading First program as implemented in a midwestern state.
This paper presents a conceptual framework for designing and interpreting research on variation in program effects. The framework categorizes the sources of program effect variation and helps researchers integrate the study of variation in program effectiveness and program implementation.
No universal guideline exists for judging the practical importance of a standardized effect size, a measure of the magnitude of an intervention’s effects. This working paper argues that effect sizes should be interpreted using empirical benchmarks — and presents three types in the context of education research.