Where to Start? Four Questions for State Leaders Selecting Assessments for Young Learners
A growing number of states have adopted policies that encourage or require the use of child assessments to measure what young children know and can do before they reach third grade.[1] Today, most Quality Improvement Systems for evaluating early childhood education programs recognize child assessments alongside other programmatic features such as curricular resources and training for teachers as important indicators of program quality.[2] As children move up to kindergarten, many states and districts require Kindergarten Entry Assessments (KEAs) to measure children’s readiness for kindergarten curricula and to help educators understand children’s needs.[3] Having a clear understanding of children’s progress can inform strategic decision-making in classrooms, programs, and systems.
Despite the utility of child assessment data, it can be challenging for policymakers to identify appropriate assessments for their state. The field offers a wide array of tools—targeting different age ranges, covering varying skills, and involving diverse data collection methods. Moreover, because it is difficult for any one tool to meet every assessment need, it is often necessary to bundle multiple tools or create new, state-specific assessments. Asking the right questions about assessment use, content, implementation, and data outputs can help state policymakers and their teams narrow down the options and find the best fit.
This issue focus highlights four questions intended to guide state policymakers in developing a coherent and effective approach to early childhood assessment. These questions are grounded in recent conversations, hosted by the Council of Chief State School Officers (CCSSO), in which early learning decision-makers, research leaders, and assessment developers discussed how states might build child assessment policies aligned with state needs and the latest research on assessments for young children.[4] This piece is a starting point to help state leaders set priorities as they evaluate child assessment tools to use in their states.
Question 1: How will your state use child assessment data?
To ensure that child assessments yield meaningful data, begin with a clear vision for how the assessment data will be used. Child assessments are often grouped into categories based on their intended use(s), such as formative, summative, screener, or diagnostic.[5] Whereas formative assessments are meant to provide ongoing insights into students’ understanding to inform the work of practitioners, summative assessments evaluate whether students, classrooms, and programs are meeting set standards at a point in time. Screeners are used to identify children at risk of potential difficulties, with diagnostics providing deeper insight into whether children require specialized services. Each assessment type serves a distinct purpose, and conclusions drawn from one type of assessment may not be appropriate for another. Although individual assessments may have multiple use cases, using assessments for purposes they were not intended to support can lead to inaccurate conclusions.
The expanded use of reading screeners across states provides a good example of potential misalignment or misinterpretation of assessment data. Reading screeners are typically used to identify children who might be at risk for reading difficulties so that they can be referred for additional diagnostic testing or specialized help. Yet, the criteria for an “at risk” designation may not accurately predict how well a child will perform on an end-of-year (summative) reading assessment: a child who is not identified as at risk on a screener might still struggle on a summative assessment and vice versa.[6] Additionally, results from different reading screeners may not always align.[7] While most reading screeners generally evaluate similar foundational reading skills, there can be differences in the specific content included in each screener. Further, the methodology for producing and reporting results also varies across screeners. These factors all contribute to how assessment results can be used and interpreted.
Starting with a clear vision for the types of conclusions that will be made from assessment data in your state can guide the selection of tools that are appropriate for those purposes and help avoid the common pitfall of using assessment data in unintended ways. Integrating the perspectives of all assessment stakeholders in this process—including those at the classroom, program, and state levels—can ensure that all potential assessment uses are considered. Equipped with this comprehensive vision, assessment decision-makers can then identify suitable tools by exploring measure repositories, examining technical manuals for individual tools, and asking assessment developers questions about appropriate uses of their tools during procurement processes.[8] Once tools are selected, developing accessible guidelines for how the data from those tools should be used and interpreted by various stakeholders can help avoid misleading conclusions.
Question 2: What skills do assessments need to cover?
A next step in identifying assessment options is determining the range of skills assessments should capture. Early childhood is an exciting period of rapid growth across a wide array of content areas. Young children develop foundational skills in academic domains such as literacy and math, while they also build nonacademic competencies needed to successfully navigate group-based learning settings such as collaboration and self-regulation.
Given this range of skills being developed and supported by educators in the early years, it can be costly and time-consuming to assess them all in a comprehensive way. Oftentimes certain areas must be prioritized, and knowing how assessment data will be used, as discussed above, can aid in prioritization. However, a common pitfall in making these choices is focusing only on a narrow set of early academic skills, overlooking other foundational areas such as approaches to learning and scientific reasoning. Over time, this approach can lead to system-wide shifts in the perceived importance of different skill areas: assessments can act as streetlights, bringing into focus the specific content areas that are measured and deemphasizing those that are not.[9] Therefore, when deciding to prioritize assessment of certain content areas, it is important to consider which areas are not being assessed and the potential consequences of these decisions.
Identifying the subject area(s) to assess is a starting point for choosing appropriate assessments, but further probing of assessments’ content coverage is needed. Choosing the best tool requires not only taking stock of an assessment’s stated domain focus, but also looking under the hood to understand how the assessment covers subdomains and specific skills, as many available assessments do not reflect the depth of skill coverage specified in state early learning standards.[10] For example, while pre-K language standards often include four subdomains—receptive language, expressive language, social language, and vocabulary[11]—many pre-K language assessments tend to concentrate solely on vocabulary, leaving the other subdomains less thoroughly represented. To evaluate assessments for alignment of breadth, depth, and areas of focus with learning standards, many states look to technical advisors with expertise in specific developmental domains. It may also be helpful to confirm that tools have undergone external content vetting, wherein experts evaluate assessment items to ensure agreement with developmental expectations.
New initiatives and evolving educational approaches can also inform content goals, motivating transitions from assessments currently in use to different or modified assessments within the same domain to ensure alignment with instructional goals. The passage of laws by 40 states and the District of Columbia promoting instruction aligned with an evidence-based approach to literacy education known as the science of reading has inspired some states to make this kind of change.[12] Alabama, following the 2019 Alabama Literacy Act, adapted the content of its second-grade literacy summative assessment to better align its content to the five skill areas outlined in the science of reading: phonemic awareness, phonics, fluency, vocabulary, and comprehension.[13] This content modification has allowed the state to determine whether children are making progress toward Alabama’s specific early literacy goals.
Question 3: What kinds of support will educators need for data collection?
Assessment data on young children’s skills are typically collected through one of three methods:[14]
- Observing children’s behaviors or abilities in classrooms
- Reviewing work samples that children create
- Conducting direct assessments of children’s performance
Each of these methods relies on teachers gathering data quickly and accurately, which can be challenging. For example, a survey of pre-K educators found that two-thirds of them were using personal time to complete assessment tasks.[15] Additionally, teachers may struggle to assess children’s skills accurately. Research suggests that teachers are often unable to differentiate between the abilities of children in their classroom or to assess individual children’s skill levels in different content areas.[16]
To pave the way for successful data collection, it is important to analyze how new assessments will affect teacher workloads and look for ways to help teachers collect data more efficiently and accurately. In this analysis, consider that the burden of implementing assessments can vary significantly by domain and assessment type, so inferences based on assessments currently in use may not be accurate. Pilot tests of promising assessment tools can provide particularly valuable insights from smaller samples of educators for future scaling efforts. Specifically, educators piloting the tools can provide direct feedback on the burden and utility of the new tools through focus groups and surveys, and assessment data collected during the pilot can be analyzed for completeness and reliability.
Leaders may also consider novel technology-based assessments that have the potential to ease the burden on teachers and reduce the potential for bias in scoring.[17] These tools often involve children completing short activities that directly gather information about their skills without necessitating manual data collection, entry, or interpretation by educators. Several new technology-based tools for pre-K are currently being developed by interdisciplinary teams in partnership with educators, families, and state leaders through the Measures for Early Success Initiative led by MDRC.
Question 4: How will assessment data be integrated and analyzed with other data sources?
Finally, although the data from a single assessment instrument can be meaningful (for example, when identifying children who may need more help in a certain area or tracking trends in kindergarten readiness), consider how data from a new instrument will be used in relation to other data sources your state already collects. States often rely on multiple child assessment tools with different purposes and content focuses to comprehensively understand children’s development. Moreover, states typically collect other types of data—including those on programmatic quality and the early childhood workforce—that can be used to predict and understand variation in children’s skills.
Drafting clear research questions that your state plans to address with a new child assessment—such as “Do KEA scores predict children’s abilities in later grades?” or “Does children’s literacy skill growth vary by pre-K setting quality?”—is an important step for ensuring that data from that tool can be plugged in to address these questions. Verify that data from the assessment are linkable to other relevant sources, such as through consistent child, classroom, and program identifiers. Evaluating content alignment between data sources can also ensure that the correct information is available to address research questions. For example, if the goal of a new KEA is to predict third-grade achievement, it is important to determine whether the KEA measures precursor skills to those assessed on a later achievement test. It is unlikely that an assessment focused only on evaluating basic shape and number recognition, which most children tend to master quickly, will strongly predict later math scores from an assessment requiring complex problem solving.[18]
Conclusion
Children’s early years are an important period for developing foundational skills and addressing any developmental challenges through early intervention. Child assessments can play a crucial role during this period, helping educators evaluate children’s abilities and identify appropriate kinds of educational support. However, when selected and implemented without careful consideration, assessments may yield inaccurate or insufficient information and create more burden for schools, teachers, and students. Reflecting on the questions raised is a starting point for systems leaders choosing the best tools to provide accurate, actionable insights tailored to specific state needs.
The preparation of this issue focus was funded by the Gates Foundation. The conclusions contained within are those of the authors and do not necessarily reflect the positions or policies of the Gates Foundation.
[1] A child assessment is an assessment of an individual child’s skills and abilities, in contrast to a classroom assessment, which captures what is occurring more generally in an early learning setting.
[2] Quality Compendium, “Create a Report” (website: https://qualitycompendium.org/create-a-report, n.d., accessed May 30, 2025).
[3] Jacqueline M. Nowicki, Sarah Kaczmarek, and A. Nicole Clowers, "K-12 Education: State and Selected Teachers' Use of Kindergarten Readiness Information,” Q&A Report to Congressional Requesters, GAO-24-106552 (U.S. Government Accountability Office, 2024).
[4] This included a session entitled “Measures for Early Success: Engaging Decision-Makers, Families, and Educators to Reimagine Early Childhood Assessments” at the CCSSO National Conference on Student Assessment in June 2024 and a conversation in September 2024 on early childhood assessments at the CCSSO Early Learning Collaborative meeting.
[5] Joanne Jensen, Jessica Goldstein, and Matt Brunetti, “K–2 Assessment Systems Enable Early Intervention to Foster Student Success” (WestEd, 2021).
[6] Mariann Lemke, Dan Murphy, Aaron Soo Ping Chow, Hayley Spencer, and Angela Zhang, A First Look at Early Literacy Performance in Massachusetts: Results of Initial Analysis Based on State Grantee Literacy Screening Assessments (WestEd, 2023); Mariann Lemke, Dan Murphy, Aaron Soo Ping Chow, and Angela Acuña, Early Literacy Performance in Massachusetts: Results of Ongoing Analysis of Literacy Screening Assessments (WestEd, 2024).
[7] Matthew Brunetti, Meredith Langie, and Sarah Quesen, “Are We on the Same Page? A Discussion on the Use and Misuse of Early Literacy Assessments,” paper presented at the National Council on Measurement in Education (NCME) Annual Meeting, Denver, Colorado (April 26, 2025).
[8] Institute for Child Success, “Find Measures” (website: https://ecmeasures.instituteforchildsuccess.org/measures/, n.d., accessed July 16, 2025).
[9] Dana Charles McCoy and Terri Sabol, “Overcoming the Streetlight Effect: Shining Light on the Foundations of Learning and Development in Early Childhood,” American Psychologist 80, 2 (2025): 135–147.
[10] Emily C. Hanno, Ximena A. Portilla, and JoAnn Hsueh, “Designing Equity-Centered Early Learning Assessments for Today's Young Children,” Child Development Perspectives 19, 2 (2025): 92–98; Meghan McCormick and Shira Mattera, Learning More by Measuring More: Building Better Evidence on Pre-K Programs by Assessing the Full Range of Children’s Skills (MDRC, 2022).
[11] Ximena A. Portilla, Brenna Healy, and Emily C. Hanno, Language Content Blueprint: Developing Assessments for All Pre-K Children (MDRC, forthcoming in 2025).
[12] Just Right Reader, “Which States Have Science of Reading Laws in 2024?” (website: https://justrightreader.com/blogs/news/which-states-have-science-of-reading-laws-in-2024?srsltid=AfmBOopjbS3WoT_6WeH6hUO_pyaA1G3EmPPK9LUujIWb6IHvIFFR41Ja, n.d., accessed June 13, 2025); Sarah Schwartz, “The ‘Science of Reading’ in 2024: 5 State Initiatives to Watch,” Education Week (website: https://www.edweek.org/teaching-learning/the-science-of-reading-in-2024-5-state-initiatives-to-watch/2024/01, January 25, 2024).
[13] Alabama Literacy Act (Act 2019-523, §1).
[14] Debra J. Ackerman, “Early Childhood Care and Education Workforce Issues in Implementing Assessment Policies,” in Christopher P. Brown, Mary Benson McMullen, and Nancy File (eds.), The Wiley Handbook of Early Childhood Care and Education (Wiley, 2019).
[15] Claire E. Cameron, Sabrina Kenny, and Q. H. Chen, "How Head Start Professionals Use and Perceive Teaching Strategies Gold: Associations with Individual Characteristics Including Assessment Conceptions," Teaching and Teacher Education 121 (2023): 103931.
[16] Claire E. Cameron, Meghan M. McClelland, Tammy Kwan, Krystal Starke, and Tanya Lewis-Jones, “HTKS-Kids: A Tablet-Based Self-Regulation Measure to Equitably Assess Young Children's School Readiness,” Frontiers in Psychology 14 (2024); Jaclyn M. Russo, Amanda P. Williford, Anna J. Markowitz, Virginia E. Vitiello, and Daphna Bassok, “Examining the Validity of a Widely-Used School Readiness Assessment: Implications for Teachers and Early Childhood Programs,” Early Childhood Research Quarterly 48 (2019): 14–25.
[17] Hanno, Portilla, and Hsueh (2025).
[18] McCormick and Mattera (2022).