Investigating student achievement at Key Stage 4

Featured Image

Andrew Lyth has been working at Cambridge CEM since 1999 as a Research Associate. Andrew previously worked on the mathematical modelling of trends in Household Demography. His first degree was in Maths, and he has a Master’s degree in Statistics. Andrew specialises in predictions and value-added methodologies. His role includes explaining the statistics and methodologies that underpin students’ results to schools.

His current research project will use data from the National Pupil Database to investigate relationships between performances on CEM’s assessments and attainment at Key Stage 4.


Q. Can you tell us about the research project you are currently working on?

A. Last year, I was really pleased to make a successful application to the National Pupil Database for access to the Key Stage 4 data from 2018 and 2019.

I had asked for the MidYIS and Yellis results of students reaching GCSE in these two years to be matched to the National Pupil Database Key Stage 4 data. This dataset holds a variety of Key Stage 4 data, including grades achieved in individual subjects, and other measures such as Attainment 8 and the English Baccalaureate.


Q. How difficult is it to gain access to data from the National Pupil Database?

A. The National Pupil Database (NPD) holds a variety of data on state school students in England covering children from reception into university. In England, it is one of the most valuable and important sources of research data on education.

Access to the NPD is strictly controlled by the Department of Education. There are a lot of forms to complete, a lot of information to collate and provide, and there is training required too. It’s a demanding process.

Applicants must set out in detail the type and purpose of the research they want to carry out. They must demonstrate that there is a public benefit from the research and that they have a legitimate interest in doing that research. They must also show that they have the training and facilities in place to ensure the security of the data, from analysis to the publication of results.

Also, anyone accessing the NPD must be accredited under the Office of National Statistics (ONS) Accredited Researcher Scheme.

Lastly, they must make their research results publicly available and once our work on the NPD data is complete, we will publish the research paper on our public website.


Q. What are the benefits to CEM and its customers of having access to that data?

A. Having access to this data will serve two main purposes:

Firstly, the data will help us to ensure that our predictions are as accurate and representative as we can make them and means that we can provide national predictions to as many GCSE subjects as possible.

The data has already fed into the regression lines that we used for the National GCSE value-added last summer (2022) and the national baseline predictions for all secondary school students yet to reach GCSE.

Secondly, the data will also allow us to investigate in detail the relationships between students’ performances in the different sections of the MidYIS and Yellis assessments and their later performances at Key Stage 4.

Having this knowledge will then help us make further improvements to the MidYIS and Yellis assessments.


Find out more about MidYIS and Yellis

Secondary assessments