This post was inspired by Ben Goldacre's article DIY statistical analysis and the experience and thrill of touching real data.
Using internet resources produced by the Public Health Observatories and publicly available data on GCSE performance by pupil characteristic I have produced a funnel plot of the performance of local education authorities, using the percentage of pupils gaining at least 5 'good' GCSEs including and English and mathematics as the measure.
This variation in performance is not unexpected as it is normal to expect some natural variation in performance between LEAs. As such, some of questions we need to ask include:
- Are the variations in performance more than we would expect?
- Are some LEAs genuine outliers, which require further analysis?
- What explains the differences in performance?
A common benchmark which is often used is the average level of performance, which in this case is just over 60%. We can then estimate what is the expected level of variation around the benchmark, which in this example have been set at 2SD and 3SD from the mean, roughly 95% and 99.8 confidence limits.
As we can see there are a large number of LEAs performing either above or below the expected level of variation. The first step in any further analysis would be a consideration of the socio-economic profile of the LEA, for as we know from the work of Chris Cook there would appear to be a link between pupil deprivation and performance.
The funnel plot also suggests that as the number of pupils within an LEA increases, the performance of the LEA is more likely to be within the expected levels of variation of performance, with pupils in the larger LEAs being more likely to reflect the characteristics of the population as a whole.
As I mentioned at the beginning of this post, this is very much DIY statistical analysis, and as such I am loathe to draw too many conclusions. However, if I was to draw one conclusion, it is that as a result of undertaking this activity I have a far better appreciation of the uses of Statistical Process Control, which allows me to make far better judgements about data and what it means. Hopefully, this will contribute to being able to work more effectively with my colleagues.