Evidence Based Educational Leadership: April 2014

Tuesday, 29 April 2014

General Further Education Colleges : The Halo Effect and other delusions that deceive managers (and maybe inspectors)

This post seeks to examine some of the issues associated with using national benchmark data as a means of evaluating the performance of general further education colleges and consider whether Phil Rosenzweig's book The Halo Effect .... and the eight other business delusions that deceive managers can help our thinking about college performance.

As in previous posts I have used Statistical Process Control techniques to produce a funnel plot of general further education colleges' 16-18 long course success rates for the 2012-13 academic year.

Using ±3SD from the mean as the best estimate of expected variation, we can see that there are a large number of colleges, whose performance could be deemed to illustrate 'special-cause variation'.

However, there are limits with funnel plots, as with approximately 50% of colleges being outside the control limits it would suggest 'over-dispersion'. This level of dispersion indicates that there are a range of factors at work, which need to be taken into account when seeking to understand variations in college performance.

Furthermore, it this level of over-dispersion also suggests that we need to avoid some overly-simplistic explanations of differences in performance between colleges. Phil Rosenzweig in his book The Halo Effect .... and the eight other business delusions that deceive managers identifies a range of errors of logic or flawed thinking which distort our understanding of company (college) performance. The most common delusion is the Halo Effect, when a company's sales and profits are up (a college's retention, achievement and success rates) people (inspectors or significant other stakeholders) may conclude that this is the results of brilliant leadership and strategy or a strong and coherent corporate (college) culture. When performance falters, (success rates or position in leagues tables fall) they conclude it is the result of weak leadership and management (the Principal), and the company was arrogant or complacent (the college was coasting). Whereas the reality may be that little may has changed and that the college/company performance creates a HALO that shapes the way judgements are made about outcomes for learners, teaching learning and assessment, and leadership and management.

Rosenzweig identifies eight other delusions, which I would are argue are often seen in the FE sector and education at large. In the next section I have amended each of the delusions with the use of an educational example.

The delusion of correlation and causality : two things may be correlated, that does not mean that one causes the other, for example, does a coherent college culture lead to an outstanding college, or is a coherent college culture the product of success.

The delusion of the single explanation : there may be multiple explanations that lead to improvement, a clear vision, a focus on teaching and learning strategies, better leadership - and many of these factors are highly correlated, so focussing on one particular explanation may lead other possible or multi-causal explanations being ignored.

The delusion of connecting the winning dots : if we look at successful colleges to try and see what they have in common, that does not mean the commonality explains success

The delusion of rigorous research : Is the data available for making judgements about performance is as good as we think it is, for example, lesson grade profiles which we know are probably neither reliable or valid.

The delusion of lasting success : Many once successful colleges will decline over time. What worked in the past, may not work now, and certainly may not work for the six years between inspections for outstanding colleges

The delusion of absolute performance : College performance is relative, not absolute. A college can improve and still fall behind other colleges at the same time.

The delusion of the wrong end of the stick : Some college's may have pursued a highly focused strategy eg sixth form colleges but that does not necessarily lead to success (see previous post)

The delusion of organisational physics : College performance is a complex process, with feedback loops etc and changes often lead to unexpected consequences. Changes in success rates may be the product of changes in the nature of external assessment regimes rather than any specific organisational change.

Given the increasing pervasiveness of business management conventions within education, it would be surprising if the FE sector had escaped delusions that have deceive businesses and their managers.

Future posts will look at more detail particular aspects of management conventions and seek to compare them with the research literature to see if we can generate non-delusional management practice which is firmly based upon the research evidence.

Monday, 21 April 2014

Funnel plots and DIY statistical analysis - Part 2 - Sixth Form Colleges and Inspection Outcomes

Using the same approach and resources (produced by the Public Health Observatories) as described in a previous post I have produced a funnel plot for England's SFCs using the 2012-13, 16-18 long course success rates as the measure.

A common benchmark which is often used is the average level of performance, which in this case is a 16-18 long course success rate of 86.6% . We can then estimate what is the expected level of variation around the benchmark, which in this example have been set at 2SD and 3SD from the mean, roughly 95% and 99.8 confidence limits. Initial analysis of the data would suggest there are over 40 possible outliers, performing at levels above or below what could be expected from 'normal' variations in performance.

In recent months there has been considerable discussion about ensuring Ofsted inspection judgements are informed by a increased level of statistical sophistication, for example, the Policy Exchange's recent report on the future of school inspections. As such, it seemed reasonable to compare recent SFC inspection judgements with the funnel plot in order to provide a statistical perspective on the inspection grades.

Approximately a quarter of all SFCs have been inspected during the current academic year, and which gives us a range of inspection grades for outcomes for learners and which are summarised below

2 colleges' success rates were above the benchmark + 3SD limit (2 good grades for outcomes for learners)
9 colleges' success rates were within ± 3SD of the benchmark (7 good and 2 requires improvement grades)
12 colleges' success rates were below the benchmark - 3SD limit (3 good, 8 requires improvement, and 1 inadequate)

Given the partial nature of the data available and the recognition that inspection judgements are informed by more than piece of statistical data, it is difficult to draw any specific conclusions. On the other hand, it should be possible to pose a range of questions which are worthy of further investigation.

Allowing for the specialist nature of SFCs, more work is required to understand the over-dispersion of performance, with approximately 45% of colleges' success rates being above or below the funnel?
Given this over-dispersion, how useful are national averages in benchmarking performance?
Are too many good grades being awarded for outcomes for learners?
Are there more inadequate SFCs than has previously been thought?
What would be the implications for inspection grade if robust and statistically sound approaches to data analysis were used.

Finally, all of the above should be seen as highly provisional and as a small contribution to increasing the use of evidence based educational management.

Sunday, 13 April 2014

Funnel plots and DIY Statistical Analysis

This post was inspired by Ben Goldacre's article DIY statistical analysis and the experience and thrill of touching real data.

Using internet resources produced by the Public Health Observatories and publicly available data on GCSE performance by pupil characteristic I have produced a funnel plot of the performance of local education authorities, using the percentage of pupils gaining at least 5 'good' GCSEs including and English and mathematics as the measure.

This variation in performance is not unexpected as it is normal to expect some natural variation in performance between LEAs. As such, some of questions we need to ask include:

Are the variations in performance more than we would expect?
Are some LEAs genuine outliers, which require further analysis?
What explains the differences in performance?

A common benchmark which is often used is the average level of performance, which in this case is just over 60%. We can then estimate what is the expected level of variation around the benchmark, which in this example have been set at 2SD and 3SD from the mean, roughly 95% and 99.8 confidence limits.

As we can see there are a large number of LEAs performing either above or below the expected level of variation. The first step in any further analysis would be a consideration of the socio-economic profile of the LEA, for as we know from the work of Chris Cook there would appear to be a link between pupil deprivation and performance.

The funnel plot also suggests that as the number of pupils within an LEA increases, the performance of the LEA is more likely to be within the expected levels of variation of performance, with pupils in the larger LEAs being more likely to reflect the characteristics of the population as a whole.

As I mentioned at the beginning of this post, this is very much DIY statistical analysis, and as such I am loathe to draw too many conclusions. However, if I was to draw one conclusion, it is that as a result of undertaking this activity I have a far better appreciation of the uses of Statistical Process Control, which allows me to make far better judgements about data and what it means. Hopefully, this will contribute to being able to work more effectively with my colleagues.