If you are an evidence-based practitioner, school-research lead or headteacher and interested in being able to interpret
effect sizes, then this post is for you. In this post I will: first, briefly show how effect sizes are calculated; second, identify some of the
most common interpretations of the size of effect sizes; third, summarise
Robert Slavin's recent post on what do we mean by a large effect size; and finally,
consider the implications of the discussion for those interested in supporting evidence-based
practice within schools and the work of John Hattie
What do we mean by effect size?
Put quite simply, effect size is a way of quantifying
the difference between two groups and is calculated using the following
calculation.
Effect Size = (Mean of experimental group) -
(Mean of the control group)
Standard Deviation
How can we interpret effect sizes?
The most well known interpretation has been put
forward by John Hattie in Visible Learning who having reviewed over 800
meta-studies argues that the average effect size of a range is educational
strategies being 0.4. On the basis of this, Hattie argues that teachers
should select those strategies that have an above average effect on pupil and
student outcomes.
Secondly, we could turn to EEF’s DIY Evaluation Guide
written by Rob Coe and Stuart Kime, where on pages 17 and 18 they provide some
guidance on the interpretation of effect sizes (-0.01 to 0.18 low, 0.19 - 0.44
moderate, 0.45 - 0.69, high, 0.7 + very high) and with effect sizes being
converted into months of progress.
Alternatively, if you are interested in the
relationship between effect sizes and GCSE grades, you could turn to Coe (2002)
where he notes the distribution of GCSE grades in compulsory subjects
(i.e. Maths and English ) have standard deviations of between 1.5 – 1.8 grades.
As such, an improvement of one GCSE grade represents an effect size
of 0.5 – 0.7. So a teaching intervention which led to an
effect size of 0.6 would lead to each pupil improving by approximately one GCSE
grade.
How large is an effect size? - a recent analysis
Slavin (2016) recently published an analysis of effect
sizes which challenges these interpretations as to what is a large effect size.
Slavin argues that what is a large effect size depends on two factors: sample
size and student assignment to treatment or controlled groups (was it done
randomly or through a matching process). This conclusion is based on a review
of twelve meta-analyses and the 611 studies which met the rigorous standards
necessary for inclusion in the John Hopkins University School of Education BestEvidence-Encyclopedia. The results of this analysis are as follows:
Small
|
Large
|
|
Matched
|
+0.32
|
+0.17
|
N (studies)
|
(215)
|
(209)
|
Random
|
+0.22
|
+0.11
|
N (studies)
|
(100)
|
(87)
|
One way of interpreting the above table is to say that
if we take matched samples (424 studies in total), the average effect size for
studies with less than 250 participants (0.32) is nearly twice the size of the
effect size in studies of 250 or more participants (0.17). Alternatively, small studies using random sample are likely to generate an effect size (0.22) which is twice that of larger studies (0.11).
So what are the implications of Slavin's analysis for
the evidence-based practitioner?
First, Slavin argues - within Hattie's
Visible Learning - there are a large number of studies which do not meet the
requirements of the Best-Evidence Encyclopedia and should not be included in
any calculation of the effectiveness or otherwise of educational interventions.
Second, once having removed insufficiently rigorous
studies from the calculation of Hattie's league table of effect sizes, this
league table should be sub-divided into four separate tables - which depend
upon the size of the sample (large or small) and the nature of the sample (random or matched).
Third, the 0.4 hinge point which Hattie suggests teachers and headteachers use to identify those strategies with proven effectiveness, is in all likelihood incorrect and should not be used as screening mechanism for identifying strategies to introduce into a school. Indeed, Slavin's work suggests the need for multiple hinge-points.
Fourth, the EEF table used for the interpretation of
effect-size needs to be re-calibrated, to reflect the impact of sample sizes and random sampling/matching on average effect size. In other words, what is a large-effect size,
would now appear to be smaller, as effect sizes are unlikely to be as large as
anticipated, particularly in large multi-school studies involving more than 250
pupils. This is a particularly urgent, as there is likely to be a number of schools who are currently using the EEF DIY evaluation tool-kit as a guide to practice, and the current guidelines for interpreting effect sizes may lead to some interventions being mis-classified as having relatively small effect sizes.
And finally ...
Where does this leave us, particularly with regard to John Hattie and Visible Learning. Well for me, I think it would be difficult to justify the use within a school of Hattie's league table of effective strategies to determine either changes in teaching strategies or the introduction of school-wide interventions. What I think you can do is use Visible Learning to demonstrate the challenges and limitations associated with research-based teaching. In the other words, the benefits in critically using Hattie's work within school are to do with building professional capital rather than as a tool for prioritising interventions. If anything, the difficulties arising from Hattie's work suggest an even greater need for teachers to become effective evidence-based practitioners, who are able combine the different sources of evidence - research, school-data, stakeholder views, practitioner expertise - to make decisions which will hopefully lead to improved pupil outcomes and staff well-being
Note
I have not deconstructed Hattie's use of effect sizes - this has been more than ably done by Ollie Orange
John Hattie has made a laudable contribution to the cause of evidence informed education policy and practice. Not least, he's popularised the use of Effect Size and the more general idea of testing the impact of an intervention. But there's some collateral damage and the concept of a 0.4 ES being a threshold is definitely one of them. Cohen himself (author of the famous Cohen's D effect size) used a similar scale, calling 0.2 'small' and 0.5 'medium'. In fact we know from an accumulating body of work on both sides of the Atlantic (and you quote some of it) that large scale, properly randomised classroom based evaluations rarely produce ESs above 0.2.
ReplyDeleteIt's taken a decade or so to get evidence on the agenda of thinking school leaders but a lot of them have bought the 0.4 threshold. It will probably take another decade to re-educate the them, to quote a more contemporary guru, Ben Goldacre, that "I think you'll find it's a bit more complicated than that"
Paul Crisp
www.curee.co.uk
We are of the better view and the opinion now and hopefully for the future these would even proved to be much better so. statement of purpose for mba marketing
ReplyDeletePersonal statement examples can serve as a useful and excellent guide for applicants who want to write a winning admissions essay. As much as these samples are highly useful, not all of them can actually help you create a lively and compelling essay. It is important for you to be able to determine which personal statement samples are actually not worth your time. personal statement for internal medicine
ReplyDeletenice to read this post. I really enjoy it .
ReplyDeletepodiatry school personal statement .
This comment has been removed by the author.
ReplyDeleteA heart surgeon may be a doc that makes a speciality of matters of the functioning of the center and its relationship to the remainder of the body. There ar regarding vi billion human inhabitants within the world, that is, vi billion hearts to stay cardiologists terribly busy. rheumatology fellowship personal statement
ReplyDeleteYour essays can bring credibility to your application and permit the admission officers to be told WHO you're. you would like to create certain this happens. thus simply however must you begin writing your essays? more
ReplyDeleteHere are some compelling reasons to put in this totally important career assertion in your resume and a zenith-10 tips list for writing a memorable one. I need to know about professional mba essay writing
ReplyDeleteI am thankful to this blog for assisting me. I added some specified clues which are really important for me to use them in my writing skill. Really helpful stuff made by this blog.
ReplyDeleteเรียนต่ออังกฤษ
Thanks for a very interesting blog. What else may I get that kind of info written in such a perfect approach? I’ve a undertaking that I am simply now operating on, and I have been at the look out for such info. Education in NIgeria
ReplyDelete