Using baseline data for early intervention

Something I think about often is how to choose which pupils to give interventions to. Here, “intervention” means extra support, not simply extra scaffolding, additional feedback, or differentiated worksheets, but real one-to-one help. It is something aimed at those pupils that are somehow struggling but it is something that takes a huge amount of resource. You need to choose wisely where to invest that time.

Traditional intervention

In a typical scenario, a teacher will focus on the lowest scoring pupils in a class (sometimes called weaker pupils, sometimes called less able pupils, neither name is particularly nice). Those pupils are performing the worst so they clearly need the most intervention. Furthermore, by reducing the standard deviation of the class’s outcomes (i.e., by trying to narrow the range of grades), the teacher has an easier job in terms of less adaptive teaching (nee “differentiation”). A win-win!

However, this system is inherently unfair on less able pupils. Suppose I have two pupils, let’s call them Pupil A and Pupil B, both of which are scoring a ‘5’. In the standard system, the teacher will hit both pupils with the same intervention because they are both producing the same level of output.

Now let’s look at the baseline data of these two pupils. In my scenario, Pupil A is predicted to score a ‘4’ and Pupil B is predicted to score a ‘6’. With his ‘5’, Pupil A is outperforming his expectations and is a superstar. However, Pupil B’s ‘5’ indicates underperformance: She is not yet at the level that she should be (‘6’). Using the same intervention on both pupils seems unreasonable, even nonsensical!

Let’s assume that I am Pupil A. I do not really “get” it but I try my best and I feel like I am achieving because my grades are better than I think they should be. But wait, my teacher is pulling me to the side after every lesson to point out my mistakes which, no matter how intended, makes me feel admonished. This makes me less motivated for the subject, which, in turn, makes me upset, and I carry this feeling into my next lesson (which is a subject I really like, but not today). Remind me, why do we expect the same of both pupils moving forwards?

With the above in mind, is there a different way to identify pupils requiring intervention?

Individualised interventions based on baseline data

Rather than focusing on outcomes as compared to other pupils, I compare pupils to their own individual baseline data. What does this mean? It means for each individual pupil I compare their current level of achievement to where they are predicted to be. I then compare this across the class.

Here is an example class:

PupilCurrent test scoreThis pupil’s baseline predictionDifference (current minus prediction)
Pupil A77.7– 0.7
Pupil B78.0– 1.0
Pupil C68.9– 2.9
Pupil D56.1– 1.1
Average6.37.7-1.4
Table: Example data for fictitious pupils in my invented class.

To add meat to the scenario, the class is in Year 10 which explains why they are all underperforming as compared to their baseline predictions – they still have a year to get to that level!

Okay, as a teacher, how do I interpret the data? My first observation is that the class is currently averaging 1.4 below their baseline prediction. However, this is not a huge concern because historical data for my subject (Physics) tells me to expect an uplift of +1 between the Year 10 January Mock and the Year 10 Summer Exam, and another +1 between the Year 10 Summer Exam and the final GCSEs. This means that the class is on track to score, on average, +0.6 above their baseline prediction in their GCSE. A positive value add, I get to keep my job, and happiness all round.

What else does the data tell me? In the “traditional” model of intervention outlined at the start of this post, Pupil D clearly needs extra help because they are only scoring a 5, which is the lowest score in the class and is significantly lower than the class average of 6.3. However, the traditional model completely misses the fact that, with an expected uplift of +2 between now and the GCSE, Pupil D is on track to score a 7 in his finals and deliver a healthy value add of +0.9. I am giving someone an intervention who almost definitely does not need it. Whilst it would be great to put extra effort into every pupil, the reality is that my time is limited. Can my efforts be better directed elsewhere?

In my model of individualised intervention, I am not interested in the current test score of a pupil as compared to the class. Instead, I am interested in the difference between a pupil’s current test score and their individual baseline prediction, i.e., I care most about the final column in the table above. Take a look at Pupil C. Pupil C scored a 6 in her Year 10 January Mock. This absolutely did not flag her for my attention in the traditional method of identifying pupils needing help. However, in my method, she is predicted 8.9 so is currently 2.9 below where she should be. With a +2 expected grade uplift between now and her GCSE, she will end with a negative value add of -0.9. Comparing the pupil to her own baseline data is clearly indicating a problem.

Next, looking across the class, the other pupils all have an average “current” value add of around -1 (actually -0.93, the average of -0.7, -1.0, and -1.1), i.e., they are fairly homogenous with my measure. Pupil C’s score of -2.9 sticks out like a sore thumb. She is currently on a trajectory to return negative value add but she is placed in a class on track to achieve positive value add at the end of next year. Comparing the pupil’s current performance versus predicted performance to that of the class reinforces the problem that her individual data originally indicated.

My overall belief is that pupil outcomes come down to three basic things (actually, many more, but predominantly three:

  1. Inherent ability. The “nature” part of nature versus nurture. This is what baseline testing aims to parameterise
  2. Aspiration and motivation. Aspiration mainly comes from the family background whereas motivation can come from the family, friends, teacher, etc
  3. Teaching. The specific match between a pupil and their teacher/teaching style

With this in mind, Pupil C has inherent ability (a baseline prediction of 8.9 tells us that!), so her current trajectory is likely being hindered by her aspiration and motivation, and/or teaching. As her teacher, I have a lot of control over her motivation and essentially all of the control over her teaching, so I am the best key to unlock that inherent ability. If I am the best key to unlock her potential, she is surely where I need to focus my intervention. I think I should be putting extra effort into Pupil C with my one-to-one intervention.

Ultimately, I focus on the most underperforming pupils in the class until they are no longer the most underperforming pupils. Then I move to the “new” most underperforming pupils, repeated. This means that it is a system of continuous improvement by always focussing on the pupils where I think there are the easiest gains to be made. They are low-hanging fruit, ripe to be picked.

What is baseline data?

Throughout the above argument I have referred to baseline data. By this, I mean national testing that all pupils do at regular intervals throughout their education. In the UK, this would be MidYIS, ALIS, and YELLIS tests and the subject-specific grade predictions that these give. With my Year 10 example above, I would be using grade predictions from MidYIS data.

Of course, there can be issues with baseline data. They describe large data sets well but can struggle for accuracy at the individual pupil level. For every pupil that scores +1 as compared to their baseline prediction, there is another pupil that scores -1 for the system to average out. Also, a predicted grade of, say, 7.3, has an error bar on it which depends on the subject (typically around 0.5). I would be upset if this pupil scored a 7 and I would be happy if they scored an 8.

Ethical concerns with my method

At the start of this post I discussed my ethical objections to the traditional method of choosing which pupils need interventions. I then went on to discuss how I like to do it. However, I am not out of the woods from an ethical standpoint. Imagine we had two pupils in a class with the same test score of a 7. The first pupil is predicted a 7 so would not get an intervention from my method. The second pupil is predicted an 8 so would very possibly get an intervention. This means I have two pupils in the same class both scoring the same on tests, but I only choose to give extra help to one of them. Just as with the traditional method, there are clearly ethical issues with this method too. (Imagine if you were the pupil not getting the extra help but, on the face of it, you considered yourself to be in the same boat as the other pupil!).

My method also appears to go against common wisdom. In the current pedagogical paradigm it is insufficient for a teacher to pigeonhole pupils based on their ability, because how can they ever escape this ability label? However, this is not the point of what I am doing. All of my pupils have access to adaptive teaching resources in every lesson and they can all achieve the same. With my one-on-one intervention, I am talking about which one or two pupils of the two hundred that I have should I focus my very limited extra time on.

My method may also put too much weight on baseline data. What is the point in taking GCSEs if the baseline data is already good enough to mark out a pupil’s entire future? It is indicative at best.

A hybrid method?

I think my method of choosing who needs intervention makes more sense than the traditional approach. I think it focusses more on the softer side of teaching (motivation, etc), rather than the harder side (knowledge and understanding), which has the potential to give more gains. However, it clearly does a different job to the traditional method.

Perhaps the way forward is to use both methods? A hybrid approach between the traditional method and my method may be best. In fact, in one school where I worked, we did such a thing. Teachers focussed on traditional methods (not by policy, but by tradition/culture), and heads of year focussed on the individualised approach I have outlined here. This meant that weaker pupils got extra attention on a regular basis (e.g., every lesson) and underperformers got extra attention on a roughly termly basis. I like the sound of hybridisation.

On a final note, at everywhere I have worked to date, my classes have been Set or Streamed, so the pupils in any given class all score roughly similar grades to each other. It is very possible that an assessment could return fifteen pupils on a ‘6’ and fifteen on a ‘7’. The traditional approach of choosing who to give an intervention would do nothing to help me. Perhaps this is why I had to find another way. Maybe in schools that do not Set or Stream the traditional approach is better.

Weaker versus underperformer. If you only had time to give one-to-one interventions to a single pupil, how would you choose?

Leave a comment

Blog at WordPress.com.

Up ↑