Friday, 21 April 2017

Why are EPL clubs preferring to recruit foreign managers?

A few weeks ago I read a very interesting article by Sean Ingle in the Guardian comparing the performance of foreign managers in the EPL with their British & Irish counterparts. Using data compiled at the University of Bangor in Wales, the headline statistic was that foreign managers have averaged 1.66 points per game since 1992/93, compared to just 1.29 for those from the UK & Ireland. As Ingle points out: this amounts to a whopping 14 extra points over the course of the season.

My first thought was that these results might be misleading because of a potential selection bias. If overseas managers have tended to work for the top EPL clubs then of course they would have a higher average points per game, simply because a larger proportion of them managed big clubs than their domestic counterparts. In that case it’s not fair comparison. In the first part of the blog I will look at this in more detail: will the result reported in the Guardian stand up to further scrutiny?

Nevertheless, with only seven of the twenty EPL clubs starting this season with a British or Irish manager, it’s clear that clubs are showing a preference for recruiting from overseas. In the second part of this blog I’ll discuss one of the factors that may be motivating EPL clubs to hire foreign managers.

The rise of the foreign manager.


Figure 1 shows the breakdown of managers by nationality for each EPL season since 1992/93 (ignoring caretaker managers[1]). The red region represents English managers, blue the rest of the UK (Scotland, Wales and Northern Ireland), green the Republic of Ireland, and grey the rest of the world. The results are presented cumulatively: for example, this season 28% of EPL managers (7) have been English and 12% (3) were from Scotland and Wales; the remaining 60% of managers in the EPL this season have been from continental Europe (13), South America (1) or the US (1).

Figure 1: Stacked line chart showing the proportion of EPL managers by nationality in each season since 1992/93. Current season represented up to the 1st March 2017. The proportion of managers that are English managers has fallen from two-thirds to one-third over the past 24 years.
The figure shows a clear trend: the number of English managers has significantly declined over the last 24 years. Back in 1992, over two-thirds of managers in the EPL were English and 93% were from the UK as a whole. Since then, the proportion of English managers has more than halved, replaced by managers from continental Europe and, more recently, South America[2]

Is the trend towards foreign managers driven by supremacy over their domestic rivals? 


The table below compares some basic statistics for UK & Irish managers with those of managers from elsewhere. Excluding caretaker managers, there have been 283 managerial appointments in the EPL era, of which over three-quarters have been from the Home Nations or the Republic of Ireland. Of the 66 foreign EPL appointments, nearly half were at one of the following Big6 clubs: Man United, Arsenal, Chelsea, Liverpool, Spurs and Man City[3]. However, only 12% of British or Irish managerial appointments have been at one of these clubs. This is the selection bias I mentioned at the beginning – the top clubs are far more heavily weighted in one sample than the other.


At first glance, foreign managers have performed better: collecting 1.66 points/game compared to 1.29 for their UK & Irish counterparts (reproducing the results published in the Guardian article). However, this difference is entirely driven by the Big6. If you look at performance excluding these clubs it’s a dead heat – foreign managers have performed no better than domestic ones, both averaging 1.2 points per game.

At the Big6 clubs, foreign managers have collected 0.2 points/game more than their UK counterparts. This difference is almost entirely driven by Chelsea and Man City, where foreign managers have collected 0.8 and 0.7 points per game more than UK & Irish managers[4].  But since Abramovich enriched Chelsea in 2003, they have not hired a single British or Irish manager[5]. A similar story at Man City: in only one and a half of the nine seasons since the oil money started to flow into Manchester have they had a British manager (Mark Hughes). Both clubs had very different horizons before and after their respective cash injections, and they have hired exclusively from abroad since[6].

So it seems that, when you look closely, you find little convincing evidence that foreign managers have performed better than domestic managers in the EPL era. Why then do clubs prefer to look beyond these shores?

Access to foreign markets


Previous success is clearly a key criteria in manager recruitment, but I wonder if there are specific attributes that give foreign managers an edge over English candidates. In particular, foreign managers have local knowledge and contacts that might give a club the edge over domestic rivals in signing overseas talent. You could argue that Wenger’s initial success at Arsenal was influenced by his ability to identify and sign top French players at a time when France was dominating international football. Raphael Benitez certainly mined his knowledge of Spanish football to successfully bring a number of players to Liverpool.

In hiring foreign managers, do clubs improve their access to transfer markets overseas? As the table above shows, foreign managers sign players from abroad at roughly twice the rate of domestic managers -- an average of 5 per season compared to the 2.6 per season signed by their British or Irish counterparts. The result does not change significantly if you exclude the Big6 clubs, or if you only look at players signed in the last 15 years. 

This doesn’t prove the hypothesis that clubs sign foreign managers to improve access to foreign players, but it does support it. Of course, being British isn’t necessarily a barrier to signing top overseas talent; after all, Dennis Bergkamp, arguably Arsenal’s greatest ever import, was bought by Bruce Rioch. But in era in which English players come at a premium, it makes sense to for clubs to hire managers that will enable them to lure high quality players from the continent. 

------------------------

Thanks to David Shaw and Tom Orford for comments.

[1] I define a caretaker manager as one that remained in post for less than 60 days.
[2] The proportion of managers from Scotland, Wales and Northern Ireland has generally remained stable at about 25% (although very recently it is has fallen).
[3] The first five are the five best finishers in the EPL era, on average. I decided Man City warranted inclusion because of their two EPL titles.
[4] Of the others, Wenger and Ferguson largely cancel each other out and foreign managers have performed only marginally better at Spurs and Liverpool.
[5] Indeed you have to go all the way back to Glen Hoddle’s departure in 1996 to find Chelsea’s last British or Irish manager.
[6] Mark Hughes was appointed before the Abu Dhabi group bought Man City. 


Wednesday, 29 March 2017

Keep calm and let them carry on: are mid-season sackings worth it?

It’s February and your club is in trouble. Following a run of poor results, they are hovering just above the bottom three. Fans and pundits alike are writing them off. The remainder of the season is destined to be a grim struggle for the points: a few snatched draws, the odd scrappy win, but mostly meek surrender to mid-table and above teams.

The board panics and fires the manger, it seemed the only remaining option. Granted, he did well last season – bought in some good players and promoted others, got them playing attractive football. But now the team needs defibrillation: a new manager with fresh ideas, inspiring players keen to prove themselves to him. A five game honeymoon period and, come spring, everything will be rosy again. After all, it worked so well for Sunderland last season.

This story seems to play out several times each season, but does it actually make any sense to fire a manager mid-season? A few years ago, Dutch economist Dr Bas ter Weel compared points-per-game won immediately before and after a manager has been fired in the Eredivisie. He demonstrated that, while there does tend to be an uptick in results in the following six or so games, it has nothing to do with the change in manager -- it's just mean reversion. Analogous to having rolled 6 ones in a row, the results were very likely to improve in the next 6 matches (or rolls) irrespective of whether the manager was fired or not.

In this blog I’m going to focus more on the longer term. Specifically, I’ll look at league rankings, comparing each team’s position at the end of the season against its position at the point when the manager was fired. In the harsh light of data, is there any evidence that clubs that sack their manager before April perform better over the remainder of the season than their closest competitors?

Mid-season sackings in the EPL and Championship


To answer this question, I identified every in-season manager change that has occurred in the EPL since the 1996/97 season, and in the Championship since 2006/07. Discarding outgoing caretaker managers (which I define as managers in position for four weeks or less) gave me a sample of 259 changes: 117 changes in the EPL and 142 in the Championship.

I then classified each manager departure into one of three categories: sacked, resigned, and mutual consent. For example, of the 117 in-season manager departures in the EPL over the last 20 seasons, 71 were fired, 36 resigned and 10 left by mutual consent. In this analysis we’re only interested in those that were forced out, which I will define as either sacked or leaving by mutual consent (the latter typically being a nice way of saying that he was sacked).

Managerial changes occur throughout the season, however I’m going to focus on those that occur in the middle portion of the season, from November through to March. Manager firings that occur early in the season can be due to reasons other than the team’s recent performance. Likewise, those that occur late in the season tend to be with an eye to the summer and following season. Retaining only the mid-season sackings left me with a final sample of 111, with just over half being at EPL clubs.

Finally, I also identified a sample of clubs that were in a similar league position to those that fired their manager (within 3 points on the date it was announced) but retained the same manager for the entire season. We’ll compare this baseline sample with the manager-change sample and see if the latter did any better.

Results


Figure 1 plots the league position on the date the manager was removed (x-axis) against league position at the end of the season (y-axis), for each team in the manager-change sample. The black circles represent EPL clubs; the blue triangles Championship clubs. The red diagonal line indicates the same league position at departure and season end. The shaded regions above and below the line encompass teams that finished 3,6 or 9 places higher or lower than their position when the manager was sacked.

It’s clear that the majority of mid-season manager firings occur at clubs in the bottom half of the table. Of the EPL firings, 89% were at teams below 10th, and 66% were at teams in the bottom five places. Likewise, in the Championship 82% of sackings were at teams below 12th, and 51% at teams in the bottom 6.  Of the 6 sackings that occurred at EPL teams in the top-half of the table, 4 were at Chelsea[1].

Figure 1: the league position of EPL and Championship teams on the date their manager was fired (x-axis) against their league position at the end of the season (y-axis). The black circles represent EPL clubs, the blue triangles Championship clubs. The red diagonal line indicates the same position at departure and season end; the shaded regions above and below encompass teams that finished 3,6 or 9 places higher or lower than their position when the manager was sacked. 

There is no evidence that teams gain any kind of advantage by sacking their manager. The median position change is zero, i.e. no change. Specifically: 30% of teams end in a lower position than when the manager was sacked, 23% see no change in their position and 48% see an improvement. If we compare this to the baseline sample -- clubs in similar positions in the table that retained the same manager for the entire season -- we find roughly the same proportions: 38% ended the season in a lower position, 17% saw no change in their position and 45% improved their position.

We can be more specific and look at clubs in the relegation zone when the manager departed. As the table below shows, of those that fired their manager 34% survived; of those that did not 39% survived. There is no evidence that firing the manager helps avoid relegation.


But what about Leicester?


Leicester fired Ranieri more than a month ago and have not lost since. They’re currently 2 places above their league position after his last game and seem likely to continue their recovery up the table. Didn’t they benefit from firing their manager?

While Figure 1 demonstrates that, on average, a club’s league position is not expected to improve after their manager is sacked, some individual clubs clearly did go on to significantly improve their league position. For instance, when Brendan Rodgers was fired from Reading in 2009/10 they were in 21st position; under his replacement, Brian McDermott, they went on to finish in 9th. Crystal Palace sacked Neil Warnock just after Christmas in 2014 when they were in 18th position; by the end of the season Alan Pardew had guided them to 10th. 

On the other hand, clubs that do not switch manager also undergo miraculous recoveries. In the 2001/02 season Blackburn Rovers rose from 18th place in mid-March to 10th place by the end of the season. In late November 2008, Doncaster Rovers were rooted at the bottom of the Championship in 24th place; an eight match unbeaten run lifted them up to mid-table and they finished in a respectable 14th place. Both teams retained the same manager for the entire season: Graeme Souness and Sean O'Driscoll, respectively.

There are clearly circumstances that might necessitate a managerial firing in the middle of the season -- Leicester may be an example of this. But to pull the trigger without a clear diagnosis of what has gone wrong is a sign of desperation and poor decision-making. Indeed, over the last twenty seasons, EPL managers appointed during the summer months have, on average, lasted over 100 days longer in their jobs than those appointed during the season. Coupled with the large compensation payments that are often necessary to remove a manager, mid-season changes may actually end up harming the long-term prospects of a club.



--------------------------
[1] Specifically: Gullit in 97/98, Scolari in 08/09, Villas-Boas in 11/12 and Di Matteo in 12/13.

Saturday, 11 February 2017

The Wisdom of Crowds: A Census of EPL Forecasts

Introduction


We're nearly two-thirds of the way through the 2016/17 EPL season, which seems a good time to try to predict what might happen. Chelsea’s nine-point cushion and relentless form make them clear favorites for the title; not since Newcastle in 1996 have a team blown such a lead. Just five points separate second from sixth as the remaining superpowers battle for Champions League places: who will miss out? Perhaps the mantra ‘most competitive EPL season ever’ is best reserved for the relegation fight, though. Six teams, two points and an ever-changing landscape. Amongst them: last season’s heroes, Leicester. Too good to go down?

Most TV pundits are definitive in their predictions, indeed they are typically paid to be so. Others prefer to let the numbers do the talking. Football analysts around the world build mathematical models to measure team strength and calculate the probability of match outcomes. Rather than saying “team A are likely to beat team B”, they'll say “I estimate that there is an 85% probability that team A will win”.

There is no agreed method for designing a forecast model for football. Consequently, predictions vary from one model to another. However, there is also strength in diversity. Rather than comparing and contrasting predictions, we can also collect and combine them to form a consensus opinion.

Last January, Constantinos Chappas did just that. Following gameweek 20, he collected 15 sets of predictions, averaging them to produce a ‘consensus forecast’ for the outcome of the 2015/16 EPL season. His article was published on StatsBomb here; we’ll return to the success of last year’s predictions at the end. First, I’m going to repeat the exercise for the 2016/17 EPL season. What do the combined predictions say this time around?

Participants


In total there were 15 participants this year, many of whom offered up their predictions in response to my twitter appeal. A big thank you goes out to (in no particular order):

@8Yards8Feet, @petermckeever, @goalprojection, @11tegen11, @cchappas, @SteMc74, @fussbALEXperte, @SquawkaGaming, @EuroClubIndex@opisthokonta and Sky Sports (via @harrydcarr)

To these, I added forecasts from the FT and FiveThirtyEight; I haven’t been in contact with them personally, but their forecasts are publicly available. I also added a bookmaker’s average, calculated by collecting the odds published on oddschecker.com and averaging the implied probabilities. That’s 14 - the final participant was myself (@eightyfivepoint).

The Predictions


Before we get into the results, a little bit about how they’ll be presented. I’ve followed last year’s article and presented forecasts as box-plots. These are a simple graphical representation of the distribution of forecasts for a particular outcome. The height of the shaded area represents the interquartile range: the 25th to 75th percentiles. By definition, half the forecasts lie within this range -- it provides a decent estimate of the variablity of the predictions.  The black horizontal line in the middle is the median (50th percentile), I’ll sometimes refer to this as the consensus forecast. The ‘whiskers’ extending out vertically from each box show the 5th to 95th percentiles. All but the highest and lowest forecasts for a given outcome will lie within this range.

On each plot I've also plotted the individual predictions as coloured points. They are identified by the legend on the right.

So, without further ado, here are the forecasts for this 16/17 EPL season.

The Champions



Not surprisingly, Chelsea are the clear favourites: the median forecast gives them an 88% chance of winning the league, as do the bookmakers. There’s not a huge amount of variability either, with the forecasts ranging from 80% to 93%. If Chelsea do suffer some kind of meltdown then it’s probably Spurs or City that would catch them, with median predictions of 5% and 4%, respectively. Liverpool and Arsenal are rank outsiders and any of the other teams finishing top would be an enormous surprise.

The Top Four



Now this is where things get a bit more interesting. Chelsea seem almost guaranteed to finish in the Champions League places, which leaves five teams fighting it out for the remaining three. Tottenham and Man City are heavily favoured: both have a median probability of at least 80% and the whiskers on their box-plots do not overlap with those of the next team, Liverpool.

The real fight is between Klopp and Wenger. Statistically they are almost neck-and-neck, with their box-plots indicating that the individual predictions are broadly distributed. Look closely and you see an interesting negative correlation between them: those that are above average for Liverpool tend to be below average for Arsenal (and vice-versa). You can see this more clearly in the scatter plot below. The reason must be methodological; to understand it we’d have to delve into how the individual models assess the teams' relative strength. Note that the bookies are sitting on the fence - they've assigned both Arsenal and Liverpool a 53% chance of finishing in the top four.


Man United are outsiders, but the consensus forecast still gives them about a 1 in 3 chance of sneaking in. Interestingly, the bookmakers odds – which imply a 44% chance of United finishing the Champions League positions - are way above the other predictions. Perhaps their odds are being moved by heavy betting?

The Relegation Candidates



Two weeks ago it looked like Sunderland and Hull were very likely to go down. Since then, the relegation battle has been blown wide open. The first six teams seem set for a nervous run-in and neither Bournemouth nor Burnley will feel safe.

The principal candidates for the drop are Sunderland, Hull and Palace, all of whom have a median prediction greater than a 50% chance of relegation. There is clearly a lot of variability in the predictions though, with the Eagles in particular ranging from a 38%-74%. You can certainly envisage any one of them managing to escape.

The next three clubs - Middlesbrough, Swansea and Leicester - are all currently level on 21 points, yet the median predictions imply that Middlesbrough (42%) are nearly twice as likely to go down as Leicester (22%). I suspect that this is because some models are still being influenced by last season’s results (for instance, Leicester's forecasts appear to bunch around either 15% or 30%). The amount of weight, or importance, placed on recent results by each model is likely to be a key driver of variation between the predictions.

What about <insert team’s name here>?


The grid below shows the average probability of every EPL team finishing in each league position. Note that some of the models (such as FiveThirtyEight, Sky Sports and the bookmakers) are excluded from the plot as I wasn’t able to obtain a full probability grid for them. Blank places indicate that the probability of the team finishing in that position is significantly below 1%.

An obvious feature is that Everton seem likely to finish in 7th place. The distribution gets very broad for the mid-table teams: Southampton could conceivably finish anywhere between 7th and 18th.


Last year’s predictions.


So how did last years’ predictions pan out? Leicester won the league, but the median forecast predicted only a 4% chance of this happening (compared, for example, to a 40% chance that they would finish outside the Champion's League places). However, the top four teams were correctly predicted, with a high probability of finishing there having been assigned to each of Leicester, Arsenal, City and Spurs.

Down at the bottom, both Newcastle and Villa were strongly expected to go down and they did. Sunderland were predicted to have only a 15% chance of staying up, yet the Black Cats escaped again. Instead, Norwich went down in their place having been 91% to stay up. Other surprises were Southampton (7 places higher than expected), Swansea (5 higher) and Crystal Palace (down 7).

How good were last year’s forecasts, overall? This is a tricky question and requires a technical answer. The specific question we should ask is: how likely was the final outcome (the league table) given the predictions that were made? If it was improbable, you could argue that it happened to be just that – an outlier. However, it could also be evidence that the predictions, and the models underlying them, were not particularly consistent with the final table.

We can attempt to answer this question using last season’s prediction grid to calculate something called the log-likelihood function: the sum of the logarithms of the probabilities of each team finishing in their final position. The result you obtain is quite low: simulations indicate that only about 10% of the various outcomes (final rankings) allowed by the predictions would have a lower likelihood. It is certainly not low enough to say that they were bad, it just implies that the final league table was somewhat unlikely given the forecasts. A similar result this time round would provide more evidence that something is missing from the predictions (or perhaps that they are too precise).

A final caveat..


Having said that – models are only aware of what you tell them. There are plenty of events – injuries, suspensions, and managerial changes – of which they are blissfully unaware but could play a decisive role in determining the outcome of the season. Identifying what information is relevant – and what is just noise – is probably the biggest challenge in making such predictions.

I will continue to collect, compare, combine and publicize forecasts as the season progresses: follow me on twitter (@eightyfivepoint) if you'd like to see how they evolve.


(This is a piece that I wrote for StatsBomb; I've copied it here.)



Wednesday, 18 January 2017

Poor FA Cup crowds erode home advantage

I was struck by the poor attendances at some of the FA Cup 3rd round matches this month. 17,632 turned up to watch Sunderland vs Burnley, less than half Sunderland’s average home gate this season. It was a similar story at Cardiff vs Fulham, Norwich vs Southampton and Hull City vs Swansea, all of which saw crowds below 50% of their league average this season.

An interesting statistic was recently posted on Twitter by Omar Chaudhuri, of 21st Club (@OmarChaudhuri). If you take all 181 FA Cup ties that involved two EPL teams (ignoring replays and matches at neutral venue) since the 2000/01 season, you find that the home team won 46% of the matches and the away team 30%. However, if you look at the equivalent league match between the teams in the same season, you find that the home team won 52% of the matches and the away team 22%. Although the sample size is small, the implication is that home advantage is less important in cup matches.

Lower FA Cup crowds and diminished home advantage - are the two connected? This seems a reasonable hypothesis, but I’ve never seen it demonstrated explicitly. I aim to do so in this post.

Cup Matches vs League Matches


To answer the question I’ll look specifically at cup ties that involved teams from the same division, from League 2 to the EPL, and compare the outcomes to the equivalent matches in the league. This approach isolates the influence of any changes in circumstance between the two games – including lower or higher attendance.

I identified every FA Cup tie, from the third round onwards, that involved two teams from the same-division since 2000/01[1], along with the corresponding league match.  I then removed all matches at a neutral venue[2]. This left me with a sample of 357 cup matches, and the same number in the league.

I then measured what I’ll refer to as the home team’s attendance ratio -- their average home-tie FA cup attendance divided by their average home league attendance -- in each of the last 16 seasons. Season-averaged attendance statistics for both league and FA cup games (3rd round onwards) for every team were taken from www.worldfootball.net. Ideally, you would directly compare the attendance of each FA Cup tie with that of the equivalent league game. However, I don’t have the data for individual games, so instead I used each team’s season averages for cup and league as a proxy (but if anyone has this data and is willing to share it, please let me know!)

I used the attendance ratio to divide my sample of matches into three sub-samples: well-attended matches, mediocre attendance and poorly-attended matches. The former are defined as cup matches in which the crowd size was greater than 90% of the home team’s league average. A mediocre attendance is defined as a crowd size less than 90% but greater than 70% of their league average, and a poorly-attended one as less than 70% their league average. For each group, we’ll look at differences in the fraction of home wins, away wins and draws between the FA Cup ties and league matches.

Table 1 summarizes the results. Let’s look at the first three lines - these give outcomes for cup ties in which the attendance was at least 90% of the league average. There have been 148 such matches in the last 16 seasons: the home team won 56%, the away team 23% and 21% were draws. In the corresponding league matches, the home team won 51%, the away team 24%, and it was a draw in 26%. So, there was a small increase in the proportion of home wins relative to the league outcomes, with correspondingly fewer draws. In about a third of these ties the attendance was greater than their league average: the home side may have benefited from a more vociferous support.

Table 1

The next set of lines in Table 1 show the results for the FA Cup matches that had a mediocre attendance – those in which the attendance ratio was between 70% and 90% of the home side league average. The home team won 44% of these matches, which is slightly below the home win rate in the corresponding league matches. There is again a fall in the number of draws, but this time the away team benefits, winning 6% more often than in the league matches. The differences are small, but there is some evidence that the away team were benefitting from the below-average attendance.

However, the increase in away wins becomes much more striking when we look at poorly-attended cup matches: those in which the attendance was less than 70% of the home team's league average. The home team won only 34% of these ties, 14% below the corresponding league fixtures. The away win percentage increases to 42% and is 19% above the league outcome. Indeed, the away team has won poorly-attended cup matches more frequently than the home team. This is despite the home team winning roughly twice as often as the away team in the corresponding league fixtures (48% to 23%). The implication is very clear: when the fans don’t show up for an FA Cup tie, the team is more likely to lose. I don’t think I’ve seen any direct evidence for this before[3].

In all three sub-samples, it's worth noting that draws are down 5% relative to the corresponding league outcomes (although the beneficiary depends on the attendance). Presumably this is down to the nature of a cup tie: teams are willing to risk pushing for a win in order to avoid having to play a troublesome replay (or a penalty shoot-out during a replay).

So why are some fans not showing up? One obvious explanation is that they are simply unwilling to shell out more money beyond the cost of a season ticket. Maybe clubs should lower their prices for FA Cup matches; I’d be curious to know if any do. There could even be an element of self-fulfilling prophecy: the fans believe that their team have no real chance of winning the cup and so choose not to attend, to the detriment of their team. Perhaps the fans are aware that the cup is simply not a priority – their club may be involved in a relegation battle, for example – and that they are likely to field a weakened team.

The bottom line seems clear enough, though: if clubs want to improve their chances of progressing in the FA Cup they should ensure that they fill their stadium.


--------------------
Thanks to David Shaw, Jim Ebdon and Omar Chaudhuri for comments.

[1] Data was only available for all-Championship ties from 02/03, 08/09 for L1 and 09/10 for L2.
[2] Replays were retained, although the outcome of penalty kicks was ignored (i.e., a draw at the end of extra-time was scored as a draw). There are 64 replays in the sample in total, of which 8 went to penalties.
[3] One caveat is that the sample size is pretty small: this analysis could do with being repeated on a larger sample of games (and with the specific match attendances, rather than season averages). However, the increase in the away percentage in the smallest sample (attendance ratio < 0.7) is still highly significant. 

Tuesday, 10 January 2017

The Frequency of Winning Streaks

Thirteen – an unlucky number for some. So it proved for Chelsea: just one win shy of equaling Arsenal’s record, their thirteen-match winning streak was finally ended by an in-form Spurs side. While there may be some temporary disappointment amongst Chelsea fans at having failed to set a new record, their winning run has almost certainly propelled them into the Champions League next season and made them clear favourites for the title.

Sir Alex Ferguson would often refer to momentum as being instrumental to success. A winning streak can sweep teams to the title or snatch survival from the jaws of relegation. What constitutes a good streak is clearly dependent on the team, though.  Manchester United are currently on a five-match winning run: such form would certainly be outstanding for a relegation-threatened team, but is it common for a Champions League contender? This question is itself part of a broader one: what is form and how should we measure it?

In this blog I’m going to take a look at some of the statistics of winning streaks, investigating the characteristic length of winning runs in the EPL and how it varies for teams from the top to the bottom of the table.

How well do teams streak?


I started by taking every completed EPL season since 2000/01 and dividing the teams into bins based on their points total at the end of each season (0-40 points, 40-50, 50-60, and so on)[1]. For each bin, I measured the proportion of the teams in that bin that completed a winning streak, varying the length of the streaks from 2 to 10 matches.  For example, of the 54 sides that have finished on between 50 and 60 points since the 2000/01 season, 17 (31%) completed a winning run of at least 4 matches.  Runs were only measured within a single season – they do not bridge successive seasons[2]. The results are summarized in Table 1.


Table 1: The proportion of teams that complete winning runs of two games or longer in the EPL. Teams are divided into bins based on their final points total in a season, from 0-40 points (top row) to >80 points (bottom row).

The top row gives the results for teams that finished on less than 40 points. The columns show the percentage that managed a winning streak, with the length of the streaks increasing from 2 (left column) to >10 matches (right). Three quarters of the teams in this points bin put together a winning streak of at least two games. However, the proportion drops very rapidly for longer runs: only 14% completed a 3-match winning streak and only 7% a 4-match streak. The only team to complete a 5-match winning streak was Newcastle early in 2014/15 (and this was half of the total number of games they won that season).

As you'd expect, the percentage of teams that achieve a winning streak of a given length increases as you move to higher points bins. Every team that has finished with 60 points or more has completed a 3-match winning stream. However, fewer than a quarter of those that finished with less than 70 points completed a 5-match winning streak. In general, the proportion of teams that achieve a winning streak drops off very rapidly as the length of the streak is increased. 

The exception is the title-challenging teams (the bottom row in Table 1): the percentage in this bin falls away more slowly as the the length of the winning streak is increased. 27 of the 29 teams that finished with at least 80 points put together a 5-match winning streak, 13 completed an 8-match streak and 5 completed a 10-match winning streak. This is the success-generating momentum that Ferguson habitually referred to.

In his final 13 seasons (from 2000/01 to 2012/13), Man United put together 14 winning streaks lasting 6 matches or more; in the same period Arsenal managed only 5. United won 7 titles to Arsenal’s 2. For both teams, the majority of these streaks occurred in title-winning seasons. The same applies to Chelsea and, more recently, Man City. Only two title-winning teams have failed to complete a 5-match winning streak: Man United in 2010/11 and Chelsea in 2014/15. The median length of winning streak for the champions is between 7 and 8 games.

Leicester’s 4-match winning streak at the end of the 2013/14 season saved them from relegation. It was also an unusually long run for a team finishing on around 40 points - only four other teams have managed it. Was this a harbinger of things to come? A year later, during their title-winning season, their 5-match winning streak in March/April pushed them over the line.

The implications for form


Only the best teams put together extended winning runs: 40% of EPL teams fail to put together a three-game winning streak and 64% fail to win 4 consecutive games. Perhaps momentum - and the belief and confidence it affords - is only really relevant to the top teams? Does the fixture list throw too many obstacles in the path of the smaller teams? Every 3 or 4 games a smaller team will play one of the top-5 sides, a game that they are likely to lose. This may make it more difficult for them to build up a head of steam.

On the other hand, perhaps smaller teams are able to shrug-off their defeats away to Arsenal or Liverpool and continue as before. In that case, should we discard games against the ‘big teams’ when attempting to measure their form? And to what extent do draws interrupt, or in some cases boost, a team's momentum? These are all questions that I intend to return to in future blogs.

Unbeaten Runs


Finally, I’ll leave you with the equivalent table for unbeaten runs. While the typical length of unbeaten runs in each bins is about twice as long as winning runs, most of the conclusions above still apply.

Table 2: The proportion of teams that complete an unbeaten run of length 2 or longer in the EPL. Teams are divided into bins based on their final points total in a season, from less than 40 points (top row) to more than 80 (bottom).

---------------

Thanks to David Shaw for comments.

[1] The total number of teams across all bins was 320: 16 seasons with 20 teams per season.
[2] Note that the runs are inclusive - if a team achieves a 3-match streak it will also have achieved a 2-match streak.




Tuesday, 20 December 2016

Does January transfer spending improve results?

Last week the Sunderland chief executive, Martin Bain, warned that only "very limited" funds will be made available to David Moyes in the January transfer window (see here, here and here). Bain said that Sunderland are “not going to be able to spend to get out of trouble” and that "we have reached a point where there has to be a time where you don’t have that short-term hit to plug the holes in the dam".

The implication is that Sunderland have put their long-term financial health at risk in previous seasons by spending substantial sums in January in a last-ditch effort to retain their EPL status. While they have indeed survived their recent flirtations with relegation, is there any compelling evidence that winter spending actually improves results in the second half of the season? By out-spending their rivals, are troubled teams boosting their chances of staying up, or are they just using up previous financial resource that could be invested more carefully in their future? In this blog I’ll try to investigate these questions.

January spending and results improvement.


The goal is to establish whether there is any relationship between January transfer spending and an improvement in results in the latter half of the season. For each of the last six seasons, I calculated the gross January expenditure of every EPL team using data taken from transferleague.co.uk[1].  To measure the improvement in results for each team, I calculated the average number of points per game they collected in matches played either before or after January 1st in each season and took the difference (second half of the season minus the first).

Figure 1 below plots the change in points-per-game versus gross January expenditure for all EPL teams in each of the 2010/11 to the 2015/16 seasons (each point represents a team in one of those six seasons). On average, just under two thirds of EPL teams spent more than £1m in (disclosed) transfer fees in any given January window, with just over a third spending more than £5m and a fifth spending more than £10m. There are four clubs that spent more than £30m in January: Chelsea in 2010/11 and 2013/14, Liverpool in 2010/11 and Man United in 2013/14. The average change in points/game between the two halves of the season is close to zero[2] and there is no significant correlation with the level of spending.


Figure 1: Change in the average points-per-game measured before and after 1st January against total spending in the January transfer window for all EPL teams in each of the last six seasons. 

Not all teams will be looking for an immediate return on their investment in January. Some will be buying back-up to their first team or young players for the future. The teams that will certainly be looking for an immediate impact are those embroiled in the fight to remain in the EPL. In Figure 2 I’ve highlighted the relegation-threatened teams in each season. Specifically, this includes all teams that were in the bottom 6 positions in the table on January 1st, plus those that went on to be relegated at the end of the season (as you’d expect, most relegated teams were also in the bottom 6 in January)[3]. Teams that were relegated are coloured red; those that survived are blue. 

Figure 2: Change in the average points-per-game measured before and after 1st January against total spending in the January transfer window for all EPL teams (grey crosses) in each of the last six seasons. Teams marked by a square were in the bottom six of the table on 1st January; those in red were relegated, those in blue survived.
There are a couple of interesting things about this plot. First -- the majority of relegation-threatened teams see an improvement in their results in the second half of the season. I think this is just mean reversion: teams that underperform in the first half of the season are likely to do better in the second half. For example, over the last six seasons, teams in the bottom half of the table collected an average of 0.2 points/game more in the second half of the season than the first. The opposite is true of teams in the top half of the table: they tended to be an average of 0.2 points/game worse-off in the second half of the season. 

Second -- there is no significant correlation between spending and improvement in results for relegation-threatened teams. If we split them into two groups, those that spent greater than £5m in January and those that spent less, we find that 38% (6/16) of the high spenders and 55% (12/22) of the low spenders were relegated. This difference is probably not big enough to be significant. Raising the stakes higher – of the four relegation-threatened teams that spent more than £20m in January, three were relegated: Newcastle & Norwich last year, and QPR in 2012/13.

It seems reasonable to conclude that teams should resist the temptation to try to spend their way out of trouble: there is little evidence that it will pay off. It looks like Bain is being prudent in tightening the purse strings.

-----

[1] Note that for some teams it will be an underestimate as the transfer fee was never disclosed.
[2] This doesn’t have to be the case. For instance, there could be more draws in the first or second half of the season.
[3] The results don't change significantly if we selected relegation-threatened teams as being those within a fixed number of points from the relegation zone.

Friday, 2 December 2016

Playing in Europe does affect domestic results in the EPL

There’s recently been a bit of discussion in the media (e.g: Sky, Guardian) on whether participation in European competitions has a negative impact on an EPL club’s domestic performance. This is partly motivated by the significant improvements shown by Liverpool and Chelsea this season: after 13 games they are 10 and 17 points better off than at the same stage last season, respectively. Neither are playing in Europe this year. Leicester are demonstrating a similar trait, albeit in the opposite direction: they are now 15 points worse off than last season. For them, the Champions League seems to have been a significant distraction.

Numerous studies have demonstrated that there is no ‘hangover’ effect (see here and here) from playing in Europe. There is no evidence that EPL teams consistently perform worse in league matches that immediately follow a midweek European fixture. But what about the longer-term impact? Perhaps the mental and physical exertion of playing against the best teams in Europe manifests itself gradually over a season, rather than in the immediate aftermath of European games. If this is the case, we should be able to relate variations in an EPL team’s points haul from season-to-season to the difference in the number of European fixtures it played.

It turns out that there is indeed evidence for a longer-term impact. The scatter plot below shows the difference in the number of European games played by EPL teams in successive seasons against the change in their final points total, over the last 10 seasons. Each point represents a single club over successive seasons. For instance, the right-most point shows Fulham FC from the 08/09 to 09/10 season: in 09/10 they played 15 games in the Europa cup (having not played in Europe in 08/09) and collected 7 fewer points in the EPL. Teams are only included in the plot if they played in European competitions in one or both of two successive seasons[1]. The green points indicate the results for this season relative to last (up to game week 13); the potential impact of European football (or lack of) on Chelsea, Liverpool, Southampton and Leicester is evident. Chelsea's league performance from 2014/15 to 2015/16 is a clear outlier: they played the same number of Champions League games but ended last season 37 points worse off.
Effect of participation in European competitions on a team's points total in the EPL over successive seasons. Green diamonds show the latest results for this season compared to the same stage last season. Blue dashed line shows results of a linear regression. 

The blue dashed line shows the results of a simple linear regression. Although the relationship is not particularly strong – the r-square statistic is 0.2 – it’s certainly statistically significant[2]. The slope coefficient of the regression implies that, for each extra game a team plays in the Europe, they can expect to lose half a point relative to the previous season. So, if a team plays 12 more games, it will be 6 points worse off (on average) than the previous season. 

It’s worth noting that the CIES Football Observatory performed a similar analysis in a comprehensive report on this topic published earlier this year.  They found there to be no relationship between domestic form and European participation over successive seasons. However, in their analysis they combined results from 15 different leagues across Europe. So perhaps the effect is more pronounced in the EPL than other leagues? This recent article in the Guardian, citing work by Omar Chaudhuri, suggests that the effects of playing in Europe may be more pronounced in highly competitive divisions. The lack of a winter break may also be a factor: while teams in Italy, Spain and Germany enjoy several weeks rest, EPL teams will play four league matches over the Christmas period. 

Finally, an obvious question is whether we are simply measuring the effects of playing more games across a season. To test this, we should apply the same analysis to progress in domestic cup competitions. However, I’ll leave that to the next blog.


----------------------

[1]. The points along x=0 are teams that played the same number of European games in successive seasons (and did play in Europe both seasons). The only two teams that are omitted are Wigan and Birmingham City, both of whom played in the Europa League while in the Championship. Matches played in preliminary rounds are not counted.
[2] The null hypothesis of no correlation is resoundingly rejected.