Calculating Expected Goal Differential 1.0
/The basic premise of expected goal differential is to assess how dangerous a team's shots are, and how dangerous its opponent's shots are. A team that gets a lot of dangerous shots inside the box, but doesn't give up such shots on defense, is likely to be doing something tactically or skillfully, and is likely to be able to reproduce those results.
The challenge to creating expected goal differential (xGD), then, is to obtain data that measures the difficulty of each shot all season long. Our xGD 1.0 utilized six zones on the field to parse out the dangerous shots from those less so. Soon, we will create xGD 2.0 in which shots are not only sorted by location, but also by body part (head vs. foot) and by run of play (typical vs. free kick or penalty). Obviously kicked shots are more dangerous than headed shots, and penalty kicks are more dangerous than other shots from zone two, the location just behind the six-yard box.
So now, for the calculations.
Across the entire league, for all 8,291 shots taken in 2013, we calculate the proportion of shots from each zone that were finished (scored):
Location | Goals | Shots | Finish% |
One | 129 | 415 | 31.1% |
Two | 451 | 2547 | 17.7% |
Three | 100 | 1401 | 7.1% |
Four | 85 | 1596 | 5.3% |
Five | 51 | 2190 | 2.3% |
Six | 5 | 142 | 3.5% |
We see that shots from zones one and two are the most dangerous, while shots from farther out or from wider angles are less dangerous. To calculate a team's offensive "dangerousness," we count the number of shots each team attempted from each zone, and then multiply each total by the league's finishing rate. As an example, here we have Sporting Kansas City's offensive totals:
Locations | Goals | Attempts | Finish% | ExpGoals |
One | 5 | 18 | 31.1% | 5.6 |
Two | 29 | 160 | 17.7% | 28.3 |
Three | 5 | 78 | 7.1% | 5.6 |
Four | 3 | 97 | 5.3% | 5.2 |
Five | 2 | 120 | 2.3% | 2.8 |
Six | 1 | 17 | 3.5% | 0.6 |
Total | 45 | 490 | 9.2% | 48.1 |
Offensively, if SKC had finished at the league average rate from each respective zone, then it would have scored about 48 goals. Now let's focus on SKC's defensive shot totals:
Locations | Goals | Attempts | Finish% | ExpGoals |
One | 4 | 13 | 31.1% | 4.0 |
Two | 17 | 95 | 17.7% | 16.8 |
Three | 4 | 54 | 7.1% | 3.9 |
Four | 4 | 56 | 5.3% | 3.0 |
Five | 1 | 84 | 2.3% | 2.0 |
Six | 0 | 4 | 3.5% | 0.1 |
Total | 30 | 306 | 9.8% | 29.8 |
Defensively, had SKC allowed the league average finishing rate from each zone, it would have allowed about 30 goals (incidentally, that's exactly what it did allow, ignoring own goals).
Subtracting expected goals against from expected goals for, we get a team's expected goal differential. Expected goal differential works so well as a predictor because teams are more capable of repeating their ability to get good (or bad) shots for themselves, and allow good (or bad) shots to their opponents. An extreme game in which a team finishes a high percentage of shots won't sway that team's xGD, nor that of its opponents, making xGD a better indicator of "true talent" at the team level.
As for xGD 2.0, coming soon to a laptop near you, the main difference is that there will be additional shot types to consider. Instead of just six zones, now there will be six zones broken down by headed and kicked shots (12 total zones) in addition to free kick---and possibly even penalty kick---opportunities (adding, at most, four more shot types). As with xGD 1.0, a team's attempts for each type of shot will be multiplied by the league's average finishing rates, and then those totals will be summed to find expected goals for and expected goals against.