Reep Revisited

By Dave Laidig (@davelaidig)

I recently created a decent set of MLS possession data while working on another project, and I was curious if the patterns of the famous Reep analysis would hold for MLS. Thus, I attempted to replicate his result, and perhaps offer a couple new perspectives to the data.

I was first introduced to the legacy of Charles Reep while reading The Numbers Game (by Chris Anderson & David Sally). Reep was an early advocate for applying statistics to soccer, and was famous for tracking game events by hand over many seasons. According to his data, most goals were scored from possessions with three passes or fewer. And this was taken as empirical justification to play directly; minimizing the touches with longer passes in order to improve results.

Although Reep’s status as a pioneer in the sport is secure, many still debate the results and interpretation. Some critiques assert the underlying data was misinterpreted. Highlighting a simple majority of goals may not be the best analysis when most possessions had three or fewer passes anyway. Others suggest the structure of the analysis confuses correlation with causation; leading to misapplication of the results. In short, one can’t tell if the results were caused by the number of passes, or whether some other factors have causal roles. As I attempt to recreate the analysis; it’s worth stating the same criticisms and critiques apply to this replication effort as well.

Possession Defined

Possession is a surprisingly slippery concept; and various analysts apply different definitions. I apply the same possession definition I use for calculating player values, with a small update since my last article. To recap, my possession begins with a completed pass, dribble, shot, GK claim, GK pickup, or a successful GK sweeper action. In addition, incomplete passes during free kicks, corner kicks, and throw-ins can start a possession.

And once a possession starts, the chain continues until the team takes a shot, or the opponent has an offensive action (shot, dribble, pass), or the half ends. Note, an opponent can interrupt a possession without creating a possession of their own.

The number of passes, as used here, represents a simple count of the number of pass actions in a possession chain. No effort was made to track the actual number of unique players involved in the possession. And “zero passes” means a player dribbled and lost the ball, or took a shot without passing.

Overall Possession Results

For this review, I examined the 2015 through 2018 MLS seasons and 264,862 possessions. Over that period, 1.6% of possessions resulted in a goal. And as one would expect, the average xG per possession was 0.016.

In addition, I tracked ball advancement as intermediate measures of success. Overall, 70.8% of possessions entered the attacking half; and 49.6% of possessions made it into the final third of the field.

Passes Possessions (n) Goals
0 7,994 597
1 39,838 467
2 53,642 501
3 36,554 459
4 26,531 398
5 20,115 293
6 15,799 234
7 12,609 224
8 10,033 163
9 8,091 148
10 6,487 135
11 5,463 102
12 4,364 94
13 3,331 57
14 2,753 45
15 2,225 44
16 1,724 29
17 1,354 27
18 1,153 37
19 959 20
20 717 13
21+ 3,126 67

In the seasons examined, MLS data seem to mirror the Reep results. Over 48% of the goals resulted from possessions with three passes or fewer. Of course, the bulk of possessions had three passes or fewer, about 52% of them in fact.

But if we adjust for the number of possessions, do the results hold? Are fewer passes associated with better results? To address these questions, I plotted the possession results, adjusted for the number of possessions, against the length of the possession (i.e., the number of passes).

In this chart, the red line represents results in terms of average xG per possession; blue is the percentage of possessions with goals; and the green line is the MLS average for all possessions. Setting aside the zero pass possessions, the percentage of goals and the average xG per possession both increase with the number of passes. And possessions with four or fewer passes have below average results; and above average results with 6+ passes. The sample sizes drop below 1000 for possessions with 19 passes and higher, and the variability seems to increase as well.

Zero-pass possessions break the trend, possibly due to how I define a possession. Because I focus on a player controlling the ball; each shot attempt ends a possession by definition. In other words, the player ceases to control the ball while attempting to score. It’s possible the defensive team will end up controlling the ball next (via the keeper catching the shot, or a poor shot travelling out of bounds). But it’s also possible for the same team to start the next possession; by recovering a GK deflection, or winning a corner after a block. In these situations, a player may find themselves “starting” a possession in the opponent’s penalty area, and immediately shoot at goal. These situations, along with forwards pressing a keeper or center back into a blunder, likely account for the unusual zero-pass possession results.

In addition to possession results, we can look at ball advancement as an indirect measure of success. The chart below shows the percentage of possessions that enter the attacking half, and the final third (along with the MLS average for all possessions).

Here, the shape of the curves for ball advancement roughly mirrors that of possession results. The possessions with three or fewer passes are advanced less frequently than the league average. In contrast, possessions with four passes enter the attacking third in 73.8% of possessions. And they enter the final third in 50.2% of possessions. Both are slightly higher than the MLS averages (70.8% attacking half, 49.6% final third).

Combining all the measures, a few generalities can be made. Possessions with a three or fewer passes are associated with below average performance. Performance tends to be about average around four or five passes, and slowly increases until the number of passes gets to 20 or so. Also, the zero-pass condition remains an outlier on all measures.

In sum, longer possessions are associated with better results across several measures. Possessions with five or more passes tend to have above average results on the whole.

Effect of Possession Start Location

However, not all possessions start in the same location, which affects likely possession results. For example, almost half of all possessions begin in the defensive third. And these possessions have one-fourth the likelihood of scoring as the small percentage of possessions (10%) beginning in the attacking third.

Zone Number of Possessions Started % of Overall Possessions Avg xG Result
Defensive Third 127,778 48.2% 0.010
D – Left 19,746 7.5% 0.008
D – Center 88,570 33.4% 0.011
D – Right 19,462 7.3% 0.008
Zone Number of Possessions Started % of Overall Possessions Avg xG Result
Central Third 110,899 41.9% 0.015
C – Left 25,316 9.6% 0.013
C – Center 59,924 22.6% 0.016
C – Right 25,659 9.7% 0.013
Zone Number of Possessions Started % of Overall Possessions Avg xG Result
Attacking Third 26,467 10.0% 0.044
A – Left 7,943 3.0% 0.025
A – Center 9,736 3.7% 0.075
A – Right 8,788 3.3% 0.026

It seems fairly intuitive that winning the ball close to goal means a better chance at scoring; and perhaps the team would need fewer passes to get a good shot opportunity. As a result, considering the starting location of possessions for a Reep-style analysis may provide some insight.

Possession Results and Passes for Selected Field Locations

To investigate, I used four areas for a sampling of potential possession-length associations; (A) defensive Penalty Area, (B) off to a side in the defensive half, (C) defensive center circle, and (D) middle of the field just into the attacking third.

Zone A – Defensive Penalty Area

Zone A begins in the defensive penalty area. The average possession result from this area is 0.009 xG/poss and 0.009 goals/poss.  Of possessions beginning in this area, 57% enter the attacking half while 35% enter the final third. The possession chain results are in the table below (cells that are above the location’s average are bolded).

Passes Possessions (n) Avg Result (xG) Percent in Attacking Half Percent in Final Third Percent Resulting in Goals
0 1,069 0.001 1.2% 1.0% 0.0%
1 8,626 0.001 17.3% 6.3% 0.1%
2 9,724 0.003 53.0% 17.0% 0.4%
3 5,649 0.008 51.0% 28.8% 1.0%
4 4,298 0.011 59.1% 37.2% 1.2%
5 3,359 0.014 69.1% 45.7% 1.1%
6 2,632 0.014 74.9% 52.5% 1.3%
7 2,161 0.017 80.1% 55.8% 1.6%
8 1,840 0.018 84.1% 62.0% 1.6%
9 1,434 0.019 88.1% 66.5% 2.2%
10 1,168 0.019 90.6% 69.3% 2.1%
11 1,018 0.018 93.0% 72.5% 1.8%
12 846 0.018 94.8% 77.3% 2.2%
13, 14 1,078 0.024 94.9% 77.1% 1.9%
15-17 1,084 0.020 96.9% 81.6% 2.0%
18-20 520 0.022 98.8% 85.6% 1.7%
21-29 493 0.025 99.6% 88.4% 2.2%
30+ 127 0.026 100% 94.5% 4.7%

From deep in a team’s own zone, shorter possession chains are associated with below average results. It is not until a possession has four or more passes before the typical results exceed the average. And here, the zero passes condition does not seem like an outlier. The chances of scoring steadily increase until it plateaus a bit around nine or ten passes. All in all, this chart tends to support that continued possession means better results.

Zone B – Left Side in Defensive Half

Zone B begins on the left side of the field, with some parts in the defensive third and some in the central third. Beginning a possession in this area had an average result of 0.011 xG/poss and 0.010 goals/poss.  Overall, 58% of possessions enter the attacking half while 36% enter the final third. The possession chain results are in the table below (cells that are above the location’s average are bolded).

Passes Possessions (n) Avg Result in xG Percent in Attacking Half Percent in Final Third Percent Resulting in Goals
0 248 0.002 17.3% 6.9% 0.0%
1 2,833 0.003 23.0% 8.4% 0.3%
2 3,702 0.005 35.8% 13.7% 0.4%
3 2,725 0.008 49.8% 25.1% 0.7%
4 1,819 0.014 62.1% 37.0% 1.5%
5 1,474 0.013 72.2% 46.6% 1.0%
6 1,161 0.015 76.7% 52.4% 1.5%
7 858 0.019 83.2% 58.7% 1.6%
8 718 0.017 85.9% 64.5% 1.9%
9 609 0.017 89.8% 67.5% 1.5%
10 513 0.017 90.8% 71.2% 1.2%
11 404 0.020 94.6% 72.8% 2.0%
12 320 0.025 95.3% 76.9% 1.6%
13-15 579 0.020 95.9% 81.2% 2.1%
16-20 460 0.020 97.8% 85.0% 1.1%
21+ 253 0.027 99.6% 93.7% 1.2%

Similar to the results from the defensive penalty area, this zone also reports below average results when possessions have three or fewer passes. And the possessions with zero passes have the lowest rates for xG, advancement, and goals. This chart also supports that continued possession is associated with better results.

Zone C –Defensive Center Circle

Zone C begins on the defensive side of the center circle. Beginning a possession in this area had an average result of 0.013 xG/poss and 0.013 goals/poss.  And 83% of possessions enter the attacking half while 49% enter the final third. The possession chain results are in the table below (cells that are above the location’s average are bolded).

Passes Possessions (n) Avg Result in xG Percent in Attacking Half Percent in Final Third Percent Resulting in Goals
0 88 0.009 39.8% 20.5% 1.1%
1 644 0.013 80.6% 28.7% 1.6%
2 1,572 0.010 66.3% 28.2% 1.1%
3 1,406 0.011 79.2% 36.1% 0.9%
4 1,124 0.009 83.6% 45.9% 0.5%
5 849 0.014 84.5% 49.1% 1.5%
6 757 0.012 90.0% 57.5% 0.7%
7 557 0.014 88.9% 55.7% 1.8%
8 481 0.015 93.1% 63.4% 0.8%
9 343 0.023 94.8% 66.8% 2.3%
10 313 0.018 93.6% 68.1% 2.2%
11 242 0.028 96.3% 71.5% 2.5%
12-13 319 0.017 96.9% 77.4% 2.8%
14-16 312 0.013 97.8% 82.1% 1.0%
17+ 310 0.024 99.7% 88.7% 2.9%

Here, the possessions with fewer passes also have below average results, but the relationship is less consistent. Possessions with four or fewer passes had below average xG and goals per possession. And the zero passes condition is not much below the possessions with a couple passes in them. But in spite of some variance in the results; the trend remains that longer possessions are associated with better results across performance measures.

Zone D – Attacking Third, Middle of the Field

Zone D begins just inside the attacking third, in the center or the field. Beginning a possession in this area had an average result of 0.032 xG/poss and 0.040 goals/poss.The possession chain results are in the table below (cells that are above the location’s average are bolded).

Passes Possessions (n) Avg Result in xG Percent in Attacking Half Percent in Final Third Percent Resulting in Goals
0 414 0.030 100% 100% 3.6%
1 354 0.046 100% 100% 7.9%
2 421 0.033 100% 100% 3.6%
3 236 0.026 100% 100% 2.1%
4 163 0.020 100% 100% 3.1%
5 133 0.036 100% 100% 3.8%
6 93 0.031 100% 100% 3.2%
7 81 0.037 100% 100% 4.9%
8 9 104 0.023 100% 100% 1.9%
10-14 100 0.027 100% 100% 3.0%
15+ 47 0.035 100% 100% 2.1%

Because the possession already starts in the attacking third, the ball advancement stats are not useful. And it is here, when a possession begins relatively close to goal, that we see pass number not correlated with results. The highlighted cells are above average, and there is no obvious pattern with the number passes.

And the results seem intuitive. Possessions starting in this area (about 30 to 35 yards from goal) can include opportunities to shoot with little advancement; as well as set pieces. Thus, it is reasonable that some quick possessions lead to good results from this area. And of course, longer possessions may allow time to find a gap in a bunkering defense.

Reviewing the different zones, it appears that longer possessions are associated with better results when possessions begin in the defensive half. And winning the ball close to goal does not have a discernable relationship between possession results and the number of passes.

Circling back to Reep, these results are not sufficient by themselves to dictate a particular tactical plan. Players are likely trying to score in all circumstances, and react to the opportunities as they see them. We cannot say for sure whether a different instruction would have led to a different possession length or result. Perhaps the possessions listed with fewer passes would have continued if not for defensive efforts. Or conversely, the longer possessions may have been shorter if a scoring opportunity presented itself.

In short, correlation is not causation. And an association between possessions with more passes and scoring does not mean that additional passes directly cause increase scoring. However, the relationship between possession length and results does exist, especially when one examines areas individually. And for teams, with direct knowledge of tactical instructions and no need to infer player intent, can obtain better evidence than that used here. Overall, the MLS data does justify a deeper dive into the contours of the relationship between maintaining possession and the probability of scoring.