Possession Confusion
/Consider every conversation ever had about soccer tactics. I would bet 99.9% of them touched on one specific subject: possession. Whether it’s the men’s league team you play for, or the club team you cheer for, isn’t more possession always a good thing? I can’t answer that question confidently, but I will explore it. The first obstacle to analyzing and discussing possession in MLS is the data itself. We get our data from Opta, and this is what Opta defines as possession:
During the game, the passes for each team are totaled up, and then each team's total is divided by the game total to produce a percentage figure which shows the percentage of the game that each team has accrued in possession of the ball.
“Possession” in Opta’s data is thus a measure of the proportion of completed passes in a match for each team, not a proportion of time. A lot of short, quick passes will accrue possession for a team that may only have the ball for a matter of seconds. This isn’t necessarily bad or good. It is what it is, and we’ll work with it.
Not all passes are created equally---or better put, not all teams' passes average out to be equally effective---but for a moment let’s suppose that they are. It’s hard to gather data on the value of each pass, and hard to then weight teams’ passes accordingly. So let’s just stick with the assumption that all teams' passes are equally effective. Perhaps someday we can sit around drinking beer and punching holes in that assumption. Today is not that day.
Under that assumption of equal passes, a team that completes a higher proportion of passes than its opponent will likely have strung together effective buildup more often than its opponent. Having created more effective build up, that team will likely have earned more scoring opportunities than its opponent. Having earned more scoring opportunities than its opponent, that team will be more likely to score goals and nab points. So this sort of possession should really imply sunshine and rainbows for the participating team. Seems like fair logic to me, but of course, I’m the one writing.
Looking at the tables—tables that were created with Opta’s version of possession, remember—we don’t see a strong correlation between possession and results. Four of the top five teams (by points per match) have 50% possession or less, but overall there is still a weakly positive correlation. We start to get significant results when we assess the correlations between teams’ possession and Attempt Ratios (0.60*), and again with Shots on Goal Ratios (0.55*). Those positive correlations imply that more possession coincides with more scoring chances. Of course, there is not nececelery a causal link.
Let’s take a look at this from another perspective. If we look at the relationships game-by-game—rather than team-by-team—the correlation between possession and scoring chances is still positive. The team that possesses the ball for a majority of passes (Opta’s definition) during any given match also tends to earn more scoring attempts than its opponent.
So far I’ve bored you with support for conventional wisdom: possession coincides with more scoring opportunities, and thus probably with better results.
But then I control for a few variables and shit goes haywire.
When I control for each individual team and whether or not they were playing at home, the relationship between possession and results is decidedly negative. In fact, a team that possesses the ball an additional 10% in any given match is expected to lose half of a goal on average, equivalent to about half of a point. For example’s sake, consider the Seattle Flounders Sounders. Over Seattle’s top four matches in terms of possession, it has earned just one point. However, during Seattle’s bottom four matches in terms of possession, it has earned eight points. Seattle is an extreme case, but a good example of what my model is picking up. Most teams individually seem to do worse when their possession is higher.
So more possession seems to correlate with more shots, and more shots seems to correlate with more goals, but for some reason more possession does not share a significant relationship with more goals. There is some missing information screwing with me, and I don’t have a definitive explanation for this strange paradox, but I will share a theory.
Each team has a style. Whether or not that style works is probably mostly a product of how well the players fit in, and how good those players are in the first place. Perhaps, in general, a style that focuses more on stringing short passes together tends to produce more shots than a high-risk/high-reward style, but this type of possession is not a necessary condition for success. Once each team develops its style, a certain amount of possession is required to optimize that style. For Montreal, it may be 49% possession, and for Portland, it might be 57%. This would explain the mild positive correlations between possession and shots across teams.
But why is it that, across games, more possession seems to correspond to less goals and worse results?
In a given game, if a team generates more possession—more passing by Opta’s definition—then perhaps that is indicative more of the opponent’s defense than of the desire of the team in question to possess. In other words, an excellent defense may not necessarily kill possession, but rather, push possession to less dangerous parts of the pitch. In this way, more possession is simply indicative of a frustrated team, not a team in control doing what it wants to do.
Without being able to conclude this thought exercise satisfyingly, I will propose a few things. First, that by charting each shot’s point of origin, we can begin to assess the quality of a team’s shots. And second, that possession data should be gathered from the distinct areas on the pitch. Possession in the attacking third is likely more valuable than possession in the defensive third. Some combination of these two measurements could very well help to explain the paradox we’re seeing with passing possession and team success.
*A perfect positive correlation would be 1.0.