State of MLS Analytics: March 2021

State of MLS Analytics: March 2021

This is an update to last Summer’s article on the state of analytics at MLS clubs. The last few months have been a big one for MLS analytics signings. Harvard’s Laurie Shaw was hired by City Football Group, former Opta and SportLogiq employee Sam Gregory took his talents to Ft. Lauderdale, Cory Jez transferred from the Utah Jazz to Austin FC, Nikos Overheul moved to Vancouver after working for StatsBomb and Smartodds, and American Soccer Analysis’ own Sam Goldberg and Kevin Minkus were hired by New York Red Bulls and the Chicago Fire, respectively. Given this, teams are polarizing into the haves and have nots. In this update I’ve dropped the Tier 1.5 “Definitely Know What xG Is“ as teams in that tier moved up.

Last year, Kevin Minkus wrote Soccer Analytics 101 over at MLSsoccer.com where he defined analytics as “using data and statistics to better understand something.” For the purposes of deciding what MLS teams have an analytics staff member the “something” is player recruitment and tactical analysis. I’m talking about using numbers and mathematical models (e.g. xG, xA, g+) to help evaluate transfer targets and team and player performance.

Read More

2020 Season Preview: Montreal Impact

2020 Season Preview: Montreal Impact

The Montreal Impact go into 2020 with some excitement - Thierry Henry is the manager! Tempered by a lot of question marks - How does Henry want to play? Who scores the goals? Can the defense hold up? In light of those questions, 2020 will probably be a rebuilding year. Montreal return about 65% of their minutes from 2019, the 8th fewest in the league, but have so far brought in only one or maybe two actual starters. They clearly need a few more pieces to get them close to playoff contention. It’s not a terrible strategy to let Henry work with what he has, and then figure out in the summer and next winter what’s missing. But it could make for a long 2020.

Read More

A Tale of Two Central Defensive Midfielders

Michael Bradley and Wil Trapp share several obvious qualities. They are both captains for club and country. They are both smooth passing defensive midfielders, and they both possess excellent heads of hair. Another similarity is that they rarely shoot or score goals, each collecting only one goal over the last three seasons. Coincidentally, both of those goals are what we could enthusiastically describe as "wonder-goals." Bradley's long-distance chip for the US national team in a World Cup qualifier against Mexico at the Azteca (a goal not remembered as fondly as it deserves due to the rest of qualifying) and Trapp for the Crew to win a match in stoppage time against Orlando City this past summer. However, one difference between these two players was how each responded to the confidence boost that came after scoring a once-in-a-career goal.

Read More

Evaluating Defensive Prospects for the Expansion Draft

FC Cincinnati will face many difficult decisions over the next three months, as they build their expansion team to be ready for the 2019 season. Their next set of choices takes place today, in the Expansion Draft. What the team decides there may not make or break their season, but they do have the opportunity to add important pieces for their inaugural year.

One strategy, the path I’ll discuss here, is for Cincinnati to grab cheap, young players. The hope is that, while they weren’t key contributors for their former teams, those players will continue to develop. A team with enough of these works in progress, and with a sufficient capacity to develop them, might reasonably hope for a few to pan out into full-time starters.

Read More

Postseason Preview: Real Salt Lake

Postseason Preview: Real Salt Lake

After an encouraging 2017 featuring the emergence of a handful of exciting young talent, Real Salt Lake seemed poised to take a step forward in 2018. Technically they did, by making the playoffs on the last day of the regular season, courtesy of an epic collapse from the LA Galaxy. RSL found themselves in that precarious position thanks to a lot of inconsistency. The team’s stretch run, for example, featured a 6-2 dismantling of the Galaxy followed by a home draw to Minnesota, and a 1-1 tie at Kansas City (maybe RSL’s best performance of the year given the context) followed by two blowout losses to Portland. Those painful ups and downs are what happens when you build a team on still-developing stars - it’s just a part of the process. Here it is in graphical form, with their 4-game rolling xGD:

Read More

How Long Does It Take a Team to Mesh?

By Kevin Minkus (@kevinminkus)

While beginning a season 0-3-0 does not a happy fan base make, Sunday's win over Philadelphia has some Chicago Fire fans feeling at least a little better about the team's rebuilding process. Throughout the beginning of the season, coach Frank Yallop has frequently stressed that the team needs time to adjust to each other. After all, they brought in three new designated players during the off-season, and are returning players who accounted for only 63% of last year's minutes (the league average over the last four seasons is around 71%). It should take a while for all of those new pieces to mesh from the somewhat disjointed side we've seen into a coherent whole. But, given the Fire's level of roster turnover, how long should we expect the meshing process to take?

The term “meshing” is a slippery one, and can be defined in any number of ways.  Is it when a team's roster turnover no longer informs its results? Is it when a team's results sufficiently indicate its performance for the rest of the season? Is it when a team reaches the level of performance it will remain at throughout the rest of the season (if, in fact, a team can ever be expected to do so)?

Each of these definitions could be argued as valid, and I'm sure there are many other possible definitions not considered here. As it stands, though, these are the three I will analyze, using MLS data since 2011, in hopes of arriving at an answer to the question of how long it takes a team to mesh.

Let's start with the first definition- meshing defined as the number of games in which roster turnover still directly informs a team's results. 

This graph shows the correlation between points after x number of games and the percentage of a team's field minutes returned from the previous season. 

A positive correlation suggests that as roster stability increases, so does points earned. Numbers below the red line are not considered statistically different from zero (at 90% confidence). Note that the correlations in general aren't huge, but they do exist. As you can see, the correlation between roster stability and points peaks at game three, and remains statistically significant until game five (after which it remains insignificant until close to the end of the season).

A similar pattern exists if we look at defensive stability, though the correlation becomes doesn't become insignificant until after 8 games:

These two graphs, then, suggest (though perhaps not convincingly), that it may take as few as three or four games for a team in general to mesh, while it may take as many as eight for a defensive unit to come together.

Now let's take a look at the second definition- meshing defined as the point at which a team's results through some number of games “sufficiently” indicate what its results will look like for the rest of the season.

To do this, I've split teams into two groups- those with “high” roster turnover (in the top 50%), and those with “low” roster turnover (in the bottom 50%). I then regressed the team's final points total on the team's points total after x games, for each of the two groups. The Rsquared values for each of these regressions are graphed below, with the linear models from the set of all teams included as well. So essentially what we are looking at it is how well we can predict how a team will finish the season, based on what they've done after a given number of games.

Through six games, each game is about as predictive for each group, meaning that how well a team with high roster turnover does through six games is just as indicative of how that team will finish as how well a team with low roster turnover does through six games. That is to say, we don't gain any extra predictive power by knowing a team's level of roster turnover.

By game seven, though, high turnover teams begin to out-pace low turnover teams- by game seven we have a better idea of how high turnover teams will finish the season than low turnover teams. 

By game nine, the R2  value for high turnover teams is at .546, which is pretty high. We would expect predictions made using this nine game point total to be on average only about seven points off the final season total. That gets us pretty close for being barely a quarter of the way into the season.

 Though it's a normative statement not a positive one, and you could really draw the line anywhere, I would probably suggest that nine games is as good a place as any to set the limit on meshing based on our second definition. At the very least, we can say that after nine games we should have a decent idea of whether the rebuilding process will be successful in year one.

Finally, let's turn our attention to the third definition- meshing as the point at which a team reaches its consistent level of performance.

Let's investigate this phenomenon a little bit. 

Here's a graph of the three game rolling expected goal difference (at x = 4, the value on the y axis is the xGD from games two, three, and four, for example) for Sporting Kansas City last season- a decently representative mid-table team.  Expected goal differences provide a pretty reasonable statistic for gauging how good a team is.

It's pretty much all over the place. 

A three game rolling points per game graph of another mid-table team from last year, the Vancouver Whitecaps, tells a similar story:

These graphs point to something which I think is an important (though perhaps obvious) point to make; it's mostly unreasonable to expect game by game measures of a team's strength to converge over the course of a season. (Metrics like xGR (expected goal ratio), TSR (total shot ratio), and points per game will converge, but usually only when they're being calculated on aggregate.) There are a lot of reasons for this. Injuries, international call-ups, strength of schedule, and mid-season transfers are all factors which affect a team's consistency of performance. Teams, save maybe the very dominant and the very bad ones, just go through peaks and valleys throughout the year. They have good games and bad games. 

What does this mean for meshing, then?

Well, we've already seen that how a team performs at the start of the year can be predictive of where it finishes, particularly for teams with high turnover. The point above, though, suggests that how a team starts the year isn't necessarily indicative of how it will perform throughout the year. 

For teams who haven't quite come together yet, then, there is certainly still hope of righting the ship. Given the above analysis, I would expect the effects of having new players brought in to the system to begin to wear off by game four or five (though this may take a bit longer this season because of international call-ups). By game nine or ten, a team should have a decent idea of how well it has done in rebuilding its roster. If things remain bleak at that point, there is still the possibility of finding some success, but it may come only in limited doses.