The Ty Webb Award: MLS’ tallest team and other insights about height

Judge Smails: Ty, what did you shoot today?
Ty Webb: Oh, Judge, I don't keep score.
Judge Smails: Then how do you measure yourself with other golfers?
Ty Webb: By height.

-Caddyshack

By Jared Young (@JaredEYoung)

Ty Webb’s classic line is funny in part because no one measures the result of a sporting event based on how tall the participants are. But height is actually no laughing matter in soccer. An article written in 2011 by Chris Anderson revealed that, at least for national soccer teams, taller teams do perform better. The relationship produced a robust 0.53 r-squared. A recent trend showing that MLS homegrown players are smaller than players taken in the SuperDraft got me wondering more about the importance of tall players. How tall or short is MLS when it comes to height? And who in fact is the tallest current team in MLS, and therefore winner of the Ty Webb Award?

First, here is a look at how MLS stacks up against the top European leagues. This data is from mlssoccer.com rosters and an annual analysis of height in soccer managed by the CIES Football Observatory.

Interestingly, Germany and England clear nearly two centimeters per player over any of the other leagues in the analysis. That is pretty significant. MLS is the tallest league outside of those two countries, so they should not back down from any fights. It gets interesting when you look by position.

MLS stacks up quite consistently with Europe in all positions but midfield. MLS midfielders are 1.6 centimeters shorter on average than in Europe. Only Orlando City, Seattle and DC United have midfields that are taller than the European average. The Houston Dynamo, Sporting KC and the Vancouver Whitecaps all sport midfields four centimeters below that European average.

I dug into the midfield “issue” a bit. Players born in the US are just one centimeter shorter than the European average, but only 34 percent of US-born players are midfielders, compared to the league average of 40 percent. That means that foreign born players compensate for the lack of US midfielders, and they must be even shorter than average. Sure enough, Central and South American players represent the region that sources more than its share of shorter midfielders in MLS. In fact, 48 percent of players born in those regions who play in MLS play in the midfield, and they are much shorter than average at just 175.1 centimeters tall.

The Latin American players in MLS are also significantly shorter than the players overseas. They average 178.1 centimeters in MLS (when you include forwards, goalkeepers and defenders), and yet players from those regions are 181 centimeters over in Europe.

Players originally from the US are actually of slightly above average height compared to Europe, but MLS attracts foreign players that are below average height for soccer players. This is perhaps worth some future investigation.

Okay, onto the big trophy. This season’s Ty Webb Award for tallest team goes to the Colorado Rapids!

Here is a chart that shows all the MLS teams in order by average height.

I’ll leave you with a few observations:

*Colorado leads the league in defender height and also has a tall midfield, so their height is very much in the back. Colorado were tied for 5th last season in set piece goals.

*Sporting KC and Real Salt Lake, the two franchises most noted for winning the most with the least amount of salary, have the shortest teams.

*The expansion teams NYCFC and Orlando City SC are starting out tall.

*There does not appear to be any relationship between preseason favorites and the height of the team.

Despite the MLS targeting shorter homegrown players, folks like Ty Webb and soccer analysts will continue to question if that’s the right move. The latest understanding is that taller is better. That doesn’t mean that the Colorado Rapids are preseason favorites. It just means they may have a slight advantage over other teams they will face. 

In the next installment, we analyze which MLS team was most likely caught putting with the daughter of the Dean. Just putting, at night.

Do expected goals models lack style?

By Jared Young (@JaredEYoung)

Expected goals models are hip in the land of soccer statistics. If you have developed one, you are no doubt sporting some serious soccer knowledge. But it seems to be consistent across time and geography that the smart kids always lack a bit of style.

If you are reading this post you are probably at least reasonably aware of what an expected goals model is. It tells you how many goals a team should have scored given the shots they took. Analysts can then compare the goals actually scored with the goals a team was expected to score and use that insight to better understand players and teams and their abilities.

The best expected goals models incorporate almost everything imaginable about the shot. What body part did the shooter connect with? What were the exact X,Y coordinates of the shooter? What was the position of the goalie? Did the player receive a pass beforehand? Was it a set piece? All of these factors are part of the model. Like I said, they are really cool.

But as with all models of the real world, there is room for improvement. For example, expected goals models aren’t great at factoring in the number of defenders between the shooter and the goal. That could force a higher number of blocked shots or just force the shooter to take a more difficult shot than perhaps they would like to. On the opposite end of that spectrum, perhaps a shooter was wide open on a counterattack, the models would not likely recognize that situation and would undervalue the likelihood of a goal being scored. But I may have found something that will help in these instances.

I recently created a score that attempted to numerically define extreme styles of play. On the one end of the score are extreme counterattacking teams (score of 1) and on the other end are extreme possession-oriented teams (score of 7). The question is, if I overlay this score on top of expected goals models, will I find any opportunities like those mentioned above? It appears there are indeed places where looking at style will help.

I have only scored one full MLS season with the Proactive Score (PScore) so I’ll start with MLS in 2014, where I found two expected goals models with sufficient data. There is the model managed here by the American Soccer Analysis team (us!) and there is the publicly available data compiled by Michael Caley (@MC_of_A). Here is a chart of the full season’s average PScore and the difference between goals scored and expected goals scored for the ASA model and Michael Caley’s model.

Both models are pretty similar. If you were to draw a straight line regression through this data you would find nothing in particular. But allowing a polynomial curve to find a best fit reveals an interesting pattern in both charts. When the Pscores are below 3, indicating strong counterattacking play, the two models consistently under predict the number of goals scored. This makes sense given what I mentioned above; teams committed to the counterattack should find more space when shooting and should have a better chance of making their shots. Michael Caley’s model does a better job handling it, but there is still room for improvement.

It’s worth pointing out that teams that rely on the counterattack tend to be teams that consider themselves to be less talented (I repeat, tend to be). But you would think that less-talented teams would also be teams that would have shooters that are worse than average. The fact that counterattacking teams outperform the model indicates they might also be overcoming a talent gap to do so.

On the other hand, when the PScore is greater than 4, the models also underpredict the actual performance. This, however, might be for a different reason. Usually possession-oriented teams are facing more defenders when shooting. The bias here may be a result of the fact that teams that can outpossess their opponent to that level may also have the shooting talent to outperform the model.

Notice also where most teams reside, between 3 and 4. This appears to be no man’s land; a place where the uncommitted or incapable teams underperform.

Looking at teams in aggregate, however, comes with its share of bias, most notably the hypothesis I suggested for possession-oriented teams. To remove that bias, I looked at each game played in MLS in 2014, home and away, and plotted those same metrics. I did not have Michael Caley’s data by game, so I only looked at the ASA model.

For both home and away games there does appear to be a consistent bias against counterattacking teams. In games where teams produce strong counter-attacking Pscores of 1 or 2, we see them also typically outperforming expected goals (G - xG). Given that xG models are somewhat blind to defensive density it would make perfect sense that counterattacking teams shoot better than expected. By design they should have more open shots than teams that play possession soccer. It definitely appears to me that xG models should somehow factor in teams that are playing counterattacking soccer or they will under estimate goals for those teams.

What’s interesting is that same bias does not reveal itself as clearly at the other end of the spectrum, like we saw in the first graph. When looking at the high-possession teams -- the sixes and sevens -- the teams' efficiencies become murkier. If anything, it appears that being more proactive to an extreme is detrimental to efficiency (G - xG), especially for away teams. The best fit line doesn’t quite do the situation justice. When away teams are very possession-oriented with a PScore of 6 or 7, they actually underperform the ASA xG model by an average of 0.3 goals per game. That seems meaningful, and might suggest that gamestates are playing a role in confusing us. With larger samples sizes this phenomenon could be explored further, but for now it's safe to say that when a team plays a counter-attacking game, it tends to outperform its expected goals.

Focusing on home teams with high possession over the course of the season, we saw an uptick to goals minus expected goals. But It doesn’t appear the case that possession-oriented teams shoot better due to possession itself, based on the trends we saw from game to game. It seems that possession-oriented teams play that way because they have the talent to, and it’s the talent on the team that is driving them to outperform their expected goals.

So should xG models make adjustments for styles of play? It really depends on the goal of the model. If the goal is to be supremely accurate then I would say that the xG models should look at the style of play and make adjustments. However, style is something that is not specific to one shot, it looks over an entire game. Will modelers want to overlay macro conditions to their models rather than solely focus on the unique conditions of each shot?

Perhaps the model should allow this bias to continue. After all, it could reveal that counterattacking teams have an advantage in scoring as one would expect.

If the xG models look to isolate shots based on certain characteristics, perhaps they should strive to add data to each particular moment. Perhaps an aggregate overlay on counterattacks would be counterproductive as it would take the foot off the pedal of collecting better data for each shot taken. Perhaps this serves as inspiration to keep digging, keep mining for the data that helps fix this apparent bias. Perhaps it’s the impetus to shed the sweater vest and find an old worn-in pair of boots. Something a little more hip to match the intellect.

Visualizing MLS Salaries Compared to Other U.S. Leagues

By Drew Olsen (@drewjolsen)

With the MLS season rapidly approaching and players talking tougher about their demands for the new collective bargaining agreement (CBA), it helps to give some context for where MLS players stand compared to other leagues. Data is hard to come by for foreign leagues because they don't disclose much about salaries, but the other major American sports leagues are more forthcoming. By looking at the other major US leagues, we can examine how MLS wages match up against their fellow pro athletes.

There are many caveats to this comparison. To start, an increase in the minimum salary doesn’t even seem to be the MLS players’ major priority; that would be free agency. Second, MLS is famously secretive, and does not release the exact terms of any player contracts. We get these numbers from the Major League Soccer Players Union (MLSPU), which publicizes the cost against the salary cap of each MLS player a few times a season. Often, that number does not necessarily match the player's actual salary, as teams often use allocation money and other magic to limit their cap hit (a good example is David Villa, who made much more than the $60,000 he was listed at most recently). Indeed, league officials can be counted on to claim the inaccuracy of these salary releases each time they are published. But until they provide proof, claiming foul while also refusing to release the the “real” numbers reminds me of when I ask my baby cousin how he knows unicorns live in his backyard. He usually replies “because they do!”

With those limitations aside, below are presented the five major US sports with a variety of arbitrary bits of salary information, as determined to the best of my ability using public data. We've also got all MLS player salary information in a sortable table here.

League Average MedianAve-MedAve/MedMinSalaryMaxSalaryMax/MinTop salary as % of league totalTop salary > X lowest salaries combined
MLS $226,454$91,827$134,6272.47$36,500$7,167,500196.375.53160
NFL $1,900,000$770,000$1,130,0002.47$420,000$22,000,00052.380.68*
MLB $3,818,923$987,500†$2,831,4233.87$507,500$30,714,28660.520.88
NHL $2,696,069$2,000,000$696,0691.35$550,000$14,000,00025.450.7224
NBA $4,153,249$2,245,886$1,907,3641.85$507,336$20,644,40040.691.0877

There are a few things to note when looking at these numbers. MLS has a much smaller average wage and a much smaller total wage sum than all the other leagues; the $226,464 average wage is less than the minimum salary of all the other leagues. Conversely, MLS has no official maximum wage, which differs from leagues like the NHL, NBA or NFL, where there is a max possible salary and a harder and harsher salary cap. Furthermore, the other four leagues are the apex of talent and competition in the world for these sports, while the most generous MLS fan would be hard pressed to argue that the league is in the top five soccer leagues in the world.  Lastly, MLS is a league that continues to grow, entering 2015 with 20 teams. The NBA, NHL, and MLB all have 30 teams, and the NFL has 32. Perhaps MLS' comparatively small number of teams contributes to some of the difference. We could go on here, but these all essentially boil down to this: MLS doesn’t compare perfectly to the other American sports. But you already knew that.

That said, these are still the five major professional sports leagues in the US, and these are the leagues that we MLS fans are constantly measuring ourselves against. It’s also why this is still an interesting and valuable exercise.

In some areas, MLS falls in right among the other major sports, while in others it is a clear outlier. For instance, when dividing the average salary by the median salary across leagues, MLS is on par with the NFL and more equitable than MLB. This is one way to say that MLS salaries as a whole are not as skewed upwards as those of MLB. MLS’s major differences come in the income disparity at the extremes – those with the biggest salaries are simply making much more than those at the bottom, especially when compared to other leagues.

When an unheralded young player gets his first callup to the Majors, he is usually making around $500k, which is about 1/60th of the $30 million Clayton Kershaw will make in 2015. Similarly, the NBA has the highest average salary of any other league in the world (beating out the Indian Premier Cricket League. Who would have guessed?), and Kobe Bryant makes about 40 times the rookie minimum. While MLS has the shortest average career of the major sports at only about 2.5 years, the NFL is second shortest, and presumably comes with a more severe wear and tear on the body. In that league, Aaron Rodgers’ salary is 52 times the size of the league's lowest paid player. Kaka, the flashiest name on the roster of the newly promoted Orlando City, makes about $7 million per season, which is almost 200 times what players like Dylan Remick and Bradford Jamieson IV made last year. Indeed, Kaka alone made more than the 160 lowest paid players in MLS… combined, and he accounts for more than 5.5% of total league wages; next closest is the NBA where Bryant is at 1%.

*NBA and NHL Salaries were prorated linearly to fit on the graph. X-axis is reverse-ordered rank of individual player's salary.

*NBA and NHL Salaries were prorated linearly to fit on the graph. X-axis is reverse-ordered rank of individual player's salary.

To put this disparity in perspective, if all 145 players that made less than $50,000 last season were given a modest wage increase to $50k, it would cost MLS about $900,000. That sum is less than 1/7th what players like Defoe, Dempsey, and Bradley made last year, and pales in comparison to the $200 million that is about to be spent to build a new stadium for DC United.

These numbers help us paint a picture of what this small portion of the CBA means. They help explain why it just doesn’t ring true when Don Garber and the MLS ownership group claims poverty, even after they’ve signed a record-breaking TV deal and brought in the likes of Kaka, Dempsey, Bradley, Altidore, Villa, Giovinco, and Lampard for many millions of dollars each. Without a doubt, MLS is in a different place in its history, operates under different rules, and competes in a different market compared to the other four major American pro sports leagues. But it is still the apex of pro soccer talent in the USA and Canada, and so when the MLSPU asks for a raise in the minimum salary, it might be time for the league to listen.

*I wasn't able to find the number of players that are measured to create the published average NFL salary, but each team has 53 players, and there are 32 teams, so I multiplied by the average salary to estimate the sum of all NFL salaries (53*32*1900000=3,222,400,000).

† Because of the majors/minors aspect of MLB, median is a bit hard to judge. This article states the average salary and the number of players (910) that played last year. The median would be the 455th, player, which I found here, a list that only goes through 468. This obviously presumes that the 400 or so players not listed made less than those who are listed. In other words, this estimate may not be exact but it's probably pretty close.

‡ I couldn't find a comprehensive list of all NFL and MLB contracts and their sums, and so was unable to estimate how many of the lowest salaries were roughly equivalent to the biggest in those leagues.

ASA Series on Caponomics, Part Three: Midfielders

This is part three of a four-part series examining market inefficiency in Major League Soccer. The portion on forwards will be published next Friday. We recommend you first read the portions on goalkeepers and defenders.

By Tom Worville (@worville)

This week, we continue our evaluation of salary inefficiency in MLS with a look at midfielders. By comparing whoscored performance ratings to the salaries released by the MLS Players' Union, we can evaluate how well teams are spending their allotted salary cap space. According to Whoscored, midfielders are tied with goalkeepers as the lowest rated players in MLS, with an average 6.75 rating. They are also the second most expensive position, with the median salary for a midfielder being $207,338.24. Just as midfielders play in the middle of the field, their ratings place them in the middle of our metrics.

2014 Whoscored rating and salary by position
Position Average WS RatingMedian Salary
GK 6.75 $132,478.56
D 6.94 $152,419.91
M 6.75 $207,338.24
F 6.77 $221,506.11
Note: Median wages used due to the top wages for DP’s skewing the average significantly.

From grouping like-players in certain baskets I have found that you can compare players with similar attributes (in this case performance) to their price. Taking this idea and applying it to different player types, it means that I can compare Designated Players and see which teams have allocated their DP slots effectively. The table below shows a list of the current MLS DPs that play in the midfield.

Midfield Designated Players

Diego Valeri is the best midfield DP in the league by a small, but significant margin as he had a WS rating of 7.45 vs Graham Zusi’s 7.32 (9% increase). Valeri ($500,000) also cost significantly less than Zusi ($631,388), a 26% decrease. This is a brilliant example of a DP slot being allocated effectively. Portland are not paying Valeri extortionate wages, nor are they using the slot for a player which is performing at a league average level. Needless to say I would not say that Zusi is a poor player - second best DP midfielder in the league last year and only slightly more expensive than Valeri - just Valeri is better value for money. In fact, you could argue the Osvaldo Alonso is a better use of the DP slot than Valeri. Costing $400,000 (20% less), Alonso had a WS rating just 10% worse than Valeri. His 20% discount on wages shows effective cap management from the Sounders. Alonso is also a different sort of player to Zusi and Valeri - he’s more of a midfield enforcer than a creative, attack-minded midfielder. With their other two DP slots taken up by forwards (Clint Dempsey and Obafemi Martins), the choice by Seattle to use one of their DP slots for a more defensive player highlights their recognition that a balanced side is important. They could quite easily have used this third DP slot for an attacking midfielder, but instead used it on a defensive counter-weight to even the team out with a solid performer.

From the table above you can also see how much Jermaine Jones cost vs Javier Morales - both getting a WS rating of 7.24 last season. Despite both being DP’s, Jones costs significantly more than Morales ($3,252,500 vs $300,000). Clearly many DP’s are signed for more than just on-field performances, which justifies the league to bringing in Jones off the back of his great World Cup for the USMNT.  In terms of cap management, Morales on his performances alone indicates he is a great use of a DP slot. Performing at a 35% increase on the average MLS midfielder is definitely a good investment. For Jones, as long as he increases revenue generation for the Revs and helps retain a larger set of fans who keep coming to games and spend money, it’s a good use of a DP slot and of the excess salary paid to him also.

Top 10 Overall Midfielders
Player 2014 Team WS Rating Salary
Diego Valeri POR 7.45500,000
Graham Zusi SKC 7.32631,388
Osvaldo Alonso SEA 7.31400,000
Jermaine Jones NE 7.243,252,500
Javier Morales RSL 7.24300,000
Matías Laba VAN 7.17300,000
Maurice Edu PHI 7.11113,000
Pedro Morales VAN 7.031,410,900
Mauro Diaz DAL 6.96411,000
Michael Bradley TOR 6.946,500,000
Cristian Maidana PHI 6.92131,666
Tim Cahill NY 6.673,625,000
Alexander López HOU 6.3110,000

Moving away from DP’s, we have Lee Nguyen as the best non-DP midfielder in the league. Costing just $193,750 (6.5% decrease on average) and having a WS rating of 7.4 (46% above average) it’s easy to see how he was in the running for MVP last year. I openly backed him in the MVP race last year, and felt he lost unjustly to Robbie Keane who comes from a team full of quality attack minded players (Gyasi Zardes, Landon Donovan, Juninho, etc.). I would be unsurprised if New England gave him a new contract and made him a DP in due course. 

My question would be whether Nguyen can maintain the level of play he showed last season, or whether it was just a lucky season and he will perform at a lower level in the coming months.. He scored 35% of New England's goals in 2014, and led MLS in game winning goals. He was certainly an invaluable player for the Revs last season, and their continued success is likely to center on his form in 2015. On the other hand, it would also not come as a surprise if he left the league in search for a new challenge in Europe as no doubt there are clubs interested in him. If New England are to cash in on his excellent 2014, they will need to heavily invest in both a playmaker and a goalscorer - a rare breed of player that they currently have with Nguyen.

From the top 10 midfielders table below the most surprising inclusion is that of Michel of FC Dallas. His wage is also the lowest of the top 10 players in the league ($141,500) despite being ranked the eighth best - joint with JJ, Javier Morales and Marcelo Sarvas. One of the main reasons for this would be because he is one of the key set piece takers for FC Dallas - scoring seven penalties in the 2014 season. Nevertheless, this is an important trait to have within a squad and considering he costs 32% less than the average MLS midfielder, and it’s a worthwhile investment for a player that can be depended on in set-piece situations.

Robbie Rogers is another solid inclusion. Apart from being a great role model for all young gay sportsmen and women, he’s a good player to match. His ability to play either in defense or midfield is a useful addition to any squad, adding much needed depth and quality in those positions also. The seventh best midfielder in the league last year performed 37% above average, with a WS rating of 7.27 and cost just $167,500 which is a 20% decrease on the average midfielder salary. He’s unlikely to be able to command a DP salary just yet though, but this is probably a good thing for LA as they are getting a very versatile player who is a top 10 midfielder for a low wage. I’m hoping Rogers has another good season next year and manages to get into the USMNT squad for the upcoming Gold Cup.

Similar to Rogers, there are a few players who fit into the Midfielder/Defender basket nicely. These players already are great to have within a squad as they are capable of playing a couple of positions well, so even if they are performing at only a league average level (or maybe even slightly below average) their versatility makes up for it. In this basket there are the following players:

Defender/Midfielder Utility players
Player 2014 Team WS Rating Salary
Diego Valeri POR 7.45500,000
Lee Nguyen NE 7.4193,750
Brad Davis HOU 7.34392,162
Graham Zusi SKC 7.32631,388
Osvaldo Alonso SEA 7.31400,000
Benny Feilhaber SKC 7.29337,187
Robbie Rogers LA 7.27167,500
Jermaine Jones NE 7.243,252,500
Javier Morales RSL 7.24300,000
Marcelo Sarvas LA 7.24192,500
Michel DAL 7.24141,500
Darlington Nagbe POR 7.23260,000
Juninho LA 7.2325,000

Jorge Villafana was one of the most undervalued players in the league. He was the 11th best midfielder in the league with a Whoscored rating of 7.19 - 31% better than the average MLS midfielder. He was also the 20th cheapest in the league and cost 64% less than the average of $207,338.24. The Portland Timbers were right in swiftly placing Villafana on their protected list during the Expansion Draft late last year. While he mostly played in the defense for them, his versatility to move forward into the midfield only increases his value. Also notable from this table, with a WS rating 22% above average and costing just $87,000 (58% below average), Lovel Palmer  was a steal for the Chicago Fire last season. His WS rating puts him as the 17th best midfielder in the league too, which is great for such little money. It’s signings like this that allow Chicago Fire to freely spend in other areas - which can be seen by their signing of two DP strikers this offseason in David Accam and Kennedy Igboananike. As said previously, all of these players are excellent methods of creating value within the salary cap. Even Jordan Stewart, the worst performer of this set is only 1% worse than the league average WS rating and yet costs 32% less than average.

Now that we've looked at DPs and hybrid Defenders/Midfielders, lets take a look at the full list of players who played at least 10 games last season. The full list of qualifying midfielders can be found here.

A few things of note jump out. First is how Luis Gil and Sebastian Velasquez both performed at the same WS rating level last season (6.5, 18% below the midfielder average) for Real Salt Lake. It is true that Gil played a lot more minutes than Velsaquez, but the major difference with the players is that Gil cost $315,083 last season, whereas Velasquez cost just $48,825. Versus the league average that’s 52% more for Gil and 76% less for Velasquez. For Veslasquez, his performances can probably be excused considering the amount he is being paid is well below the league average. For Gil however, his contract is above some DP’s (teammate Javier Morales is one example) and his appearances are extremely poor in comparison. Velasquez’s move to NYCFC is a great one for him as a player, as he is likely to get more minutes at the new franchise. Gil, on the other hand, poses a problem: if he does not progress this season and start playing well, his salary is being wasted on what could becomes another DP for Real Salt Lake.

Finally I’m going to focus on three more defensive midfielders: Matias Laba, Tony Tchani and and Diego Chara. All three of these players played over 2500 minutes for their clubs last season, representing a key component of the midfield for their respective teams. They had Whoscored ratings of 7.16 (Tchani), 7.17 (Laba) and 7.18 (Chara). All roughly 30% better than the average MLS midfielder. Tchani and Chara also have very similar wages ($175,000 vs $170,000) which makes Chara the better midfielder out of the three in terms of cost and performance. Once again Chara highlights the impressive front office management by the Portland Timbers to allocate the salary cap effectively. He’s another example of a player in a position that you expect cannot get any better without taking a gamble - and likely spending more money - than what you already have in place. Similarly, Matias Laba cost $300,000 and was a DP for Vancouver last season. For me this indicates good cap management still, as Vancouver haven’t broken the bank to fill their DP slot and have also filled it with an effective and useful starter. Had Toronto not gone on a big spending spree last offseason, I doubt Laba would have been forced to move away, although he may not have gotten the minutes he did at Vancouver.

Clearly, there are a variety of methods that have been used to identify and play midfielders in MLS, and some teams seem to be better at finding value for their dollar than others. Check in next Friday for our final installment in Caponomics, where I discuss forwards.

ASA SERIES ON CAPONOMICS, PART TWO: DEFENDERS

This is part two of a four-part series examining market inefficiency in Major League Soccer. The portions on midfielders and forwards will be published on the next two Fridays. Part One: Introduction And Goalkeeping Application, can be found here.

By Tom Worville (@worville)

Following from the previous Caponomics post, which gave an introduction to what this is and how it can be applied, this post is the second in the series where I look at defenders. Defenders are the second-most affordable group of players in the league, costing teams about $150,000 per season, and they have the highest average Whoscored rating out of the four player positions.

2014 Whoscored rating and salary by position
Player 2014 Team WS Rating Salary
Robbie Rogers LA 7.27167,500
Michel DAL 7.24141,500
Jorge Villafaña POR 7.1974,431
Lovel Palmer CHI 7.0687,000
Chris Tierney NE 6.98103,333
Rodney Wallace POR 6.89175,000
Jordan Stewart SJ 6.73140,000
Position Average WS RatingMedian Salary
GK 6.75 $132,478.56
D 6.94 $152,419.91
M 6.75 $207,338.24
F 6.77 $221,506.11
Note: Median wages used due to the top wages for DP’s skewing the average significantly.

Lets begin by looking at the top 10 and bottom 10 defenders in MLS in 2014, according to Whoscored.

Top 10 defenders in 2014 according to Whoscored
Player 2014 Team WS Rating Salary
Kendall Waston VAN 7.8 201,242
Omar Gonzalez LA 7.59 1,250,000
Chad Marshall SEA 7.54 286,666
Norberto Paparat/to POR 7.44 100,000
Aurelien Collin KC 7.4 281,250
Matt Hedges DAL 7.35 120,000
Drew Moor COL 7.34 247,000
Alvas Powell POR 7.31 48,828
DaMarcus Beasley HOU 7.3 779,166
Clarence Goodson SJ 7.27 342,000
Bottom 10 defenders in 2014 according to Whoscored
Player 2014 Team WS Rating Salary
Kofi Sarkodie HOU 6.61 195,500
Corey Ashe HOU 6.57 174,705
Heath Pearce MTL 6.54 100,000
Krzysztof Król MTL 6.52 153,000
Dylan Remick SEA 6.49 36,500
Stephen Keel DAL 6.48 48,825
Richard Eckersle/y NY 6.46 373,333
Bradley Orr TOR 6.46 75,000
Thomas Piermayr COL 6.46 74,429
Maxim Tissot MTL 6.35 48,500

 Click here for the list of all qualifying defenders.

Last season, Kendall Waston of the Vancouver Whitecaps was the best defender in the league, at least by WS Rating. He averaged a WS Rating of 7.8 and only cost $50,000 more than the median defender ($201,242). Waston gave performances 14 percent better than the highest-paid MLS defender last year---Omar Gonzalez (7.59, 2nd best in the league)---but at an 84-percent discount on his salary. That these two played for two of the stingiest defenses in the league is no surprise. LA finished first and Vancouver third overall in expected goals allowed

Another extremely valuable player in the league is Alvas Powell. Though the Portland defense was poor on the whole in 2014, it vastly improved in the second half of the season, which coincided with Powell taking over the starting position on the right side of the defense. He was the eighth-best defender in the league last year with a WS Rating of 7.31 and the fifth-cheapest defender in the league, costing only $48,828. Versus the median league defender, Powell cost 68 percent less, while performing 25 percent better than average.

The Timbers also pulled off another solid defensive signing in the form of Norberto Paparatto. The 31-year-old Argentine had a salary of only $100,000, 34.4 percent below the median wage. Much like his teammate Powell, Paparatto performed exceptionally well last season, despite starting slow. After losing his starting role early in the season while the team struggled, he eventually earned it back and turned his performances around. Paparatto's overall Whoscored rating of 7.44 was the fourth-best in the league, performing 34.5 percent above average. That said, I very much doubt Nat Borchers, their new defensive signing from Real Salt Lake, will be able to match a similarly efficient season. With a wage last year of $236,968 (55 percent above median) and a Whoscored rating of 6.92 (a marginal 1.4 percent below average), Borchers was one of the least efficient defenders in the league. Defense was Portland's greatest weakness in 2014 (and 2012 and 2011), but a full season of Powell and Paparatto together may help them turn it around. How Borchers fits in remains to be seen.

A similarly valuable player was Karl Ouimette, who was the fourth-cheapest defender in the league last year and was the 15th-best defender, despite playing for the porous Montreal defense. Once again comparing Ouimette to the average/median defender, he cost 68 percent less and produced performances that were 15 percent better. A very cheap and useful option for any side, and a talent that may be overlooked playing for a team that allowed nearly two goals against per game.

In the previous Caponomics release I mentioned a couple of Orlando City’s signings: Donovan Ricketts and Tally Hall, who both were among the worst and most expensive 'keepers in the league last year. In defense the new MLS franchise has done better, signing Aurelien Collin and Amobi Okugo. Okugo cost $101,994 last season at the Philadelphia Union - 33% below average - and produced performances 7.6% below average (6.83 Whoscored rating). At only 23 years old, he’s young and still learning his craft. Being paired with the veteran Collin seems a wise choice by the management. Collin was the 5th best defender in the league (7.4 Whoscored rating), costing $281,250 - 84.5% above average. Despite reservations about the goalkeeping situation in Orlando, they've done well with their defensive additions.

Considering there are two new expansion teams joining the league this year, I feel that it is only fair that I look at the defensive additions at New York City FC. George John and Kwame Watson-Siriboe didn't play enough minutes last season to be considered in this analysis, although Jason Hernandez, Chris Wingert and Josh Williams all did. Hernandez was the most expensive player of the three, costing $213,333 (40% above average). Wingert cost $170,590 (12% above average) and Williams $125,000 (18% below average). Their Whoscored rating’s followed in the same order also - 6.86 for Hernandez, 6.8 for Wingert and 6.65 for Williams - all three being below average defender performance in MLS. It seems Orlando have outdone their fellow league-newcomers defensively, both in cost and talent. Cheaper and performance wise Okugo  is nearly better than all three.

By grouping players who have the same Whoscored rating together, it is easy to identify those that have performed at the same level as others but at a lower salary. For example, there is a group that contains the players who have a Whoscored rating of 7.15 (14% greater than the average performance). This contains Jamison Olave, Fabinho, Johnny Leveron and Nick Hagglund. The most expensive of these is Olave, costing $290,000. The cheapest of these was Hagglund, costing just $48,500. The reasons for this are obvious - age and experience. Hagglund was only drafted at the start of the 2014 season and Olave is 33 and has six years experience in MLS. The real value in these four comes from Fabinho and Johnny Leveron, who cost $100,500 and $91,187 respectively. The recent move by RSL to take Olave back to Salt Lake from New York looks like an expensive one, considering they could have had both Leveron and Fabinho for the same money. Leveron and Fabinho are both also a lot younger than Olave (24 and 29 respectively) meaning they probably have more fruitful careers ahead of them - and more performances.

Richard Eckersley on the other hand provides very poor value for money. Costing 145% more than the average ($373,333) and with performances 70% less than the average defender (6.46 WS rating). If I were the GM for the Red Bulls I would cut Eckersley straight away. He represents a large portion of the Red Bulls salary cap, and for a player who performs very poorly for his side, that money could be used more sensibly in other positions.

Another poor performer last year was a newcomer to the league. Bradley Orr moved to Toronto FC on loan from Blackburn Rovers and cost the club $75,000. A relatively low wage which was 51% below the average for a defender. Sadly for Toronto, Orr only produced a Whoscored rating of 6.46 - the same as Richard Eckersley. The reason that I have pointed out this move is because it was made by Toronto FC - a club in which GM Tim Bezbatchenko has previously been referred to as a ‘Capologist’.

For me the ideologies behind Caponomics (or being a ‘Capologist’) is to make the rules of the salary cap in place work for you and your team: get maximum value out of the cap and, where possible, incurring minimum loss or risk. This move for Orr could be seen as good cap management compared to Eckersley - paying 20% of the cost and getting the same level of performance. Alternatively there are plenty of players who cost less than Orr but produced performances a lot better than he did. I’ll mention Bezbatchenko in the next article about Midfielders, but I feel this was a poor move from the Toronto FC GM and this is one of several.

Check back in next Friday (1/23) when I cover Midfielders.

ASA Series on Caponomics, Part one: Introduction and Goalkeeping Application

This is part one of a four-part series examining market inefficiency in Major League Soccer. The portions on defenders, midfielders, and forwards will be published on Fridays for the next three weeks.

By Tom Worville (@worville)

In this series of articles I’m going to explore further my OPTA Pro conference submission, which can be read here. The data I’m going to be using is not going to be as in depth as suggested in that submission. Instead I’m going to Whoscored ratings (thanks to @JaredEYoung for pointing me to this article) and MLS salary data taken from the MLS Players Union here. For an explanation of how Whoscored ratings work, click here.

The aim of my presentation is to display how the market for players is inefficient. In MLS, looking at players who have more than 10 appearances (starts and substitutions) and including players from all teams, the correlation between a player's Whoscored Rating and their salary is 0.25.

This indicates a low correlation between the salary a player is paid and how well he performs; i.e. paying a player a huge salary doesn’t mean you’re going to get a low performance level and vice versa. The market inefficiency shown through this means there are gains to be made when looking for player performance (which is what you are trying to get the most of in a team).

Obviously soccer is a team game, and this article is looking at player performance from an individual point of view. It is important to note that at this time, it’s difficult to quantify team chemistry with analytics, which is an important part of football. Purely looking at player performance also discounts the off-field and intangible aspects of signing a player: leadership, revenue generation through merchandising and ticket sales, coaching commitments, club loyalty, manager-player relationships, work rate, etc.

Evidently, given the data on this it would paint a picture of which players are signed for what reasons and allow us to allocate a specific amount of money toward each specific reason. For example, you could break down the salary of Thierry Henry into his marketability, performances on the pitch, influence on other players in training and so on. Sadly, the data is not available, and I doubt as an outside analyst (by this I mean someone not working within a club) I will never have access to this kind of information.

But even without that data, analysis can focus primarily on the performance of a player vs his salary. I've broken down the players into four distinct baskets focusing on the position that they play in: Goalkeeper (GK), Defender (D), Midfielder (M), and Forward (F). The average wages and ratings for those are given below:

Position Average WS RatingMedian Salary
GK 6.75 $132,478.56
D 6.94 $152,419.91
M 6.75 $207,338.24
F 6.77 $221,506.11

Note: Median wages used due to the top wages for DP’s skewing the average significantly.

Forwards are typically the highest paid players in the league, with goalkeepers being the lowest paid. This makes sense in the current footballing climate - the more flashy, attacking players are valued higher (and therefore paid more) than their more defensive team mates. This does mean that a great goalkeeper is likely to cost less than a decent striker but is as, if not more, integral to the team.

The Numbers Game famously highlights inefficiencies in the transfer market. One of the chapters within the book concludes that a clean sheet is more valuable for a team than scoring a goal. At the end of the day this is football 101 - you start with a point at 0-0 and you can either try and preserve that point or attack and try and score more than your opponent to get all 3 points.

In this series of four articles I’m going to show my findings for each of the four positions, starting with the goalkeeper. The sortable table below shows the salary of all 22 goalkeepers that had more than 10 appearances in 2014, sorted by their Whoscored Rating (WS Rating). The correlation between WS Rating and Salary for goalkeepers specifically was actually negative in 2014, indicating a potentially greater inefficiency in the goalkeeping market. 

Name 2014 Team WS Rating Salary
Bill Hamid DCU7.12 $114,750
Jeff Attinella RSL7.11 $48,825
Andy Gruenebaum SKC 6.94 $85,000
Steve Clark CLB6.91 $138,333
Jon Busch SJ6.9 $184,583
Jaime Penedo LA 6.87 $138,562
Nick Rimando RSL6.86 $235,833
Bobby Shuttleworth NE 6.79 $100,000
Chris Seitz FCD6.78 $105,000
Donovan Ricketts POR 6.77 $260,000
David Ousted VAN 6.76 $266,156
Raul Fernandez FCD6.74 $247,500
Luis Robles NYRB 6.73 $125,000
Troy Perkins MTL6.69 $271,833
Sean Johnson CHI6.69 $253,000
Stefan Frei SEA 6.66 $150,000
Clint Irwin COL6.62 $87,000
Joe Bendik TFC 6.61 $147,375
Zac MacMath COL6.59 $51,325
Eric Kronberg SKC 6.58 $120,000
Tyler Deric HOU6.58 $97,667
Evan Bush MTL 6.49 $48,825
Tally Hall HOU 6.37 $213,500
Dan Kennedy CHV 6.33 $213,417


As stated previously, goalkeepers are the least valued players in the league with the median salary being $132,478.56. The most expensive goalkeeper salary last year was Troy Perkins of Montreal Impact, who was paid $271,833; $50k more than the average forward salary of $221,506.11. On average though, Goalkeepers have the lowest average rating of all positions in the league, tied with midfielders who have an average Whoscored rating of 6.75. By comparing the salary of a player to his Whoscored rating, you can get some sense of value.

Jeff Attinella of Real Salt Lake was the cheapest goalkeeper in the league last year (tied with Evan Bush) costing just $48,825 but was the 2nd best keeper in the league with a Whoscored rating of 7.11. The best goalkeeper in the league was Bill Hamid of DC United. He had a Whoscored rating of 7.12 and cost $114,750, the 9th cheapest goalkeeper in the league. The 3rd best goalkeeper in the league was also the 4th cheapest - Andy Gruenebaum who cost $85,000.

The two worst goalkeepers in the league last term were Dan Kennedy, now a member of FC Dallas, and Tally Hall of Houston Dynamo, who had Whoscored ratings of 6.33 and 6.37 respectively. Similarly, they were also among the higher paid - coming in at 8th and 7th in salaries. This doesn't look good for FC Dallas, who may be hoping that Kennedy is an improvement on Chris Seitz, or the new Orlando City franchise, who recently signed Hall from Houston for allocation money. Their other new goalkeeping addition, Donovan Ricketts, fared slightly better. He was the 10th best goalkeeper in the league last year with a Whoscored rating of 6.7, but had the 3rd highest salary of $260,000. When Hall is fit to return from knee surgery there looks to be competition for the top GK role at quite an expense.

Taking averages for both salary and performance into account, from last seasons figures you would expect a goalkeeper who produces a performance rating of 6.75 to cost about $132,500. Perkins cost 105% more than the average goalkeeper but produced performances 8% worse, with a Whoscored rating of 6.69. In comparison, Attinella comes at a discount of 63% compared to the average goalkeeper, but his performance of 7.11 is 48% above average. This big decrease in cost and increase in performance shows how much of an undervalued keeper Attinella is. Perkins represents poor budget management; he takes up a sizable chunk of the salary cap and produces poor performances.

Tally Hall, similar to Perkins, cost 61% more than average ($213,500) but his performances come at a 51% reduction on the average goalkeeper (6.37). You’d expect for the money spent, Hall would produce a performance of at least average, not less than half the average ‘keeper. Ricketts had a performance rating of 2.7% above average (6.77) but cost the Portland Timbers 96% more than the average GK ($260,000). As we have seen from the beginning of this article, there is not a strong correlation between salary and performance, but for the investment of 96% over the average GK salary you’d hope for a higher return than 2% above average performance. It’s good to have strength in depth of course - but having two relatively under-performing (and well-paid) goalkeepers is not exactly strength.

Bill Hamid cost DC United 13.8% less than the GK average and produced a Whoscored rating 50% above average. Not as good as Attinella, but still great value for money. Gruenebaum cost 35.8% less than the average goalkeeper and produced a performance rating 25.33% above average. Once again a valuable prospect.

Luis Robles of New York Red Bulls is potentially the most average goalkeeper in the league. His salary was $125,000 which was 5.6% less than the average goalkeeper and he produced a performance rating of 6.73, only 3% less than that of the average goalkeeper - pretty good going and probably the most steady ‘keeper in the league in terms of wage and performance output. This is good for the Red Bulls as it is not really worth them gambling on a goalkeeper who costs more in the hope of gaining a better performance rating. It also means that they can concentrate on getting good players in other positions from the money saved from not over-investing.

From this analysis it’s evident that having the most expensive goalkeeper in the league does not guarantee good performance. It will be interesting to see in the forthcoming 2015 season the performance levels of Ricketts and Hall for Orlando City. Both Gruenebaum and Attinella were available at a lower salary and rated better than them. San Jose drafted Gruenebaum in the Re-Entry draft earlier this month, in my opinion a wise move. We’ll see in 2015 whether his form continues. Attinella on the other hand looks like a good replacement for the aging Rimando, although I’d look to swoop early and take this solid and cheap keeper from Real Salt Lake. As for Hamid I’m excited to see how he progresses through the course of next season for both club and country.

Editor's note: Our own goalkeeper ratings here at ASA are correlated strongly with the WS Ratings with a coefficient of 0.76, and reaffirm the apparent market inefficiency with MLS goalkeepers. 

Why Do MLS Teams Suck at Drafting Goalkeepers?

By Bill Reno (@letsallsoccer)

Last year I interviewed John McCarthy, who at the time I was sure would be an MLS SuperDraft pick. McCarthy had just graduated from LaSalle University, and while his school didn't make that year's NCAA tournament, it was obvious he was good enough to play professionally. Still, the SuperDraft came and went, and McCarthy went untouched in the four rounds.

Andre Blake was the heralded newcomer, but the other three selections---along with other MLS combine invitees---were largely unknown. McCarthy responded to the setback by signing with the Rochester Rhinos, where he unseated an MLS-loaned player on his way to being named both USL Goalkeeper and Rookie of the Year. So how did MLS miss this one?

Every January, MLS teams draft collegiate players in the aptly named MLS SuperDraft. But for being in a country that is renowned for producing elite goalkeepers, MLS has a miserable time of identifying the talent. Consistently good collegiate goalkeepers take an unnecessarily long road to get to MLS while clubs select goalkeepers that never make an appearance. Here is a list of every drafted goalkeeper since 2006.

MLSgks.png

Names highlighted blue were invited to MLS combine
GS - Games started, orange numbers have at least five starts a year since draft
LG.2, LG.4 - If GK continually stayed in MLS two, four years later
TM.2, TM.4 - If GK continually stayed with team two, four years later
Y1, Y2, Y3, Y4 - Status of GK in first four years

The list looks at the first four years of each drafted goalkeeper. In the nine years covered, 60 goalkeepers were taken in either the SuperDraft or the following Supplemental Draft. Of those 60, only 31 (52%) finished the first year with the team that drafted them. After four years, two thirds of the goalkeepers weren't even in the league anymore. And with Billy Knutsen and Luis Soffner's contracts not being extended, it's looking like only four of the ten goalkeepers drafted in the past two years will still be with their original team.

What's even more bizarre, goalkeepers drafted after the 40th pick have a higher collective number of starts than those drafted before it.

Drafted Appearances # GKs # GKs (1+ GS)
1 - 40 408 20 12
41+ 501 40 12

Even though the number of post-40 draft picks double those drafted earlier, when you remove the goalkeepers that never started, both groups have twelve goalkeepers. Half of the goalkeepers selected in the first two rounds didn't get more than ten starts. Not only are goalkeepers being poorly selected in the draft, but it doesn't matter all that much when they were drafted or in what round. Late or early, they have surprisingly low chances of ever starting in MLS. The inefficiencies of the SuperDraft continue when you survey the current goalkeeper pool.

Orange names are startersBlue numbers were in the first 40 draft picks

Orange names are starters
Blue numbers were in the first 40 draft picks

Sixty goalkeepers were on an MLS roster this past year. Excluding the 18 goalkeepers that were not able to be selected in a collegiate draft (including Marcus Hahnemann, who graduated two years before MLS started), 16 current MLS goalkeepers went completely undrafted and entered the league later, including six current starters. Another three starters couldn't, or wouldn't, agree to a contract with an MLS side.

What if more than a quarter of all NFL quarterbacks were originally undrafted and represented 30 percent of last weekend's starters? That's ten Kurt Warners! It's not that there aren't enough rounds in the draft for goalkeepers to be drafted, it's that the SuperDraft is incredibly ineffective in scouting MLS talent. And the talent is definitely there. Fourteen of the 19 starting MLS goalkeepers came from NCAA, and 75 percent of all goalkeepers in MLS played college ball, including Canadians. Even on the international scene, NCAA has served the USMNT as well.

USMNT Caps Player USMNT Years College
104 Tim Howard 2002–2014 Did not attend college
102 Kasey Keller 1990–2007 University of Portland
100 Tony Meola 1988–2002 University of Virginia
82 Brad Friedel 1992–2004 UCLA
28 Brad Guzan 2006–2014 South Carolina
16 Nick Rimando 2002–2014 UCLA
15 Mark Dodd 1988–1998 Duke
9 Marcus Hahnemann 1994–2011 Seattle Pacific University
8 Juergen Sommer 1994–1998 Indiana University
8 Zach Thornton 1994–2001 Loyola University Maryland
7 Troy Perkins 2009–2010 South Florida
5 Kevin Hartman 1999–2006 UCLA
4 Sean Johnson 2011–2013 University of Central Florida
3 Jonny Walker 2004 Louisville
2 Joe Cannon 2003–2005 Santa Clara University
2 Bill Hamid 2012–2014 Did not attend college
2 Matt Reis 2006–2007 UCLA
1 Jon Busch 2005 Charlotte
1 Tom Presthus 1999 Southern Methodist University
1 Zach Wells 2006 UCLA
1 David Yelldell 2010 Did not attend college
1 Luis Robles 2009 University of Portland

Drafted players regularly don't work out in any league. There are only so many spots on rosters so not everyone is going to make it. But good goalkeepers are consistently coming out of the NCAA---ones good enough to play for the national team---yet MLS still hasn't figured out who they are.

There are too many goalkeepers getting invited to the combine, but then not drafted. There are too many goalkeepers that go undrafted and yet eventually do make it. MLS clubs are making bizarre trades for eventual starting goalkeepers. Teams are overpaying aging goalkeepers. Teams overstock on goalkeepers they can't unload. The league's entire approach to goalkeeping is mind-boggling, and few are getting it right. With the expansion of MLS-USL affiliations, goalkeepers are getting more secured playing time, but it doesn't matter much if MLS continues to pass on top goalkeepers and mishandle the ones they currently have.

Plus-minus stats in MLS

Most analytically inclined sports fans are aware of plus-minus metrics and what they mean. For those new to the concept, Plus-minus metrics combine the individual and the team. For soccer, it represents the goal differential for a player's team only while he was playing. I was going to provide a plus-minus chart for players in MLS going back to 2011, but as I'm going to explain, it's mostly useless. Even when using expected goal differential (xGD) instead of actual goal differential to create xPlusMinus, not much can be gleaned from the metric.

Here is a chart of the top 25 MLS players in 2014 by xPlusMinus. It is followed by a brief discussion on its current lack of meaningfulness, and then some foreshadowing as to what adjustments will be made in the future to create a worthwhile plus-minus metric. These metrics are on a per-96-minute scale.

Player Team Starts Appearances Minutes xPlusMinus PlusMinus
Todd Dunivant LA 5 7 390 1.58 1.72
Alan Gordon LA 5 14 566 1.53 1.70
Omar Gonzalez LA 22 22 2014 1.12 1.10
Robbie Rogers LA 15 19 1466 1.05 0.98
Marcelo Sarvas LA 25 28 2338 1.04 1.07
Gyasi Zardes LA 26 32 2541 1.02 1.21
Landon Donovan LA 30 31 2870 0.95 1.07
Jose Leonardo Ribeiro da Silva LA 20 24 1949 0.95 1.28
Dan Gargan LA 27 29 2580 0.94 1.19
Chance Myers SKC 7 7 609 0.87 1.26
Vitor Gomes Pereira Junior LA 31 34 2906 0.86 1.06
Baggio Husidic LA 26 34 2298 0.84 1.17
Robbie Keane LA 28 29 2696 0.78 0.85
Jaime Penedo LA 29 29 2768 0.77 0.90
A.J. DeLaGarza LA 28 29 2642 0.76 0.80
Oriol Rosell Argerich SKC 7 7 641 0.76 0.45
Helbert (Fred) da Silva PHI 3 11 389 0.72 0.99
Clint Dempsey SEA 23 26 2268 0.70 0.42
Zach Scott SEA 16 16 1478 0.69 0.71
Rob Friend LA 4 10 408 0.67 -0.24
Stefan Ishizaki LA 22 30 1994 0.65 0.91
Obafemi Martins SEA 29 31 2811 0.58 0.55
DeAndre Yedlin SEA 25 25 2362 0.57 0.33
John Berner COL 5 5 479 0.55 0.20
Chad Marshall SEA 31 31 2967 0.50 0.49

Here's an example of how to interpret the chart. During Omar Gonzalez's 2,014 minutes on the field, the LA Galaxy recorded an xGD of +1.12. Since the Galaxy as a team recorded a league-best xGD of 0.88, one could come to the conclusion that Omar Gonzalez is one of the best players on the best team. That particular conclusion probably isn't too far from the truth, but what about the rest of the table?

Control for a player's team

Basically, this statistic reiterates that the LA Galaxy were the best team in the league. 16 of the top 25 players are from LA, while five of the remaining nine play for the Seattle Sounders. There are fewer lineup combinations in soccer than in basketball and hockey because of the restrictions on substitutions, and this leads to a mostly redundant plus-minus metric. However, by looking at this table, we can start to figure out how to make it better, and the first thing we need to do is control for a player's teammates.

Control for strength of opposition

If we focus just on Galaxy players, we still see some weird things. Todd Dunivant and Alan Gordon are not LA's best players, so why did they end up on top? It's very possible that when these two were on the field, it also independently happened to be when LA's other players were playing exceptionally well. Neither player has a large sample size of minutes with LA, and that can lead to additional random variance in any metric. 

But there is a more concrete factor that likely influenced their plus-minus metrics: an individual's strength of schedule. Weighted by minutes, Dunivant and Gordon played against teams with expected goal differentials of -0.23 and -0.15, respectively. Compare that to Landon Donovan and Robbie Keane's -0.05. Those discrepancies don't make up the whole difference in xPlusMinus, but show that even players on the same team may face different opponents. 

Control for home/away and gamestate

Two other key components to adjust for are whether or not a player's team is at home, and the gamestate in which he typically plays. Robbie Rogers is probably not more valuable than Donovan or Keane, but he did play 54% of his minutes at home and 69% of his minutes while tied or losing. That compares to Donovan's 50% and 63% figures and Keane's 49% and 63%. I know that was a smattering of strange numbers, but the point is that Rogers was given both a home-field advantage and a gamestate advantage relative to many of his teammates. We must also adjust for these factors.

We hope to create a refined plus-minus statistic such as the one ESPN uses for NBA players that controls for specific lineups at any given time. For now, I have put up some of the aforementioned plus-minus metrics in our Plus-minus tab in the upper right. Or just click this link. Proceed with caution.

Shots: Confusion in correlations

By Matthias Kullowatz (@MattyAnselmo)

Much of the research I do for this site revolves around predictive analysis. I like to know which individual and team skills can be measured with stable metrics, metrics that hold true month after month. However, it's still worthwhile doing what I call explanatory analysis. Explanatory analysis involves finding the variables that explain an outcome which has already happened, even if these variables may fluctuate randomly in the future.

I have shown before that shot quality and quantity correlate well to future outcomes. But with that in mind, it is somewhat confusing that the same shot information doesn't correlate so well to the outcomes of the very same games from which the data were gathered. Here are some interesting facts about shots. 

Over the past four seasons in Major League Soccer, home teams averaged more shots in games they lost than in games they won (14.5 to 14.2). Conversely, away teams averaged more shots in wins than in losses (11.9 to 11.1). When the data are combined, the correlation between shot differential and goal differential within a match is virtually zero (CI: 0.02, 0.13). Superficially, this information seems more confusing that helpful.

This finding has led some to reason that shots are a less important metric when it comes to team evaluation. The fact that shot information is predictive is enough to convince most people (including me) that it is useful information to have. But how can it be that something predictive is not also explanatory? How can shots help to predict future outcomes, and yet not be able to explain the outcomes of those games in which they occurred?

You've probably already spotted the subtle differences between explaining and predicting, but let me take a shot (I promise that was an accident). Within a match, correlations are confusing due to all kinds of confounding variables. The answer to this question would clear some things up: "Who was winning the game when all these shots were happening?" Let's explore.

Typically, home teams outshoot away teams 14.3 to 11.3 per 96 minutes of play, and 14.1 to 10.7 in even gamestates. But when they're winning, home lose the shots battle 13.0 to 12.2, likely more content to sit on their leads. When they're losing and desperate for points,  home teams outshoot the visitors by a huge margin, 17.2 to 9.3. So I would argue that the goal differential (gamestate) influences the shot differential as much as the shot differential influences the goal differential.

It's no wonder that in-game correlations between shots and goals are non-existent. Early on in games, the team that gets more shots tends to take the lead. But once they have the lead, those teams tend to  ease up on shots. Thus whenever a team "holds on" to win a game, it very likely had a shot advantage at some point, and then relinquished that shot advantage in attempting to preserve the lead. Without taking into account the gamestates, a superficial analysis would suggest that shots do not correlate to wins. 

I have done nothing with shot quality here, but that wasn't really the point. The point was to show that in-game correlations have to be treated with a lot of care if you want to come to any conclusions about causation. But for the curious, the in-game correlation between Expected Goal Differential 3.0 and final goal differential was 0.37 (0.32, 0.42). Though gamestates are still an issue, shot quality is able to account for the fact that the losing team will be taking lower quality shots, and we get something sort of intuitive.

Do goals stimulate goals?

By Matthias Kullowatz (@MattyAnselmo)

I've heard it said before that a soccer team is most vulnerable after a goal has been scored. My coaches often said this, anyway. Perhaps it was just to keep us focused after we'd experienced the euphoria of scoring or the letdown of conceding. It turns out, there is some support for this notion from the 2014 MLS season. To the results! 

1) A goal is more likely to be scored in a five-minute segment if a goal was just scored during the last five-minute segment.

First, I should say that I controlled for the teams' abilities using season expected goals data, and I controlled for the gamestate as well, since there are fewer goals typically scored in zero gamestates. Once controls were in place, I found that if a goal had been scored by either team in last five-minute segment, the chances of another goal being scored in the next five-minute segment increased from about 15% to nearly 18%.  Put another way, after a recent goal the average goals scored in next five-minute segment increased by nearly 0.04, equivalent to about 0.70 goals over a whole game.

This isn't an obvious uptick in scoring. You probably wouldn't notice even if you watched a lot of games, but the effects of a recent goal are also not nothing. The game appears to open up a bit on average after a recent goal. 

2) The team that most recently scored is more likely to score again (than its typical scoring rate would suggest).

Breaking the first hypothesis down further, we actually see that the team most likely to score in the next five-minute segment is the team that just scored.* The chances of a team scoring in the next five minutes--whether it be the away team or the home team--are increased by 3 or 4 percent if that team scored recently. Chances increase from 6% to 9% for the away team and from 9% to 13% for the home team.

Typical sports fans may say "duh" because the existence of momentum in sports is a common belief. However, momentum is still very much a point of controversy among statisticians across all sports. I would say about the only thing we agree on is that the effects of psychological momentum are smaller than the common fan would believe, and perhaps even negligible in many scenarios. That's why this finding surprised me, especially when we consider that the team that just scored must then relinquish possession.

These results may not apply to a youth soccer team, or even a professional team from another league. But in MLS, there is an average effect on scoring, which is not necessarily negligible, that comes from a recent goal being scored.

Comments are welcome, especially if you can think of a way to further control the scenarios and weed out any biases in the observational data.

*Of course, the team that just scored is probably the better team. But that's why a control was put in place for overall team ability. What isn't controlled for is team ability on that day (due to injuries and what not).