GAME OF THE WEEK: Los Angeles VS. Seattle

Since the weekend was filled with barbecues, families, and time away from the pseudo grind of the world, we decided to skip out on our weekly podcast. But we all love our "Game of the Week" contest so much that we decided to still preview tonight’s game of the week between Seattle and LA. This is what we do for you, America. This is our service.

DREW:

The LA Galaxy are playing a soccer game? ESPN, you know what to do... broadcast it at a time when everyone East of Utah will be asleep! After last week's Galaxy v Red Bulls snoozefest took 89 minutes for anything to happen, ESPN has decided to go double or nothing and show the slumping Galaxy against a Seattle team on a roll. It has largely been because Lamar Neagle (no, seriously) has either found out how to use those neon jerseys to blind defenders, or finally decided he's an MLS quality striker. After Seattle started the season unable to score goals, the Sounders are now getting them in bunches. Or as I like to put it: they're regressing to the mean with a vengeance!

As for the Galaxy, their dependence on Juninho was exposed last week after a hard tackle from his namesake forced him to leave the game early. Los Angeles never got back into sync with him off the field, and New York dominated the rest of the game. As of this writing, his status is still up in the air, but If the Galaxy are going to keep Ozzie Alonso in check they'll need Juninho to keep him occupied. Should Garcia (or anyone else) get the start in Juninho's place, then Alonso will get more forward than he otherwise would, freeing up Neagle, Martins, and family to attack the net. Couple that with the fact that Carlo Cudicini has looked as good in goal this season as Jimmy Nielsen looks in jorts, and the Galaxy could be in for a hurtin'.
All that said, the Galaxy have been very solid at home this season, and the Fishing Village to the North have found their scoring touch at home, but still struggle to get goals on the road. My prediction: if Juninho plays, the Galaxy will pull this out 2-1. If not, it will be a 1-1 draw.
MATTHIAS:
The Sounders have come on strong recently, recording 13 points in their last five matches. I checked for a recent dip in Seattle's strength of schedule, but there was no such dip to be found. Seattle has played three of those last five matches on the road, including a win on the road in Kansas City and a win at home over Dallas.
The Sounders' win at Colorado shouldn't be overlooked either. I will be coming out with a strength of schedule index soon, but my beta version* suggests that the Rapids have played the toughest schedule to this point (along with New England). That's not to mention that, as the away team, Seattle was giving up an estimated third-of-a-goal in an uphill battle. Impressive stuff.
But after saying all those wonderful things about Seattle, my three points this week go to the Galaxy in a one-goal victory. Though the Sounders find themselves second in the tables in goal differential, they are second to the Galaxy. Though the Sounders have an impressive 1.21 Shots-on-goal Ratio, the Galaxy have outdone them again at 1.37. Though the Sounders' strength of schedule has been difficult recently, over the course of the season it's the Galaxy that have faced seemingly tougher opponents. The final nail in the coffin is that the game will be played in Los Angeles, and that third-of-a-goal advantage will lie with the Galaxy. LA drew more than 20,000 fans to its last home match on May 5th, and you can bet they'll show up for the red hot Sounders.
 
*Strength of schedule is currently based on opponents' goal differentials and shots-on-goal ratios.
HARRISON:
I write a lot about the Sounders over the course of the week so let me make this simple. They were taking shots; at first they weren't going in and then, recently, they started all going in. Somewhere in between these two truths lies the median of this organization. They aren't as good or as lucky as what they've been in cumulatively over the past 2 1/2 weeks. But they certainly weren't as bad as what they were to start the season. It's a bit difficult to gauge the true talent level of this squad because of how frequent these parts are moving about.
Unfortunately, for the Sounders, Ozzie Alonso is suffering from a groin strain that will probably prevent him from making an appearance and Steve Zakuani is still not able to go this weekend. Which will force the Sounders to work with an inopportune 18 and even a less-conducive starting XI. This isn't something new to them this year, but I imagine that it's still going to be tough for them to deal with due to how Los Angeles works the ball through the middle of the field with Marcelo Sarvas.
However, the Galaxy are also dealing with injuries to their central midfield---specifically with Juninho who, as Drew mentioned above, was taken out ironically enough by a rough tackle from New York's opposing Juninho. Los Angeles uses an assortment of means to move the ball up the pitch. They average more shots than their opponents, more possessions and longer ones by the standard of TFS. Despite that, they've managed an impressive 17 points in 11 games and are still considered one of the more unlucky teams in all the league.
Adding to their attack the athletic Robbie Rogers and a Landon Donovan---who has something to prove to Jurgen Klinsmann---and all of a sudden you have a club that is very dangerous and probably one of the better ones in the league. Add that to the likelihood of the Sounders shot-to-goal ratio coming back to earth and the absence of Ozzie Alonso, and you end up with a very likely Galaxy win at home. I don't think it's going to be anywhere a long the lines of the Sounders defeat from the playoffs, but a 2-1 victory wouldn't surprise me.
----
Current Standings (as best as I can remember them):
Drew 0 - 3 ; Prediction: LA (if Juninho plays)
Matthias 2 - 3 ; Prediction: LA
Harrison 1 - 4 ; Prediction: LA

Game of the Week: (A Rather Late) #LAGvsNYRB Review

We've talked quite a bit about game states on the blog over the last few weeks, both linking certain articles as well as talking about it on the podcast. The ability to take specific events and associate context with them to provide a better understanding of the match results is helpful. However, there are times when I think Game States need to be refined based upon the situation. Take for instance our "game of the week" selection, New York Red Bulls at home against the potent Los Angeles Galaxy. There is a lot I could say about leaving Mike Magee behind in LA and losing Juninho just 10 minutes into the match. Attempting to use the typical goal game state doesn't really work simply because of the lone goal was scored at the 91 minute mark.

If we were looking at this in a season long context and we wanted to see how good a team was in the "even goal state," or maybe how long they played in an even goal state, 90+ minutes of data this match would go towards that game state and presumably help speak to each team's ability. The problem is that on an individual game basis sometimes there is a need for another way to really apply context to this game.

Naturally, with the injury to Juninho the first thought is to apply game states to substitutions rather than goals. The problem with that---omitting Juninho's substitution---is that substitutions take place in bunches in the second at the end of the game. It's becomes difficult to separate where exactly there was a specific difference maker.

So I kind of abandoned the thought of single game states in this scenario and instead looked more for another pattern.

LAGalaxy

Above is a bit from the MLS site chalkboard. Events on the timeline have been taken from each team, and each has a corresponding event associated with it on the map of the pitch. I specifically used offensive-associated filters to help give me an idea of the effectiveness of each team and how often it was involved.

The specific filters used were: Through balls, Crosses (both successful and unsuccessful), Key Passes, Shots on target, shots off target and lastly, blocked shots. These are all decisively aggressive methods that appreciate a teams ability to drive towards the opposing goal. I'm not exactly sure what to make of all it, there are almost distinctive time blocks that belong to each team as they would hold the ball and look for their own attempts on goal.

You can see that each team had a couple of chances in the last 10 minutes and it came down to a bit of luck in the circumstances of the lone goal. The timeline itself looks almost like heart beat rhythm between each team and their respective attempts towards the opposing goal. This is kind of the pattern I was looking to find, but I don't exactly know what to do with it.

In summation of the actual game, you could make some Carlos Cudicini references---see: Matthew Doyle for snark---and put a nice little bow on it. Yes, I do agree that LA's Italian keeper should have come out of his goal to clear the attempt, but I happen to also think that this single game came down to a rather random occurrence. A simple mistake from a goal keeper who has been in residence at some prestigious clubs.

The league average team finishes a shot roughly once every 10 attempts. The New York Red Bulls scored on what was their 10th attempt at goal. While LA was stuck at 9. I know it's not popular but I believe that sometimes it's not necessarily about strategy or anything deep tactically. Instead, maybe it's about fighting for 90 minutes, putting up as many (good) shots as possible and hoping one of them goes in. That sounds a bit Charles Reepish... I know, but sometimes it's true. Sometimes the ball just finds its way into the back of the net.

Humans make mistakes and even the best goal keepers do, too.

ASA Podcast: Episode VI

Everyone, here we are with American Soccer Analysis Podcast Episode 6! We talk about Juan Agudelo, shots and finishing (skill vs. "luck"), grass pitches vs. artificial turf, Kei Kamara and his return to KC, and then some about bowel movements. LISTEN NOW!!! [audio http://americansocceranalysis.files.wordpress.com/2013/05/asa-episode-6-a-treatise-on-turf-kamara-and-bowel-movements.mp3]

Possession Confusion

Consider every conversation ever had about soccer tactics. I would bet 99.9% of them touched on one specific subject: possession. Whether it’s the men’s league team you play for, or the club team you cheer for, isn’t more possession always a good thing? I can’t answer that question confidently, but I will explore it. The first obstacle to analyzing and discussing possession in MLS is the data itself. We get our data from Opta, and this is what Opta defines as possession:

During the game, the passes for each team are totaled up, and then each team's total is divided by the game total to produce a percentage figure which shows the percentage of the game that each team has accrued in possession of the ball.

“Possession” in Opta’s data is thus a measure of the proportion of completed passes in a match for each team, not a proportion of time. A lot of short, quick passes will accrue possession for a team that may only have the ball for a matter of seconds. This isn’t necessarily bad or good. It is what it is, and we’ll work with it.

Not all passes are created equally---or better put, not all teams' passes average out to be equally effective---but for a moment let’s suppose that they are. It’s hard to gather data on the value of each pass, and hard to then weight teams’ passes accordingly. So let’s just stick with the assumption that all teams' passes are equally effective. Perhaps someday we can sit around drinking beer and punching holes in that assumption. Today is not that day.

Under that assumption of equal passes, a team that completes a higher proportion of passes than its opponent will likely have strung together effective buildup more often than its opponent. Having created more effective build up, that team will likely have earned more scoring opportunities than its opponent. Having earned more scoring opportunities than its opponent, that team will be more likely to score goals and nab points. So this sort of possession should really imply sunshine and rainbows for the participating team. Seems like fair logic to me, but of course, I’m the one writing.

Looking at the tables—tables that were created with Opta’s version of possession, remember—we don’t see a strong correlation between possession and results. Four of the top five teams (by points per match) have 50% possession or less, but overall there is still a weakly positive correlation. We start to get significant results when we assess the correlations between teams’ possession and Attempt Ratios (0.60*), and again with Shots on Goal Ratios (0.55*). Those positive correlations imply that more possession coincides with more scoring chances. Of course, there is not nececelery a causal link.

Let’s take a look at this from another perspective. If we look at the relationships game-by-game—rather than team-by-team—the correlation between possession and scoring chances is still positive. The team that possesses the ball for a majority of passes (Opta’s definition) during any given match also tends to earn more scoring attempts than its opponent.

So far I’ve bored you with support for conventional wisdom: possession coincides with more scoring opportunities, and thus probably with better results.

But then I control for a few variables and shit goes haywire.

When I control for each individual team and whether or not they were playing at home, the relationship between possession and results is decidedly negative. In fact, a team that possesses the ball an additional 10% in any given match is expected to lose half of a goal on average, equivalent to about half of a point. For example’s sake, consider the Seattle Flounders Sounders. Over Seattle’s top four matches in terms of possession, it has earned just one point. However, during Seattle’s bottom four matches in terms of possession, it has earned eight points. Seattle is an extreme case, but a good example of what my model is picking up. Most teams individually seem to do worse when their possession is higher.

So more possession seems to correlate with more shots, and more shots seems to correlate with more goals, but for some reason more possession does not share a significant relationship with more goals. There is some missing information screwing with me, and I don’t have a definitive explanation for this strange paradox, but I will share a theory.

Each team has a style. Whether or not that style works is probably mostly a product of how well the players fit in, and how good those players are in the first place. Perhaps, in general, a style that focuses more on stringing short passes together tends to produce more shots than a high-risk/high-reward style, but this type of possession is not a necessary condition for success. Once each team develops its style, a certain amount of possession is required to optimize that style. For Montreal, it may be 49% possession, and for Portland, it might be 57%. This would explain the mild positive correlations between possession and shots across teams.

But why is it that, across games, more possession seems to correspond to less goals and worse results?

In a given game, if a team generates more possession—more passing by Opta’s definition—then perhaps that is indicative more of the opponent’s defense than of the desire of the team in question to possess. In other words, an excellent defense may not necessarily kill possession, but rather, push possession to less dangerous parts of the pitch. In this way, more possession is simply indicative of a frustrated team, not a team in control doing what it wants to do.

Without being able to conclude this thought exercise satisfyingly, I will propose a few things. First, that by charting each shot’s point of origin, we can begin to assess the quality of a team’s shots. And second, that possession data should be gathered from the distinct areas on the pitch. Possession in the attacking third is likely more valuable than possession in the defensive third. Some combination of these two measurements could very well help to explain the paradox we’re seeing with passing possession and team success.

*A perfect positive correlation would be 1.0.

New England Revolution acquires Juan Agudelo: What does that mean?

First things first before I make fun of the Revolution (and I will).  Their defense has been---excluding the New York outlier---borderline elite this season. That's possibly one of the few reasons they're still afloat and maybe the only reason to watch them (sorry, Lee Nguyen).

Tempo-free soccer has the Revs ranked 6th in dAG (defensive attempts on goals), which is how many times an opposing team has made any attempt at their goal. Add to that that we have them ranked 2nd (6.2%) in Opposing Finishing%, which is how often a team's opponents successfully convert attempts into actual goals. They're better than every team outside of Montreal in that category.

This has all culminated in only 6 goals allowed in 9 games. Something that would be overlooked if it wasn't for their horrible attack and the need for at least some positive mention.

But now the Revs have added the young (former starlett?) Juan Agudelo, someone who saw time with the US National team only 6 months ago in Russia and didn't look awful by any stretch. To be fair, he's someone that has actually come out looking very strong for Chivas earlier this season, but he's been hampered the last few weeks with hamstring issues.

It was thought that he had mended a brewing off-season situation between himself and Chivas USA head coach, El Chelis. But of late, Chelis has given a lot of credit to his now former striker. He told MLSSoccer:

"I didn’t know what I had in Agudelo, but by having him, what I asked for doesn’t matter because Agudelo is a model. He is the natural on this team. He’s a player that has many technical qualities. He’s very involved in working to improve others."

And now he's shipped off to the greater Boston area and we are with out the full detail of the acquisition being yet to be vented  in exchange the Goats received allocation money. The spice of life and magic dust that no one talks about and everyone wants. Of course for us this isn't about the details at this point.

What Juan Agudelo will bring is spectacular things and then all together frustrating things. He averages about 16 shots on goal per 1500 minutes, a number he has yet to reach in either of his stops in Chivas or New York. A team averages a goal on 9.4% of its attempts this season, and 26.9% of its shots on goal. Using that, there's a possibility that he adds a few additional goals to the line-up. Assuming he is just average at finishing.

That said Agudelo has beat the average ratio over his 3,000 minutes, scoring 11 goals in 36 shots on goal (30.6%). Scoring goals is a skill, and though we don't know how much is luck vs. his ability, I think it's very possible that he will continue to beat the league average conversion rates.

Looking at Chris Wondolowski, Kenny Cooper and Álvaro Saborío--the top 3 scorers for 2012--they all combined to beat the league average by scoring a goal on 43.9% of their shots on target. So, we can safely attribute scoring goals on shots on target as being a skill, the only problem is trying to account for luck. That's a little difficult at this stage, and so for now, we'll just mention it.

But assuming that Agudelo is consistent and continues scoring at a high rate and matches 1500 minutes. I have him for about 6 goals this season. Right now considering their goal conversion and their already abysmal offense, the Revs are on pace for 36 goals total to end the season. Considering their ability to suppress their opponent's talent and ability to score goals I have them for 26 goals allowed, assuming they continue their defensive supremacy.

Using SoccerMetricsPythagorean this comes out at about 51 points... given the asinine goal difference. Add in the additional 6 goals that Juan Agudelo brings and that brings them to a total of 56. Basically almost a full point for each goal.

Now, I'm not about to say that the Revolution have a shot at 60 points, not in your life. But in the last 3 years the only teams to have a plus goal differential AND not make the playoffs were 2011 Chicago Fire and the 2010 Kansas City Wizards. Considering a team-wide return to a league-average ability to score goals AND adding Juan Agudelo, it's very possible that New England just moved themselves within striking distance for the 4th or 5th spot in the East.

Columbus and Philly, beware.

Game of the Week Recap: Dynamo at Galaxy

The Los Angeles Galaxy’s Landon Donovan blew a PK in the 25th minute, and the Houston Dynamo managed a goal in the 56th, stealing three points in LA. Definitely not what our expert panel of misfits projected on Saturday’s podcast.  The Galaxy controlled possession (59.3%), won more duels (55%), and earned more attempts (19 to 14), but earned nothing in the standings for its work. Here’s a chronological summary of Houston’s shots:

Houston Dynamo shot times - May 5 2013

There are actually two shots taken in succession there before Houston’s opponent missed a PK, but the disproportionate bulk of Houston’s shots, including its goal, still came after it almost went down 1-0. That’s probably just random, but interesting nonetheless. Here’s something that’s almost surely not random. Though the Galaxy won the attempts battle, many of those attempts were blocked, and many of those blocked attempts occurred after Houston took the lead. Observe:

LA Galaxy's blocked shot times - May 5 2013

It seems the Galaxy began to get desperate, and this brings up the concept of game states which we discussed in the above podcast. Teams are likely to employ different strategies depending on the score of the game. While the Galaxy out-attempted the Dynamo during the game, many of those attempts appear to have been of the low-probability type.

The Dynamo jumped up to third in points per game, and the updated tables can now be seen here!

*Thanks to Opta and MLS Soccer for the sweet images!

ASA Podcast: Episode V

Hello loyal readers/listeners! My apologies for not having this up yesterday, but it's here now. A 55-min. bit on Markov Chains, Game States, and a preview of the Galaxy vs. Dynamo, while also introducing a new member of the American Soccer Analysis team. Hope you enjoy! [audio http://americansocceranalysis.files.wordpress.com/2013/05/asa5.mp3]

Game States: An Attempt at an Introduction

I'm no mathematician. Matty maybe, but I am not. So when approaching something like Game States, I felt it good to attempt to introduce it with something, though it's rather ominous and a bit intimidating. So most--if not all--of the information provided is taken from a source who is smarter than I am. That's really what this blog is all about, finding people who know and understand the principles we are trying to learn and centralize the material and keep it in tidy location where people that are new---not just to the sport, but also the concept of analytics---can go to find information and grow their knowledge.

The idea of game states is that a match will consist of a sequence of states, where each state is defined by a combination or series of events that culminate in creating a new state. Those events give details and help break down the match. They provide context or meaning to the data that we record. Game states, as I understand it, is based upon the idea of the Markov Chains.

...[A] mathematical system that undergoes transitions from one state to another, between a finite or countable number of possible states. It is a random process usually characterized as memoryless: the next state depends only on the current state and not on the sequence of events that preceded it.

Let's apply this idea to a sport...

In football, if a team is in a certain situation, what happened previously has no effect on what will happen next. For example, if we have a 1st-and-10 from our own 20, it does not matter if the previous play was a kickoff for a touchback or a 10-yard gain for a first down after a 3rd-and-10 from the 10-yardline. Either way, we now have a new situation that will only directly affect the next play.

To put it all into soccer terminology, if a team has possession of the ball and is progressing past the half-line and into the attacking third it doesn't matter if they got it on an interception or a goal kick. Regardless of how it happened they now have the ball going into the oppositions attacking third and will have an opportunity to threaten and score. I struggle with this thought because tactically this may not be true---an individual getting the ball on a break is different than someone participating with a soft building of play in attacking the opposition's net. Coming from a statistical and memory-less position we simply want the facts.

Let's go to another sport and put this into baseball terms because of the advanced progress that the sport in general has made in analytics. Baseball analytics breaks game states down into four basic concepts: the score, inning, base runners and outs. This would be the way you could calculate basic run/win expectancy and ratios.

If you have the average number of runs expected to score in an inning after any game state, you can figure out how many runs a stolen base is worth, or a triple, or a strikeout. The game state essentially allows us to relate everything that happens on the diamond back to the major currencies of baseball: winning and runs.

We're not talking about baseball. We're talking about soccer, or for you euro snobs, 'Fútbol'. However, taking the concepts of an already practically applied matrix, such as baseball and the one that that Tom Tango has already developed(see below), can give us ideas on how we can attempt to create a corresponding one in soccer. Soccer has wins too---though I see more people associate points---but our true currency in this sport is goals. Everything leads back to the price or value of a goal. Whether it be a cross or a tackle, the ultimate result of what we want is to be able to understand how things work together to produce goals.Comparing the possible game states; Soccer just--just like in baseball--has a score line that can help give us an idea of the transitions between states. A team being up one goal, or on the other side of the coin being down one goal, can give us a definition of a game state. It's simplistic, but it works. It also can show us why some times, looking at you Alex Fergueson, teams take less shots than at other times. It would make sense just for the purpose of proper possession.

We also have measurement for the length of the game, in that each team will play against one another for a total of 90 minutes. In soccer, we have time intervals. The largest problem is that because it's constantly moving rather than a set static state, such as an inning, it creates a lot of various probabilities and chains. However, if you wanted to mitigate that to an extent you could instead just revert to the basics of using the first half vs. the second half of a match. While these are two big time intervals, when used in conjunction with other specific game states it could continue to help us develop a better understanding of the game.

The next one listed is base runners, but I'm going to pass on that and move on to the concept of outs. This is something that at very best I can say is a difficult correlation, but if you wanted to attempt one I might try changes in possession. This is a rather poor concept and I have no idea how or if you want to use this... probably not. Baseball limits a team to three attempts to score per their half of the inning rather than having 3 specific possessions or attempts to score. In the first 45 minutes of a half you could have a team with anywhere between 12 and 40 possessions alone depending on who the opponent was and how the team executed its attack.

Back to base runners, and this is one of the easier things to mimic though I have no way of proving how closely they are related. Shots on goal  conceivably could give us a baseline on the probability of goal scored and, more importantly, the points that are associated with winning or drawing.

One additional game state that hasn't been mentioned and doesn't really correlate with anything that baseball has, is yellow and red cards. While, baseball has ejections it doesn't necessarily affect the dimensions of the game. However, in soccer it puts the team down a man, rather than being able to just sub in a replacement.

These are all elements which you would consider context. How do teams perform during these situations? Do their possessions last longer? Do they take more shots? What are the quality of shots? This is the information and really the purpose behind gathering the data points. Are the Seattle Sounders just as likely to score in a "-1" goal situation as they are in a tie-game?

This is all information that we are seeking. The context can provide further details as well as specific entry into how certain aspects and statistics can be properly correlated to goals, and then to points toward the table.

I feel like there is more to write about here... and that's because it's true. There is a lot more to write about. But this is primarily to cover the basics of game states. We'll talk more about it in our podcast tomorrow, and we'll have a follow up post to everything when we start preparing to post MLS game state information.

A thought on Big Data and Club Analysis

I talked a bit about Big vs. Small data last week--in case you missed it, go back and check it out--and we kind of talked about how you don't have to necessarily rely on the revolution of big data. There is a need to make do with what is currently available. However, while that big data is sometimes available, there are other encumbrances to deal with:

Beyond the complexity and time constraints placed on the analysis, another major obstacle faced in the job – like that faced by so many people entrusted with big data within an organisation, football club or otherwise – is to make data useful, accessible and engaging to colleagues who have little interest or experience in dealing with numbers.

This from a recent interview with Ben Smith of the development performance systems at Chelsea FC. A club that is often quoted as one of the "big-4" in the English Premier League. It's important to understand that, while many of these clubs have information at their disposal, few (if any) know or understand the practicality of implementing the information into their planning and preparation phase.

If fact, reading back on the 'Counter Attack' blog by Richard Whittal, some clubs--i.e. most--don't pay their club analysts. That should give you a brief, if not all together insulting, view of how much they respect the value of the service provided. I'm not saying they don't see it as useful in "some capacity", I just think that, in terms of how much they pay the rest of the staff, they could afford to have a full time analyst, especially for what an analyst has the potential to provide.

But it is not, of course, just the coaching and scouting staff that benefit from the big data analytics being carried out at the club, the players are also reaping the rewards of the work across the club. He says: “Every one of Chelsea’s Academy players from the age of nine has a personalised development programme."

There is some interesting stuff here to think about. It sounds a lot like how Ravi Ramineni has started helping out David Tenney, Sounders FC fitness coach, over the past 6-8 months.

By the way, h/t goes to Ravi who linked the article from his twitter.