Predicting Goals Scored using the Binomial Distribution

Much is made of the use of the Poisson distribution to predict game outcomes in soccer. Much less attention is paid to the use of the binomial distribution. The reason is a matter of convenience. To predict goals using a Poisson distribution, “all” that is needed is the expected goals scored (lambda). To use the binomial distribution, you would need to both know the number of shots taken (n) and the rate at which those shots are turned into goals (p). But if you have sufficient data, it may be a better way to analyze certain tactical decisions in a match. First, let’s examine if the binomial distribution is actually dependable as a model framework. Here is the chart that shows how frequently a certain number of shots were taken in a MLS match.

source data: AmericanSoccerAnalysis

The chart resembles a binomial distribution with right skew with the exception of the big bite taken out of the chart starting with 14 shots. How many shots are taken in a game is a function of many things, not the least of which are tactical decisions made by the club. For example it would be difficult to take 27 shots unless the opposing team were sitting back and defending and not looking to possess the ball. Deliberate counterattacking strategies may very well result in few shots taken but the strategy is supposed to provide chances in a more open field.

Out of curiosity let’s look at the average shot location by shots taken to see if there are any clues about the influence of tactics. To estimate this I looked expected goals by each shot total. This does not have any direct influence on the binomial analysis but could come in useful when we look for applications.

source: AmericanSoccerAnalysis

The average MLS finishing rate was just over 10 percent in 2013. You can see that, at more than 10 shots per game, the expected finishing rate stays constant right at that 10-percent rate. This indicates that above 10 shots, the location distribution of those shots is typical of MLS games. However, at fewer than 10 shots you can see that the expected goal scoring rate dips consistently below 10%. This indicates that teams that take fewer shots in a game also take those shots from worse locations on average.

The next element in the binomial distribution is the actual finishing rate by number of shots taken.

 source: AmericanSoccerAnalysis

Here it’s plain that the number of shots taken has a dramatic impact on the accuracy rate of each shot. This speaks to the tactics and pace of play involved in taking different shot amounts. A team able to squeeze off more than 20 shots is likely facing a packed box and a defense less interested in ball possession. What’s fascinating then is that teams that take few shots in a game have a significantly higher rate of success despite the fact that they are taking shots from farther out. This indicates that those teams are taking shots with significantly less pressure. This could indicate shots taken during a counterattack where the field of play is more wide open.

Combining the finishing accuracy model curve with number of shots we can project expected goals per game based on number of shots taken.

ExpGoalsbyShotsTaken

What’s interesting here is that the expected number of goals scored plateaus at about 18 shots and begins to decline after 23 shots. This, of course, must be a function of the intensity of the defense they are facing for those shots because we know their shot location is not significantly different. This model is the basis by which I will simulate tactical decisions throughout a game in Part II of this post.

Now we have the two key pieces to see if the binomial distribution is a good predictor of goals scored using total shots taken and finishing rate by number of shots taken. As a refresher, since most of us haven’t taken a stat class in a while, the probability mass function of the binomial distribution looks like the following:

source: wikipedia

Where:

n is the number of shots

p is the probability of success in each shot

k is the number of successful shots

Below I compare the actual distribution to the binomial distribution using 13 shots (since 13 is the mode number of shots from 2013’s data set), assuming a 10.05% finishing rate.

source data: AmericanSoccerAnalysis, Finishing Rate model

The binomial distribution under predicts scoring 2 goals and over predicts all other options. Overall the expected goals are close (1.369 actual to 1.362 binomial). The Poisson is similar to the binomial but the average error of the binomial is 12% better than the Poisson.

If we take the average of these distributions between 8 and 13 shots (where the sample size is greater than 40) the bumps smooth out.

source data: AmericanSoccerAnalysis, Finishing Rate model

The binomial distribution seems to do well to project the actual number of goals scored in a game, and the average binomial error is 23% lower than with the Poisson. When individually looking at shots taken 7 to 16 the binomial has 19% lower error if we just observe goal outcomes 0 and 1. But so what? Isn’t it near impossible to predict the number of shots a team will take in the game? It is. But there may be tactical decisions like counterattacking where we can look at shots taken and determine if the strategy was correct or not. And a model where the final stage of estimation is governed by the binomial distribution appears to be a compelling model for that analysis. In part II I will explore some possible applications of the model.

Jared Young writes for Brotherly Game, SB Nation's Philadelphia Union blog. This is his first post for American Soccer Analysis, and we're excited to have him!

MLS Prediction Contest - We Have a Winner!

After two weeks of Major League Soccer wins, losses, and, this week, mostly draws, the best predictors were... [googleapps domain="docs" dir="spreadsheet/pub" query="key=0At6qSdpic03PdE4zOE12WWNlSm0zeVBnaXd6SnpDQ0E&output=html&widget=true" width="500" height="300" /]

MLSAtheist and timbertyler tied for first place with 13 correct answers each (out of 20). Normally, we would have gone to the tiebreaker to determine the grand prize winner, but MLSAtheist, a valued contributor to American Soccer Analysis, graciously decided to withdraw his prize eligibility. That leaves timbertyler as the winner of a subscription to MLS Live 2013!

Congratulations to timbertyler; maybe Portland will follow his lead and start amassing some wins of their own.

ASA Fantasy League Update Round 2: A Terrible Case of the Nagbe's

This is your weekly reminder that you're doing MLS fantasy, and if you're taking part in our league you should probably set your rosters so you have an opportunity to win something TBD. And really, since you're probably not doing any work with the NCAA tournament going on, you have some time to make sure your lineup is good to go this week. If you aren't in our league yet, and for some reason you feel the strong need to join, you can do so by figuring out how to use this code: 9593-1668. We grade on a pass/fail scale. If you get in you passed. Here is the current week's worth of data. It's in a jpeg format because, frankly, tables show up for crap on our site and we'll be moving soon enough to this other site that... well, we'll tell you more when we're at that stage.

week2MLSFANTASYHere are the main take aways for this week.

- Stop making Darlington Nagbe your Captain.

- Will Bruin continues to make me look stupid.

- I'm average, and if you are below me, you are not doing yourself any favors.

- I'm ahead of both Matthias and Drew, so while I'm the idiot of the podcast I've so far shown to be the better fantasy player.

- I totally lucked out with Zack MacMath this week.

Now the below image is for the week 2 "dream team" which is basically how you could have gotten the most points last week. Interesting that no one from our league sported a 3-5-2 formation this week and that three main formations were kind of cycled through for everyone.

DreamTeam-week2

Good luck to you all, and we'll see if we can ever catch up to either Bazzo, Cris Pannullo or Chris Gluck. They look poised to possibly run away with this thing. Hopefully this week will set them back so the rest of us can feel better about ourselves.

xGD in CONCACAF Champions League

Understanding that not everything has to mean something, we still try to provide meaning to things. Deriving meaning becomes infinitely harder when sample sizes are small: what size sample is important when considering a specific set of data? We don't always know, but I present you the CONCACAF Champions League data anyway. Below is the Expected Goals 1.0 data from the group stage of the CCL that I've compiled in the last couple of days.

Team  xGF   xGA   xGD 
Cruz Azul 8.578 4.112 4.466
Toulca 7.528 3.488 4.04
Tijuana 6.617 3.018 3.599
America 6.975 4.017 2.958
Dynamo 5.683 3.417 2.266
LA Galaxy 7.052 4.95 2.102
Sporting KC 4.785 2.699 2.086
    SJ Earthquakes 4.768 2.962 1.806
Montreal I.    3.816    8.796    -4.98

To be honest, this is my inventive way to present this information to you. I wanted to do an article about various things concerning CCL, but the problem always kind of leads back to sample size. Four games just isn't that much. The thing is, while you may not be able to draw any solid conclusions from this, it does give us a rough assessment for how Liga MX compares to MLS at this juncture, and it tells us that for the most part, MLS and Liga MX teams are better than the competition.

Mind you, teams have changed between when they qualified for CCL 2013-14 and now. This San Jose Earthquakes squad, for example, has quite a few new faces. Houston also has added a couple of pieces and underwent a some changes in the defensive rotation scheme.

xGD wasn't going to tell us too much about the semi-final matches that were played the last two nights. We knew that it was improbable that even two clubs were going to move forward. Furthermore it seems awkward to even consider that San Jose was the closest to advancing--and had it not been for a bad call, it probably would have.

What xGD did tell us is that all four Mexican clubs performed better in that short period than any of the MLS sides. Sure, a "duh" statement is in order, but this clarifies that point further than a cute 1990's radio morning drive show with catchy sound effects could. Cruz Azul seemed a superior team, for example, as they were nearly two expected goals better than any MLS side. In a short tournament that says something stronger than their actual goal tallies.

Yes, I realize the whole sample size thing, and really it's funny submitting qualifying statements, but it's even more silly to consider that we qualify them despite the fact that we don't actually know if we need to. For all we know xGD stabilizes as a metric at six games or maybe even four. We'll get Matty on that...

Mexico's teams were better, and judging from everything going down on  Twitter and how the fragile psyche of the average US Soccer fan seems almost devastated by this fact, the reality is that MLS is better than it has been. The league has grown so much, and considering the issues that still limit organizations from competing against Mexico, it's surprising how well we really do in this competition.

Now American teams aren't yet on the "elite" level yet. But they are still very good and are nearing the imaginary line of being able to compete on a greater level with Mexico. As the budgets of MLS increase, and the depth charts along with the academies grow deeper, you're only going to see MLS teams get better. Stating that an MLS team will never win the CCL is one of those hyperbole statements that is just crazy to me. I think it's an eventuality at this point that some club somewhere will knock Mexico off it's perch...sooner rather than later.

ASA Podcast: XLI An MLS Week 2 Review

Okay, here is a podcast. This is our weekly podcast that is just simply a podcast. A podcast that stars myself and Drew. A Podcast that I edited to make sound not bad and some what interesting. I encourage you to listen to it. Drew makes lots of good points, I make some too. [audio http://americansocceranalysis.files.wordpress.com/2014/03/asa-podcast-xli-the-do-over.mp3]

Passing: An oddity in how it's measured in Soccer (Part II)

If you read my initial article on "Passing - An oddity in how it's measured in Soccer Part I"; I hope you find this article of value as well as the onion gets peeled back a bit further  to focus on Crosses. To begin please consider the different definitions of passing identified in Part I and then take some time to review these two additional articles (Football Basics - Crossing) & (Football Basics - The Passing Checklist) published by Leo Chan - Football Performance Analysis, adding context to two books written by Charles Hughes in 1987 (Soccer Tactics and Skills) and 1990 (The Winning Formula).   My thanks to Sean McAuley, Assistant Head Coach for the Portland Timbers, for providing these insightful references.

In asking John Galas, Head Coach of newly formed Lane United FC in Eugene, Oregon here's what he had to offer:

"If a cross isn’t a pass, should we omit any long ball passing stats? To suggest a cross is not a pass [is] ridiculous, it is without a doubt a pass, successful or not - just ask Manchester United, they ‘passed’ the ball a record 81 times from the flank against Fulham a few weeks back.”

In asking Jamie Clark, Head Coach for Soccer at the University of Washington these were his thoughts...

"It's criminal that crosses aren't considered passing statistically speaking. Any coach or player knows the art and skill of passing and realizes the importance of crossing as it's often the final pass leading to a goal. If anything, successful passes should count and unsuccessful shouldn't as it's more like a shot in many ways that has, I'm guessing, little chance of being successful statistically speaking yet necessary and incredibly important."

Once you've taken the time to read through those articles, and mulled over the additional thoughts from John Galas and Jamie Clark, consider this table.

 Stat Golazo/MLS STATS Squawka Whoscored MLS Chalkboard My approach Different (Yes/No)?
Total Passes 369 356 412  309+125 = 434 309+125+9=443 Yes
Total Successful Passes 277 270 305 309 309 + 9 = 318 Yes
Passing Accuracy 75% 76% 74% NOT OFFERED 71.78% Yes
Possession Percentage 55.30% 53% 55% NOT OFFERED 55.93% Yes
Final Third Passes 141 NOT OFFERED NOT OFFERED FILTER TO CREATE 140 Yes
Final Third Passing Accuracy 89/141= 63.12% NOT OFFERED NOT OFFERED FILTER TO CREATE 92/140 = 65.71% Yes
Total Crosses 35  vs 26 (MLS Stats) NOT OFFERED 35 35 35 No
Successful Crosses 35*.257=9 NOT OFFERED 9 9 9 No
KEY PASSES NOT OFFERED 7 9 6 6 Yes
 

* NOTE: MLS Chalkboard includes unsuccessful crosses as part of their unsuccessful passes total but does not include successful crosses as part of their total successful passes; it must be done manually.

For many, these differences might not mean very much but if looking for correlations and considering R-squared values that go to four significant digits these variations in datum might present an issue.

I don't track individual players but Harrison and  Matthias do, as does Colin Trainor, who offered up a great comment in the Part I series that may help others figure out where good individual data sources might come from.

What's next?

My intent here is not to simply offer up a problem without a solution; I have a few thoughts on a way forward but before getting there I wanted to offer up what OPTA responded with first:

I (OPTA representative) have has (had) a word with our editorial team who handle the different variables that we collect. There is no overlay from crosses to passes as you mentions, they are completely different data variables. This is a decision made as it fits in with the football industry more. Crosses are discussed and analysed as separate to passes in this sense. We have 16 different types of passes on our F24 feed in addition to the cross variable.

So OPTA doesn't consider a cross a pass - they consider it a 'variable'?!?

Well I agree that it is a variable as well and can (and should) be tracked separately for other reasons; but for me it's subservient to a pass first and therefore should be counted in the overall passing category that directly influences a teams' percentage of possession.  Put another way; it's a cross - but first and foremost it's a pass.

(Perhaps?) OPTA (PERFORM GROUP now) and others in the soccer statistics industry may reconsider how they track passes?

I am also hopeful that OPTA might create a 'hot button' on the MLS Chalkboard that allows analysts the ability to filter the final third consistently, from game to game to game, as an improvement over the already useful 'filter cross-hairs'...

In closing...

My intent is not to call out any statistical organizations but to offer up for others, who have a passion for soccer analyses, that there are differences in how some statistics can be presented, interpreted and offered up for consideration.  In my own Possession with Purpose analysis every ball movement from one player to another is considered in calculating team passing data.

Perhaps this comparison is misplaced, but would we expect the NFL to call a 'screen pass' a non-pass and a variation of a pass that isn't counted in the overall totals for a Team and Quarterback's completion rating?

Here's a great exampleon how Possession Percentage is being interpreted that might indicate a trend.

Ben has done some great research and sourced MLS Stats (as appropriate) in providing his data - he's also offered up that calculating possession is an issue in the analytical field of soccer as well.

In peeling back the data provided by MLS Stats he is absolutely correct that the trend is what it is... When adding crosses and other passing activities excluded by MLS Stats the picture is quite different and lends credence to what Bradley offers.

For example--when adding crosses and other passing activities not included by MLS Stats--the possession percentages for teams change, and the R-squared between points in the league table comes out as 0.353, with only 7 of 8 possession-based teams making the playoffs. New York, with most points, New England and Colorado all had possession percentages last year that fell below 50%, and only one team in MLS last year that didn't make the playoffs finished with the worst record (16 points) DC United.

For me, that was superb research - a great conclusion that was statistically supported. Yet, when viewed with a different lens on what events are counted as passes, the results are completely different.

All the best,

Chris

You can follow me on twitter @chrisgluckpwp

 

How It Happened: Week Two

I'll be frank: either week two of the MLS season was much less exciting than week one, or I did a poor job of picking games to watch and analyze this week. My bet is that both are true. Anyway, onto the show in which I take a look at three games from the weekend and pick a stat or Opta chalkboard image for each team that tells the story of how they played (last week's version is here if you missed it):

Sporting Kansas City 1 - 1 FC Dallas

Stat that told the story for Dallas: outpassed 418-213, including 103-41 in the game's first half hour

A thought occurred to me when watching this game: Sporting Kansas City has to look a lot like a prototype of what Oscar Pareja wants out of his teams. From the formation to the high-pressing, KC has long made their money by manhandling opponents as soon as they get on the ball and not letting them get comfortable. In this game, Sporting came out fired up at home and simply punched Dallas in the mouth (not even completely a figure of speech - this game was brutally physical). The high-pressing from KC's entire team had FCD out of sorts for most of the first half, particularly the first 30 minutes, when they mustered only 41 completed passes.

But the Hoops managed a road draw against the defending champs, so the game wasn't completely a story of getting worked over. As the game wore on and Sporting found it difficult to keep up the constant pressure, Dallas was able to grow into the game a bit. They certainly were never dominant, but another very good game from Mauro Diaz and some smart counter-attacks allowed Pareja's team to stem the tide for the majority of the game. In the end, it was fitting that the slugfest of a game saw just two goals, both from set pieces, but Dallas should feel good about how they played as the game progressed and were able to steal a point.

Stat that told the story for Kansas City: lack of production from forward line: 15 offensive actions in attacking third

kc2

Sporting KC won MLS Cup last year and has unquestionably been one of the league's best teams for the last few seasons. But few would argue that this success is built on a very strong defense and midfield. The forward line has often been sort of an Achilles' heel for this squad, especially now that Kei Kamara has moved on. In this game, Graham Zusi was held out so he could stay fresh for CONCACAF Champions League action, and DP forward Claudio Bieler only came on for the last 13 minutes. But the five players who saw time at a forward spot for KC (Bieler, Dom Dwyer, Sal Zizzo, CJ Sapong and Jacob Peterson) combined to register 15 offensive actions in the attacking third. 

To be clear, that 'offensive actions' stat that's illustrated above might have been made up by me just now, but it encompasses successful passes, dribbles, and all shot attempts. Too often on Saturday, and really for the last few years, Kansas City has dominated the game until the last thirty yards of the field, where they lack ideas. Getting Zusi back will likely help, as would playing Claudio Bieler for a full 90 minutes, but Sporting will need some more creativity and production from their forwards if they hope to lift another trophy this season.

Chivas USA 1 - 1 Vancouver Whitecaps

Stat that told the story for Vancouver: only 53 passes in the offensive third (23 of which were after Kekuta Manneh came on in the 60th minute)

I tuned in for the Chivas-Vancouver matchup excited to see an offensive battle between two sides that combined for 7 goals in week one. Instead, I saw an early red card to the Goats' Agustin Pelletieri followed by a lot of dull possession for Vancouver against a surprisingly organized team in red and white stripes. After looking so deadly in attack against New York, the Whitecaps looked completely lost for ideas on Sunday, with the only forays into the offensive third seeming to come from chips over the top from the superb Pedro Morales. That all changed when Kekuta Manneh came on, as he attacked the Chivas defense with and without the ball, causing fits for Eric Avila and eventually scoring the equalizer for the 'Caps. Still, after playing 87 minutes against 10 men, Vancouver has to be rightfully disappointed at only earning a point.

Stat that told the story for Chivas: Mauro Rosales turning back the clock: 151 actions

chv2

The Seattle Sounders traded Mauro Rosales to Chivas this offseason because he was too expensive and too old to fit into the club's plans for 2014. Nobody even really argued with the decision, though Rosales is undeniably a classy player and won the league's Newcomer of the Year award in 2011. So far in 2014, playing in the red and white of the Goat Zombies, Rosales has looked a lot like the 2011 playmaker that Sounders fans knew and loved. Playing down a man, Rosales was everything you could hope from a smart, skilled veteran; he hoofed it up field when in trouble so his team could get organized, he led smart counter-attacks and he kept the ball when possible (with the help of Erick Torres, who also played very well). All in all, he registered 151 actions in Opta's chalkboard, 12 more than any other player and a whopping 47 more than his nearest teammate. Not bad for a washed-up 33-year-old.

Houston Dynamo 1 - 0 Montreal Impact

Stat that told the story for Montreal: Marco Di Vaio's non-existant heat map

mtl2

I've watched about 120 minutes of Montreal Impact soccer in the season's first two weeks, and just about every one of those minutes has been more impressive than I expected from the Impact this season. Despite having zero points from their first two games (both on the road), they've actually looked pretty good on the field. Justin Mapp is doing Justin Mapp things (like this awesome run & assist from week 1), Hernan Bernardello and Patrice Bernier are pinging beautiful balls to open up space, and Felipe and Andrew Wenger are getting in pretty good goal-scoring spots. So what's the reason behind the zero points? Well, not putting chances away against the Dynamo killed Montreal. ASA's shot numbers had their xGF at 1.15 this week, but there were plenty of other times that they wasted dangerous opportunities (one particular Wenger near-breakaway early in the first half stands out). If All-Star Italian striker Marco Di Vaio wasn't suspended, I have a hard time believing the Impact gets shutout last week.

Stat that told the story for Houston: 8 fouls conceded in the defensive third

This was another game where what I ended up watching did not line up with the expectations I had going in. After an open, attack-filled opening game with New England, Houston came out and didn't really do much offensively against Montreal. It was actually sort of a gameplan of old-school Dom Kinnear, as the Dynamo got an early goal thanks to a deflected Will Bruin shot, and then packed it in and made themselves hard to beat. They sat in two organized banks of four so that only the perfect ball from Montreal would be enough to beat them, and when it looked like they might get beaten, they did the professional thing and took a foul. Eight of Houston's 14 fouls conceded were in their defensive third, and while I can't offer much perspective on whether that's a high proportion compared to league average, I can tell you that many of them occurred when Montreal players were breaking away and getting ready to provide a scoring chance.

Agree with my assessments? Think I'm an idiot? I always enjoy feedback. @MLSAtheist or MLSAtheist@gmail.com

MLS Week 2: Expected Goals and Attacking Passes

Truth be told, last week was kind of a failure on my behalf. I trusted the data and information that was supplied by Golazo, and I'm not sure it really worked out as intended. A few mistakes have been pointed out to me, and while in general that could have been avoided by double checking the MLS chalkboard, I'm not sure that I really wanted to double check their work. This week I went straight to the Chalkboard for the data and then verified the total amount based off MLS soccer numbers. The result of the total numbers this week were a bit surprising.

Team shot1 shot2 shot3 shot4 shot5 shot6 Total xGF
San Jose 0 15 1 8 2 1 27 3.231
Colorado 1 8 4 3 1 1 18 2.228
Portland 2 5 6 3 4 1 21 2.219
New York 1 7 1 0 2 0 11 1.667
Sporting KC 1 4 4 4 3 2 18 1.654
Philadelphia 2 2 4 3 2 0 13 1.465
Chicago 2 2 2 4 2 2 14 1.446
Chivas 2 1 2 6 4 0 15 1.351
Seattle 1 4 1 0 6 1 13 1.263
Houston 1 2 4 3 4 0 14 1.2
Montreal 1 2 2 3 8 0 16 1.15
RSL 0 3 3 2 4 0 12 0.942
Toronto 0 2 2 1 3 1 9 0.653
New England 1 1 1 1 1 0 5 0.635
Vancouver 0 2 1 1 3 1 8 0.582
FC Dallas 0 2 1 2 2 0 7 0.577
Total               22.26

*Expected Goals 1.0 used for this table.

It's weird the last couple of games (talking the CCL match against Toluca midweek); San Jose has done an incredible job at generating shots against talented opposition. First, against a very talented Deportivo Toluca that currently sits second in the Clausura 2014 table, the Quakes managed to put together 20 shots. Liga MX isn't what they once were to MLS, but this is a very efficient showing. With that they barely squeaked by with a draw. This weekend was a much different story as they put the pedal to the floor and crashed through Real Salt Lake to draw a game they really had no business even being in to that point.

Portland is another team that stood out, but for less good things than bad. As Chris already alluded to this morning (he stole my thunder!), they've had an incredible amount of shots that have been blocked even before they get to the keeper. They're obviously getting into advantageous locations and taking shots, but their opponents are getting out in front and deterring those attempts. Which, if you were going to deploy a method for the stopping the Timbers' offense, that would seem to be it. Stay in front of them and prevent as many shots from occurring as possible. Portland has shown itself to be a terribly direct team.

Team    xGF     Goals  
San Jose 3 3
Colorado 2 1
Portland 2 1
New York 2 1
Sporting KC 2 1
Philadelphia 1 1
Chicago 1 1
Chivas 1 1
Seattle 1 1
Houston 1 1
Montreal 1 0
RSL 1 3
Toronto 1 2
New England 1 0
Vancouver 1 1
FC Dallas 1 1
Total 22 19

As you saw last week, our metric predicted under the total amount of goals scored and this week we were actually over. Again this speaks to the strength of long-term averages, and you're frequency going to be bouncing around the total amount. But the important thing is that we're close, and that we understand where we came up short and where we went over. New England, Vancouver and FC Dallas are all clubs that were lucky to even make the "50%" cut off because they just barely projected for a goal. But that was because we round up to the nearest whole number.

New England was surprisingly the highest of the three clubs. I say surprising because they tallied the least amount of shots. Despite that they managed a couple of better shot locations.

    Team   Comp. Passes   Inc. Passes   Total     Pass%     KeyP
Philadelphia 76 35 111 68.47% 5
New England 44 22 66 66.67% 1
New York 53 38 91 58.24% 6
Colorado 26 20 46 56.52% 5
Seattle 59 54 113 52.21% 6
Toronto 15 19 34 51.72% 2
Sporting KC 38 29 67 56.72% 5
Dallas FC 26 11 37 70.27% 4
Houston 40 26 66 60.61% 8
Montreal 49 25 74 66.22% 8
San Jose 54 36 90 60.00% 10
RSL 50 15 65 76.92% 3
Portland 46 41 87 52.87% 5
Chicago 31 30 61 50.82% 7
Chivas 48 33 81 59.26% 8
Vancouver 31 22 53 58.49% 2

Lastly we have attacking third passing data. As you see, there were only two clubs over the "100" mark this week. Seattle and Philly both collected a large percentage of the total possession, which as we have talked about previously isn't necessarily what's important. It's about WHERE you possess the ball. Well, for Philadelphia it worked out well as they pretty much dominated New England. Pushing the ball into the attacking third, the Zolos limited the total touches of New England in dangerous locations and created plenty of opportunities for themselves.

However, Seattle is a different story. As shown in PWP, they dominated a lot of the raw numbers and even managed to finally produce a goal despite shot frustrations. But Toronto preyed on the counter attack and mental mistakes by Marco Pappa. They didn't need many chances, but in the future we'll have to see if they can continue to finish as efficiently as they did on Saturday. They sported the least amount of attacking touches in all of MLS with only 34 and while that obviously doesn't correlate 100% to goals scored, the more opportunities you have the more likely you're going to find the back of the net.

MLS Possession with Purpose Week 2: The best (and worst) performances

In case you missed it, I will be offering up a series of related articles throughout this season focusing on my Possession with Purpose analysis - a drive towards developing a simplified, yet systematic, statistically-based rating approach on Strategic Team performance (both attacking, defending and cumulative) in executing the Six Steps of PWP. Part of this effort also includes highlighting individual players who have had a significant role in how a team performed that week. I don't claim to say that 'the player' selected is the best player on the team but it is intended to show how one players' activities help influence a team outcome. If you haven't read the introduction and explanations to PWP click here for more details.

For this week my article focuses strictly on team performance for Week 2; an additional article may be offered up later this week that covers the cumulative PWP Indices for the first two weeks; if I get it posted I'll paste a link here.

The top PWP Strategic Attacking team for week 2 was Real Salt Lake; that may come as a bit of a surprise given some other outcomes this past week - more later on that; but a good thing to remember is that high strategic ratings for RSL are not unusual given their penchant for possession and some pretty good goal scoring ratios based off shots taken and shots on goal.

PWP STRATEGIC ATTACKING INDEX WEEK 2 2014

Here's the breakout on how they performed in each of the six steps of PWP:

REAL SALT LAKE PWP STRATEGIC ATTACKING PROCESS WEEK 2 2014

The RSL Attacking Team Player of the week is Joao Plata; here are some highlighted individual statistics that helped influence team performance...

REAL SALT LAKE PWP INDIVIDUAL ATTACKER OF THE WEEK 2 2014

The bottom feeder in Strategic attack this week was the Montreal Impact (1.9808).   Some internal key indicators used to develop that rating included being 5th lowest in Total Passes (389); 5th lowest in Passing Accuracy at ~69%; 8th lowest in Passes within their Attacking Third; 10th highest # of Passes completed in the Final Third (62); 5th best with 16 Shots Taken with 5 Shots on Goal (tied for 6th best) yet no goals scored.

Other teams were less productive in some cases but the summation of all those indicators pointed to Montreal as being the least effective and efficient as a Team in Attack.

Now how about Jermaine Defoe and Toronto FC?  He had two goals in a blinding win for the Reds visiting Seattle.   Is there a reason why Toronto didn't get the best PWP Attacking team this week? A good question, and here's why they missed out.

Recall from last year that one of the top attacking teams in MLS was Vancouver - yet on the defending side they were not quite so fortunate.   Also note that both Chicago and Dallas also had on average (and in total) more goals scored than 3 other teams making the Playoffs.

In reviewing the intent of PWP; it's not intended to mirror outputs that directly match goals scored; if it was then the PWP Composite Index for last year would have been 70% accurate as opposed to 90% accurate.   For now let's just say that Toronto did a great job in taking 3 points in Seattle - it's a long season with many games yet to be played.   So the analysis doesn't snub Toronto - it simply attempts to better recognize that Real Salt Lake had a more comprehensive team attack than Toronto.

Unlike last week, the top Defending team performance did not come from the top attacking team; recall RSL gave up 3 goals-against in their draw with San Jose.

The top Defending team this past week were the Houston Dynamo.  Given Montreal were the bottom feeder in Attack it only makes sense that the most effective and efficient team in Defense... was... Houston; part of that rests with an impotent attack by Montreal but part of it also rests with a very active defending team unit of Houston.

It should be noted that last week Houston were number two in attack and defense; and while they only scored one goal this week they did, like last week, come away with a clean sheet.  Is this an early sign the the Dynamo are indeed a force to be reckoned with in the East?

Here's how they compared to all other teams in Week 2:

PWP STRATEGIC DEFENDING INDEX OF THE WEEK 2 2104

Here's their Defending percentages for the six steps of the PWP Strategic Defending Process:

HOUSTON DYNAMO PWP STRATEGIC DEFENDING PROCESS OF THE WEEK 2 2104

And the PWP Defending Player of the Week award goes to Corey Ashe:

HOUSTON DYNAMO PWP INDIVIDUAL DEFENDING PLAYER OF THE WEEK 2 2014

Some could offer that David Horst or another defender might have nailed this award - for me the number of touches and passing accuracy speak to a comprehensive impact in the game and while David did great job in the box; especially with clearances I felt and thought Corey Ashe played the most comprehensive game on both sides of the pitch.

Finally, before offering up some additional observations, here's the complete picture on the PWP Composite Strategic Index for Week 2:

PWP COMPOSITE STRATEGIC INDEX FOR WEEK 2 2104

Observations:

A interesting output is how well Toronto showed against other teams this week; they took three points in their away match to Seattle yet fell below zero in their cumulative total.   Part of that outcome has much to do with their on-field strategy - play the counter and allow Seattle the better part of possession in hopes of capitalizing on mistakes to generate goals.

In looking at the Seattle statistical indicators for that game they were obnoxiously potent in posssession, passing, and penetration (like some others team so far this year) but simply couldn't put quality shots on goal or another goal past Cesar.

All told Seattle offered up 643 passes {HUGE} (this includes crosses, throw-ins, etc where the intent is to move the ball from one player to another), a 79% passing accuracy with 68% possession, 61% passing accuracy within the Attacking Third, (95 passes successfully completed), yet they only tallied 13 shots taken with just 2 of them on goal.   Seattle controlled the game only up to the point of setting the stage for shots and shots on goal - two of the most critical steps in Possession with Purpose.

As the year unfolds the counter-attacking style of Toronto, and others, while ceding possession, may be much more clear and additional tendencies should pop up to validate other teams taking this approach.  For now I'll call it an outlier but don't expect it to be an outlier later this year as more patterns develop.

Like last year, Portland is finding itself near the top in overall PWP.   As noted in my match analysis there is a potential weakness with Portland this year in telegraphing shots.  With 35 total shots taken this year 17 of them have been blocked before reaching the keeper - a trend to continue to watch for sure!

FC Dallas now have Pareja running the team and, if his team performs like the Rapids did last year, it is likely we continue to see them in the top half of the Index .  Last year Dallas faltered around the midway point and a good indicator then was a drop in defensive performance.  Should be interesting to see if that drop-off manifests itself after week 17 or so if there attack continues to stay aggressive.

Philadelphia have added Edu this year and their attack is considerably different given a more possession based approach - Jared Young offered last week that Okugu provided some very solid defensive play against Portland - we'll be sure to watch how he and Edu and others look to improve the Union results this year.

All for now; you can follow me on twitter @chrisgluckpwp

Best, Chris

Passing: An oddity in how it's measured in Soccer (Part I)

In my passion to better understand how soccer is statistically tracked I've come across what I would call is an oddity about the general characterization of "passing" in the world’s greatest sport. Here's the deal - go to Squawka.com, whoscored.com, reference the "Stats" tab on mlssoccer.com, or review Golazo information, and you'll notice they all provide passing information.

My intent is not to dig deep into passing details – not yet, anyway. We’ll get there in another article to follow after I get permission from OPTA to reference their F-24 definitions within their Appendices. For now here's a simple question I have as a statistical person working on soccer analysis.

What is the number of passes I should use for teams and which denominator is the right number for total passes by both teams to help determine possession percentages?

In the MLS Chalkboard you can clearly see and count passes - here's an example from a game this past week.

An important filter to note - the major term 'Distribution' is not to be clicked in creating this filter - all that is clicked is 'successful pass and unsuccessful pass'; note also that some details are provided on the types of passes  - we’ll get there in another article.

Bottom line is that the MLS Chalkboard identifies 309 successful passes and 125 unsuccessful passes for a total of 434 passes attempted.

On the MLS Stat sheet - one tab over but linked here the number of passes for Chivas = 369; that number doesn't match the Chalkboard in either total, unsuccessful or successful.

For Golazo, for that same game here's their total: 369 Passes total with 75% accuracy meaning the total successful passes was 277 and unsuccessful passes totaled 92.  Not the same either.

For Squawka.com here's their total: Successful = 270 /// headers (8), throughballs (2), passes (239), long balls (21) and supposedly crosses (0) Unsuccessful = 86 /// passes (52), headers (14), long balls (20), no unsuccessful crosses or throughballs logged here?! Yet the MLS chalkboard indicates 26 unsuccessful crosses! All told that is 356 passes; those figures don't match the other data sources.

For whoscored.com here's their total: Short ball = 323, Long ball = 52, Through ball = 2, Cross = 35, for a total of 412 passes - again that figure doesn't match the other data sources.

So what's the right total?  Here’s a table to compare showing the source of data and the total passes submitted for statistical folks like us to leverage in our analysis.

MLS Chalkboard 434
MLS Statistics 369
Golazo (same as MLS Stats) 369
Squawka 356
Whoscored 412

Observations:

I have no idea what 'right' looks like here but here's what I've done to work through this issue.

I chose one source, the MLS Chalkboard, to gather and analyze statistics on passing and possession and all other things available from that data source - where other information is not offered there I reference the MLS Stats tab and Formation tab.

Why did I choose the Chalkboard?  Because it provides additional detail that shows more clarity on all the other types of passes that occur in a game.

For example; if you scroll down on the Chalkboard link and select Set-Pieces you’ll see that Throw-ins are included in the successful passing totals – by definition a Throw-in is a pass as it travels from one player to another.

So my recommendation, if interested, is to track Major League Soccer statistics using the MLS Chalkboard first - it's harder but seems to be the best one at this time.

I'm not sure why the MLS Chalkboard, Golazo, Whoscored and Squawka all had different team passing statistics; given that it is likely they all have different individual player statistics as well... but in asking a representative from OPTA about that - their response was provided below:

“The difference between the different websites could be down to a few things. Either they take different levels of data from us, or they take the same feed but only use a chosen set of information from each feed to display their own take on each game.”

By the way – I did try to find a reasonable definition of what a pass is defined as for soccer; here’s some of that information before final thoughts… note: they are all different and Wikipedia proves, by its definition, why it’s a pretty useless source for information…  for them a pass in soccer must travel on the ground – no kidding – here’s their definition up front:

“Passing the ball is a key part of association football. The purpose of passing is to keep possession of the ball by maneuvering it on the ground between different players and to advance it up the playing field.”

Other definitions get pretty detailed – it is what it is apparently – complicated…

Passing Definition: About.com World Soccer.

When the player in possession kicks the ball to a teammate. Passes can be long or short but must remain within the field of play.

Soccer Dictionary: Note there are numerous definitions provided in this link so offering up a specific link is troublesome so I will cut and paste those definitions below:

Cross, diagonal: Usually applied in the attacking third of the field to a pass played well infield from the touch-line and diagonally forward from right to left or left to right. Cross, far-post: A pass made to the area, usually beyond the post, farthest from the point from which the ball was kicked. Cross, flank (wing): A pass made from near to a touch-line, in the attacking third of the field, to an area near to the goal. Cross, headers: 64% of all goals from crosses are scored by headers. Cross, mid-goal: A pass made to the area directly in front of the goal and some six to twelve yards from the goal-line. Pass, chip: A pass made by a stabbing action of the kicking foot to the bottom part of the ball to achieve a steep trajectory and vicious back spin on the ball. Pass, flick: A pass made by an outward rotation of the kicking foot, contact on the ball being made with the outside of the foot. Pass, half-volley: A pass made by the kicking foot making contact with the ball at the moment the ball touches the ground. Pass, push: A pass made with the inside of the kicking foot. Pass, sweve: A pass made by imparting spin to the ball, thereby causing it to swerve from either right to left or left to right. Which way the ball swerves depends on whether contact with the ball is made with the outside or the inside of the kicking foot. Pass, volley: A pass made before the ball touches the ground. Passing: When a player kicks the ball to his teammate. Through pass: A pass sent to a teammate to get him/her the ball behind his defender; used to penetrate a line of defenders. This pass has to be made with perfect pace and accuracy so it beats the defense and allows attackers to collect it before the goalkeeper.

Ducksters.com offers up a Glossary and Terms for Soccer; here’s what they define a pass as being…  this one is geared more towards teaching players about various types of passes they will need good skill in order to execute them.

Direct Passes - The first type of soccer pass you learn is the direct pass. This is when you pass the ball directly to a teammate. A strong firm pass directly at the player's feet is best. You want to make it easy for your teammate to handle, but not take too long to get there.

Passes to Open Spaces - Passing into space is an important concept in making passes in soccer. This is when you pass the ball to an area where a teammate is running. You must anticipate both the direction and speed of your teammate as well as the opponents. Good communication and practice is key to good passes into space.

Wall Passes (One-Twos) - Now we are getting into more complex passing. You can think of a wall pass as bouncing a ball off of a wall to yourself. Except in this case the wall is a teammate. In wall pass you pass the ball to a teammate who immediately passes the ball back to you into open space. This helps to keep the defense off balance. This is a difficult maneuver and takes a lot of practice, but the results will make it worth the effort.

Long Passes - Sometimes you will have the opportunity to get the ball up the field quickly to an open teammate. A long pass can be used. On a long pass you kick the ball differently than with other shorter passes. You use an instep kick where you kick the soccer ball with your instep or on the shoelaces. To do this you plant your non-kicking foot a few inches from the ball. Then, with your kicking leg swinging back and bending at the knee, snap your foot forward with your toe pointed down and kick the ball with the instep of your foot.

Backward Pass - Sometimes you will need to pass the ball backward. This is done all the time in professional soccer. There is nothing wrong with passing the ball back in order to get your offense set up and maintain control of the ball.

Now that's probably not 'every' definition available but they pretty much say the same thing apart from ‘on-the-ground’ by Wikipedia – a pass is a transfer of the ball from one player to another…

In closing… 

As noted earlier – I’m not really sure what right looks like but I remain convinced that all these organizations are well-intentioned in offering up free statistics for others to use, be it for analysis, fantasy league or simply to check it out.

In my own effort to develop more comprehensive measurements and indicators a standardized source of data for the MLS would be beneficial – if the intent for MLS is to endorse OPTA then there remains a conflict as Golazo clearly does not use the same data filters as the Chalkboard.

My vote, is and will remain, keep the Chalkboard and then, MLS, consider ways, as OPTA (Perform Group) is now, to improve it for more beneficial analysis.

Here is Part II  - where I peel back a wee bit more - consider these phrases, successful crosses, launches, key passes, through-balls, throw-ins and more, as ASA continues its venture into Soccer Analysis in America.

Here’s a few paraphrased thoughts from other folks who offer up articles on ASA about this issue on passing statistics:

Jared Young – The massive difference in pass data between sites is troubling and disturbing;   I’ve been primarily using whoscored.com and golazo for my numbers so I may have to explore other options.

Cris Pannullo – Major League Soccer should take an initiative and define what pass means in their league; it is surprising that they haven’t given how popular things like fantasy sports are; people eat statistics up in this country.

All the best, Chris

You can follow me on twitter @chrisgluckpwp