Currently, most team research focuses on statistics obtained from an entire game, and over the entire pitch. With our shot locations data, we attempted to break down the pitch, segregating shots by their point of origin in an attempt to quantify the difficulty of the shot. There is still work to be done there, but here in this project, we focus not only on pitch location, but also on score and time. It's fair to assume that a team's tactics often change due to the score and the time remaining in the match. A team that is ahead with a few minutes to go isn't likely to push for scoring opportunities, but conversely, its opponent is probably looking for any and all scoring opportunities. When data is taken from the whole game, it ignores these contingencies, these gamestates, completely.
We have now completed the collection of all shots taken in 2013, and we're keeping up with 2014, as well. Each shot can be stratified not only by which player took it, but also by the score and time of the shot. The data can tell us something like this very quickly:
When Portland traveled up to Seattle on March 16th of the the 2013 season, Seattle spent more than 80 minutes with a one-goal lead. During those times, Seattle generated just six shot attempts, while Portland was able to muster 12, including Rodney Wallace's equalizer in stoppage time. However, during the second half that shot ratio was a less-extreme six-to-four in Portland's favor.
To more macro examples, with this data set we will be able to say things like:
Based on S season(s)...
- When the home team is up (or down) by G goals in the Nth minute, playing M-on-M men, it can expect to win with P probability. A summary of this can be found here.
- Teams that have generated R1 shot ratios during periods of D goal differential could be expected to beat teams that generated R2 shot ratios during periods of D goal differential with P probability.
- Players X and Y finish chances from zone two with an efficiency head and shoulders above the rest. Their goals have also been worth an average win expectancy added of +W percent.
Essentially, this data set can help us to create a Win Expectancy statistic, something that now exists in both baseball and American football. Additionally, it will help us to better understand what teams and certain players do tactically during various game states, and how that predicts their success in the future. We have begun published both Expected Goals and Even Gamestate Expected Goals here.