Analysis of SMJHL preseason team rankings, player upgrades, and scoring stats
|
juke
Registered Posting Freak
First media bonus
This says 1635 words but Word says 1255 + some pictures R code word count: An additional 1306 words (I don't know what you want to do with that information but including it just in case) Edit: Sorry to any goalies, this entire report negates and ignores your existence :( Hey guys, I realize this piece will end up a week too late since we’re 17 games into the regular season, but I put together a few pictures highlighting how each of the SMJHL teams and players performed in the full preseason, based on their builds and attributes. It’s kind of dry material, and I haven’t put too much effort into organizing it into a piece that reads well and has a good flow, so buckle up. I started because I wanted to see which attributes might be the best to upgrade and the data was easy to scrape and format from the website, and I figured that for the time being it was still the biggest collection of games to analyze. I’m know that some of the analyses may not be entirely accurate, since I think some GMs may have been adjusting their preseason lines to get rookies more ice time, so I plan on just re-running the code once the regular season has progressed a little further into the season to see how well the preseason data holds up. Like I said earlier, it all started with me wanting to see what attributes I should be putting points into most. The first thing I did was pull everyone’s build, basically what they spend their TPE on, and then correlated each attribute with how good they had been playing. Which is already flawed, because there’s no overall rating for player’s performance, so I just went with their points/60 (PP60). This is especially confounding for defenseman, since the best shut down defender might not necessarily score at a high rate. That being said, it’s clear that for both forwards and defenseman that defense, scoring, puck handling, passing, and skating are the 5 attributes that contribute the most to scoring. Which I’m sure is a shock to no one and it was kind of unnecessary, since I’d wager about 90% of players I’ve seen don’t really upgrade anything else, as you can also see buy the vertical lines on the graph. I also took a look at the attributes compared to +/-, but didn’t find this to be as helpful. A lot of the correlations dropped, and as a lot of hockey fans already know it’s not the best metric to determine how good a player is, and largely depends on the talent of the opposition and teammates. It would be nice to use the stats that the simulation generates to somehow make a metric for defensive success (maybe using some sort of combination between +/-, blocked shots, hits, takeaways etc), but unfortunately for now everyone is valued by their scoring. Putting it all together, I wanted to see which attributes had the highest spearman ‘r’ value. Again, not really too helpful in the end (pay close attention cause that’s the theme of this entire post) because I assume there’s a lot of covariance between the different attributes, and for some of the attributes correlation is kind of pointless caused 95% of the values are the same, but here you have it. This next picture I believe is a little more telling. For the most part it’s simply a heatmap of all the players in the SMJHL attributes, and the players are sorted by their PP60. You can see that despite scoring having the 4th best correlation for scoring in both forwards and defense, the top scorers in the preseason seemed to upgrade scoring more than passing, puck handling, and skating, while defense remains highly relevant to scoring (Andrei pay attention). Since it’s clear that these 5 core attributes are the most relevant to how good players are, I wanted to see if the team standings had anything to do with the team’s average values for each of these attributes. One thing that’s important to note, most team averages ranged from about 63-77 ish, which made differences hard to see in the plot. So I rescaled all the values to a 0-20 scale. This means that the difference in attribute values are now over-exaggerated, just for the sake of the plot, which is ordered from top left to bottom right of where teams placed in the standings. It’s tough to see any clear trends, but I think it is pretty clear that defense wins championships preseason games. Other than that it looks like a teams average puck handling and maybe passing have some loose correlations with their place in the standings. This plot would also have you believe that Detroit is like 3x less talented than all the other teams (just like the real NHL), but again these differences in values are not proportional to the actual differences in team's ratings. And interestingly enough, they have 4 of the top 10 PP60 players from the preseason, so they were quite the top heavy team. The team standings over time turned out to be a little disappointing. In the sense that it didn’t make for an exciting graph. Most teams were pretty consistent, with no huge slumping periods. I would say Carolina was the streakiest team. Starting around game 20, the last 30 games of their season were pretty much two giant win streaks sandwiching basically a winless streak. On a personal level, our beloved Berserkers were sitting in 2nd place 80% through the preseason, then fell flat during the last 10 games. Conditioning coach has got to go. The next plot turned out to be so useless that I debated even putting it in here. I wanted to see the scoring breakdown of the teams (ordered again by their place in the standings), maybe hoping we’d see the better teams scored at a higher even strength rate or something. But scoring is super consistent across all teams, the entire league is within a few percentage points, so I learned nothing here. This last plot shows the shot differentials for each team, but it’s a little unclear at first. At this point, I was getting kind of sick of making graphs, so I combined 3 different visual scales into one graph. Obviously the x and y coordinates of the points are the team’s shots for/against. The color of the dots represents how many points each team got in the standings, and I put the legend in for reference. The size of the dots represents the ranked goal differential for each team. So Vancouver for instance, despite being dead on the league average for both shots for and against, were last in the preseason for goal differential. The last scale is harder to see, but the transparency of the dots represents what percent of the team’s shots hit the net, as opposed to blocked or missed. More solid means a higher percent. I think all the teams were within 1 or 2% of each other, so I wouldn’t really worry if you can’t see the differences too well. This chart shows how well Anaheim really dominated. Ton of shots for, not many shots against, higher percent of their shots hit the net, and they were rewarded with the best goal differential and points in the league. That’s all I got for now. This ended up being more work than it was worth I think, especially the parts regarding player upgrades, because most people already knew what the data said anyways. But I think that the team graphs will be fun enough to keep updating and following along during the season. If you made it all the way to the end of this huge post, I doubt you got anything out of it but thanks for reading! I've posted the code here below, but be warned because I don't annotate/comment my codes well Code: --- Sigs: Thanks JNH, Lime, Carpy, and ckroyal92
boom
SHL GM pure of heart, dumb of ass
Awesome analysis!
luke
SHL GM Admiral of the Data Seas
juke
Registered Posting Freak 02-03-2020, 02:46 PMluketd Wrote: Oh fuck, this is legit work. What did you use to pull the data, Python? And then did you use the libraries in python or R to get the graphs Obviously answered you in the discord, but in case anyone else was wondering: rvest R package for scraping, and ggplot R for the graphs Sigs: Thanks JNH, Lime, Carpy, and ckroyal92
Clean Andrei Kostitsyn
Registered Senior Member
roastpuff
Registered Posting Freak
mxman991
Registered Senior Member
This is awesome work! Quick question though... Does Halifax have that low of scoring? Maybe that's our problem haha
juke
Registered Posting Freak 02-03-2020, 04:29 PMmxman991 Wrote: This is awesome work! Quick question though... Does Halifax have that low of scoring? Maybe that's our problem haha Halifax ended up with the least goals scored in the preseason out of the 10 teams, and the second lowest differential (Vancouver was lowest) Sigs: Thanks JNH, Lime, Carpy, and ckroyal92
thiefofcheese
Media Graders Posting Freak |
« Next Oldest | Next Newest »
|
Users browsing this thread: |
1 Guest(s) |