Tag Archives: Analytics

Data Incubator Challenge: Plot 2

Hi everyone!  Today I’m using our blog to host some plots that I made as a challenge project for the Data Incubator.  For more information on this program, visit the Data Incubator’s hompage.

plot2

Advertisements

When it’s Hot, Does Smoke Catch Fire?

Ten races into the NASCAR Sprint Cup season, Tony Stewart has put up some rather pedestrian numbers so far. With only two top 5s and four top 10s Smoke, as he is affectionately known throughout NASCAR circles, currently sits 21st in the point standings. His average finish of 19.8 is well above his career average of 12.9, yet some fans don’t seem to be worried. The reason? Memorial Day is fast approaching and as legend has it when the temperature heats up Smoke starts to catch fire. Throw in the added fact that under NASCAR’s 2014 rules a single win will probably qualify a driver for the chase, and many fans are betting that Tony picks up at least one win sometime between now and the last race before the chase cutoff at Richmond. But, will Tony pick up that win? Is there any merit to the legend that summer heat helps Smoke catch fire?

We’ll start simple and look at average finishes overall. From 1999 through 2013 Smoke has an average finish of 14.91 before Memorial Day and 11.66 after. If we just limit it to “the summer heat” where we look between Memorial Day and Labor Day, the gap tightens just a bit. During the summer Tony averages 11.74 compared to 13.38 during cooler times of year. We can use a statistical test to test for significant difference. In doing so, despite the large sample size (N = 521), the data are highly non-normal so a Student’s t-Test is not the appropriate test to apply in this situation. Instead we use a nonparametric test called the Wilcoxon-Mann-Whitney test which allows us to test for significance even though the data violates the normality assumption. Doing so gives p-values of 0.0006 and 0.0325, indicating Smoke’s improved post-Memorial Day and summertime performance is not just random noise, and is in fact not a myth.

Stewart's finishes before vs. after Memorial Day (left) and spring/fall vs. summer (right) with Wilcoxon-Mann-Whitney test for significance.
Stewart’s finishes before vs. after Memorial Day (left) and spring/fall vs. summer (right) with Wilcoxon-Mann-Whitney test for significance.

However, we can take this analysis farther. First, let’s remove the impact of mechanical failures, which the great majority of the time is not the driver’s fault (there can be situations where the driver abuses the car causing a tire or other mechanical failure, but they tend to be few and far between). We leave in races where accidents have caused a DNF (did not finish) because there are times this is the driver’s fault and times the driver is simply caught up in someone else’s wreck. Since there is no way to differentiate fault, we play it safe and leave accidents in. In doing so, Smoke now has average finishes of 14.01 before and 11.39 after Memorial Day respectively, and 12.73 and 11.51 in spring/fall and summer respectively. In both cases, removing mechanical failures has tightened the gap. However, applying Wilcoxon-Mann-Whitney still gives p-values of 0.0024 and 0.0561 respectively, once again implying Smoke’s finishes tend to be better after Memorial Day and during summer. Moving forward, the rest of this analysis has DNFs from mechanical failures removed.

It’s fair to ask the question, “well isn’t Tony Stewart simply better at some tracks than others, and the tracks he’s better at tend to fall in the summer?” Here’s where it gets interesting. Looking at the 14 tracks that Tony has raced on before and after Memorial Day, Tony has averaged 1.15 positions worse than his average finish at each track before Memorial Day. After Memorial Day, Smoke has finished 1.03 positions better than his average finish at each track, meaning Tony finishes almost 2.2 positions better compared to his track average before Memorial Day than after. This difference is statistically significant (p = 0.0225). However, if we look at the 9 tracks that have had summer and spring/fall races Tony finishes near his track average regardless of time of year (0.14 worse in spring/fall, 0.22 better in summer). This small difference could be purely random chance (p = 0.4895). Thus, it’s easy to conclude that since Tony does better after Memorial Day, but no better between Memorial Day and Labor Day, it stands to reason he does best after Labor Day.

Stewart's finish against track average for before vs. after Memorial Day (left) and spring/fall vs. summer (right) with Wilcoxon-Mann-Whitney test for significance.
Stewart’s finish against track average for before vs. after Memorial Day (left) and spring/fall vs. summer (right) with Wilcoxon-Mann-Whitney test for significance.

As it turns out, Smoke does indeed perform best late in the season. Now we divide the season into three segments, “Beginning”, consisting of races before Memorial Day, “Middle”, consisting of races between Memorial Day and Labor Day, and “End”, for races after Labor Day. For tracks that are not confined to one time of year he outperforms his track average by 1.21 positions at the end of the season, best of the three groups. This is one position better than the middle of the year, and 2.36 positions better than the beginning of the year against his track averages. The difference between beginning and end of year finishes produces the only result of significance (p = 0.0236).

Stewart's finishes against track average for beginning, middle, and end of year with Kruskal-Wallis (multiple means version of Wilcoxon-Mann-Whitney) test for significance.
Stewart’s finishes against track average for beginning, middle, and end of year with Kruskal-Wallis (multiple means version of Wilcoxon-Mann-Whitney) test for significance.

In other words, we can conclude, yes Tony Stewart does perform better after Memorial Day than before, but that is because stating it this way includes the end of the year. Instead, it is more correct to say he performs best at the end of the year, outperforming his track averages the most as the season winds down.

So, will Smoke similarly heat up this year? Under NASCAR’s new rules, winning matters most. Before Memorial Day Smoke has won 6 times in 174 races for a win ratio of 3.45%. After Memorial Day, he boasts a 12.54% win ratio. Considering there are 16 races left before the chase then at a similar 12.5% winning clip Smoke’s expected value is 2 wins before the chase in an average year. However, this has been no ordinary year so far for Smoke. Through 10 races, the 2014 season has been his worst start to date. Smoke has finished on average 8 spots worse than his track average in 2014. By comparison, his second worst start through 10 races was in 2007 where he averaged 4.9 spots worse than his track average. Applying the eight extra positions to Tony’s expected finish at each track, but crediting him for a post-memorial day bonus on all the tracks after Charlotte, we can try to calculate his expected wins. In doing so, the data for (actual finish – track average finish) needs to be transformed since it is highly non-normal. I first shifted the data so it is all positive valued and then I applied a Box-Cox transformation to produce a near-normal distribution of data. From here, we can calculate the expected win value for each of the 16 remaining races since we have the transformed mean and standard deviation for his actual finishes around his average track finish. When all is said and done, Tony’s new expected wins value is 0.1275 wins before the chase, a far cry from the 2.0 expected wins. This equated to about an 12% chance that Tony wins at least one race. Not too great.

Histogram of difference between actual finish and track average finish showing significantly skewed distribution of data
Histogram of difference between actual finish and track average finish showing significantly skewed distribution of data

Tony will have to step up his game if he wishes to make the chase. The data suggests he has only marginally improved performance in the middle of the year rather than the “as the summer heats up so does Tony” myth that is out there. Lucky for him, all it takes is one win, and if he does manage to significantly improve performance or grab an unexpected win with subpar performance to make the chase, all bets are off. Tony at the end of the year is a vastly different Tony than the beginning of the year as the data proves. Smoke could very well contend for the title if he simply makes the chase.

The Final Four: Why You Should Bet “On Wisconsin”

Florida is the Vegas favorite to cut down the nets Monday night. But that doesn’t mean betting on them gives the best odds for bettors. I preview each match-up from a statistical point of view, and provide the chances each team has of moving on to the title game, and becoming the NCAA basketball champion.

Wisconsin v. Kentucky (-2)

This game pits a talented, yet under-performing Kentucky squad full of future NBA players coached by the legendary John Calipari against the last Big Ten hopeful, Wisconsin, in its first final four in the 13 years Bo Ryan has been at the helm. Vegas has jumped on-board the Kentucky bandwagon and has made them a 2 point favorite over Wisconsin. My numbers disagree.

The cluster analysis I performed last week, I have enhanced by including tempo into the equation. This gives me two sets of clusters. The first set has eight clusters of teams where tempo is not included, while the second set has seven clusters with tempo included. We can then group the teams into a possible 56 clusters (eight clusters multiplied by seven clusters), however, only 28 clusters of teams are formed. This isn’t unexpected, as the only difference was the inclusion of tempo, so teams that were similar in the first cluster analysis often will be similar in the second cluster analysis. Wisconsin is a Type 7 team in the first analysis, and a Type 6 team in the second analysis; I notate this as (7,6). Kentucky falls into group (5,4). As it turns out, (7,6) vs. (5,4) has occurred 164 times in the 2013-2014 NCAA basketball season, with (5,4) winning 83 times and (7,6) winning 81 times. Additionally (5,4) teams have won more despite being ranked lower on average according to Pomeroy’s ratings. Type (7,6) teams had an expected winning percentage of 54.0% so combined with a 164 game sample size, the 83 wins by (5,4) teams could have occurred by pure chance (p = 0.13). Thus, we cannot prove that (5,4) teams match-up well against (7,6) teams, so we provide no match-up bonus to Kentucky.

This is confirmed by the fact that Wisconsin has been quite successful against Kentucky-type teams this year, going 7-1 against (5,4) teams with the sole loss coming to Minnesota on the road (they beat the Gophers two other times). This 87.5% win rate is in line with a 79.0% expected winning percentage. Wisconsin recently dismantled Baylor, who as Mike Portscheller at btn.com points out, is quite similar in style to Kentucky. In fact, they are so similar that my two cluster analyses indeed pitted Baylor as a (5,4) team, just like Kentucky. Kentucky, on the other hand, has gone 1-1 against (7,6) teams, with a win over Boise St. and a loss against Michigan St.

Also, I built a predictive model to predict offensive and defensive efficiency using a regression model with Pomeroy’s four factors as inputs in predicting offensive and defensive efficiency for each team. Normally, I’d adjust Pomeroy’s four factors for strength of schedule, but the difference in each team’s schedule strength is negligible to this analysis, so I stick with unadjusted values. The model predicts Wisconsin to score 110.2 points per 100 possessions and Kentucky to score 108.1 points per 100 possessions. With no match-up adjustment, this means Wisconsin should win 60.9% of the time.

Throw in the injury to Willie Cauley-Stein (who hasn’t been ruled out, but I’m counting as out, or at best ineffective) and Wisconsin definitely has the edge, expected to win 63.1% of the time. Wisconsin moves on.

Prediction: Wisconsin 69, Kentucky 63.

Florida (-6) v. Connecticut

The earlier game produces a rematch from earlier in the year where the home Huskies edged out a one-point victory over Florida in Storrs. Since then Florida has not lost, reeling off 30 straight wins en route to the #1 overall seed in the tournament.

Florida falls into the (6,5) group of teams while Connecticut is a (7,2). Here, the cluster analysis again predicts no match-up difference. Adjusting for missing players actually gives a slight advantage to Florida. One of Florida’s losses (to Wisconsin on the road) came with Dorian Finney-Smith and Scottie Wilbekin sidelined, giving Florida a small boost in each area of the four factors when accounting for this. On the other hand, while two of the Huskies’ losses came while missing key players, they also went 3-2 in other games when their opponents were missing key players. This produces a near neutral net effect, with minimal changes to each component of the four factors. Using the regression model using the missing player adjusted four factors, Florida is expected to win 75.1% of the time.

UConn’s Cinderella run comes to an end, and the Gators advance and cover the six point spread.

Prediction: Florida 65, Connecticut 56.

NCAA Championship Game

Using the probabilities above it’s easy to calculate each the odds for each potential match-up. Here are the probabilities for each match-up, with match-up chances that are better than Vegas odds noted with the expected yield return on a dollar.

  • Florida v. Wisconsin 47.4% chance (Vegas odds 1.6 to 1), $1.232
  • Florida v. Kentucky 27.7% chance (Vegas odds 1.2 to 1)
  • Connecticut v. Wisconsin 15.7% chance (Vegas odds 6 to 1), $1.099
  • Connecticut v. Kentucky 9.2% chance (Vegas odds 5 to 1)

NCAA Title Odds

The only significant adjustment to Pomeroy’s ratings for potential championship game match-ups comes from a possible Florida v. Wisconsin game. Here, Wisconsin, a (7,6) team would face Florida a (6,5) team, where (7,6) teams have won 36.2% more than expected against (6,5) teams. This holds true for Wisconsin, as they are 3-1 against (6,5) teams having beaten Virginia, Florida, and St. Louis, with the lone loss to Ohio State. The only other adjustments are the aforementioned missing player adjustments to Kentucky, Florida, and Connecticut. The missing player and match-up adjustment move Wisconsin’s chances of a victory to 45.5% over Florida.

The missing player and match-up adjusted winning percentages produces the following championship probabilities, with favorable noted with the expected return on a dollar:

  • Florida 45.5% (Vegas odds EVEN)
  • Wisconsin 30.8% (Vegas odds 3.5 to 1), $1.386
  • Kentucky 13.0% (Vegas odds 2.75 to 1)
  • Connecticut 10.7% (Vegas odds 7 to 1)

Wisconsin is being undervalued both against Kentucky and in a potential championship game against Florida. The result is them being highly undervalued, especially as national champions. Given a one-time shot Florida is the expected team to win it all, but if the Final Four is played 100 times, you’d make the most money betting “On Wisconsin”.

Virginia vs. Michigan State: Diving Deeper and a Prediction

Last time I discussed two improvements to Ken Pomeroy’s rankings. One adjustment was for missing players, and the other was a new approach to adjusting for on-court match-ups using a technique called cluster analysis. Applying these adjustments, I determined that Michigan State had a 51.9% chance of winning, with an average margin of victory predicted around 1 point. However, even this of course neglects two very important factors. The first factor is that after Virginia’s loss to Tennessee by 35 points, Joe Harris went to coach Tony Bennett’s house on New Year’s Eve to discuss the team’s issues. This seemingly has resulted in a new, transformed Virginia team. The Wahoos have only lost two games since: a close loss to Duke at Cameron Indoor Stadium, and an overtime loss at former ACC rival Maryland. The other 21 games have all been wins, including the ACC regular season and tournament championships, and quality wins over North Carolina, Pittsburgh (twice), Duke, Syracuse, and Memphis. Second, Michigan State played a significant portion of the year with injuries to key players. Branden Dawson, Keith Appling, and Adreian Payne have all missed time this year to one injury or another. Experts argue Michigan State is a better performing team with all three healthy and 100%.

I looked at a couple metrics to rate Virginia since the course of their season changed with that New Year’s Eve meeting. First, the Cavaliers have posted a BPI of 93.0 since the Tennessee loss, compared to an 87.0  yearly average, an increase of 6.9%. Pomeroy’s metric (adjusted for injury in the full-year case), also shows Virginia has played better since the Tennessee game. Their rating since the Tennessee game is 3.2% higher than their yearly rating. So indeed, Virginia has played better since the Tennessee game.

Michigan State also sees its ratings boost. It’s ESPN BPI for games where both teams were fully healthy is 91.1 compared to 85.4 for the year. This represents an increase of 6.7% over their yearly average. Using Pomeroy’s metrics (adjusted for injury in the full-year case), Michigan State plays better by 3.9% over their yearly average when they and their opponent are fully healthy.

I let these values represent each team’s ceiling. However, these ideal scenarios aren’t the only way these teams played this year, and I will still use their full-year metric in some way. To do so, I weight the ideal scenario (since this game will be played under the ideal scenario for both teams) twice that of the full-year. Doing so gives Virginia a 63.7% chance of winning before the match-up adjustment. The match-up adjustment knocks this down to a 50.9% chance of Virginia winning.

Considering the way the teams have performed at their peak, and combining peak performance with full-year performance puts Virginia at a slight edge. So, given that this game is virtually a toss-up, what can give one team the edge over the other?

Analyzing the teams’ styles, we see that Michigan State is a highly efficient shooting team, with an effective field goal percentage of 54.7%, good enough for 13th in the nation. They do not really stand out in any other area of offense, although they are in the top 100 in protecting the ball, turning the ball over on only 17.2% of possessions. In MSU’s two fully healthy losses, they lost to Illinois with an eFG% of only 45.7% and a TO% of 28.6% and to Ohio St. with a turnover ratio of 24.6%. Both Ohio State and Illinois play somewhat similar defensive styles to Virginia, with a slow-pace (all are 298th or lower in defensive pace) with top 11 defensive efficiencies. However, pace is where the comparisons end. Virginia’s strengths are minimizing eFG% and dominating the defensive boards, while generating a turnover rate that is only mid-pack. Ohio State is also strong in limiting the opponents shooting and strong at generating turnovers, while being relatively weak on the defensive boards. Illinois is not dominant in any area defensively, but above average to good in each category. To win this game, Virginia will have to continue to dominate the defensive glass, and either limit Michigan State to an eFG% under 50% or force a turnover ratio over 20%.

Virginia should have no trouble winning the battle on the glass, being a great defensive rebounding team (5th in the nation, limiting opponents to offensive rebounds on only 25.9% of missed shots), while Michigan State is a mediocre offensive rebounding team (104th in the nation, grabbing only 33.4% of their missed shots). Since the Wahoos are not a team that tries to create turnovers, this game looks like it will come down to Michigan State’s shooting against UVa’s shooting defense. Virginia must contest as many shots as possible, and even then hope the Spartans aren’t knocking down shots left and right. As was evident against Coastal Carolina, Virginia struggled when its opponents hit shots. Coastal started out 5 for 9 on three point attempts in the first half, despite two of those made attempts coming with a hand in the face.

When Virginia is on offense, there doesn’t appear to be a game changing match-up. Virginia’s offense is above average to good in all areas. Michigan State’s main defensive strength is defensive rebounding. The offensive glass is an area Virginia has all but given up on in NCAA tournament play. Virginia had NO offensive rebounds against Coastal Carolina, and only six against Memphis. Instead, during tourney-time, Tony Bennett seems to have instructed his team to get several players back in favor of a set defense over an offensive put-back opportunity.

Finally, in a game as much of a toss-up as this, it could come down to free throw shooting. Neither team is particularly great from the line. Michigan State generates only 34% as many free throw attempts as they do field goal attempts, good for 300th in the nation, and only shoot 70.4% at the charity stripe (163rd in the nation). The Cavaliers, meanwhile, shoot a paltry 67.1% from the line (272nd in the nation) but at least get to the line more often, attempting 42.4% as many free throw attempts as they do field goal attempts. Whichever team is ahead at the end of the game might not necessarily win it. With a 50.9% chance of winning, Virginia has the higher chance to be ahead in the waning minutes. However, toward the end of the game when the losing team MUST foul, Michigan State has the higher chance to hold on to a lead due to their higher FT%.

This is truly a toss-up of epic proportions. Being a Wahoo fan, I have no doubt Virginia will win the battle of the boards by a wide margin when MSU is on offense. I also don’t see Virginia’s defense creating many turnovers. However, I did see in person Virginia contest almost every shot in the first half against Coastal Carolina. This time contesting shots will work in their favor and Michigan State will knock down shots well below their yearly clip. A close game at halftime will be opened up in the second half, as Virginia slowly builds its way to an 8 point lead in the closing minute. An eight point lead will be enough to ensure free throws are not an issue. MSU will not come all the way back in the final minute and Virginia will move on to the Elite Eight. Virginia 67, Michigan State 62

Virginia vs. Michigan State: A Statistical Analysis

The upcoming game between Virginia and Michigan State is intriguing on several levels. First, this is a battle between two contrasting programs. Long established national power Michigan State, with legendary coach Tom Izzo, fields a team of sharpshooters who face off against a Virginia team that has slowly built its way back to national prominence through the grind-it-out style basketball coach Tony Bennett emphasizes. Second, this game is the only Sweet Sixteen matchup where the Vegas betting line favors the lower seed. In other words, Michigan State is considered the favorite. Last but not least, especially in regards to this piece, according to various statistical models Virginia is anywhere from a moderate favorite to the ever so slightest of underdogs. These intriguing points raise three questions: how do these teams match up? What current analytic model is most accurate? And how can we improve upon these existing models to come up with the best possible estimate of each team’s probability of advancing to the East Region final?

Starting with the “simplest” of models, Ken Pomeroy’s formula over at kenpom.com estimates that Virginia is a 63% favorite to advance to the Elite Eight. His model uses tempo-adjusted offensive and defensive efficiencies to assess team strength using Pythagorean Expectation. Essentially, he is calculating the expected winning percentage of one team against an average Division-I team. However, when two teams match up, it’s highly unlikely one of them is the average D-I team. He compares two teams by using what’s called a log5 formula to calculate the expected winning percentage for one team over the other. This is how his formula derives that Virginia should beat Michigan State 63% of the time. His model can also account for home-court advantage using an adjustment to the tempo-adjusted offensive and defensive efficiencies for the home team. However, since this game will be played in a neutral venue, we’ll skip past his adjustments for home-court advantage. Kenpom’s method also has an advantage over purely margin-of-victory-based models such as Georgia Tech’s Bayesian LRMC model or Raymond Cheong’s rankings which do not account for the tempo of play. In these models, a 30 point win is counted the same, regardless of whether this margin of victory was achieved in a 50 possession per team game or an 80 possession per team game. Clearly, achieving such a large margin of victory in a 50 possession per team game shows that the winning team was extremely efficient, reaching this margin with significantly fewer chances to do so when compared to an 80 possession per team game. Pomeroy’s model does predict margin of victory as described before using tempo-adjusted efficiencies.

However, as useful as this model is, Pomeroy himself admits and understands that this model cannot account for several factors. These include things like injuries or suspensions, on-court matchups, and other intangibles whose effects on a particular game are hard to quantify (examples include player experience, depth, coaching, officiating, etc.). We will focus on the two factors that can be adjusted for, injuries or suspensions, and matchups.

Nate Silver at fivethirtyeight.com and ESPN’s BPI are two models that do account for injuries or suspensions. Silver’s missing player adjustment uses a concept called win shares, which is roughly equivalent to measuring the impact a player’s absence has on the point differential per game. The example he gives is that Brandon Davies’ suspension in 2011 hurt BYU by about 1.7 points per game. ESPN’s missing player adjustment de-weights games where one or both teams are missing key players and makes the adjustments on a minutes-per-game basis. This is a good adjustment, because it is tempo independent (whether a game has 80 possessions or 50 possessions per team per game, if a player missed half the game, he missed approximately half of the possessions). One additional item to point out is that Silver’s calculations for 2013-2014 season now include ESPN BPI as one-seventh of his base power rankings, and then the base power rankings are adjusted for missing players. So there is a full missing player adjustment plus another one-seventh missing player adjustment. Since Silver’s missing player adjustment has an extra one-seventh component adjustment in his method, we choose to use ESPN’s BPI, which accounts for tempo and where a specific weight is given for each game.

The second adjustment we explore here is for the type of matchup this game presents. There are no rating methods (to my knowledge) that account for the nuances of matchups. However, there is some recent work in this area as ESPN, in conjunction with Liz Bouzarth, John Harris and Kevin Hutson of Furman University, has led the way in matchup-based analysis. They apply their model simply to identify potential NCAA tournament upsets in what they call their “Giant Killers” model. They use a technique called cluster analysis to group similar teams together and identify which groups of significantly lower-seeded teams have the potential to upset much higher-seeded teams (the seed differential for their model to apply must be at least 5).

We build off their ideas, and use cluster analysis to group all 351 NCAA Division-I teams into 8 distinct groups based on their style of play. We then analyze all the win/loss results from the 2013-2014 season and compare each group’s winning percentage over every other group in relation to the expected winning percentage calculated by Pomeroy using an adjustment to these winning percentages through BPI’s tempo-free missing player de-weighting. In simpler terms, if Group A was expected to beat Group B at a 57% clip (according to Pomeroy and adjusted for missing players), but Group A actually beat Group B 65% of the time, we might conclude, based on the sample-size, that this is a significant difference and that Group B outperforms expectations against Group A based on how they match up. We then apply this to the Virginia/Michigan State game to make one final adjustment to Pomeroy’s winning percentage, giving us two adjustments, one for missing players and one for the matchup.

The missing player adjustment is fairly straightforward. Using BPI’s weightings, we see Virginia’s results are deflated by missing players only slightly as the Wahoos and their opponents played mostly full strength all year. The Cavaliers’ adjusted offensive efficiency increases by 0.1% while their defensive efficiency decreases (they become less efficient defensively) by 0.3%. However, Michigan State was struck by injuries for half the year, so their offensive and defensive efficiencies are boosted by about 0.6% and 0.1%, respectively. Recalculating the projected winning percentage with these additions, Virginia drops from a 63.0% favorite over the Spartans down to a 60.2% favorite. In other words, even accounting for Michigan State’s injury troubles throughout the year, Virginia is still expected to produce overall better results against the rest of NCAA Division-I.

However, we have only adjusted for missing players up to this point. We still need to adjust for the on-court matchup. According to the results of my cluster analysis there are eight distinct styles (or “types”) of teams. Each type has similar features that distinguish it from other types of teams. This produces 64 possible matchup combinations (eight types of teams can face eight other types of teams). UVa is a “Type 6” team. These teams are very efficient defensively, forcing an extremely low effective field goal percentage and a very high rate of turnovers. They tend to be good rebounding teams on both ends, but are only slightly above average in effective field goal percentage. Michigan State is a “Type 7” team. These teams are extremely efficient shooters, usually win the turnover battle, and are great on the defensive boards. Type 6 and Type 7 teams are very good and quite similar overall, with Type 6 teams better defensively and Type 7 teams better offensively, especially at shooting. Type 6 and Type 7 style teams produce teams that have the two highest average rankings according to Pomeroy’s rankings adjusted for missing players.

According to the missing player-adjusted winning expectations, Type 6 teams tend to be slightly higher rated than Type 7 as a whole. For matches played in the 2013-2014 season between one Type 6 team against one Type 7 team, Type 6 teams were predicted to win 64.6% of the time over Type 7 teams using Pomeroy’s ratings that have been adjusted for missing players. However, in reality the win rate was about 20% lower than the expected missing player-adjusted win rate, with a 51.7% actual win rate. Based on the sample size the p-value associated with this is <0.01, and we can indeed conclude that Type 7 teams pose a particularly tough matchup for Type 6 teams. In other words, Type 6 teams tend to dominate teams that are inferior, but struggle more than expected against other top-tier teams who are more efficient offensively and less efficient defensively. As a result, we adjust UVa’s expected win ratio of 60.2% down by the average 20% giving the ‘Hoos a 48.1% chance of winning the game against Michigan State.

When compared to other analytic models, this model actually produces the least favorable chance for a Cavalier victory. Adjusted for tempo, this yields a result of Michigan State winning on average by just over 1 point. Currently, the Vegas consensus is Michigan State favored by 2 points with 67% of the money coming down on the Spartans. Thus, bettors seem to be underestimating the Cavaliers’ chances of victory.

The next entry discusses the intangibles for each team, the impact of which is hard to quantify statistically, and delves into what Virginia needs to do to turn the tables in their favor.