Search This Blog

Monday, May 21, 2012

Predictive Analytics: March Madness Style

This post was originally posted on in April 2012. Re-posting here for archival purposes.

March Madness has the unfortunate reputation as being one of the most productivity draining events of the entire year for many organizations. Instead of focusing on this phenomenon, which many others have already so eloquently done, (here’s a great one this post is about the underlying data and capabilities of it to predict a team’s success in the tournament and how analytics is as important for those of us filling out our brackets as it is for the success of our organizations.

Each year, as NCAA tournament time rolls around, millions of college hoops fans, inclusive of myself, try to predict which lower seeds will upset a higher seed, which school will make a Cinderella run and who will ultimately make the final four.

I bleed Orange and Blue. This means my beloved Florida Gators are a lock for the final four every year and traditionally, I’ve always put them through to the championship game in my bracket.

However, as the landscape was dominated this year by the likes of Kentucky (who we as Gator fans endured three painful loses to), Syracuse and Missouri, which received an untimely exit compliment of Cinderella Norfolk State, I decided to take a more objective approach to filling out my brackets. I did a side-by-side comparison of how teams would perform in the tournament based on two key data items used by the NCAA men’s basketball selection committee. 

The first was RPI which is a computer ranking of a team based on the opponents it has played during the season. The RPI is made up of a complex formula assigning weightings to things such as winning percentage, strength-of-schedule, wins at home vs. road wins, and opponents winning percentage. Fortunately for college basketball fans, the RPI rankings play much less significance (or do they?) in determining the overall national champion than that other infamous computer ranking system we all love to hate. Yes, I’m talking about you BCS!

The second was the team’s won/loss record over their last 10 games.  Those we consider experts believe how a team performed over their last 10 contests, conference tournaments included, is as good a predictor as any as to whether a team is deserving of being selected for the tournament field. The thought process here is a team who won 7 of their last 10 has a much better chance of going deeper in the tournament than one who may have lost 5 of 6 down the stretch.

And so I began to fill out my brackets, putting my love for my Alma Mater aside-giving way to the numbers and higher rationale. As we always say, the numbers don’t lie. In my RPI bracket, the team with the higher RPI went through to the next round.  In other words, the higher seeded team advanced.  The “last 10” proved  a bit more of a challenge with many teams recording the same won/loss records over their last ten games.  If this was the case, my tiebreaker was overall number of wins.

Now before we get to which was a better predictor of who would eventually be crowned champion, let’s discuss how using your company’s own data can help identify those employees in your organization who may become champions and those who may make an early exit.

If I told you that out of your most recent group of new hires, I could predict with high levels of accuracy which of those employees would no longer be with your organization after 1 year, what you would say? How about if I could predict which of your high performers might consider leaving their current position with your company, what would you do? I expect you would say “Heath, show me how I can use my organization’s data to predict employee behavior. “ I’d be happy to. By creating models that leverage key data points in our HCM systems, such as salary, length of service, location, job, performance data, we can successfully predict an individual’s behavior with greater than 80% accuracy. This is exactly what the selection committee is trying to do when building the tournament field, creating a model and using readily available data to determine a team’s success. Although in this case we’re dealing with individuals and your organization instead of a team and a tournament.

Imagine the value this type of tool would provide to your organization. Think about reducing employee churn and the cost savings associated with replacing an employee (estimated in the thousands of dollars).  And how about the ability to recognize flight risk of a high performer and the potential opportunities gained if you were ultimately able to retain that individual. The possibilities are endless. As an HR practitioner,  demonstrating these capabilities to your C-level execs will definitely earn you a seat at the table.

So how’d my brackets do? (ignoring the inherent flaws in my model) RPI turned out to be a better predictor of success then won/loss record. A team’s RPI successfully predicted the game winner 62% of the time, including the overall national champion, the Kentucky Wildcats. Using “last 10” as a predictor was only accurate in selecting the winner 30% of the time. And my Florida Gators, who were a dismal 4-6 in their last 10, well they defied the odds making it through to the Elite 8.

Fortune tellers have their crystal ball, the selection committee has their RPI. As professionals, we have access to powerful  business intelligence tools and data to help us predict the future. Harness that power to ensure the ongoing success of your organization. Like we always say, the numbers don’t lie.

No comments:

Post a Comment