This is an advanced beginner data science tutorial where we build a simple forecast model which predicts NCAA football spreads and win probability for upcoming games using the R programming language.

This model accurately predicted the winner of 73% of games for the 2019 season.

The datasets featured in this tutorial are free and open source and contains over 25k unique games dating back to the 2000 season. It is updated weekly with results from the most recent games and also contains upcoming games for the current season.

Follow me on Twitter for future forecasts

Check out my videos…

Let’s predict the 2020 presidential election!


Image for post
Image for post

Get the full code on my GitHub page

Follow me on Twitter for model updates


Creating a simple prediction model for the 2020 general election between Trump and Biden is actually fairly simple. All we need is to estimate each candidate’s state-by-state average polling performance and polling standard deviation to create a basic monte carlo simulation.

This simulation is currently tracking very closely with Nate Silver’s 538 model. As of this writing, my model is forecasting a 70% chance Biden wins the election which is in lockstep with the 538 forecast model.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store