How to Mine Twitter Data For Fantasy Football Gains
Fantasy Football, like all fantasy sports, is a game that requires a lot of research. This research often consists of watching and reading sports news, discussing with friends and colleagues, or just thinking about what you could do better than the coach who’s actually in control of the player you care about.
I’m going to show you how to tap into another source for your research: social media — specifically Twitter. While books like Moneyball have convinced many that the future of sports is quantification, we are still in a place where people are able to make decisions that are not based solely on statistics. In fact, many teams (not just in football, but in all sports) have recently hired a social media director whose job it is to determine how they can leverage these online interactions to their advantage.
What I will show you here is how you can mine the same data that these directors have access to and use it to gain an edge over your opponents. As with any analysis, though, there’s more than one way to approach this problem; I hope that this provides a good starting point for you so that you can find your own best path forward.
So you want to be a fantasy football champion? Do you want to be on top of the latest and greatest trends in football? The best way to do that is to mine data from Twitter. The information you need is out there. You just have to find it, organize it, and use it. This blog is going to focus on how we can mine Twitter data for fantasy football gains.
Twitter is the most popular social networking site in the world. It is second only to Facebook in monthly active users, but, unlike Facebook, it provides a free API that allows developers access to all its content. That means that if you know what you are looking for, you can get access to a lot of high quality information that could help your team win your league.
Of course, there’s a lot of garbage out there as well. You have to sift through all of that to get at the good stuff. In this blog I will hopefully give you some tips and tricks for doing that. I also plan on sharing lots of useful links and resources, as well as my own code which I hope will serve as a starting point for anyone interested in doing this kind of work.
A word about me…
I am not a professional
The decisions we make in fantasy football have a lot to do with the information we have. The more we know, the better our chances of winning.
So what if there was a way to use Twitter data to get an edge on your league-mates? There is. It’s called Social Media Mining.
The concept is simple: We filter through large amounts of Twitter data to extract useful information and feed it into your fantasy analysis workflow – be it projections, rankings or custom models.
From Fantasy Football Analytics
We know that the best predictor of future fantasy football performance is past fantasy football performance. But unfortunately, we don’t get a free pass to ignore the offseason just because we won our league last year – the NFL landscape is constantly changing and evolving.
The NFL offseason is an important time for fantasy managers to evaluate new situations and changes in roles, as well as who’s going where. In this post, we’ll take a look at how trends in Twitter data can give us an edge in identifying which players are primed for increased playing time, and thus increased production, heading into next season.
It goes without saying that news breaks on Twitter first – it’s the go-to outlet for all things sports. But when it comes to fantasy football, Twitter has even more utility for us: it’s the best place to read about player sentiment and expectations from fans, coaches, analysts and players themselves. If a player is tweeting about being excited for camp or working out with his team this summer, we can feel confident he’s getting closer to achieving his full potential and should be moving up draft boards. If a coach talks about wanting to use a running back more in an interview, we should take note; if analysts are talking about how great a young player
I’m a big fan of the computational analysis of social and professional networks, so I’ve become interested in applying techniques from that domain to fantasy football. The idea is to use data from Twitter to extract features that are useful for predicting outcomes in fantasy football (e.g., will this player have a good game this week?).
My first attempt at analyzing Twitter data has been an exercise in frustration. The NFL is not particularly friendly to data miners, with no free, open API available. Retrosheet contains lots of useful data, but it’s all historical and it doesn’t contain any information about the “social network” of each player (i.e., who he follows and who follows him).
I did, however, manage to scrape some data from NFL.com. This was not easy because I had to go through a lot of hoops to get around their legal threats and anti-scraping measures. Eventually I ended up writing a script that goes through the list of players on each team’s roster page on NFL.com and extracts their Twitter handles (if they have one).
I spent a few hours last night trying to find interesting things on Twitter. I was mostly looking for things related to Fantasy Football and the NFL, but I also saw some great stuff about how Twitter can be used to track the spread of disease, how it can be used by politicians to keep track of what is said about them, etc.
I found a few interesting, albeit far-fetched, ideas, but nothing that could be turned into a real product. Then I came across this tweet:
I’m going to write a script that pulls all tweets with