Take Me Out to the Ballgame: the Growing Role of Data on the Fan Side of Baseball
The following is a guest post by Master of Information and Data Science instructor and PhD student Andy Brooks, whose research examines the role of information technology in strategic and high-stakes decision making. He will be co-teaching the Research Design and Application for Data and Analysis course with Professor Steve Weber, starting spring 2014. To find out more, follow Andy on twitter at @andybroo.
If you’ve seen the movie Moneyball, or read the original book, you probably know a great deal about the Oakland A’s use of data to improve their on-field performance and to ultimately win an unprecedented twenty consecutive games. Since Billy Beane’s initial foray into the field of data science, many other teams have followed suit, crunching numbers to improve their team performance.
Proper analysis of sports data can clearly improve the performance of a team. What people don’t often talk about, however, is the effect data science can have on the fan side of the sport, whether by drawing spectators in to the games, driving them away, or engaging them during their time at the stadium.
As “America’s Pastime,” baseball creates die-hard, lifelong fans, and teams are beginning to leverage the same tools they’re using to crunch the speed, strength, and abilities of their players to also engage with fans as never before.
For starters, teams are broadening their definition of what it means to be a fan. For the last century, baseball fandom was often demarcated by a single, readily measurable action —whether or not you went to a baseball game. It was difficult for teams to identify — and more importantly, measure — other types of fans: those who pored over box scores, watched the games on television, or listened on the radio. Purchasing a ticket was one of the few, measurable transactions. In this era teams didn’t readily know who you were and were unable to easily track consecutive purchases or interactions. This meant that they didn’t have a deep, long relationship with any of their fans, at least in a data sense, unless those fans were season ticket holders. Even then, data was scarce — once a year, those fans received a phone call about renewing their subscription, or a brochure and invoice in the mail.
All of this began to change about a decade ago, when professional sports teams started to replace paper tickets with barcode-based electronic tickets. With this digitization, the amount of data that teams could gather about fans and their activities exploded overnight. Digitization made it easier to track attendance, repeat purchases, and ticket resale prices, as well as find out where fans were located via their payment information. Teams were better able to use data to understand their fan base, just as they could their players on the field.
The San Francisco Giants were pioneers in this sense when they opened AT&T Park (then Pacific Bell Park) in 2000. The team implemented these new electronic, barcode-based tickets to gather more intelligence about their game-attending fans. The team also created secondary sources to gather intelligence: the Double Play Ticket Window and Ticket Relay. This proprietary online marketplace platform and ticket exchange technology allowed fans to easily buy and sell tickets with one another.
The primary user group for these technologies was season ticketholders, a group that clearly had a keen interest in engaging with the team. They provided the perfect sample set for the Giants to get a more detailed understanding of part of their fan base. By tracking the exchange of electronic tickets, the Giants could not only see who was selling and buying tickets and for what price, but how often, where they chose to sit, and who they ended up selling them to. All of this was crucial information that the Giants could use to attract and retain their long-term fans.
At the same time, these technologies offered tremendous value to season ticketholders, who had to buy a full season of tickets — 81 games, plus a handful of preseason games. Even the most loyal, passionate game-going fan can’t always attend every game. Technologies like the Double Play Ticket Window and Ticket Relay made it easier to sell tickets to a game a season ticketholder couldn’t attend. Rather than turning to traditional outlets – selling on Craigslist or to questionably legal sidewalk scalpers, giving them away, or “eating” the tickets at a loss, the season ticketholder could easily recoup their entire investment. This was a “win” for both the team and ticketholders, much like StubHub today.
Nowadays, electronic ticketing provides a great data source for direct marketing. When fans purchase a ticket, the system collects their address, credit card, name, e-mail, and more. Teams will take this a step further and turn to a data broker or marketing insights consultancy, which will use the collected information to build a more detailed profile of purchasers. The data broker or consultants will try to deduce things like:
- What is their level of education?
- Where do they live?
- Where do they work?
- What is their income?
- How do they get to the stadium?
- What are their other interests (particularly sports)?
- What is the likelihood that they will attend certain types of games, such as day, night, or against certain rival teams?
Why do these things matter to a baseball team? For starters, knowledge of a fan’s income may hint at the proper amount of product or experience upselling a team should pursue. Information about where the fan lives or works and how they get to the stadium might lead to better targeted ads on public transit or on local television. Knowing the likelihood that a fan is interested in certain types of games allows teams to further tailor e-mail messages to pique interest. Data can have more ephemeral uses as well — from attracting visitors to a website to ensuring international fans remain invested in the franchise.
YOU MIGHT ALSO LIKE
Here, Kris Harbold explores the Bitly Real-Time Media Map, an interactive data visualization that illustrates media consumption across the United States. By using the tools of data science to analyze the source and format of media links being shared, this map attempts to pinpoint correlations between the political affiliation of readers and the news they consume.
YOU MIGHT ALSO LIKE
Think ghosts are scary? Think again. Just in time for Halloween, we’ve got a horror story that will chill you to the bone: big data breaches. What’s the harm in someone else taking a peek at your data, you might ask? Inherently, this wouldn’t raise an issue, but nine times out of ten, breaches are criminal in nature, launched with malicious intent — and these attacks can cause serious harm.
YOU MIGHT ALSO LIKE
Nicholas Christakis’ 2010 TED Talk explores what he calls “computational social science,” incorporating the idea that epidemics spread through social networks. He has worked with mapping processes, data collection and analyses, which allow us to understand social processes and phenomena in a way that simply wasn’t possible before.