A Simple Model for Running Marathons

Alejandro Rojas (@Venamax) is an entrepreneur and student in the datascience@berkeley program. He is a former top-tier management consultant with a proven track record of developing tools that drive strategic decision making. He is a data enthusiast who runs experiments even when running marathons. 

I love running models and running marathons. In fact, last year I joined the Data Science program at UC Berkeley and also finished my 10th New York City Marathon. Combining both experiences, I decided to put together a model for running marathons.

Any model should start with clear premises, and in my case I laid out two. My first premise was that a simple model is better than a complex model. The second premise was to make it useful. So my overall purpose was to come up with a model that would be simple and useful. To be useful, a model needs to take into account who will use it and how it will be used. I wanted my model to be used by runners to track their performance during races and define training strategies to help them achieve their goals. How can runners use a model when running a marathon? By keeping it so simple that it only needs a basic calculation requiring little mental effort.

My approach breaks down the 26.2-mile race into three 8-mile segments and one 2.2-mile segment, while assuming a pace of 10 minutes per mile. At that pace, the marathon time will total 262 minutes (26.2 miles multiplied by 10 minutes per mile), a time that, for example, is 22 minutes above my own four-hour target. Similar to card-counting techniques, this approach points to a single indicator that reflects a simple goal: How to save 22 minutes, or a runner’s own individual target count, over the 26.2 miles. Just as if it were a game, a simple calculation determines a “score” that shows at every mile how much progress has been made toward achieving the simple goal of saving target minutes.

Breaking up the 26.2-mile distance into three 8-mile segments and one 2.2-mile segment seeks to establish a “format” that can be replicated and rehearsed before facing the challenge. Focusing all training into 8-mile run segments lets runners manage, measure, and get comfortable with an experience that gets replicated three times during the race. This approach also simplifies the creation and execution of strategies geared toward achieving a stated goal.

For the marathons I ran, I implemented a strategy I named GAPP, which stands for “glide,” “accelerate,” “push,” and “pray or protect.” The strategy is to “glide” over the first 8 miles of the marathon, seeking to save only six minutes. After the first 8-mile segment, I start to “accelerate” the pace with the objective of saving an additional 10 minutes. In the third 8-mile segment, I want to “push” it up so that 12 more minutes are saved. The last 2.2 miles of the run are all about keeping the total accumulated savings of 28 minutes, in other words, “protect” those saved minutes or “pray” to have enough energy to get to the finish line at a theoretical time that should be six minutes below the four-hour target.

The best thing about “running” this “experiment” is the fact that runners can test their hypothesis and learn what works and what doesn’t. I encourage you all to run your own "experiments" because simple models become more powerful as more data is collected. So go collect more data by running your own marathon.


Learn more about how datascience@berkeley can help you achieve your goals.