Why FiveThirtyEight’s National Burrito Bracket is Tragically Flawed

Nate Silver’s data-driven search for the country’s best burrito isn’t as neatly rolled as it seems

2 Comments

Let me get this out of the way first: I love burritos.

In fact, I doubt you could find someone less anti-burrito than I am. That’s not to say my diet consists solely of them, but when it comes admiring burritos as a wonderfully delicious yet practical byproduct of the complex Mexican-American culinary tradition, I would ranked myself pretty high up there.

That’s why, like many burritos devotees, I was excited to hear about Nate Silver and Anna Barry-Jester’s newest project, the Burrito Bracket, which aims to determine the nation’s best burrito using empirical analysis and statistics, with added help from national Mexican food experts. (I should probably add in here that I am a huge fan of Silver’s work on FiveThirtyEight in general, in which he and his team break down the math involved in predicting everything from Obama’s 2012 reelection to the likelihood of each team winnings this year’s World Cup.)

So, you may wonder, how does a hardened stat-head go about finding the best burrito in the country? Mainly, by using collected data from reviews of over 67,000 establishments on Yelp, all of which contain a some mention of the term “burrito.” The star rating and total number of reviews of each location were then plugged into a complex formula to generate a figure dubbed “VORB” or Value Over Replacement Burrito. Those of you familar with sabermetrics should be salivating right about now (read about the process in greater depth here).

The top-ranking VORB restaurants in the country were then curated and ranked by a panel of experts that included chef/restaurateur David Chang, OC Weekly editor and author Gustavo Arellano, Eater restaurant critic Bill Addison, and Mexican food historian Jeffery Pilcher. Sixteen winners from each of four regions (California, West, South, Northeast) were then seeded into an March Madness-style bracket of 64, in which each individual burrito battle will be judged by FiveThirtyEight burrito correspondent Anna Barry-Jester.

The entire process has yielded several fascinating blog posts so far, including terrific interviews with the judges debating the burritos merits of each region of the country. I enjoyed reading them all immensely and would highly recommend doing the same. But there is one inherent problem: Although Silver and Barry-Jester’s in-depth analysis seems to hold up on initial review, closer inspection reveals that there are some major issues with the Burrito Bracket methodology.

The largest flaw, perhaps, lies within the Yelp data mining itself. That method treats any review which mentions the word “burrito” as a review of the burrito itself, rather than of the restaurant as a whole. We know from experience that this is often untrue. Yelp reviews can often be long-winded affairs where stars are doled out or withheld are based on things like “they forgot to give me napkins” to “this place is always open late after I get drunk.” And while a restaurant that has a very high overall Yelp rating has a greater chance of having a terrific burrito than say, one with a low overall rating, they are not necessarily determinant of one another.

If Yelper A, for example, enjoyed his or her burrito but decided that $1.25 was too high a price for extra guacamole or that the staff was rude, and gave the restaurant two stars instead of four star, that could incorrectly affect results in a burrito search.

If this factor seems insignficant, it’s also worth noting that in area where Yelp is not heavy used—basically anywhere not near a large metropolitan area—the function of the star rating often becomes skewed in one direction or the other, as some reviewers are known to exaggerate ratings for the sake of visibility, i.e. “This is the absolute WORST burrito in Flagstaff!” This leads to a larger point: Even though the formula Silver uses aims to balance the weight of star ratings versus total reviews, it can’t overcome the fundamental fact that smaller sample sizes on Yelp are much more prone to error than larger ones. Yelp, as a company, has encountered this problem as well, and uses a complex formula in determining the affect of an individual review on an overall star rating.

This may explain why beloved restaurants in sparsely-populated states like New Mexico and Arizona (where the burrito historically originated) yielded very underwhelming VORB scores, despite anecdotal evidence to the contrary.  In fact, many of the seeded restaurants from the West region (west of the Mississippi omitting California) only made the list by grace of the experts, in spite of their dismal VORB scores.

Overall, the final list of seeded burritos—at least in California, where I am familiar with most of the nominees—seems pretty solid. El Farolito and La Taqueria, both Mission-style joints in S.F., Lolita’s Taco Shop in San Diego and Manuel’s El Tepeyac in Boyle Heights are all deserving candidates, each buttressed by a sheer mass of positive reviews. But it does make you wonder what under-the-radar gems were overlooked in less Yelp-friendly regions outside the West Coast. Indeed, it was in this area that the judge’s input was most needed. 

The team of experts Silver recruited was enlisted as a mitigating factor in the face of inconclusive data sets (kind of like how frog DNA was used to fill in damaged dinosaur genes in Jurassic Park). In this situation, you probably couldn’t ask for a more esteemed ensemble of food writers, historians, and chefs. My question, then: Why not use their judgements (or the judgment of a larger groups of critics) to determine seeding entirely, rather than shape the VORB data to align with the judge’s opinions? At least, poll a more-informed set of users, like those that populate Chowhound.com, where entire threads detailing arguments over superior burritos exist, rather than draw on an artificially quantitative burrito metric such as Yelp ratings for seeding.

A final qualm lies with the final 64-burrito bracket itself, which will determine a winner based on the sole judgment of editor and burrito correspondent Anna Barry-Jester. I have no problems whatsoever with Barry-Jester’s tasting ability, but it seems strange that a process which goes to such lengths to be populist-driven, reverts to the opinion of single person in its final stages.

In the end, this grand tournament of burritos will be undoubtedly fun to read, and the overall winner will almost certainly be deserving of it high praise. But will it truly, objectively yield the best burrito in the country, eliminating the need for future “Top 10” and “Best of” lists? Of course not.

Although Nate Silver’s cultural sabermetrics have worked wonders in the past, applying those skill to the food world might prove a much trickier feat—much like rolling a burrito so the filling doesn’t fall out of the tortilla. Until then, maybe just ask the guy in your office who really likes Mexican food.

Editor’s note: I should mention that my favorite burrito in Los Angeles, the cheese-oozing, chile relleno behemoth at East L.A’s La Azteca Tortilleria did in fact make the list as an unseeded addition. Go La Azteca!


Related Content

Comments

  1. Richard Robin

    June 12, 2014 at 10:50 am

    Good going Garrett. You are becoming an excellent restaurant critic! This review was right-on!

  2. Is “Data-driven Science” an Oxymoron? | Science Political

    October 6, 2014 at 4:46 pm

    […] their “burrito competition,” which could be a fun idea but their bracket apparently neglected sparsely-populated states like New Mexico and Arizona, where the burrito historically […]