My favorite anecdote in Michael Lewis’s The Big Short is about one of the first people to make big bets against mortgage bonds, a Deutsche Bank trader who liked to brag that he employed China’s second-smartest mathematician. According to Lewis, the trader talked about the mathematician as if he were “a pet tied to his desk.” When anyone doubted the mathematician’s claims, the trader would say: “How can a guy lie who doesn’t even speak English?”
It’s hardly news that one of the world’s biggest brains dedicated his life to modeling the havoc created by a million subprime mortgage brokers. If “A Beautiful Mind” had been set in the 1980s, it would have ended at a hedge fund, not in a Nobel Prize. It’s sort of a tragedy. I’ve sometimes wondered if God calibrated the size of our brains and the amount of fuel in the sun to give us just enough time to figure out the universe & send a space-ark toward a new galaxy, but the only guys who could figure this out are working for Wall Street.
What’s interesting now is that they’re fanning out to other industries on Main Street and Sand Hill Road. The most coveted employees in Silicon Valley may no longer be software engineers, but mathematicians. And the reason is simple: we now record so much data about what people are doing within the vast virtual world of the web that our biggest challenge is just making sense of it.
My last trip down to the Valley was a field trip set up by one of our investors, Greylock Partners. I met a mathematician who once developed models for predicting the likely locations of nuclear weapons in Iraq. He’s now spending his time more profitably at a social networking site, working out when to send diffident users a “win-back” email. Later that same day, I met a Chief Marketing Officer at one of the world’s largest retailers’ websites. He had no interest in my questions about branding. The team he ran was a team of mathematicians.
And the world he described was fascinating. Imagine if, every time you walk into Anthropologie or Macy’s, a guy with a clipboard follows you around, noting the path you take through the racks, the clothes you pick up, the ones you try on, the ones you get in line with, and the ones you finally buy.
He measures how much time you spend on each floor, and he comes into the changing room with you to measure how long you spend at the mirror sizing up each shirt. The next time you stop in, the whole place is re-arranged so that you don’t have to walk as far, or see clothes you don’t like, and its decor has shifted in subtle ways, which somehow makes you want to stay longer.
This is what’s happening within every well-run website. Just ask Jeff Hammerbacher, who built the data storage system for Facebook. He visited Redfin last month to talk about Hadoop, an open-source data storage system built to support the analysis of vast data sets. Jeff observed that most data now is collected by machines monitoring other machines, not by a machine collecting the input of a Macy’s sales clerk, or storing a letter to your mother.
The difference in volume between machine- and human-produced data is as great as the difference in volume between Model Ts and their hand-built predecessors. With a single setting adjusted, a web server can increase the amount of data it sends to another machine by a thousandfold.
It dawned on me then that Jeff hadn’t come to talk about how Facebook stored all your messages and status updates; he came to talk about what turned out to be a far larger data set: how Facebook stores data about what you were doing before posting that message, and what you do next.
The result? Facebook was capturing a terabyte of information about its users every day. A trillion bits of data. In 2007. Back then, Facebook was a tenth its current size.
This data creates a new competitive dynamic. First, it favors size: Lowe’s knows better than the corner hardware store what to stock because it has more data about what people want. CarMax can price used cars better than a mom-and-pop dealership because it has more data on what people will pay for, say, a 2008 Honda Odyssey. CarMax’s founder, Austin Ligon, called this information dominance.
More importantly for Redfin, this dynamic favors company-owned operations over franchises. Last Tuesday, I visited CNBC on the same day as the CEO of Coldwell Banker. We were both interviewed about the market. And the CB CEO should have known far more about it than I do: he has decades of experience, not just a few years; he manages an organization with tens of thousands of real estate agents, not a few dozen. In short, he has forgotten more about real estate than I will ever know.
But outside of calling one agent after another, the CB CEO has no way of knowing what his agents are doing; most work as contractors, for franchises, recording their deals in spreadsheets and notepads. Redfin on the other hand has a system for scheduling home tours and writing offers, which means we also have a system for storing data about every tour & offer. Months before the numbers are recorded at county courthouses or by federal agencies, we know when bidding wars are back, or when tire-kickers have taken over the market. We can see the whole elephant, and we’re minutely sensitive to when he’s about to roll on top of us or stampede through the jungle.
(Earlier we said that Coldwell Banker’s CEO writes a column for Inman News. This was incorrect. We apologize for the error.)