Stephens-Davidowitz is right:
One more important point that becomes clear when we zoom in: the world is complicated. Actions we take today can have distant effects, most of them unintended. Ideas spread—sometimes slowly; other times exponentially, like viruses. People respond in unpredictable ways to incentives.
Yet we seem to like simple stories and seem to believe that our actions will have simple, easy-to-understand consequences. Data complicates or invalidates many of those stories, so we ought to seek it whenever we can. Stephens-Davidowitz does just this in Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are. An alternate sub-title could be, “Why most of us are full of shit.” You may suspect, intuitively, that most of us are full of shit, but it’s nice seeing it confirmed. The miracle of aggregation gives us a lot of new tools to look at human nature.
This book can be read as part of a series, as it’s congruent with Dan Ariely’s Predictably Irrational and especially Jon Birger’s Date-onomics, which doesn’t discuss data from porn, as Stephens-Davidowitz does, but it could have (albeit at the risk of making it longer and perhaps turning off some of its readers). We’re going to see a lot more books like Everybody Lies as the Internet allows us to aggregate huge amounts of data that tell us something about what we do—as opposed to what we say. What we say seems to be a very poor guide to understanding what we really think; while this has been obvious on some level for a long time, it’s useful to see the specific ways action and speech are mismatched.
Take one sensitive area:
Somewhat surprisingly, porn data is rarely utilized by sociologists, most of whom are comfortable relying on the traditional survey datasets they have built their careers on. But a moment’s reflection shows that the widespread use of porn—and the search and view data that comes with it—is the most important development in our ability to understand human sexuality in, well . . . Actually, it’s probably the most important data ever.
“Ever” might be an overstatement (what about Masters and Johnson’s live observations?), but calling it “very important” and perhaps most importantly “novel” is legitimate. While the observation is useful, it’s also useful to remember that what people want in a fantasy setting may be different from what they, or we, want in a reality setting. Many people like watching people get shot in movies without thinking we should shoot more people in real life.
Or, in the same domain, there is this, with the data from the General Social Survey:
when it comes to heterosexual sex, women say they have sex, on average, fifty-five times per year, using a condom 15 percent of the time. This adds up to about 1.1 billion condoms per year. But heterosexual men say they use 1.6 billion condoms every year. Those numbers, by definition, would have to be the same. So who is telling the truth, men or women?
Neither, it turns out. According to Nielsen, the global information and measurement company that tracks consumer behavior, fewer than 600 million condoms are sold every year. So everyone is lying; the only question is by how much.
A meta lesson may be, be very wary of survey data.
(If you recognize some of these ideas, you’ve probably read A Billion Wicked Thoughts or my essay on it.)
Other problems, this time outside the realm of sexuality, include estimation:
When relying on our gut, we can also be thrown off by the basic human fascination with the dramatic. We tend to overestimate the prevalence of anything that makes for a memorable story. For example, when asked in a survey, people consistently rank tornadoes as a more common cause of death than asthma. In fact, asthma causes about seventy times more death. Deaths by asthma don’t stand out—and don’t make the news.
Still, I wonder what would happen if researchers paid survey respondents for right answers. In surveys, people have little incentive to try to be right. In some other parts of life, they do.
Much of the data comes from Google, and we should remember something important: “Google can display a bias toward unseemly thoughts, thoughts people feel they can’t discuss with anyone else.” Which makes sense: before Google or the Internet more generally, many of those thoughts would never have left the mind in a way that in turn left a residue on the rest of the world. Now they do. Perhaps one lesson of Everybody Lies is that more of us should use Duck Duck Go, the search engine that famously doesn’t record its users’ search terms. I infer, from the prevalence of Google search and Gmail, that most people don’t give a damn about privacy—regardless of the numerous article about privacy one sees in the media. People’s revealed preferences seem to indicate they want convenience and familiarity far more than privacy.
Then there is this, which may be most useful for people doing Internet marketing:
The lesson of A/B testing, to a large degree, is to be wary of general lessons. Clark Benson is the CEO of ranker.com, a news and entertainment site that relies heavily on A/B testing to choose headlines and site designs. “At the end of the day, you can’t assume anything,” Benson says. “Test literally everything.”
By the way, the school(s) you attend also seems to matter little for any measurable life outcomes. The money spent on expensive private schools seems to be largely wasted, or, if not wasted, then at least should be considered a consumption expense, rather than an investment expense. The entire education industry has worked hard to convince you otherwise, but the papers Stephens-Davidowitz cites are convincing and congruent with similar research I’ve seen on the issue.
Stephens-Davidowitz ends by saying that data from the Amazon Kindle indicates that few people read to the end of books. This one is worth reading in full.