Summer reading: Everybody Lies

I heard about Everybody Lies from listening to the Freakonomics podcast where Seth Stephens-Davidowitz was interviewed about the book.  I absolutely love data.  I’m not necessarily good at evaluating it as I don’t have the same toolkit as many modern data scientists, but I do often turn to data to answer questions I have in my life.  This book didn’t disappoint in answering some really interesting questions–about racism, sexism, poverty, sex, and more.

The main points of the book, I’d say, is that our intuition about things is often wrong and that we have enough data at our fingertips and the tools to dig into (in the form of computing power) to answer some really big and important questions that might make life better for lots of people.

Stephens-Davidowitz is also a really good writer, so while the book is about datasets and regression analyses, it’s not at all dry.  And the insights the book reveals about human nature are also compelling.  Here are a few of my favorites:

  • While we talk all the time about implicit bias when it comes to race, search data reveals that racism is not as implicit as we think it is.  It’s really explicit. People just hide it well.  They’re not unaware that they’re racist, as implicit bias would have us believe.  They just don’t share their racism with others.  But they share it with Google.
  • Parents display a lot of bias against their daughters. They assume she’s not smart, that looks are more important, and that ugliness is a very undesirable characteristic to have in daughters but not necessarily sons.  (I found this nugget particularly interesting given my interest in girls education).
  • The Internet is not as segregated as one might think.  Most of us bump into people whose opinions are very different from our own very regularly.
  • People say they’re going to do one thing — like watch a documentary and not the chick flick — but they do something else entirely.  Which is why Netflix and Amazon and other Internet sellers pay more attention to what you actually do (watch the chick flick) and not what you are projecting you’ll do (because you added that documentary to your queue).
  • Sometimes data doesn’t give you the whole picture, so you need human intervention. Test scores, for example, don’t tell you everything you might need to know about how effective a teacher is in creating student success. Test scores, student surveys, and teacher observations (the last two qualitative data from humans) taken all together give you a really solid picture.
  • Also, the size of a horse’s left ventricle is a big indicator of whether that horse will win a lot of races.

And those are just a few of the cool things I learned.  But the other cool thing about the book is that it’s also a story of data itself, of how much we have (even us regular people), of what kinds of things scientists are investigating and discovering from all this data, and the untapped potential that’s there.  I actually think I’ll be applying some of what I learned from this book pretty immediately.  And that’s cool.