If you are a Twitter user, congratulate yourself: your average word length is greater than Shakespeare’s. If you are a man or a woman, I know what age group you find most attractive. I also know those naughty and bizarre things you search for in Google.
Okay, I fess up. I don’t know YOU, but I know the collective representation of you thanks to Christian Rudder’s new book DATACLYSM. And, dang it…I know a bit more about myself, too.
Rudder got me interested in his book when he started talking about the dating site he co-founded, OKCupid. So much data! I loved peeking in with him as he explored what happened when photos were removed from the site and people went on blind dates (hint: ugly folks make great dates). Then later when the pictures were made larger (hint: beautiful people received tons more messages; homely folks even fewer than before). Everything was extrapolated, even down to the average keystrokes versus average message length of messages (can you figure out how 1,000 letters are typed just by pressing 10 keys?).
Rudder kept his Harvard math powers rolling through Twitter, Craigslist, Google, and—my favorite—reddit. WARNING: I spent way too many hours last night rolling through Google trends. Be prepared. You will lose precious time fondling the data in this book.
Here’s a snapshot of page 202. Rudder created a map of reddit, called “The United Stated of Reddit”. The size of the subreddits represent the popularity. The darker color respresents the percentage of people that stay within that subreddit and don’t post to other subs:
One more thing: Rudder is hilarious and insightful. Describing what we learn from Google, he writes, “It’s the site acting not as Big Brother but as older Brother, giving you mental cigarettes”. Going back to Twitter versus Shakespeare (average word length 4.8 versus 3.99, respectively), Rudder writes, “Looking through the data, instead of a wasteland of cut stumps, we find a forest of bonsai.” As for the data on his OkCupid site, he writes, “People saying one thing and doing another is pretty much par for the course in social science.”
The one thing I can complain about is the length of the book: I want more! At 300 pages, I would say about 100 of those pages are devoted to end-of-book references and section-introduction pages. The good news is that Rudder has his blog with tons more data, and the book has its own site, too. The book’s site is supposed to have several tools on it, such as an algorithm to predict if you are going to divorce or break-up based on your Facebook profile. Those features were not available before the book’s publication.
Bottom line: tons of data and insight provided through creative charting and exquisite writing. I love it.
Thanks to Crown for sending this to me for review. Pure awesome!
You can find this book’s preview and other reviews here: Dataclysm: Who We Are (When We Think No One’s Looking)