Books Collected from Patrick Collison's Tweet

Posted by HQ on Mon 13 November 2017

Last Sunday when I opened up twitter, I saw this tweet from @patrickc. Before followers started to reply, I thought this could be a really nice chain to follow.

This also reminds me what I listened a few days ago from this episode with David Darmanin on IndieHackers, where David mentioned that he asked people "What's your favourite 10 books?" and he tooke the top 10 common books to start reading.

I'm definitely interested in making a list of the books mentioned in this thread, but I'm a little reluctant to go through more than 600 tweets and copy them into a spreadsheet. Being a Data Scientist myself, first idea came to me was, can I automate this or maybe I can make this a fun exercise.

In this blog post, I'll only cover the top 10 mentioned books, as that's where I get to in my analysis. Other things I want to is to look at the authors, published time and genres of the books. I'll probably need to use Goodreads or someth other APIs to do that.

Top 10 (Actually 12) Mentioned Books

Let's jump to the result first. If you are interested in how to get this list, I covered a little at the end of this blog post. Personally I found many interesting books I haven't read before, and this was the biggest reason that I started the exercise.

I've put all books mentioned at least twice in a spreadsheet. Feel free to check it out over there.

Title Review Mentions
Bible. "We are pressed on every side by troubles, but we are not crushed. We are perplexed, but not driven to despair." 2 Corinthians 4:8. (Affiliate Link) 27
Sapiens by Yuval Noah Harari, which is a summer reading pick by Barack Obama, Bill Gates and Mark Zuckerberg. At the time of writing the blog, it's the fourth most read book on Amazon Charts. (Affiliate Link) 25
Thinking, Fast and Slow. Daniel Kahneman, the renowned psychologist and winner of the Nobel Prize in Economics, takes us on a groundbreaking tour of the mind and explains the two systems that drive the way we think. (Affiliate Link) 25
1984 was George Orwell’s chilling prophecy about the future. And while 1984 has come and gone, his dystopian vision of a government that will do anything to control the narrative is timelier than ever. (Affiliate Link) 20
Antifragile is a standalone book in Nassim Nicholas Taleb’s landmark Incerto series, an investigation of opacity, luck, uncertainty, probability, human error, risk, and decision-making in a world we don’t understand. (Affiliate Link) 20
The Black Swan. A black swan is an event, positive or negative, that is deemed improbable yet causes massive consequences. In this groundbreaking and prophetic book, Taleb shows in a playful way that Black Swan events explain almost everything about our world, and yet we—especially the experts—are blind to them. (Affiliate Link) 18
Zero To One. The great secret of our time is that there are still uncharted frontiers to explore and new inventions to create. In Zero to One, legendary entrepreneur and investor Peter Thiel shows how we can find singular ways to create those new things. (Affiliate Link) 16
Meditations. One of the world's most famous and influential books, Meditations, by the Roman emperor Marcus Aurelius (A.D. 121–180), incorporates the stoic precepts he used to cope with his life as a warrior and administrator of an empire. (Affiliate Link) 13
Guns Germs and Steel: The Fates of Human Societies. In this "artful, informative, and delightful" (William H. McNeill, New York Review of Books) book, Jared Diamond convincingly argues that geographical and environmental factors shaped the modern world. (Affiliate Link) 12
Brave New World. Now more than ever: Aldous Huxley's enduring "masterpiece ... one of the most prophetic dystopian works of the 20th century" (Wall Street Journal) must be read and understood by anyone concerned with preserving the human spirit in the face of our "brave new world". (Affiliate Link) 11
Zen And The Art Of Motorcycle Maintenance. Acclaimed as one of the most exciting books in the history of American letters, this modern epic became an instant bestseller upon publication in 1974, transforming a generation and continuing to inspire millions. (Affiliate Link) 11
The Selfish Gene. Professor Dawkins articulates a gene's eye view of evolution - a view giving centre stage to these persistent units of information, and in which organisms can be seen as vehicles for their replication. (Affiliate Link) 11

What did I do?

First thing to do is getting access to Twitter API. I chose to use the library python-twitter and obtained access tokens from Twitter Developer.

    api = twitter.Api(
        consumer_key=CONSUMER_KEY,
        consumer_secret=CONSUMER_SECRET,
        access_token_key=ACCESS_TOKEN,
        access_token_secret=ACCESS_TOKEN_SECRET,
        sleep_on_rate_limit=True)

Once you have this, it's pretty easy to obtain a single status or tweet. For example, the post by @patrickc has status_id = '929862403763798016'. To get the status via the API is as simple as

    api.GetStatus('929862403763798016') 

and it returns something like this

Status(ID=929862403763798016, ScreenName=patrickc, Created=Mon Nov 13 00:04:07 +0000 2017, Text='So, Sunday evening Twitter: which five books have influenced you the most? (In terms of shaping your worldview.)')

However, it took me some time to get all the replies to this post, mainly because the Twitter API itself doesn't support this directly. I googled a little bit and ended up with this gist, where you can search all tweets sent to some user from some since_id to max_id.

query = urllib.parse.urlencode({
    "q": "to:@%s" % user,
    "count": 100,
    "since_id": since_id,
    "max_id": max_id,
    "tweet_mode": "extended"
})

replies = api.GetSearch(raw_query=query, result_type='recent')

The code snippet is slightly different from the gist, because I had to make some change to make it work. Most important piece is "tweet_mode": "extended", which guarantees that you will receive the full tweet rather than a truncated version with link to the original tweet. After this, I did lots of re.sub, re.split and strip calls to process the text and I have something cleaner to work with.


Comments !