Aspiring Data Scientists! Start to learn Statistics with these 6 books!

Tomi Mester
5 min readJan 8, 2018

--

The original version of this article is found here: https://data36.com/aspiring-data-scientists-start-to-learn-statistics-with-these-6-books/

Statistics is difficult. Of course it is, as it’s most of the actual science part in data science. But that doesn’t mean that you couldn’t learn it by yourself if you are smart and determined enough.

In this article, I am going to list 6 books that I recommend starting with if you want to learn statistics. The first three are lighter reads. These books are really good for setting your mind to think more numerically, mathematically and statistically. They also do a good job of presenting why statistics is exciting (it is!).

The second three books are more scientific — with formulas and Python or R codes. Don’t get intimidated though! Mathematics is like LEGO: if you build the small pieces up right, you won’t have trouble with the more complex parts either!

Let’s see the list!

1. You Are Not So Smart — by David McRaney

When I first saw the title, I loved it already! This is a very well written book, containing many stories — and everything in it is based on real experiments and real scientific research.

David McRaney introduces one sad but true fact of life: that our brain constantly tricks us and we are not even smart enough to realize it. For an aspiring data scientist, this book is essential, because it lists many common statistical bias types. It points out classic mistakes like the self-serving bias, the availability heuristic, and the confirmation bias. It also shows why people tend to be tricked by fake news or scams and why people don’t always help when seeing someone having a heart attack on a busy street. Being aware of these biases should be basic, but I see even practicing data professionals fall for them from time to time…

(I wrote a detailed article about Statistical Bias Types. Find it here.)

You can buy the book: here (affiliate link).

2. Think Like a Freak — by Dubner & Levitt

The previous book was about why we are not so smart. But this one is about how to be smarter! Think Like a Freak shows us how critical and unconventional thinking can lead to huge success… and, hey, that’s something that as a data scientist, you should practice every day.

The book lists a bunch of case studies from everyday life, goes into details and analyzes why a solution for a problem is good or bad. Reading it will definitely boost your analytical thinking.

You can buy the book: here (affiliate link).

3. Innumeracy — by John Allen Paulos

If you hated mathematics in middle or high school, it was for one reason: you had a bad teacher. A good teacher turns mathematical equations into mystical puzzles, probability theory into detective stories, and linear algebra into the ultimate solution for all the big questions in life. Luckily, I had really good math teachers, so I was always generally excited by mathematics and statistics. Looking back, this really affected my life.

If you didn’t have a good math teacher, John Allen Paulos is here to make up the loss for you: he’s the awesome teacher you wish you’d had. Innumeracy focuses mostly on one specific segment of statistics: probability theory and calculations. It explains the math behind it, shows the formulas and puts everything into a very logical context. And it does it by showing the real life applications of these calculations, so you can immediately understand the advantage of being more math-minded.

You can buy the book: here (affiliate link).

4. Naked Statistics — by Charles Wheelan

I have already highlighted this book in my previous article, but I can’t stand to add it to this list either. It’s the perfect transition between the previous light-read statistics books and the next two more scientific ones. Reading it, you can easily understand basic concepts like mean, median, mode, standard deviation, variance, and standard error, or the more advanced things like the central limit theorem, normal distribution, correlation analysis or regression analysis.

Almost needless to say that all of these are packed into metaphors for ease of understanding.

You can buy the book: here (affiliate link).

5. Practical Statistics for Data Scientists — by Andrew & Peter Bruce

This is a relatively new book and it contains everything that a Junior Data Scientist has to know about the practical part of statistics. In my opinion, the biggest advantage of the book is the structure. It really makes it clear how things are built on top of each other. But it also goes into detail on the most common prediction and classification models — and it talks a bit about Machine Learning and Unsupervised Learning too.

The book comes with R code examples, but if you don’t know R, that’s not a problem; you can simply skip those parts.

You can buy the book: here (affiliate link).

6. Think Stats — by Allen B. Downey

Topic-wise, Think Stats is really similar to Practical Statistics for Data Scientists. I wanted to have it on the list, though, because even if the topic is the same, different writers usually approach things differently. On a topic as complex as data science, I think it’s worth looking at different angles and having things explained by two different data professionals.

Plus, this is a book from 2011. It’s good to see how much the interpretation of (even these standard) things has changed in as short as six years.

Oh, and I almost forgot to mention that Think Stats is available for free in PDF format, here: http://greenteapress.com/thinkstats/

Or you can buy the book: here (affiliate link).

And that’s it!

By reading these 6 books you can get a solid understanding of Statistics for Data Science! What’s the next step in becoming a data scientist? Well, first of all:

I’ve created a comprehensive (free) online video course to help you get started with Data Science. Click here for more info: How to Become a Data Scientist.

REGISTER HERE (FOR FREE): https://data36.com/how-to-become-a-data-scientist/

You can read even more books: here’s my 7 favorite data books. Or you can start to learn coding in SQL or in Python.

If you want to learn even faster, check out my new 6-week online data science course: The Junior Data Scientist’s First Month

If you think this list is missing something, let me know in the comment section below!

Thanks for reading!

Enjoyed the article? Please just let me know by clicking the 👏 below. It also helps other people see the story!

Tomi Mester
my blog: data36.com
my Twitter:
@data36_com

--

--

Tomi Mester
Tomi Mester

Written by Tomi Mester

Data analyst @Data36. I create in-depth, practical, true-to-life online tutorials — and video courses to help people learn Data Science. https://www.data36.com

Responses (6)