Aspiring Data Scientists! Learn the basics with these 7 books!
In the last few years I’ve spent a significant amount of time reading books about Data Science. I found these 7 books to be the best. These together are a very valuable source of learning the basics. It drives you through everything you need to know.
Though they are very enjoyable, none of these is light reading. So if you decide to go with them, allocate some time and energy. It is worth it! If you combine this knowledge with the right online data science courses, it’s already a good-enough level for an entry level Data Scientist position. (In my opinion, at least.)
Note: you can see I listed four O’Reilly books here. If it looks suspicious: I’m not affiliated with them in any way. ;-) I just find their books really useful.
I suggest this specific order:
1. Lean Analytics — by Croll & Yoskovitz
The first book to read is about the basic business mindset for using data. It says it’s for startups, but I feel like it’s much more than that. You will learn why is it so important to select the One Metric That Matters as well as the 6 basic online business types — and the data strategy behind those.
You can buy the book: here (affiliate link).
2. Business value in the ocean of data — by Fajszi, Cser & Fehér
If Lean Analytics is about business + data for startups, this book is business + data for big companies. It sounds less fancy than the first one, but there is always a chance to pick up some useful knowledge from the big guys, including how insurance companies use predictive analytics and what data issues banks are facing.
3. Naked Statistics — Charles Wheelan
I constantly promote this book on my channels. It’s not just for Data Scientists. It’s the very basis of statistical thinking, which I think every human being should be familiar with. This book comes with many stories and you will learn how not to be scammed by headlines like “How we pushed 1300% on our conversion rate by changing only one word” and other BS.
You can buy the book: here (affiliate link).
4. Doing Data Science — Schutt and O’Neil
The last book before going really tech-focused. This one takes the things that you learned from the first 3 books to the next level. It goes deeper into topics like regression models, spam filtering, recommendation engines and even big data.
You can buy the book: here (affiliate link).
5. Data Science at the Command Line — Janssens
The other thing I constantly promote is learning (at least) basic coding. With that you can be much more flexible when retrieving, clearing, transforming and analyzing your data. It just extends your opportunities in Data Science.
And when you start, I suggest starting with the Command Line. This is the only book I’ve seen about Data Science + Command Line, but one is enough as it pretty much covers everything.
You can buy the book: here (affiliate link).
6. Python for Data Analysis — McKinney
The second data language to learn is Python. It’s not too difficult and it’s very widely used. You can do almost everything in Python, when it comes to analysis, predicting and even machine learning.
This is a heavy book (literally: it’s more than 400 pages), but covers everything about Python.
You can buy the book: here (affiliate link).
7. I heart logs — Jay Kreps
The last book on the list is only 60 pages and very technical. It gives you a good view into the technical background of data collecting and processing. As an analyst or data scientist you probably won’t use this kind of knowledge directly, but at least you will be aware of what the data infrastructure specialists of the company do.
You can buy the book: here (affiliate link).
And that’s it!
As I mentioned before, if you go through all of these — combined with the right online data science courses — you will have a solid knowledge of Data Science!
UPDATE: I’ve created a (free) online video course to help you get started with Data Science. Click here for more info: How to Become a Data Scientist.
If you want to try out, what it is like being a junior data scientist at a true-to-life startup, check out my new 6-week online data science course: The Junior Data Scientist’s First Month!
Learn more about the data analytics basics — and don’t miss my new data coding tutorial series: SQL for Data Analysis and Python for Data Science!
I wrote a new article about my favorite Statistics Books:
Aspiring Data Scientists! Start to learn Statistics with these 6 books!
Thanks for reading!
Enjoyed the article? Please just let me know by clicking the 💚 below. It also helps other people see the story!
Tomi Mester
my blog: data36.com
my Twitter: @data36_com