Icon of a book

Free Open Source Books recommended by the Author

Here are some great free/open source books that have helped me along the way
(I have no affiliation/partnership with any of the books listed)

1.) Regression Analysis with Python (Level: Beginner – Intermediate

This is a great book for getting a full introduction to regression and understanding the fundamentals. It takes you through the entire model training process and covers feature engineering and selection, model fitting, and model evaluation.

The Regression teachings are especially good. The book begins with simple single feature linear regression models and eventually works its way up to multiclass logistic regression and more complex regression methods like Gradient Boosting.

https://github.com/priscilj/Hadoop-related-books/blob/master/Regression%20Analysis%20with%20Python%20-%20Luca%20Massaron%20Feb%202016%20PACKT.pdf

2.) Python Data Science Handbook (Level: Beginner – Intermediate)

This book covers a wide range of topics, from basic data manipulation to advanced machine learning and deep learning concepts. As you read through each chapter, you’ll find detailed explanations and Python code examples that make it easy to learn and apply the concepts.

Bonus- try the ‘Open in Colab’ option to get hands on and actually run the code in browser, cool feature

3.) Elements of Statistical learning (Level: Intermediate – Advanced)

This book is a classic work, originally intended for Stanford graduate students and written by 3 Stanford statistics professors (Trevor Hastie, Robert Tibshirani, and Jerome Friedman), its been released in the open domain and received some great updates over the years to keep it current.

The book provides a rigorous introduction to various statistical learning techniques, including supervised and unsupervised learning, classification, regression, clustering, and more

The first part of the book covers linear regression and its variants, eventually diving into ridge regression and lasso regression. The second part covers classification methods, such as logistic regression, support vector machines, and decision trees. The third part covers unsupervised learning, including principal component analysis and various clustering methods. The final part covers advanced topics, such as neural networks and deep learning.

https://hastie.su.domains/ElemStatLearn/