This repository provides companion code for Machine Learning From Scratch, where ten core ML algorithms are built from scratch using NumPy. It allows anyone with basic Python skills to understand and implement algorithms intuitively while comparing them against popular libraries like Scikit-learn and PyTorch.
The Machine Learning From Scratch repository serves as a valuable companion code resource for the book Machine Learning From Scratch. This repository includes the implementation of ten core machine learning algorithms developed from the ground up using NumPy. Additionally, it presents comparisons with established libraries such as Scikit-learn and PyTorch, enhancing understanding of each algorithm.
Overview of Features
This project demystifies the often opaque processes in machine learning, particularly the functionalities behind fit() and predict(), and provides a comprehensive five-stage framework for algorithm implementation:
- Intuition: Grasp the conceptual thinking that underpins algorithms in an accessible manner.
- Formalization: Convert intuitive concepts into mathematical expressions suitable for implementation.
- Implementation: Translate mathematical formulas into clean, well-documented code using NumPy.
- Test: Validate the code against real data and benchmark it against popular libraries in the field.
- Tips: Gain insights into the strengths, weaknesses, and practical applications of each algorithm.
Table of Contents
The repository offers organized chapters covering essential topics and corresponding code implementations, including:
| Chapter | Content/Code |
|---|---|
| Setup | Overview |
| Fundamentals of ML | Overview |
| Introduction to Data | Overview |
| The Math You Actually Need for ML | Overview |
| Data Preparation | Overview data_loader.py |
| Linear Regression | 01_linear_regression.ipynb |
| Logistic Regression | 02_logistic_regression.ipynb |
| Regularization | 03_regularization.ipynb |
| K-Nearest Neighbors | 04_k_nearest_neighbors.ipynb |
| Naïve Bayes | 05_naive_bayes.ipynb |
| Decision Tree | 06_decision_tree.ipynb |
| Random Forest | 07_random_forest.ipynb |
| Gradient Boosting | 08_gradient_boosting.ipynb |
| XGBoost | 09_xgboost.ipynb |
| Neural Network | 10_neural_network.ipynb |
| Model Optimization | Overview 11_model_optimization.ipynb |
| Conclusion | Overview |
Core Algorithms
A highlight of this repository is the implementation of ten fundamental ML algorithms:
- Linear Regression
- Logistic Regression
- Regularization
- K-Nearest Neighbors
- Naïve Bayes
- Decision Tree
- Random Forest
- Gradient Boosting
- Extreme Gradient Boosting (XGBoost)
- Neural Network
Practical Applications
Accompanying this code are practical sections guiding how to approach machine learning problem-solving effectively. These sections emphasize the importance of data quality and provide methodologies for model evaluation and optimization. Overall, the repository not only helps in understanding machine learning algorithms but also aids in practical implementation, making it an essential tool for anyone aiming to deepen their knowledge in machine learning.
No comments yet.
Sign in to be the first to comment.