Phase 328 DaysIntermediate

Phase 3 β€” Core Machine Learning

Train, evaluate, and explain classical ML models on real tabular datasets β€” building pipelines that survive contact with messy, real-world data.

  • Build end-to-end pipelines from raw CSV to deployed predictions.
  • Choose models based on data geometry, interpretability needs, and error costs.
  • Use SHAP and proper evaluation metrics to make predictions explainable.

⚑ Must Know

  • Supervised vs Unsupervised vs RL
  • Linear Regression β€” OLS, MSE, RΒ²
  • Logistic Regression β€” sigmoid, log-loss
  • Decision Trees β€” Gini, entropy, depth
  • Random Forests β€” bagging, feature importance
  • XGBoost + LightGBM β€” gradient boosting
  • K-Means Clustering β€” centroids, elbow method
  • Train/Val/Test Split + Cross-Validation
  • Precision, Recall, F1, ROC-AUC
  • Feature Scaling β€” StandardScaler, MinMaxScaler
  • Encoding Categorical Features
  • Feature Engineering + Selection
  • Regularization β€” L1/L2, ElasticNet
  • Hyperparameter Tuning β€” GridSearchCV
  • sklearn Pipelines
  • SHAP for Model Interpretability

✨ Good to Know

  • SVM + Kernel Trick
  • Naive Bayes
  • Hierarchical Clustering
  • Handling Imbalanced Data β€” SMOTE
  • Collaborative Filtering β€” recommenders
  • Joblib / Pickle β€” model persistence

πŸ“š Resources

scikit-learn User Guide
Docsby scikit-learn

The authoritative reference for every classical ML algorithm.

scikit-learn.org β†—
Kaggle ML Course
Courseby Kaggle

Short, hands-on course covering core ML with competitions.

kaggle.com/learn β†—
SHAP Documentation
Docsby SHAP Team

Learn model explainability for production-grade ML.

shap.readthedocs.io β†—
Hands-On ML (GΓ©ron)
Bookby AurΓ©lien GΓ©ron

The most practical ML book β€” covers theory and sklearn end-to-end.

oreilly.com β†—

πŸ—οΈ Projects

House Price Prediction

Tabular regression pipeline with feature engineering and cross-validated tuning.

RegressionXGBoostsklearn

Customer Churn Classifier

Predict churn risk with SHAP-driven feature explanations for retention.

ClassificationSHAPImbalanced

Movie Recommender

Collaborative filtering recommender evaluated on ranking quality.

RecommenderSurpriseCollaborative