Case Study 12

Logistic Regression on Imbalanced Data

Class-imbalance techniques for reliable classification

2025
PythonScikit-learnPandasNumPyMatplotlib
Key impact
Studied class-imbalance strategies — resampling/SMOTE and class weighting — applied to logistic regression in scikit-learn.
IMBALANCED LOGISTICrawSMOTEτ=0.32precision–recallSMOTE + class weights · tuned threshold
Representative mockup

What I did

3
  1. 01

    Studied class-imbalance strategies — resampling/SMOTE and class weighting — applied to logistic regression in scikit-learn.

  2. 02

    Evaluated with imbalance-aware metrics (precision/recall, ROC-AUC, precision–recall curves) instead of misleading accuracy, and tuned the decision threshold for the operating point that mattered.

  3. 03

    Compared baseline against rebalanced training to quantify the recall gain on the minority class.

Tech stack

PythonScikit-learnPandasNumPyMatplotlib