Telco Churn Predictor
A two-stage churn model with explainable drivers on the IBM Telco dataset.
- Python
- XGBoost
- scikit-learn
- pandas
- SHAP
- Metric shown · 0.82 ROC-AUC
- GitHub available
- Feature importance
- Reproducible notebook
At a glance
- Problem
- Which customers are likely to churn, and what drives each prediction so retention can be targeted?
- Built
- A two-stage pipeline: a gradient-boosted risk scorer feeding an XGBoost classifier, with explainable drivers.
- Models / methods
- Gradient boosting (risk scoring) and XGBoost (classification).
- Result
- 0.82 ROC-AUC on a held-out test set.
- Strength shown
- Practical pipeline design plus explainable, actionable output.
- Links
- GitHubCase Study
Visual proof



Charts and diagrams are real outputs and architecture from the project.
01Objective
Predict which customers are likely to churn and explain why, so retention efforts can be targeted at the highest-risk accounts.
02Dataset / input
The IBM Telco customer dataset — contract, billing, tenure, and service-usage attributes — with churn as the target (≈1,400 test customers in the confusion matrix).
03Model approach
- Cleaning, encoding, and feature engineering on customer attributes
- A two-stage pipeline: a gradient-boosted risk scorer feeding an XGBoost classifier
- Threshold tuning and class-imbalance handling for fair evaluation
04Results / metrics
Reached 0.82 ROC-AUC on a held-out test set. Feature-importance analysis identified the strongest churn drivers (contract type, tenure, and charges), turning the model into actionable insight.
05Deployment / reproducibility
Reproducible notebook plus a saved model artefact ready to score new customers in batch.
06Limitations
- A single public dataset that may not match a given operator's mix
- Static snapshot — no time-aware or streaming churn signals
07Future improvements
- Calibrated churn probabilities tied to retention-offer economics
- Drift monitoring once deployed against live data
08Key takeaway
A clean, explainable classifier with a real metric (0.82 ROC-AUC) and drivers a business could act on.