Muzzamil Rasully

ML & AI Engineer building NLP, RAG and applied machine-learning systems

I build practical ML and AI projects using Python, PyTorch, scikit-learn, Transformers and LangChain — from model training and evaluation to retrieval pipelines, APIs and demos.

Open to ML Engineer, AI Engineer, NLP Engineer, RAG/LLM Engineer and ML-focused Data Scientist roles.

rag_pipeline.pyPython
from transformers import AutoModel
import torch

# retrieve, then generate a grounded answer
ctx = retriever.search(query, k=5)
prompt = build_prompt(query, ctx)
answer = llm.generate(prompt)
# → grounded, cited, on-device
10+
ML / DL models trained
3
end-to-end AI systems
0.82
ROC-AUC · churn classifier

Currently focused on

Applied MLRetrieval systems (RAG)Model evaluationMLOps & deploymentData analytics
Open to
ML EngineerAI EngineerNLP EngineerML-focused Data Scientist
Best proof
NLP Retrieval & RAGFinBERT stock ML pipeline0.82 ROC-AUC churn modelCNN from scratch
Core stack
PythonPyTorchscikit-learnTransformersLangChainFAISSFastAPIDockerSQL

Featured Projects

Practical ML and AI projects — clear problem, method and result. GitHub and case studies available.

Builds a question-answering system over documents using retrieval, re-ranking and generated answers.

  • Built: TF-IDF, dense and hybrid retrieval baselines
  • Added: FAISS search, cross-encoder re-ranking and Flan-T5 answer generation
  • Proof: GitHub and case study
  • Python
  • FAISS
  • Transformers
  • Flan-T5
  • NLP
  • RAG

Tests whether short-term stock movement can be predicted using market data, sentiment and document context.

  • Built: leakage-safe financial ML pipeline with technical indicators and FinBERT sentiment
  • Finding: short-horizon prediction was only slightly above baseline, showing honest evaluation
  • Proof: reproducible pipeline, dashboard, Docker and case study
  • Python
  • PyTorch
  • FinBERT
  • Transformers
  • RAG
  • scikit-learn
Case StudyRepo available on request

Predicts customer churn and explains the drivers behind each prediction.

  • Built: two-stage ML pipeline with GBM risk scoring and XGBoost classification
  • Result: 0.82 ROC-AUC on the IBM Telco dataset
  • Explainability: SHAP used to surface churn drivers
  • Python
  • XGBoost
  • scikit-learn
  • pandas
  • SHAP

Skills

ML / Modelling

  • Python
  • PyTorch
  • scikit-learn
  • XGBoost
  • Model evaluation

NLP / RAG

  • Transformers
  • LangChain
  • FAISS
  • Flan-T5
  • FinBERT
  • Embeddings

Deployment

  • FastAPI
  • Streamlit
  • Docker
  • MLflow
  • GitHub

Data foundation

  • SQL
  • Pandas
  • Power BI
  • Excel

More projects

Supporting builds across applied ML, full-stack and scientific computing.

GenAI search and recommendation platform for a streaming catalogue.

  • Built: hybrid BM25 + semantic retrieval and a grounded RAG chatbot
  • Added: neural re-ranking and a feedback-loop experiment
  • Python
  • PyTorch
  • RAG
  • FastAPI
  • Docker
  • MLflow
Case StudyRepo available on request

A convolutional neural network implemented in pure NumPy.

  • Built: hand-coded forward and backward passes — no frameworks
  • Covers: conv, pooling and dense layers for image classification
  • Python
  • NumPy
  • Computer Vision

Full-stack stock-research platform with analytics-backed, explainable summaries.

  • Built: live search with earnings, valuation and risk breakdowns
  • Added: volatility-based forecasts and LLM-generated summaries
  • Next.js
  • TypeScript
  • Supabase
  • LLM
  • Recharts
Case StudyRepo available on request

Regression models that estimate property prices and price drivers.

  • Built: feature engineering and cross-validated model comparison
  • Result: ranked the strongest drivers of valuation
  • scikit-learn
  • pandas
  • Matplotlib
  • Python

Classifies borrower default risk with careful imbalance handling.

  • Built: feature engineering on applicant and loan data
  • Result: interpretable risk metrics under class imbalance
  • scikit-learn
  • pandas
  • Python

Computational Physics (C++)

Numerical-methods toolkit in modern C++ with an N-body simulation.

  • Built: integration, interpolation and root-finding routines
  • Added: RK4 N-body simulation of the TRAPPIST-1 system
  • C++
  • Numerical Methods
  • Simulation

Where I've worked

  1. Data Scientist · Ginseng

    Aug 2024 – Aug 2025
    • Engineered Python ML workflows using scikit-learn, XGBoost and LightGBM across 100,000+ record datasets, delivering classification, regression and clustering outputs across 6+ concurrent client workstreams.
    • Reduced analytical turnaround time by 30% by integrating LLM APIs and LangChain-assisted document processing, increasing the volume of recurring client deliverables without additional headcount.
    • Converted exploratory analysis into model evaluation reports using precision, recall, F1 and AUC, giving senior stakeholders clearer evidence for operational and strategic decisions.
    • Led a 10-person cross-functional data standardisation initiative, creating a unified reporting framework adopted across client portfolios.
  2. Finance Intern · Xceedure Business Solutions

    Jun 2024 – Aug 2024
    • Improved financial reporting accuracy by 15–20% by validating, cleansing and reconciling 30,000+ records across Excel and SQL datasets, strengthening finance data integrity before reports were issued to senior stakeholders.
    • Reduced weekly reporting preparation time from several hours to under 10 minutes by developing automated Power BI dashboards and structured Excel reporting templates for financial and operational KPI tracking.
    • Automated a recurring monthly finance reporting process using VBA macros, saving approximately 3 hours per cycle while reducing manual input errors and improving consistency across reporting outputs.
    • Strengthened spreadsheet controls using pivot tables, XLOOKUP/VLOOKUP, SUMIFS, IF statements, data validation and formula checks, improving the traceability and auditability of financial analysis.

Hiring for ML, NLP or AI roles?

I'm open to ML Engineer, AI Engineer, NLP Engineer, RAG/LLM Engineer and ML-focused Data Scientist opportunities.

Email me