StreamMind AI
A production-style GenAI search and recommendation platform for a streaming catalogue.
- Python
- PyTorch
- RAG
- FastAPI
- Docker
- MLflow
- Case study available
- Docker included
- CI/tests included
- MLflow tracking
At a glance
- Problem
- How do you let users search and chat with a streaming catalogue and get grounded, explainable recommendations?
- Built
- Hybrid retrieval, a grounded RAG chatbot, ranking models and a feedback loop behind a FastAPI + Streamlit app.
- Models / methods
- BM25 + semantic retrieval, classical rankers, a PyTorch neural re-ranker, and a LinUCB contextual bandit.
- Result
- Re-ranking and the bandit loop measurably reshuffled results toward preferred items; tracked in MLflow.
- Strength shown
- End-to-end system design, retrieval evaluation, containerised deployment.
- Links
- Case StudyRepo available on request
Visual proof
Charts and diagrams are real outputs and architecture from the project.
01Objective
Let users search and chat with a movie/TV/sport catalogue, returning grounded, explainable recommendations rather than opaque suggestions.
02Dataset / input
- A structured streaming catalogue with rich metadata
- Text embeddings per title for semantic matching
- Simulated thumbs-up / thumbs-down feedback events
03Model approach
- Hybrid retrieval combining BM25 keyword search with semantic similarity
- A grounded RAG chatbot that cites the catalogue entries it used
- Classical rankers plus a PyTorch neural re-ranker
- A LinUCB contextual bandit that adapts to feedback online
04Results / metrics
Retrieval and ranking quality tracked with an evaluation harness; the neural re-ranker and bandit loop reshuffled results toward preferred items. Experiments logged with MLflow.
05Deployment / reproducibility
A FastAPI REST backend with a multi-tab Streamlit front end plus a retrieval debug trace. Containerised with Docker and wired to GitHub Actions CI; runs fully offline.
06Limitations
- Built on a synthetic catalogue rather than a licensed production dataset
- Feedback signals are simulated, not real user behaviour
- Single-node, in-memory retrieval rather than a hosted vector store
07Future improvements
- Swap the in-memory index for a managed vector database
- A/B testing around the ranker and bandit policies
- Personalisation from longer-term histories
08Key takeaway
Shows I can assemble a full retrieval-to-ranking system — not just a model — with evaluation, tracking and deployment baked in.