Case Study

StreamMind AI

A production-style GenAI search and recommendation platform for a streaming catalogue.

Python
PyTorch
RAG
FastAPI
Docker
MLflow

Case study available
Docker included
CI/tests included
MLflow tracking

At a glance

Problem: How do you let users search and chat with a streaming catalogue and get grounded, explainable recommendations?
Built: Hybrid retrieval, a grounded RAG chatbot, ranking models and a feedback loop behind a FastAPI + Streamlit app.
Models / methods: BM25 + semantic retrieval, classical rankers, a PyTorch neural re-ranker, and a LinUCB contextual bandit.
Result: Re-ranking and the bandit loop measurably reshuffled results toward preferred items; tracked in MLflow.
Strength shown: End-to-end system design, retrieval evaluation, containerised deployment.
Links: Case StudyRepo available on request

Visual proof

Coming soon

Chat UI screenshot

Coming soon

Retrieval debug trace

Coming soon

Evaluation dashboard

Coming soon

Architecture diagram

Charts and diagrams are real outputs and architecture from the project.

01Objective

Let users search and chat with a movie/TV/sport catalogue, returning grounded, explainable recommendations rather than opaque suggestions.

02Dataset / input

A structured streaming catalogue with rich metadata
Text embeddings per title for semantic matching
Simulated thumbs-up / thumbs-down feedback events

03Model approach

Hybrid retrieval combining BM25 keyword search with semantic similarity
A grounded RAG chatbot that cites the catalogue entries it used
Classical rankers plus a PyTorch neural re-ranker
A LinUCB contextual bandit that adapts to feedback online

04Results / metrics

Retrieval and ranking quality tracked with an evaluation harness; the neural re-ranker and bandit loop reshuffled results toward preferred items. Experiments logged with MLflow.

05Deployment / reproducibility

A FastAPI REST backend with a multi-tab Streamlit front end plus a retrieval debug trace. Containerised with Docker and wired to GitHub Actions CI; runs fully offline.

06Limitations

Built on a synthetic catalogue rather than a licensed production dataset
Feedback signals are simulated, not real user behaviour
Single-node, in-memory retrieval rather than a hosted vector store

07Future improvements

Swap the in-memory index for a managed vector database
A/B testing around the ranker and bandit policies
Personalisation from longer-term histories

08Key takeaway

Shows I can assemble a full retrieval-to-ranking system — not just a model — with evaluation, tracking and deployment baked in.

Back to all projects