RAG-on-Me

Project Overview

RAG-on-Me is a minimal Retrieval-Augmented Generation (RAG) system I built to demonstrate the core moving parts of a modern AI pipeline—without hiding behind heavy frameworks. The project answers questions about my experience, projects, and skills, grounded strictly in my documents (CV, case studies, READMEs).

Learning Exercise

Deeply understand RAG mechanics from scratch

Portfolio Showcase

Interactive resume that recruiters can query directly

Problem & Opportunity

RAG has become a standard design for grounding LLMs with external knowledge. Yet many examples are over-engineered or tied to a single library. I wanted to create something compact, transparent, and reusable that could serve as both a personal assistant and a code reference for others exploring RAG.

The Opportunity

A system that not only demonstrates technical skill but also functions as an interactive resume.

Scope & Features

Core Features

Markdown ingestion: Load structured documents like cv.md into pgvector store

Threaded chat: Each thread_id maintains conversation state

Postgres-backed checkpoints: Persistent chat memory

Minimal graph design: Clear retrieve → generate flow using LangGraph

API Endpoints

POST /initialize

Ingest documents into vector store

POST /chat

Query with conversation memory

GET /graph/state

Inspect graph state for debugging

Technical Architecture

Data Flow

Client → FastAPI → LangGraph → Vector Store + LLM → Postgres (checkpoints)

Stack Components

FastAPI for REST endpoints

LangGraph for pipeline orchestration

PostgreSQL + pgvector for storage

OpenAI API for embeddings & generation

Module Structure

main.py API lifecycle + graph compile

graph_runtime.py Retrieve → generate flow

nodes.py Retrieval & generation logic

adapters.py LLM, embeddings, vector store

ingest.py Markdown document indexing

Challenges & Trade-offs

Scope Control

Kept it under ~1K lines to ensure readability, at the cost of advanced features (e.g., reranking, hybrid search).

State Management

Designing checkpoints to reliably track multi-turn conversations required careful persistence logic.

Generality vs Personal Use

Balancing a system tailored to my CV with code flexible enough for others to adapt.

Outcomes & Learnings

Technical Achievements

Built a clean RAG service from scratch

Gained hands-on clarity of ingestion, retrieval, generation, and memory management

Integrated LangGraph pipelines with FastAPI

Validation Results

Successfully validated the system by querying my portfolio documents through conversation, proving both technical functionality and practical utility for recruiters.

Proudest Achievement

Turning my resume into an interactive chatbot that demonstrates RAG in a recruiter-friendly way.

Future Directions

Hybrid Search

Experiment with BM25 + vector search for better grounding

Guardrails

Handle hallucinations and off-topic queries

Multi-Document Ingestion

Expand to case studies, blog posts, academic notes

Reflection

RAG-on-Me taught me how to go beyond toy demos and build a real, production-style pipeline with persistence, modularity, and API design. It is also a personal branding tool: instead of reading my CV, recruiters can chat with it.

Advice for Others

"Strip down RAG to its essentials first. Understand the flow from request → retrieval → generation → memory. Once you've nailed that, scaling features becomes easier."

Outcome

Timeline

Role

Technologies