Projects

These are projects I’ve built or extracted from production work. Each one has a write-up explaining the problem and design decisions, plus a public repo with working code and tests.

Every repo runs locally with Python and pytest. No hosted services or API keys needed for the default path.

Active February 22, 2026

Production RAG Pipeline

In-memory RAG pipeline that demonstrates chunking, hybrid retrieval, caching, and answer assembly -- no external services required.

Focus: Ingestion, hybrid retrieval, caching, and grounded answers

Chunking with configurable size and overlap
Hybrid scoring that blends lexical overlap and cosine similarity
Query-level caching with hit tracking

langchain pgvector python fastapi rag

GitHub repo

Active January 15, 2026

Agent Evaluation Harness

Evaluation harness for LLM agent workflows -- deterministic scoring, trajectory checks, and regression gating you can run with just Python and pytest.

Focus: Agent evaluation and regression gating

Single-turn answer scoring with keyword overlap
Trajectory scoring that catches wrong tool order
Regression gate with configurable pass/fail threshold

langgraph langsmith python evaluation agents

GitHub repo

Active November 20, 2025

Vector Search Benchmark Harness

Local benchmark harness for comparing exact vs approximate vector search -- recall, latency, and candidate coverage on synthetic clustered data.

Focus: Benchmark methodology and reproducible vector search comparisons

Deterministic clustered dataset generation with fixed seeds
Three backends: exact linear, sign-bucket ANN, and projection ANN
Recall@k, p50/p95 latency, build time, and candidate ratio in one table

vector-db pgvector pinecone weaviate python

GitHub repo