Projects
2026
Python RAG LLM Systematic Review BioNLP Oncology
HGSOC Liquid Biopsy Expert — RAG System
Domain-specific expert assistant for high-grade serous ovarian cancer liquid biopsy research, grounded in a PROSPERO-registered systematic review.
Overview
A domain-specific expert RAG system that answers clinical and research questions about liquid biopsy biomarkers in high-grade serous ovarian cancer (HGSOC). Unlike generic biomedical RAG systems, this one is grounded in a curated two-tier corpus: the full-text papers from a PROSPERO-registered systematic review (the evidence base), combined with a structured research wiki covering HGSOC biology, liquid biopsy methodology, and related literature.
Tech & Architecture
- Corpus curation: Docling-based extraction pipeline for full-text PDFs (tables, figures, supplementary materials); structured Markdown knowledge base (~180 documents across two tiers)
- Retrieval: hybrid BM25 + dense retrieval (MedCPT / BioLORD adapters) over deterministic JSONL chunk exports; metadata filters expose corpus tier at query time
- Evaluation: three-arm comparison (long-context agent vs. RAG vs. QLoRA fine-tune)
scored on field accuracy, citation accuracy, and hallucination rate against
extraction_v2.db(PROSPERO CRD420261405303) - Search automation: PICO-to-boolean query generation benchmarked against a frozen 2,927-record human PubMed search (134/158 priority records recovered by template arm)
Results & Highlights
- PROSPERO-registered systematic review as the evaluation ground truth — higher provenance than any LLM-annotated benchmark in the field
- Full PRISMA pipeline automated: search recall benchmarking (Task 1), title/abstract screening with per-criterion scoring (Task 2), structured data extraction (Task 3)
- Per-criterion screening labels captured for 42 PI-confirmed full-text decisions, including 9 reclassifications that reveal the precise boundary where automated screeners systematically fail
- Two-tier corpus design decouples SR-quality extraction (benchmark-grounded) from broader domain Q&A (wiki-grounded), enabling both rigorous evaluation and expert-level synthesis