Featured Project

Goal-Based Agentic AI using Knowledge Graph

LangGraph · Gemini 2.0-Flash · ChromaDB · SGDAEO Framework

Python LangGraph Gemini RAG Knowledge Graph ChromaDB

Built an agentic AI system that extracts structured strategy knowledge from business documents. V2's multi-step pipeline with reflection loops achieved 3.6x better extraction than V1's single-shot approach — same model, same temperature, the architecture is the difference.

V1 vs V2 Pipeline Comparison
V1 vs V2: Single LLM call (29 elements) vs agentic pipeline with 21 calls, 6 reflection loops, and self-correction (106 elements + 108 relationships).
SGDAEO Knowledge Graph
SGDAEO Influence Chain: 6 entity types with 7 typed relationships forming a traceable graph — Evidence grounds Strategy, Strategy drives Goal, Goal requires Decision, Decision implemented by Action, Outcome measures Goal.
Gap Detection and Filling
Graph-Guided Gap Detection: Traversed the graph to find 81 structural gaps (zero LLM cost). Three targeted LLM calls filled 28 new entities — growing the graph 26% through precision re-extraction.
3.6x
More Entities
134
Graph Elements
136
Relationships
81
Gaps Detected
8
Self-Corrections

Architecture Highlights

Agentic Pipeline (V2): 21 LLM calls per document — 6 extraction passes (one per SGDAEO category), 6 reflection loops for self-correction, document profiling, and cross-category context accumulation. Each category's prompt includes results from all prior categories.

Knowledge Graph: 106 elements across 6 types (Strategy, Goal, Decision, Action, Evidence, Outcome) connected by 108 typed relationships (7 edge types: DRIVES, REQUIRES, IMPLEMENTED_BY, SUPPORTS, MEASURES, VALIDATES, EVOLVES_TO).

Graph Gap Detection: Pure graph traversal finds 81 structural gaps without any LLM call. Targeted re-extraction fills gaps with focused prompts — 3 calls produced 28 new entities (+26% completeness).

3-Layer RAG: Metadata filtering (19 section patterns) → vector similarity (ChromaDB + 1536-dim embeddings) → keyword boost. Narrows 10K chunks to the 5 most relevant before querying.