← BACK TO PROJECTS
PROJECT [02]

JDIH Jawa Timur Chatbot

RAG SYSTEM / CHATBOT
TYPERAG SYSTEM / CHATBOT
STATUSPRODUCTION
YEAR2026
STACKPython / FastAPI / ChromaDB / Amazon Bedrock

The JDIH Jawa Timur Real-Time Chatbot is a Retrieval-Augmented Generation system built to serve as an intelligent assistant for navigating legal and regulatory documents in the East Java region.

The system implements Hybrid Search combining Semantic and Lexical approaches with Reciprocal Rank Fusion (RRF) to maximize retrieval accuracy. This dual-approach ensures both contextual understanding and exact term matching for legal queries.

A key innovation is the real-time ingestion pipeline that reduced new legal document availability latency from approximately 24 hours to less than 1 minute. The system also features Dynamic Regex Extraction for handling complex numbered legal document identifiers.

  • Python
  • FastAPI
  • ChromaDB
  • Amazon Bedrock
  • Hybrid Search (RRF)
ARCHITECTURE DIAGRAM
┌───────────────┐     ┌──────────────┐     ┌─────────────────┐
│  USER QUERY   │────▶│  EMBEDDING   │────▶│ SEMANTIC SEARCH │
└───────────────┘     └──────────────┘     └────────┬────────┘
        │                                           │
        │             ┌──────────────┐              │
        └────────────▶│ LEXICAL SRCH │──────┐       │
                      └──────────────┘      ▼       ▼
                                     ┌──────────────────┐
                                     │    RRF FUSION    │
                                     └────────┬─────────┘
                                              ▼
                                     ┌──────────────────┐
                                     │   LLM SYNTHESIS  │
                                     └──────────────────┘

Document Ingestion → Hybrid Search (Semantic + Lexical) → RRF Fusion → LLM Synthesis