Skip to main content

We use cookies to improve Engium and measure marketing. Choose what you're comfortable with.

Manage preferences

We use cookies to improve Engium and measure marketing. Choose what you're comfortable with.

Manage preferences
Engium LogoEngium
FeaturesPricingSolutionsResourcesPartners
Get Started
Engineering

Beyond LLMs: Building a Proprietary Knowledge Base

Engineering

Beyond LLMs: Building a Proprietary Knowledge Base

MK

Marcus Kane

Oct 08, 2024 · 10 min read

Back to resources

Table of Contents

Why RAG WorksIngestion PipelineChunking StrategyEmbedding StrategyRetrieval Tuning
MK

Marcus Kane

Lead Engineer, Engium · Oct 08, 2024

10 min read

Off-the-shelf LLMs are remarkably capable, yet they hallucinate on your specific pricing, get product names wrong, and confidently answer questions about policies that changed six months ago. The solution is not a better model — it is better context.

Why RAG Works

Retrieval-Augmented Generation grounds the LLM in verified facts at inference time. Instead of relying on statistical patterns learned during training, the model reads your actual documents and answers based on what it finds.

The architectural insight is that retrieval is a search problem, not a generation problem. Embedding-based semantic search finds relevant chunks even when users phrase questions differently from your documentation.

Ingestion Pipeline

Engium's ingestion pipeline accepts PDFs, Markdown, DOCX, and web URLs. Each document is parsed, split into overlapping chunks, embedded with gemini-embedding-001 (768 dimensions), and stored in pgvector.

ingestion-pipeline.py
# Engium knowledge ingestion
chunks = splitter.split(document, chunk_size=512, overlap=64)
embeddings = gemini.embed(chunks, model="gemini-embedding-001")

await db.execute(
    insert(knowledge_chunks).values([
        {"content": c, "embedding": e, "tenant_id": tenant_id}
        for c, e in zip(chunks, embeddings)
    ])
)

Chunking Strategy

Chunk size is the most important hyperparameter. Chunks that are too large dilute the embedding signal; chunks that are too small lose context. For FAQ-style content, 256–512 tokens with 64-token overlap works reliably.

Hierarchical chunking — where document summaries are stored alongside paragraph-level chunks — dramatically improves recall on complex multi-part questions.

Embedding Strategy

Embedding model choice affects both retrieval quality and cost. Engium defaults to gemini-embedding-001 (768 dimensions) rather than OpenAI's text-embedding-3 (1536 dimensions). The smaller vector size reduces storage costs and speeds up ANN queries without measurable recall degradation on business content.

"The jump from keyword search to semantic search is not incremental — it's categorical. Users phrase things in ten different ways, and only embedding-based retrieval handles that reliably."

Retrieval Tuning

HNSW indexes (used for FAQ items in Engium) deliver sub-millisecond recall on datasets up to several million vectors. Tune the similarity threshold — 0.78 is the default, but domain-specific content often benefits from raising it to 0.82 to reduce false-positive retrievals.

  1. 01.Use HNSW indexes for FAQ retrieval (fast, high recall)
  2. 02.Use IVFFlat for large document sets (cheaper to build, good for batch)
  3. 03.Re-embed stale content monthly as your knowledge base evolves
  4. 04.Monitor mean similarity scores — a drop signals knowledge base drift

Was this helpful?

Continue reading

JD
Artificial Intelligence
The 2024 AI Playbook for Emerging SMBs
8 min read
SC
WhatsApp
Scaling Customer Support to 10k Users via WhatsApp
6 min read
AR
Security
The Security of Decentralized AI
7 min read

Share Article

Continue reading

JD
The 2024 AI Playbook for Emerging SMBs
8 min read
SC
Scaling Customer Support to 10k Users via WhatsApp
6 min read
AR
The Security of Decentralized AI
7 min read

Build your AI future today.

Scale faster with Engium's automation platform.

Try Engium free
Try Engium free
Engium LogoEngium

Redefining small business communication through advanced AI intelligence.

Platform

  • Core Engine
  • Automation
  • Analytics
  • App Directory

Company

  • Our Story
  • Careers
  • Press Kit
  • Contact
  • Sales Partners

Legal

  • Privacy Hub
  • Terms
  • Security
  • Owner's Guide

© 2026 Engium AI Systems. All rights reserved.

StatusAPIDocsOwner's Guide