Knowledge Base & RAG

Orbitali provides a built-in Retrieval-Augmented Generation (RAG) system. You can upload company policy sheets, FAQs, or product catalogs directly, and the agent will use this context to answer caller inquiries natively.

This eliminates the need to build a custom vector database, handle document chunking, or create external search APIs.

Ingesting Documents

You can manage knowledge documents through the Knowledge Base tab in the developer dashboard.

Supported formats

Markdown (.md): Parsed directly to plain text.
PDF (.pdf): Text is extracted programmatically.

Ingestion Pipeline

When a document is uploaded:

The raw file is stored in a secure Amazon S3 bucket (orbitali-knowledge).
The file is split into overlapping text chunks:
- Chunk Size: Approximately 600 tokens.
- Overlap: 100 tokens (to ensure contextual continuity at boundaries).
Each chunk is sent to the Google Vertex AI text-embedding-004 API to generate a 768-dimension vector.

The chunks and embeddings are stored in PostgreSQL inside the knowledge_chunks table, which is indexed using a pgvector IVFFlat index:

CREATE INDEX knowledge_chunks_embedding_idx
  ON knowledge_chunks
  USING ivfflat (embedding vector_cosine_ops)
  WITH (lists = 100);

Once processing completes, the document status updates to ready, and the knowledge base becomes active for the agent.

The `search_knowledge` System Tool

As soon as an agent has at least one ready document, the Go voice runtime automatically injects the search_knowledge tool into the Gemini Live session.

You do not need to register this tool manually; it is handled natively by Orbitali and does not call your backend Server URL.

Tool Contract

The tool is exposed to the model with the following function declaration:

{
  "name": "search_knowledge",
  "description": "Search the agent's knowledge base for information relevant to the caller's question. Use this before saying you don't know something.",
  "parameters": {
    "type": "object",
    "properties": {
      "query": {
        "type": "string",
        "description": "The question or topic to search for."
      }
    },
    "required": ["query"]
  }
}

Runtime Query Flow

When a caller asks a question that requires knowledge lookup:

Model Invocation: The Gemini Live session determines it needs to call search_knowledge with a query string:

{
  "name": "search_knowledge",
  "arguments": { "query": "What is your cancellation policy?" }
}

Embedding Generation: The Go agent intercepts the call and embeds the query string using text-embedding-004.
Vector Database Query: The agent runs a cosine similarity search against the PostgreSQL database, scoped to the current agent ID:
```
SELECT content FROM knowledge_chunks
WHERE agent_id = $1
ORDER BY embedding <=> $2
LIMIT 5;
```
Context Injection: The text content of the top 5 chunks is merged and returned as a single string payload to the model.
Speech Synthesis: The model synthesizes an audio response using the retrieved facts and speaks it back to the caller.