Orbitali Docs

Concepts

The core architectural resources and terminology of the Orbitali platform.

Understanding the following primary resources and behaviors will help you design and build voice experiences with Orbitali.


Agents

An Agent is the central voice AI configuration. It defines:

  • The conversational persona (the system prompt)
  • The language and voice (e.g., Google Gemini's regional voice assets)
  • The system and developer tools available to the model during the call
  • The linked carrier phone numbers that route to it

Phone Numbers & BYO Carrier

Orbitali uses a Bring Your Own Carrier (BYOC) model. Rather than purchasing numbers from Orbitali, developers lease numbers in their personal carrier accounts (Twilio or Telnyx).

By setting the carrier's inbound webhook URL to point to Orbitali's ingress nodes, calls are routed to Orbitali. The platform maps the incoming call destination to the assigned agent ID and starts the voice session.


Prompts (Static vs. Dynamic)

Static Prompts

Static prompts are stored directly in the Orbitali database. For UX authoring convenience, static prompts are split into two fields:

  • Identity: Defines who the agent is (e.g., persona, tone, gender, name).
  • Instructions: Defines what the agent does (e.g., business hours, triage rules, appointment policies).

At call startup, Orbitali combines these into a single system instruction payload:

{identity}

{instructions}

Dynamic Prompts

If your agent needs customer-specific context (e.g., calling the user by name or checking their membership tier), use Dynamic Prompts.

Before answering the phone call, the Go agent sends an agent:assistant-request HTTP POST webhook to your Server URL with the caller's metadata. Your server fetches customer records and returns a customized prompt (which overrides the agent's identity and instructions) and an optional custom greeting.


Tools

System Tools

System tools are built-in actions handled natively by the Go agent. They do not trigger webhook calls and add zero developer latency:

  • hang_up: Always exposed. Allows the agent to end the telephony leg gracefully when the call finishes.
  • transfer_call: Exposed if the agent has a transfer_destination_e164 number configured. Executes a carrier-level transfer to hand the call off to a human agent.
  • search_knowledge: Automatically exposed if the agent has active documents in its Knowledge Base. Connects the model to local RAG retrieval.

Developer Tools

Developer tools are custom functions that you register on the agent (using standard JSON Schema parameters). When the model invokes a developer tool, Orbitali halts synthesized speech and sends an agent:tool-call HTTP POST webhook to your Server URL. Your server processes the request and returns a JSON response, which is fed back to the model to continue the conversation.


Knowledge Base (RAG)

The Knowledge Base is a native Retrieval-Augmented Generation (RAG) system. You can upload company documents (Markdown or PDF) directly.

  • Processing: Orbitali extracts the raw text, splits it into overlapping chunks (~600 tokens per chunk with 100-token overlap), and calls Google's text-embedding-004 to create 768-dimension vectors.
  • Retrieval: Chunks are stored in a PostgreSQL database using pgvector. During calls, if the model invokes the search_knowledge tool, the agent embeds the query and performs a fast cosine similarity search, returning the top matches to the model context.

Realtime Web Sessions

For browser-based calling, you can create ephemeral WebSocket voice sessions:

  1. Security: Your backend makes an authorized POST request to /public/v1/agents/{id}/realtime-sessions and receives an ephemeral token that expires in 60 seconds. This allows browser calling without exposing your primary API keys.
  2. WebSocket Streaming: The client browser establishes a direct WebSocket connection to the voice runtime. The client streams raw mono PCM16 16 kHz input audio to the server and receives back synthesized PCM16 24 kHz output audio to play.

Calls & Transcripts

A Call represents an active or completed voice interaction.

  • Status Lifecycle: Starts as initiated, changes to answered and in_progress once connected, and terminates as completed, failed (e.g., model or carrier failure), or missed.
  • Transcripts: Speech transcripts are stored in the database as a sequence of dialogue turns (with a role of caller or assistant).
  • Observability: Every custom tool invocation is logged with its full request headers, request body, response status, response body, latency, and errors for auditability.

On this page