Welcome to CallmAi Agentic AI Platform
CallmAI Agents
Realtime framework for production-grade multimodal and voice AI agents.
Introduction
The Agents framework allows you to add or remove Agent(s) and client participant(s)to the platform as a full realtime conversation. The API includes a complete set of tools and abstractions that make it easy to feed realtime media and data through an AI pipeline that works with multiple languages, and publish realtime results back to the client.
Use cases
Some applications for agents include:
- Multimodal assistant: Talk to, or text an AI assistant.
- Telehealth: Bring AI into realtime telemedicine consultations, with or without humans in the loop.
- Call center: Deploy AI to the front lines of customer service with inbound and outbound call support.
- Realtime language learning: Practice conversations in different languages in realtime.
- NPCs: Add lifelike NPCs backed by language models instead of static scripts. In the context of video games. They typically interact with the player in some way, often providing quests, dialogue, or other information.
The following demonstrate some of these use cases:
- Medical Office Receptionist: Agent that have engaging conversations with patients based on symptoms and medical history.
- Restaurant Agent: A restaurant front-of-house agent that can take orders, add items to a shared cart, and checkout.
- Concierage Agent: Our Ai voice agent acts like a smart, always-on assistant answering inquiries, booking showings, and guiding potential clients anytime, day or night.
- Travel Agency Agent: Travel agent quickly answers about bookings, cancellations, documents, and changes. It is equiped to respond instantly and around the clock.
Framework overview
flowchart TD
classDef adminActions fill:#4a5568,stroke:#2d3748,color:#e2e8f0
classDef systemActions fill:#553c9a,stroke:#44337a,color:#e9d8fd
classDef customerActions fill:#2c5282,stroke:#2a4365,color:#bee3f8
%% Admin Setup Flow
A[Company Administrator] -->|1. Sign Up| B[Provide Company Info]
B -->|2. Upload| C[Knowledge Base]
C -->|3. Select| D[Configure Agent]
D -->|Set Phone Number| E[Assign Phone Number]
D -->|Set Language| F[Configure Initial Language]
E --> G[4. Start Agent]
F --> G
%% Distribution Flow
G -->|5a. Share Link| H[Add to Website UI]
G -->|5b. Share Phone| I[Publish Phone Number]
%% Customer Interaction Flow
J[Customer] -->|Web Access| H
J -->|Phone Call| I
H -->|6a. Web Interaction| K[AI Agent]
I -->|6b. Voice Interaction| K
%% Styling
class A,B,C,D,E,F,G,H,I adminActions
class J,K customerActions
%% Subgraphs for visual grouping
subgraph "Admin Setup Phase"
A
B
C
D
E
F
G
end
subgraph "Customer Access Points"
H
I
end
subgraph "Interaction Phase"
J
K
end
Our Agent operates as a stateful, realtime bridge between powerful AI models and your users. While AI models typically run in data centers with reliable connectivity, users often connect from mobile networks with varying quality.
WebRTC ensures smooth communication between agents and users, even over unstable connections. CallmAI WebRTC is used between the frontend and the agent, while the agent communicates with our servers using HTTP requests and WebSockets. This setup provides the benefits of WebRTC without its typical complexity.
The platform includes components for handling the core challenges of realtime voice AI, such as streaming audio through a voice pipeline, reliable turn detection, handling interruptions, and LLM orchestration.
Other framework features include:
- Voice, video, and text: We build agents that can process realtime input and produce output in any modality.
- Multi-agent handoff: Break down complex workflows into simpler tasks.
- Extensive integrations: Integrate with nearly every service using APIs.
- Made for developers: Control your agents through simple API calls, and do not worry about configurations and architecture.
- Production ready: Includes built-in agent orchestration, load balancing, and detailed statistics.
How our agents connect to your clients
sequenceDiagram
participant Agent as Agent
participant Server as CallmAI Server
participant Session as CallmAI Session
participant User as User
participant Frontend as Frontend App
participant Phone as Phone
Note over Agent,Server: Agent Registration Phase
Agent->>Server: Register as available agent
Server-->>Agent: Acknowledge registration
Note over User,Server: Session Initiation Phase
User->>Server: Request session
Server->>Session: Create new session
Server->>Agent: Dispatch to session
Agent->>Session: Join session
Note over User,Session: User Connection Phase
alt Web/App Connection
User->>Frontend: Open application
Frontend->>Session: Connect via WebRTC
else Phone Connection
User->>Phone: Dial in
Phone->>Session: Connect via telephony
end
Note over Agent,User: Realtime Communication
Session->>Agent: User joined notification
Agent->>User: Initial greeting
loop Active Session
User->>Agent: Send audio/video/text
Agent->>User: Process and respond
end
When your agent starts, it first registers with a CallmAI server. The agent waits until, a real user starts and join a session.
After your agent and user are both in a session, the agent and your frontend app can communicate using CallmAI WebRTC. This enables reliable and fast realtime communication in any network conditions. CallmAI also includes full support for telephony, so the user can join the call from a phone instead of a frontend app