AI voice agent glossary

Plain-English definitions for the terms you'll see across AI voice technology, telephony, and call automation. Useful whether you're evaluating Stellar, comparing vendors, or just trying to make sense of an industry that loves its acronyms.

ABC D E FGHIJK LMNOPQR S TUV WXYZ

ASR (Automatic Speech Recognition)

ASR is the technology that turns spoken audio into text, so an AI system can read what a caller is actually saying. Modern ASR handles accents, background noise, and natural speech patterns well enough for live phone conversations. In Stellar, ASR is part of the native speech-to-speech model running every call, so transcription happens in real time without a separate pipeline step.

Agent

An AI voice agent is a virtual assistant configured to handle phone conversations for a specific business workflow, like qualifying leads or booking appointments. Each agent has its own persona, voice, knowledge base, and set of tools it can invoke during a call. In Stellar, you build agents visually in a no-code agent builder and can run multiple agents per account on a single subscription.

See the agent builder

Conversational AI

Conversational AI describes any AI system designed to hold natural, human-like dialogue with people over voice or text. The core ingredients are a large language model for reasoning, a speech recognition layer for understanding voice input, and a text-to-speech layer for replying. AI voice agents like Stellar are conversational AI built specifically for the phone, with the additional layers needed to handle telephony, calendars, and CRMs.

DTMF (Dual-Tone Multi-Frequency)

DTMF is the system of beeps generated when you press a key on a phone keypad. Legacy phone systems use DTMF for menu navigation, like "press 1 for sales, press 2 for support." Modern AI voice agents replace DTMF menus with natural-language conversation, so a caller can just say what they need and the agent handles the routing without anyone pressing a button.

Embeddings

Embeddings are numerical vector representations of text that capture meaning rather than just keywords. Two sentences that say the same thing in different words end up close to each other in vector space, even if they share no common words. In Stellar, embeddings power the knowledge base so the agent can find the right passage in your docs even when the caller phrases the question completely differently.

See the knowledge base

Function Calling

Function calling is the ability for an AI model to invoke an external tool or API during a live conversation, such as checking calendar availability or sending an email. The model decides when to call which function based on the conversation flow, then uses the result in its next response. In Stellar, function calling is what lets the agent actually book the appointment or send the follow-up email mid-call.

IVR (Interactive Voice Response)

IVR is a legacy phone system that uses pre-recorded menus and keypad input ("press 1 for sales") to route callers. Most callers find IVR menus frustrating and a meaningful share hang up before reaching a human. AI voice agents replace IVR with natural conversation, so a caller can describe what they need in their own words and the agent handles routing, scheduling, or escalation directly.

Knowledge Base

A knowledge base is the collection of documents, policies, FAQs, and pricing information an AI voice agent references to give accurate, business-specific answers. Without a knowledge base, an AI agent answers from generic training data and gets details about your business wrong. Stellar's knowledge base lets you upload PDFs, docs, and URLs your agent answers from, with retrieval running in under 200 milliseconds per call.

See knowledge base

Latency

Latency in voice AI is the delay between when a caller stops speaking and when the agent starts responding. Anything over about 700 milliseconds feels sluggish, and over a full second feels broken. Sub-500ms response time is considered natural and conversational. Stellar uses a native speech-to-speech model that targets sub-500ms latency on most calls, so the agent feels like a person rather than a recording with delays.

Lead Qualification

Lead qualification is the process of evaluating whether a new prospect actually fits your buyer profile, usually through a series of screening questions about budget, timeline, decision-making authority, and need. Done manually, qualification rarely happens within an hour of the inquiry, by which point most leads have already engaged a competitor. Stellar's lead qualification agent calls every new lead in under sixty seconds and runs the qualification questions automatically.

See lead qualification

LLM (Large Language Model)

A Large Language Model (LLM) is a deep learning model trained on a huge volume of text data to generate human-like responses. LLMs power the reasoning, dialogue, and decision-making behind AI voice agents. In Stellar, every agent runs on a frontier-grade LLM with a custom system prompt, a knowledge base, and a set of tools layered on top to keep responses on-brand and on-task.

No-Show Rate

No-show rate is the percentage of confirmed appointments where the customer doesn't actually show up. No-shows are a real cost: empty chairs at a salon, lost revenue at a med spa, dead seats at a webinar. AI confirmation calls typically reduce no-show rates by 25 to 40%, especially when run as a multi-touch sequence. Stellar's event confirmation agent runs that sequence 24 hours, 2 hours, and 30 minutes before every event.

See event confirmation

Prompt Engineering

Prompt engineering is the practice of writing instructions that guide an LLM toward the behaviors and response patterns you actually want. In voice AI, the system prompt defines the agent's persona, the questions to ask, the tools available, and the boundaries the agent should not cross (like giving legal or medical advice). In Stellar, prompt engineering happens visually in the agent builder instead of as raw text editing.

See agent builder

RAG (Retrieval-Augmented Generation)

RAG is a technique where the AI retrieves relevant documents from a knowledge base before generating a response, so the answer is grounded in real data instead of the model's training set. RAG is what lets a voice agent answer accurately about your specific pricing, policies, and procedures. Stellar's knowledge base is a RAG pipeline tuned for voice, with retrieval times fast enough to run mid-call without an audible pause.

See knowledge base

SIP (Session Initiation Protocol)

SIP is the standard internet protocol for setting up, managing, and ending voice calls over IP networks. Most modern phone systems and voice AI platforms use SIP to handle the actual call leg. Most Stellar customers never interact with SIP directly because phone provisioning and call routing are handled inside the platform, but SIP knowledge matters when you're integrating Stellar with a self-hosted PBX or telephony stack.

STT (Speech-to-Text)

Speech-to-Text (STT) is the process of converting audio speech into written text. STT is functionally the same as ASR (Automatic Speech Recognition), and the two terms are used interchangeably across the industry. STT is what lets a voice agent read a caller's words. In Stellar, STT is built into the same speech-to-speech model that handles language understanding and response generation, so there's no separate transcription step adding latency.

TCPA (Telephone Consumer Protection Act)

The TCPA is the U.S. federal law regulating automated calls and text messages. It requires prior express written consent before making robocalls or autodialed calls to a consumer's phone, and a 2024 FCC rule tightened that to require one-to-one consent per seller. Stellar's compliance layer handles consent capture, TCPA-safe calling windows by region, and AI identity disclosure on every call by default.

See compliance

TTS (Text-to-Speech)

Text-to-Speech (TTS) is the technology that turns written text into natural-sounding spoken audio. Modern TTS produces human-like voices with proper pacing, intonation, and emotional inflection. Stellar's voice agents use a native speech-to-speech model, so TTS happens inside the same model that handles understanding and reasoning, instead of as a separate pipeline step that adds latency to every reply.

Voice Cloning

Voice cloning is the practice of creating a synthetic voice that mimics a specific real person's speech patterns, tone, and cadence. Stellar does not offer voice cloning. Stellar's voice library uses high-quality but generic neural voices, paired with explicit AI identity disclosure on every call, to stay aligned with state laws restricting AI voice impersonation in commercial calling.

Webhook

A webhook is an HTTP callback that sends real-time data from one system to another when a specific event happens, like a call ending or a lead being qualified. Webhooks are how Stellar integrates with tools that don't have a native integration. In Stellar, you can add a webhook step to a pipeline so it fires on call outcomes, qualification scores, or any other point in the workflow.

See pipelines