Overview
The MeshAgent VoiceBot lets you add a voice agent to any MeshAgent Room with just a few lines of code! VoiceBots use the same toolkits as ChatBots, so everything you learned in the build and deploy a chatbot quick-start still applies—your agent simply talks instead of types.
In this guide we’ll build a voice agent in three phases:
- Basic Voicebot: A simple voice agent with a system prompt (rules)
- Voicebot with Built-in MeshAgent Tools: Extend our voice agent with prebuilt tools to interact with the user and write documents to the room.
- Voicebot with Built-in MeshAgent Tools and Custom Tools: Add our own tools to the agent for use-case specific tasks.
You’ll learn how to:
- Build a voice based agent with MeshAgent
- Connect the agent to a MeshAgent Room for live testing.
- Deploy the agent as a MeshAgent Service.
- Generate a shareable link so others can start talking to the voice agent right away.
Getting Started
Note: These steps are identical to the Build and Deploy a ChatBot getting started steps. Feel free to skip if you are already setup.
Install MeshAgent
Note: uv is the faster drop-in replacement for pip; feel free to use pip install meshagent[all] if you don’t have uv.
Connect to MeshAgent Project
meshagent auth login
meshagent project list
meshagent project activate YOUR_PROJECT_ID
meshagent api-key list
meshagent api-key activate YOUR_MESHAGENT_API_KEY_ID
Note: You can create a MeshAgent API Key through the MeshAgent Studio UI or by running meshagent api-key create and passing in the key name and description
Chat with the built-in VoiceBot (zero code)
Let’s bring a ready-made voicebot into a room and talk to it:
meshagent voicebot join --room YOUR_ROOM --agent-name YOUR_VOICEBOT --name YOUR_VOICEBOT
Running this command will:
- Create and open a room
- Call the voicebot into the room
Now you can type a question to the voicebot and watch it reply.
Phase 1: Building a Simple VoiceBot
Now that we’ve seen the default voicebot in action, let’s create one from scratch.
Create a main.py file and paste the starter code for the basic VoiceBot. To start, we can give the agent one rule / system instruction. We’ll build up this code over the course of this example.
We’ll also add context to the conversation so the VoiceBot knows what day it is. If you need to add context to the VoiceBot before the conversation begins you can modify this accordingly.
import os
import asyncio
from datetime import date
from openai import AsyncOpenAI
from livekit.agents import function_tool, ChatContext, Agent, RunContext, AgentSession
from livekit.plugins import openai, silero
from meshagent.livekit.agents.voice import VoiceBot
from meshagent.api.services import ServiceHost
from meshagent.tools import ToolContext
# MeshAgent Service, Tools, and Agent
service = ServiceHost(
port=int(os.getenv("MESHAGENT_PORT","7777"))
)
@service.path("/voice")
class SimpleVoicebot(VoiceBot):
def __init__(self):
super().__init__(
name="voice_agent",
title="voice_agent",
description="a sample voicebot",
rules=[
"Always respond to the user and include a fun fact at the end of your response.",
"keep your answers short and sweet and be friendly DO NOT include emojis in your response",
],
)
def create_session(self, *, context: ToolContext) -> AgentSession:
token : str = context.room.protocol.token
url : str = context.room.room_url
room_proxy_url = f"{url}/v1"
oaiclient = AsyncOpenAI(
api_key=token,
base_url=room_proxy_url,
default_headers={
"Meshagent-Session" : context.room.session_id
}
)
session = AgentSession(
max_tool_steps=50,
allow_interruptions=True,
vad=silero.VAD.load(),
stt=openai.STT(
client=oaiclient
),
tts=openai.TTS(
client=oaiclient,
voice="sage"
),
llm=openai.LLM(
client=oaiclient,
model="gpt-4.1"
),
)
return session
async def create_agent(self, *, context, session):
ctx=ChatContext()
today_str = date.today().strftime("%A %B %-d")
ctx.add_message(role="assistant", content=f"Today's date is: {today_str}")
@function_tool
async def say(context: RunContext, text: str):
"says something out loud to the user"
session.say(text)
return "success"
return Agent(
chat_ctx=ctx,
instructions="\n".join(self.rules),
allow_interruptions=True,
tools=[
*await self.make_function_tools(context=context),
say
]
)
print(f"running on port {service.port}")
asyncio.run(service.run())
Running the Simple VoiceBot in Test Mode:
Testing a VoiceBot works the same way as a ChatBot — just be sure to update your agent name and service path.
To run the agent you will need two tabs open in your terminal. Be sure you have saved / exported your environment variables to your environment, installed the MeshAgent libraries, and authenticated to MeshAgent (see above steps).
In the first tab run:
In the second tab run the call agent command to bring our agent into the room:
meshagent call agent --url=http://localhost:MESHAGENT_PORT/SERVICE_PATH --room=ROOM_NAME --agent-name=AGENT_NAME --name=AGENT_NAME
As an example:
meshagent call agent --url=http://localhost:7777/voice --room=myroom --agent-name=myvoiceagent --name=myvoiceagent
Now we can navigate to the MeshAgent Studio and we’ll see our agent, “myvoiceagent” show up under the messaging tab! We can talk to the agent now and make sure that it responds with a fun fact like we instructed it to.
Next, let’s add the same built-in MeshAgent tools to our VoiceBot that we gave to our ChatBot in the previous example. These tools will give the voice agent the ability to interact with the user, convert documents to markdown, and write documents to the room.
Toolkits group tools together and can be used by both chat and voice agents.
We’ll also add a few imports and update the voice agent’s rules so it knows how to interact with the available tools efficiently.
import os
import uuid
import asyncio
from datetime import datetime, date
from openai import AsyncOpenAI
from livekit.agents import function_tool, ChatContext, Agent, RunContext, AgentSession
from livekit.plugins import openai, silero
from meshagent.api import RequiredToolkit, RequiredSchema
from meshagent.livekit.agents.voice import VoiceBot
from meshagent.api.services import ServiceHost
from meshagent.tools.document_tools import DocumentAuthoringToolkit, DocumentTypeAuthoringToolkit
from meshagent.agents.schemas.document import document_schema
from meshagent.tools import Tool, Toolkit, ToolContext
service = ServiceHost(
port=int(os.getenv("MESHAGENT_PORT","7777"))
)
@service.path("/voice")
class SimpleVoicebot(VoiceBot):
def __init__(self):
super().__init__(
name="voice_agent",
title="voice_agent",
description="a sample voicebot",
rules=[
"Always respond to the user and include a fun fact at the end of your response.",
"keep your answers short and sweet and be friendly DO NOT include emojis in your response",
"Use the ask_user tool to pick the name of a document, pick a document name if the tool is not available.",
"The document names MUST have the extension .document, automatically add the extension if it is not provided",
"You MUST always write content to a document",
"First open a document, then use tools to write the document content before closing the document",
"Before closing the document, ask the user if they would like any additional modifications to be made to the document, and if so, make them. continue to ask the user until they are happy with the contents. you are not finished until the user is happy.",
"Blob URLs MUST not be added to documents, they must be saved as files first",
],
requires=[
RequiredToolkit(
name="ui"
),
RequiredSchema(
name="document"
),
RequiredToolkit(
name="meshagent.markitdown",
tools=["markitdown_from_user", "markitdown_from_file"]
),
],
toolkits=[
DocumentAuthoringToolkit(),
DocumentTypeAuthoringToolkit(
schema=document_schema,
document_type="document"
)
],
)
def create_session(self, *, context: ToolContext) -> AgentSession:
token : str = context.room.protocol.token
url : str = context.room.room_url
room_proxy_url = f"{url}/v1"
oaiclient = AsyncOpenAI(
api_key=token,
base_url=room_proxy_url,
default_headers={
"Meshagent-Session" : context.room.session_id
}
)
session = AgentSession(
max_tool_steps=50,
allow_interruptions=True,
vad=silero.VAD.load(),
stt=openai.STT(
client=oaiclient
),
tts=openai.TTS(
client=oaiclient,
voice="sage"
),
llm=openai.LLM(
client=oaiclient,
model="gpt-4.1"
),
)
return session
async def create_agent(self, *, context, session):
ctx=ChatContext()
today_str = date.today().strftime("%A %B %-d")
ctx.add_message(role="assistant", content=f"Today's date is: {today_str}")
@function_tool
async def say(context: RunContext, text: str):
"says something out loud to the user"
session.say(text)
return "success"
return Agent(
chat_ctx=ctx,
instructions="\n".join(self.rules),
allow_interruptions=True,
tools=[
*await self.make_function_tools(context=context),
say
]
)
print(f"running on port {service.port}")
asyncio.run(service.run())
We can test the agent using the same commands as we did with the simple voice agent!
meshagent call agent --url=http://localhost:MESHAGENT_PORT/SERVICE_PATH --room=ROOM_NAME --agent-name=AGENT_NAME --name=AGENT_NAME
Now let’s add a custom tool to our VoiceBot! We’ll use the same TaskTracker toolkit we used for the ChatBot in our previous example. This Toolkit allows the agent to write and read tasks to the Room database.
To do this we will update the voice agent initialization and create a table for the tasks when the room starts up. We will create a tasks toolkit with two custom tools to WriteTask and GetTasks from the database.
This is a simple example of adding tasks, to create a more useful task writer we’d want to add date information and other metadata to better track and filter tasks. This example is mainly to demonstrate writing to room storage and adding custom tools to our voice agent.
import os
import uuid
import asyncio
from datetime import datetime, date
from openai import AsyncOpenAI
from livekit.agents import function_tool, ChatContext, Agent, RunContext, AgentSession
from livekit.plugins import openai, silero
from meshagent.api import RequiredToolkit, RequiredSchema
from meshagent.livekit.agents.voice import VoiceBot
from meshagent.api.services import ServiceHost
from meshagent.tools.document_tools import DocumentAuthoringToolkit, DocumentTypeAuthoringToolkit
from meshagent.agents.schemas.document import document_schema
from meshagent.api.room_server_client import TextDataType, TimestampDataType
from meshagent.api.messaging import TextResponse, JsonResponse
from meshagent.tools import Tool, Toolkit, ToolContext
# MeshAgent Service, Tools, and Agent
service = ServiceHost(
port=int(os.getenv("MESHAGENT_PORT","7777"))
)
class WriteTask(Tool):
def __init__(self):
super().__init__(
name="WriteTask",
title="Add a task",
description="A tool to add tasks to the database",
input_schema={
"type": "object",
"additionalProperties" : False,
"required": [
"taskdescription"
],
"properties": {
"taskdescription": {"type": "string"}
}
}
)
async def execute(self, context, taskdescription: str):
await context.room.database.insert(
table="tasks",
records=[{
"task_id": str(uuid.uuid4()),
"taskdescription": taskdescription
}]
)
return TextResponse(text="Task added!")
class GetTasks(Tool):
def __init__(self):
super().__init__(
name="GetTasks",
title="List tasks",
description="List tasks recorded today or this week",
input_schema={
"type": "object",
"additionalProperties":False,
"required": [],
"properties": {}
}
)
async def execute(self, context):
return JsonResponse(json={"values": await context.room.database.search(table="tasks")})
@service.path("/voice")
class SimpleVoicebot(VoiceBot):
def __init__(self):
super().__init__(
name="voice_agent",
title="voice_agent",
description="a sample voicebot",
rules=[
"Always respond to the user and include a fun fact at the end of your response.",
"keep your answers short and sweet and be friendly DO NOT include emojis in your response",
"Use the ask_user tool to pick the name of a document, pick a document name if the tool is not available.",
"The document names MUST have the extension .document, automatically add the extension if it is not provided",
"You MUST always write content to a document",
"First open a document, then use tools to write the document content before closing the document",
"Before closing the document, ask the user if they would like any additional modifications to be made to the document, and if so, make them. continue to ask the user until they are happy with the contents. you are not finished until the user is happy.",
"Blob URLs MUST not be added to documents, they must be saved as files first",
],
requires=[
RequiredToolkit(
name="ui"
),
RequiredSchema(
name="document"
),
RequiredToolkit(
name="meshagent.markitdown",
tools=["markitdown_from_user", "markitdown_from_file"]
),
],
toolkits=[
DocumentAuthoringToolkit(),
DocumentTypeAuthoringToolkit(
schema=document_schema,
document_type="document"
),
Toolkit(name="tasktools", tools=[WriteTask(), GetTasks()])
],
)
async def start(self, *, room):
await super().start(room=room)
# One tiny table:
await room.database.create_table_with_schema(
name="tasks",
schema={
"task_id": TextDataType(),
"taskdescription": TextDataType()
},
mode="overwrite",
data=None
)
def create_session(self, *, context: ToolContext) -> AgentSession:
token : str = context.room.protocol.token
url : str = context.room.room_url
room_proxy_url = f"{url}/v1"
oaiclient = AsyncOpenAI(
api_key=token,
base_url=room_proxy_url,
default_headers={
"Meshagent-Session" : context.room.session_id
}
)
session = AgentSession(
max_tool_steps=50,
allow_interruptions=True,
vad=silero.VAD.load(),
stt=openai.STT(
client=oaiclient
),
tts=openai.TTS(
client=oaiclient,
voice="sage"
),
llm=openai.LLM(
client=oaiclient,
model="gpt-4.1"
),
)
return session
async def create_agent(self, *, context, session):
ctx=ChatContext()
today_str = date.today().strftime("%A %B %-d")
ctx.add_message(role="assistant", content=f"Today's date is: {today_str}")
@function_tool
async def say(context: RunContext, text: str):
"says something out loud to the user"
session.say(text)
return "success"
return Agent(
chat_ctx=ctx,
instructions="\n".join(self.rules),
allow_interruptions=True,
tools=[
*await self.make_function_tools(context=context),
say
]
)
print(f"running on port {service.port}")
asyncio.run(service.run())
We can test the agent using the same commands as we did with the simple voice agent! And now when we go to the room we can ask the agent to add a task to our task database.
meshagent call agent --url=http://localhost:MESHAGENT_PORT/SERVICE_PATH --room=ROOM_NAME --agent-name=AGENT_NAME --name=AGENT_NAME
Once we’re satisfied with how the agent is performing we can deploy and share it.
Note: Building an agent will likely take multiple rounds of iterating through writing different versions of the system prompt and crafting the best tools for the agent.
Deploying and Running the Agent as a MeshAgent Service
This is the same process as in the Build and Deploy a ChatBot Example.
Prerequisites:
To deploy the agent you will need to have docker setup and a container registry with your cloud provider (e.g. GCP, Azure, AWS).
For this example we are using Azure Container Registry. These steps assume you have created a container registry in Azure, have a service principal setup, and have the appropriate permissions to access and push images to the container registry.
You will need to create a new image pull secret so that MeshAgent can pull and run your container, the image pull secret requires the service principal ID and password. To save the image pull secret navigate to the MeshAgent Studio, click the Secrets tab, then click New Image Pull Secret, give a name to your secret and fill in the required information.
Once you have the registry setup proceed with the following steps.
Step 1: Build and Push Your Container
- Create and activate a dedicated Buildx builder
We recommend using zstd images to speed up image pulls. To enable building zstd images, run the following commands:
docker buildx create --name zstd-builder --driver docker-container
docker buildx use zstd-builder
- Log in to Azure, connect to your ACR instance, then build and push the docker container. To build this container, use docker buildx to make a linux/amd64 image and push it to your registry.
az login
az acr login --name myregistry
docker buildx build . --platform linux/amd64 --output=type=image,name=myregistry.azurecr.io/mychatagent:v1,oci-mediatypes=true,compression=zstd,compression-level=3,force-compression=true,push=true
Step 2: Create a Service in MeshAgent for each of your Agents
- Navigate to the Services tab and click the button to create a New Service and fill in the required information about the agent.
- Add the MESHAGENT_PORT as an environment variable, this needs to match the port that you register the service with.
Step 3: Try your agent!
Navigate to the Sessions tab in the MeshAgent Studio and either join an existing room or create a new room by starting a session. You will then see the VoiceBot Agent available in the room for you to interact with!
Generate a Link and Share Your Agent
- Navigate to your Room, click the hamburger menu icon in the top left corner, next click “Share”
- Select the Room and Agent you’d like to share and click “Generate Link”
- Proudly share your Agent!