Skip to main content
The MeshAgent VoiceBot lets you build a real-time, voice-interactive agent that can listen and converse in any MeshAgent Room. VoiceBots use the same toolkits as ChatBots, so everything you learned in the build a chat agent quickstart still applies—your agent simply talks instead of types. In this guide we’ll iteratively build a voice agent in four phases:
  1. Using the MeshAgent CLI: Bring a ready-made VoiceBot into a room using a single CLI command.
  2. Creating a Basic VoiceBot: A simple voice agent with a system prompt (rules)
  3. Adding Built-in MeshAgent Tools: Extend our voice agent with prebuilt tools to interact with the user and write documents to the room.
  4. Adding Custom Tools: Add our own tools to the agent for use-case specific tasks.
You’ll learn how to:
  • Build a voice based agent with MeshAgent
  • Connect the agent to a MeshAgent Room for live testing
  • Define and add custom tools to an agent
  • Decide how you want to run and deploy your agent

Prerequisites

Before you begin be sure you have:
  • Created your MeshAgent account and project
  • Set up and activated a virtual environment, and installed MeshAgent. See the Getting Started Guide for help getting setup.
  • Authenticated to MeshAgent using the CLI by running meshagent setup

Phase 1: Talk with the built-in VoiceBot from the CLI

Start by calling the built-in VoiceBot into a room directly from the CLI:
meshagent voicebot join --room gettingstarted --agent-name voiceagent
Running this command will:
  1. Create and open a room called gettingstarted inside your project
  2. Call the voiceagent into the gettingstarted room
Next, in a web browser, go to studio.meshagent.com and join the gettingstarted room. You will see the voiceagent participant appear! Simply select it and begin talking.
Tip: We recommend muting the microphone after you finish speaking so that the agent does not pick up on any undesired background noise.

Phase 2: Building a Simple VoiceBot

Now that we’ve seen the default voicebot in action, let’s create one from scratch. Create a main.py file and paste the starter code for the basic VoiceBot. To start, we can give the agent one rule (system instruction). We’ll build up this code over the course of this tutorial. We’ll also add context to the conversation so the VoiceBot knows what day it is. If you need to add context to the VoiceBot before the conversation begins you can modify this accordingly.
Note: You can optionally set the port parameter on ServiceHost. If you don’t pass a port, it will default to an available port.
import asyncio
from datetime import date
from openai import AsyncOpenAI
from livekit.agents import function_tool, ChatContext, Agent, RunContext, AgentSession
from livekit.plugins import openai, silero

from meshagent.livekit.agents.voice import VoiceBot
from meshagent.api.services import ServiceHost
from meshagent.tools import ToolContext
from meshagent.otel import otel_config

service = ServiceHost()

otel_config(
    service_name="my-service"
)  # automatically enables telemetry data collection for your agents and tools


@service.path("/voice")
class SimpleVoicebot(VoiceBot):
    def __init__(self):
        super().__init__(
            name="voice_agent",
            title="voice_agent",
            description="a sample voicebot",
            rules=[
                "Always respond to the user and include a fun fact at the end of your response.",
                "keep your answers short and sweet and be friendly DO NOT include emojis in your response",
            ],
        )

    def create_session(self, *, context: ToolContext) -> AgentSession:
        token: str = context.room.protocol.token
        url: str = context.room.room_url

        room_proxy_url = f"{url}/v1"

        oaiclient = AsyncOpenAI(
            api_key=token,
            base_url=room_proxy_url,
            default_headers={"Meshagent-Session": context.room.session_id},
        )

        session = AgentSession(
            max_tool_steps=50,
            allow_interruptions=True,
            vad=silero.VAD.load(),
            stt=openai.STT(client=oaiclient),
            tts=openai.TTS(client=oaiclient, voice="sage"),
            llm=openai.LLM(client=oaiclient, model="gpt-4.1"),
        )
        return session

    async def create_agent(self, *, context, session):
        ctx = ChatContext()
        today_str = date.today().strftime("%A %B %-d")
        ctx.add_message(role="assistant", content=f"Today's date is: {today_str}")

        @function_tool
        async def say(context: RunContext, text: str):
            "says something out loud to the user"
            session.say(text)
            return "success"

        return Agent(
            chat_ctx=ctx,
            instructions="\n".join(self.rules),
            allow_interruptions=True,
            tools=[*await self.make_function_tools(context=context), say],
        )


asyncio.run(service.run())

Run the VoiceBot locally and connect it to a Room. From your terminal run:
meshagent setup # authenticate to meshagent if not already signed in
meshagent service run "main.py" --room=gettingstarted
MeshAgent will start the ServiceHost, discover your VoiceBot endpoint automatically, and call it into the specified room. In MeshAgent Studio, open the gettingstarted room from the Sessions tab and talk to your new agent!
Tip: There are two ways to connect an agent to a room:
  • meshagent service run – Starts the ServiceHost, automatically discovers all agents and tools defined in your service, and calls them into the room for you. This is the easiest way to test multi-agent or multi-tool setups.
  • meshagent call agent – Manually calls a single agent into a room by URL. Use this when you want finer control or to test one agent endpoint at a time, for example: meshagent call agent --url=http://localhost:8081/voice --room=gettingstarted --participant-name=voicebot. You will also need to run your file to start the agent locally e.g. python main.py before running the call command.
To see live logs, traces, and metrics as you interact with the agent check out the Developer Console on the bottom pane of the Studio. You can toggle this console on and off by selecting or deselecting it from the menu in the room.

Phase 3: Adding Built-in MeshAgent Tools to our VoiceBot

Next, let’s add the same built-in MeshAgent tools to our VoiceBot that we gave to our ChatBot in the previous example. These tools will give the voice agent the ability to interact with the user, convert documents to markdown, and write documents to the room. Toolkits group tools together and can be used by any MeshAgent agent. We’ll also add a few imports and update the voice agent’s rules so it knows how to interact with the available tools efficiently.
import asyncio
from datetime import date
from openai import AsyncOpenAI
from livekit.agents import function_tool, ChatContext, Agent, RunContext, AgentSession
from livekit.plugins import openai, silero

from meshagent.api import RequiredToolkit, RequiredSchema
from meshagent.livekit.agents.voice import VoiceBot
from meshagent.api.services import ServiceHost
from meshagent.tools.document_tools import (
    DocumentAuthoringToolkit,
    DocumentTypeAuthoringToolkit,
)
from meshagent.markitdown.tools import MarkItDownToolkit
from meshagent.agents.schemas.document import document_schema
from meshagent.tools import ToolContext
from meshagent.otel import otel_config

service = ServiceHost()

otel_config(
    service_name="my-service"
)  # automatically enables telemetry data collection for your agents and tools


@service.path("/voice")
class SimpleVoicebot(VoiceBot):
    def __init__(self):
        super().__init__(
            name="voice_agent",
            title="voice_agent",
            description="a sample voicebot",
            rules=[
                "Always respond to the user and include a fun fact at the end of your response.",
                "keep your answers short and sweet and be friendly DO NOT include emojis in your response",
                "Use the ask_user tool to pick the name of a document, pick a document name if the tool is not available.",
                "The document names MUST have the extension .document, automatically add the extension if it is not provided",
                "You MUST always write content to a document",
                "First open a document, then use tools to write the document content before closing the document",
                "Before closing the document, ask the user if they would like any additional modifications to be made to the document, and if so, make them. continue to ask the user until they are happy with the contents. you are not finished until the user is happy.",
                "Blob URLs MUST not be added to documents, they must be saved as files first",
            ],
            requires=[RequiredToolkit(name="ui"), RequiredSchema(name="document")],
            toolkits=[
                MarkItDownToolkit(),
                DocumentAuthoringToolkit(),
                DocumentTypeAuthoringToolkit(
                    schema=document_schema, document_type="document"
                ),
            ],
        )

    def create_session(self, *, context: ToolContext) -> AgentSession:
        token: str = context.room.protocol.token
        url: str = context.room.room_url

        room_proxy_url = f"{url}/v1"

        oaiclient = AsyncOpenAI(
            api_key=token,
            base_url=room_proxy_url,
            default_headers={"Meshagent-Session": context.room.session_id},
        )

        session = AgentSession(
            max_tool_steps=50,
            allow_interruptions=True,
            vad=silero.VAD.load(),
            stt=openai.STT(client=oaiclient),
            tts=openai.TTS(client=oaiclient, voice="sage"),
            llm=openai.LLM(client=oaiclient, model="gpt-4.1"),
        )
        return session

    async def create_agent(self, *, context, session):
        ctx = ChatContext()
        today_str = date.today().strftime("%A %B %-d")
        ctx.add_message(role="assistant", content=f"Today's date is: {today_str}")

        @function_tool
        async def say(context: RunContext, text: str):
            "says something out loud to the user"
            session.say(text)
            return "success"

        return Agent(
            chat_ctx=ctx,
            instructions="\n".join(self.rules),
            allow_interruptions=True,
            tools=[*await self.make_function_tools(context=context), say],
        )


asyncio.run(service.run())

We can test the agent using the same commands as we did with the simple voice agent! (Be sure to cancel the previous command Ctrl+C before rerunning)
meshagent service run "main.py" --room=gettingstarted
Head back to MeshAgentStudio and try giving the agent a document to chat about.

Phase 4: Adding Custom Tools to our VoiceBot

Now let’s add a custom tool to our VoiceBot! We’ll use the same TaskTracker toolkit we used for the ChatBot in our previous example. This Toolkit allows the agent to write and read tasks to the Room database. To do this we will update the voice agent initialization and create a table for the tasks when the room starts up. We will create a tasks toolkit with two custom tools to WriteTask and GetTasks from the database. This is a simple example of adding tasks, to create a more useful task writer we’d want to add date information and other metadata to better track and filter tasks. This example is mainly to demonstrate writing to room storage and adding custom tools to our voice agent.
import uuid
import asyncio
from datetime import date
from openai import AsyncOpenAI
from livekit.agents import function_tool, ChatContext, Agent, RunContext, AgentSession
from livekit.plugins import openai, silero

from meshagent.api import RequiredToolkit, RequiredSchema
from meshagent.livekit.agents.voice import VoiceBot
from meshagent.api.services import ServiceHost
from meshagent.tools.document_tools import (
    DocumentAuthoringToolkit,
    DocumentTypeAuthoringToolkit,
)
from meshagent.agents.schemas.document import document_schema
from meshagent.api.room_server_client import TextDataType
from meshagent.markitdown.tools import MarkItDownToolkit
from meshagent.api.messaging import TextResponse, JsonResponse
from meshagent.tools import Tool, Toolkit, ToolContext
from meshagent.otel import otel_config

service = ServiceHost()

otel_config(
    service_name="my-service"
)  # automatically enables telemetry data collection for your agents and tools


class WriteTask(Tool):
    def __init__(self):
        super().__init__(
            name="WriteTask",
            title="Add a task",
            description="A tool to add tasks to the database",
            input_schema={
                "type": "object",
                "additionalProperties": False,
                "required": ["taskdescription"],
                "properties": {"taskdescription": {"type": "string"}},
            },
        )

    async def execute(self, context, taskdescription: str):
        await context.room.database.insert(
            table="tasks",
            records=[
                {"task_id": str(uuid.uuid4()), "taskdescription": taskdescription}
            ],
        )
        return TextResponse(text="Task added!")


class GetTasks(Tool):
    def __init__(self):
        super().__init__(
            name="GetTasks",
            title="List tasks",
            description="List tasks recorded today or this week",
            input_schema={
                "type": "object",
                "additionalProperties": False,
                "required": [],
                "properties": {},
            },
        )

    async def execute(self, context):
        return JsonResponse(
            json={"values": await context.room.database.search(table="tasks")}
        )


@service.path("/voice")
class SimpleVoicebot(VoiceBot):
    def __init__(self):
        super().__init__(
            name="voice_agent",
            title="voice_agent",
            description="a sample voicebot",
            rules=[
                "Always respond to the user and include a fun fact at the end of your response.",
                "keep your answers short and sweet and be friendly DO NOT include emojis in your response",
                "Use the ask_user tool to pick the name of a document, pick a document name if the tool is not available.",
                "The document names MUST have the extension .document, automatically add the extension if it is not provided",
                "You MUST always write content to a document",
                "First open a document, then use tools to write the document content before closing the document",
                "Before closing the document, ask the user if they would like any additional modifications to be made to the document, and if so, make them. continue to ask the user until they are happy with the contents. you are not finished until the user is happy.",
                "Blob URLs MUST not be added to documents, they must be saved as files first",
            ],
            requires=[RequiredToolkit(name="ui"), RequiredSchema(name="document")],
            toolkits=[
                MarkItDownToolkit(),
                DocumentAuthoringToolkit(),
                DocumentTypeAuthoringToolkit(
                    schema=document_schema, document_type="document"
                ),
                Toolkit(name="tasktools", tools=[WriteTask(), GetTasks()]),
            ],
        )

    async def start(self, *, room):
        await super().start(room=room)
        # One tiny table:
        await room.database.create_table_with_schema(
            name="tasks",
            schema={"task_id": TextDataType(), "taskdescription": TextDataType()},
            mode="overwrite",
            data=None,
        )

    def create_session(self, *, context: ToolContext) -> AgentSession:
        token: str = context.room.protocol.token
        url: str = context.room.room_url

        room_proxy_url = f"{url}/v1"

        oaiclient = AsyncOpenAI(
            api_key=token,
            base_url=room_proxy_url,
            default_headers={"Meshagent-Session": context.room.session_id},
        )

        session = AgentSession(
            max_tool_steps=50,
            allow_interruptions=True,
            vad=silero.VAD.load(),
            stt=openai.STT(client=oaiclient),
            tts=openai.TTS(client=oaiclient, voice="sage"),
            llm=openai.LLM(client=oaiclient, model="gpt-4.1"),
        )
        return session

    async def create_agent(self, *, context, session):
        ctx = ChatContext()
        today_str = date.today().strftime("%A %B %-d")
        ctx.add_message(role="assistant", content=f"Today's date is: {today_str}")

        @function_tool
        async def say(context: RunContext, text: str):
            "says something out loud to the user"
            session.say(text)
            return "success"

        return Agent(
            chat_ctx=ctx,
            instructions="\n".join(self.rules),
            allow_interruptions=True,
            tools=[*await self.make_function_tools(context=context), say],
        )


asyncio.run(service.run())

We can test the agent using the same commands as we did with the simple voice agent! And now when we go to the room we can ask the agent to add a task to our task database.
meshagent service run "main.py" --room=gettingstarted
Once we’re satisfied with how the agent is performing we can deploy and share it.
Note: Building an agent will likely take multiple rounds of iterating through writing different versions of the system prompt and crafting the best tools for the agent.

Agent Deployment Options

Once your VoiceBot is working locally, you can run it in different ways depending on your needs and permissions.
ModeDescriptionBest For
Local (Development)Run directly from your machine using meshagent service run.Testing, debugging, iteration
Project ServiceDeploy once to run automatically in every room of a project (admin only).Always-on shared agents
Room ServiceLaunch dynamically in specific rooms via the Containers API.On-demand or user-triggered agents
Each mode uses the same main.py file with our ServiceHost and agent definition. For both project and room services we’ll need to create a Dockerfile for our agent and a meshagent.yaml file that defines the agent configuration. Project Services will use a meshagent.yaml with type Service and Room Services will use a meshagent.yaml with type ServiceTemplate. To learn more about containerizing and managing your ChatBot at scale see Services & Room Containers

Troubleshooting & Tips

  • Mute your mic when not speaking to avoid echo and prevent the agent from picking up on background noise.
  • Use the Developer Console in Studio to view logs, traces, and LLM calls live.
  • Small tweaks to your rules (system prompt) and toolkits can have big quality impacts.
  • Restart the service whenever you modify code; the room will auto-reconnect.

Next Steps

  • Review the Voicebot document to understand how the MeshAgent VoiceBot class works
  • ChatBot Overview — Learn about lifecycle, context building, and key methods for MeshAgent ChatBots
  • TaskRunner — Learn how to run agents in the background with TaskRunners
  • Worker — Offload background or long-running jobs
  • Services & Containers - Deploy your VoiceBot as a managed or on-demand service
  • Secrets & Registries — Learn how to store credentials securely for deployment