Overview

The MeshAgent VoiceBot lets you add a voice agent to any MeshAgent Room with just a few lines of code! VoiceBots use the same toolkits as ChatBots, so everything you learned in the build and deploy a chat agent quick-start still applies—your agent simply talks instead of types. In this guide we’ll build a voice agent in three phases:
  1. Basic VoiceBot: A simple voice agent with a system prompt (rules)
  2. VoiceBot with Built-in MeshAgent Tools: Extend our voice agent with prebuilt tools to interact with the user and write documents to the room.
  3. VoiceBot with Built-in MeshAgent Tools and Custom Tools: Add our own tools to the agent for use-case specific tasks.
You’ll learn how to:
  • Build a voice based agent with MeshAgent
  • Connect the agent to a MeshAgent Room for live testing.
  • Deploy the agent as a MeshAgent Service.
  • Generate a shareable link so others can start talking to the voice agent right away.

Prerequisites

Be sure you have created your MeshAgent account and project, set up and activated a virtual environment, and installed the MeshAgent Python SDK. For help getting setup see the Getting Started Guide.

Connect to MeshAgent Project

meshagent setup

Chat with the built-in VoiceBot (zero code)

Let’s bring a ready-made voicebot into a room and talk to it:
meshagent voicebot join --room gettingstarted --agent-name voiceagent --name voiceagent
Running this command will:
  1. Create and open a room called gettingstarted inside your project
  2. Call the voiceagent into the gettingstarted room
In a web browser, go to studio.meshagent.com and join the gettingstarted room. You will see the voiceagent participant appear! Simply select the agent and begin talking to it. We recommend muting the microphone after you finish speaking so that the agent does not pick up on any undesired background noise.

Phase 1: Building a Simple VoiceBot

Now that we’ve seen the default voicebot in action, let’s create one from scratch. Create a main.py file and paste the starter code for the basic VoiceBot. To start, we can give the agent one rule / system instruction. We’ll build up this code over the course of this example. We’ll also add context to the conversation so the VoiceBot knows what day it is. If you need to add context to the VoiceBot before the conversation begins you can modify this accordingly.
Python
import os
import asyncio
from datetime import date
from openai import AsyncOpenAI
from livekit.agents import function_tool, ChatContext, Agent, RunContext, AgentSession
from livekit.plugins import openai, silero

from meshagent.livekit.agents.voice import VoiceBot
from meshagent.api.services import ServiceHost
from meshagent.tools import ToolContext
from meshagent.otel import otel_config

service = ServiceHost(
    port=int(os.getenv("MESHAGENT_PORT","7777"))
)

otel_config(service_name="my-service") # automatically enables telemetry data collection for your agents and tools 

@service.path("/voice")
class SimpleVoicebot(VoiceBot):
    def __init__(self):
        super().__init__(
            name="voice_agent",
            title="voice_agent",
            description="a sample voicebot",
            rules=[
                "Always respond to the user and include a fun fact at the end of your response.",
                "keep your answers short and sweet and be friendly DO NOT include emojis in your response",
            ],
        )

    def create_session(self, *, context: ToolContext) -> AgentSession:
        token : str = context.room.protocol.token
        url : str = context.room.room_url
         
        room_proxy_url = f"{url}/v1"
            
        oaiclient = AsyncOpenAI(
            api_key=token,
            base_url=room_proxy_url,
            default_headers={
                "Meshagent-Session" : context.room.session_id
            }
        )

        session = AgentSession(
            max_tool_steps=50,
            allow_interruptions=True,
            vad=silero.VAD.load(),
            stt=openai.STT(
                client=oaiclient
            ),
            tts=openai.TTS(
                client=oaiclient,
                voice="sage"
            ),
            llm=openai.LLM(
                client=oaiclient, 
                model="gpt-4.1"
            ),
        )
        return session


    async def create_agent(self, *, context, session):
        ctx=ChatContext()
        today_str = date.today().strftime("%A %B %-d")
        ctx.add_message(role="assistant", content=f"Today's date is: {today_str}")

        @function_tool
        async def say(context: RunContext, text: str):
            "says something out loud to the user"
            session.say(text)
            return "success"

        return Agent(
            chat_ctx=ctx,
            instructions="\n".join(self.rules),
            allow_interruptions=True,
            tools=[
                *await self.make_function_tools(context=context),
                say
            ]
        )
    
print(f"running on port {service.port}")
asyncio.run(service.run())

Running the Simple VoiceBot in Test Mode:

Testing a VoiceBot works the same way as a ChatBot — just be sure to update your agent name and service path. To run the agent you will need two tabs open in your terminal. You need to be in the appropriate directory and have your virtual environment activated in each tab. In the first tab run:
python main.py
In the second tab run the call agent command to bring our agent into the room:
meshagent call agent --url=http://localhost:7777/voice --room=gettingstarted --participant-name=voiceagent
Note: This will bring in the agent named voiceagent into the gettingstarted room and run it on port 7777 using the service path voice. If you want to rename the agent, room, port, or service path simply update the variables in the command. Now we can navigate to the MeshAgent Studio and we’ll see our agent, voiceagent show up under the participants tab! We can talk to the agent now and make sure that it responds with a fun fact like we instructed it to. We can also click on the menu inside the room and turn on the “Developer console” this will allow us to see live logs, traces, and metrics as we interact with the agent.

Phase 2: Adding Built-in MeshAgent Tools to our VoiceBot

Next, let’s add the same built-in MeshAgent tools to our VoiceBot that we gave to our ChatBot in the previous example. These tools will give the voice agent the ability to interact with the user, convert documents to markdown, and write documents to the room. Toolkits group tools together and can be used by both chat and voice agents. We’ll also add a few imports and update the voice agent’s rules so it knows how to interact with the available tools efficiently.
Python
import os
import uuid
import asyncio
from datetime import datetime, date
from openai import AsyncOpenAI
from livekit.agents import function_tool, ChatContext, Agent, RunContext, AgentSession
from livekit.plugins import openai, silero

from meshagent.api import RequiredToolkit, RequiredSchema
from meshagent.livekit.agents.voice import VoiceBot
from meshagent.api.services import ServiceHost
from meshagent.tools.document_tools import DocumentAuthoringToolkit, DocumentTypeAuthoringToolkit
from meshagent.agents.schemas.document import document_schema
from meshagent.tools import Tool, Toolkit, ToolContext
from meshagent.otel import otel_config

service = ServiceHost(
    port=int(os.getenv("MESHAGENT_PORT","7777"))
)

otel_config(service_name="my-service") # automatically enables telemetry data collection for your agents and tools 

@service.path("/voice")
class SimpleVoicebot(VoiceBot):
    def __init__(self):
        super().__init__(
            name="voice_agent",
            title="voice_agent",
            description="a sample voicebot",
            rules=[
                "Always respond to the user and include a fun fact at the end of your response.",
                "keep your answers short and sweet and be friendly DO NOT include emojis in your response",
                "Use the ask_user tool to pick the name of a document, pick a document name if the tool is not available.",
                "The document names MUST have the extension .document, automatically add the extension if it is not provided",
                "You MUST always write content to a document",
                "First open a document, then use tools to write the document content before closing the document",
                "Before closing the document, ask the user if they would like any additional modifications to be made to the document, and if so, make them. continue to ask the user until they are happy with the contents. you are not finished until the user is happy.",
                "Blob URLs MUST not be added to documents, they must be saved as files first",
            ],
            requires=[
                RequiredToolkit(
                    name="ui"
                ),
                RequiredSchema(
                    name="document"
                ),
                RequiredToolkit(
                    name="meshagent.markitdown", 
                    tools=["markitdown_from_file"]
                ),
            ],
            toolkits=[
                DocumentAuthoringToolkit(),
                DocumentTypeAuthoringToolkit(
                    schema=document_schema,
                    document_type="document"
                )
            ],
        )

    def create_session(self, *, context: ToolContext) -> AgentSession:
        token : str = context.room.protocol.token
        url : str = context.room.room_url
         
        room_proxy_url = f"{url}/v1"
            
        oaiclient = AsyncOpenAI(
            api_key=token,
            base_url=room_proxy_url,
            default_headers={
                "Meshagent-Session" : context.room.session_id
            }
        )

        session = AgentSession(
            max_tool_steps=50,
            allow_interruptions=True,
            vad=silero.VAD.load(),
            stt=openai.STT(
                client=oaiclient
            ),
            tts=openai.TTS(
                client=oaiclient,
                voice="sage"
            ),
            llm=openai.LLM(
                client=oaiclient, 
                model="gpt-4.1"
            ),
        )
        return session


    async def create_agent(self, *, context, session):
        ctx=ChatContext()
        today_str = date.today().strftime("%A %B %-d")
        ctx.add_message(role="assistant", content=f"Today's date is: {today_str}")

        @function_tool
        async def say(context: RunContext, text: str):
            "says something out loud to the user"
            session.say(text)
            return "success"

        return Agent(
            chat_ctx=ctx,
            instructions="\n".join(self.rules),
            allow_interruptions=True,
            tools=[
                *await self.make_function_tools(context=context),
                say
            ]
        )
    
print(f"running on port {service.port}")
asyncio.run(service.run())
We can test the agent using the same commands as we did with the simple voice agent!
python main.py
meshagent call agent --url=http://localhost:7777/voice --room=gettingstarted --participant-name=voiceagent

Phase 3: Adding Custom Tools to our VoiceBot

Now let’s add a custom tool to our VoiceBot! We’ll use the same TaskTracker toolkit we used for the ChatBot in our previous example. This Toolkit allows the agent to write and read tasks to the Room database. To do this we will update the voice agent initialization and create a table for the tasks when the room starts up. We will create a tasks toolkit with two custom tools to WriteTask and GetTasks from the database. This is a simple example of adding tasks, to create a more useful task writer we’d want to add date information and other metadata to better track and filter tasks. This example is mainly to demonstrate writing to room storage and adding custom tools to our voice agent.
Python
import os
import uuid
import asyncio
from datetime import datetime, date
from openai import AsyncOpenAI
from livekit.agents import function_tool, ChatContext, Agent, RunContext, AgentSession
from livekit.plugins import openai, silero

from meshagent.api import RequiredToolkit, RequiredSchema
from meshagent.livekit.agents.voice import VoiceBot
from meshagent.api.services import ServiceHost
from meshagent.tools.document_tools import DocumentAuthoringToolkit, DocumentTypeAuthoringToolkit
from meshagent.agents.schemas.document import document_schema
from meshagent.api.room_server_client import TextDataType, TimestampDataType
from meshagent.api.messaging import TextResponse, JsonResponse
from meshagent.tools import Tool, Toolkit, ToolContext
from meshagent.otel import otel_config

service = ServiceHost(
    port=int(os.getenv("MESHAGENT_PORT","7777"))
)

otel_config(service_name="my-service") # automatically enables telemetry data collection for your agents and tools 

class WriteTask(Tool):
    def __init__(self):
        super().__init__(
            name="WriteTask",
            title="Add a task",
            description="A tool to add tasks to the database",
            input_schema={
                "type": "object",
                "additionalProperties" : False,
                "required": [
                    "taskdescription"
                    ],
                "properties": {
                    "taskdescription": {"type": "string"}
                    }
            }
        )

    async def execute(self, context, taskdescription: str):
        await context.room.database.insert(
            table="tasks",
            records=[{
                "task_id": str(uuid.uuid4()),
                "taskdescription": taskdescription
            }]
        )
        return TextResponse(text="Task added!")
    
class GetTasks(Tool):
    def __init__(self):
        super().__init__(
            name="GetTasks",
            title="List tasks",
            description="List tasks recorded today or this week",
            input_schema={
                "type": "object",
                "additionalProperties":False,
                "required": [],
                "properties": {}
            }
        )

    async def execute(self, context):
        return JsonResponse(json={"values": await context.room.database.search(table="tasks")})

@service.path("/voice")
class SimpleVoicebot(VoiceBot):
    def __init__(self):
        super().__init__(
            name="voice_agent",
            title="voice_agent",
            description="a sample voicebot",
            rules=[
                "Always respond to the user and include a fun fact at the end of your response.",
                "keep your answers short and sweet and be friendly DO NOT include emojis in your response",
                "Use the ask_user tool to pick the name of a document, pick a document name if the tool is not available.",
                "The document names MUST have the extension .document, automatically add the extension if it is not provided",
                "You MUST always write content to a document",
                "First open a document, then use tools to write the document content before closing the document",
                "Before closing the document, ask the user if they would like any additional modifications to be made to the document, and if so, make them. continue to ask the user until they are happy with the contents. you are not finished until the user is happy.",
                "Blob URLs MUST not be added to documents, they must be saved as files first",
            ],
            requires=[
                RequiredToolkit(
                    name="ui"
                ),
                RequiredSchema(
                    name="document"
                ),
                RequiredToolkit(
                    name="meshagent.markitdown", 
                    tools=["markitdown_from_file"]
                ),
            ],
            toolkits=[
                DocumentAuthoringToolkit(),
                DocumentTypeAuthoringToolkit(
                    schema=document_schema,
                    document_type="document"
                ),
                Toolkit(name="tasktools", tools=[WriteTask(), GetTasks()])
            ],
        )

    async def start(self, *, room):
        await super().start(room=room)
        # One tiny table:
        await room.database.create_table_with_schema(
            name="tasks",
            schema={
                "task_id": TextDataType(),
                "taskdescription": TextDataType()
            },
            mode="overwrite",
            data=None
        )

    def create_session(self, *, context: ToolContext) -> AgentSession:
        token : str = context.room.protocol.token
        url : str = context.room.room_url
         
        room_proxy_url = f"{url}/v1"
            
        oaiclient = AsyncOpenAI(
            api_key=token,
            base_url=room_proxy_url,
            default_headers={
                "Meshagent-Session" : context.room.session_id
            }
        )

        session = AgentSession(
            max_tool_steps=50,
            allow_interruptions=True,
            vad=silero.VAD.load(),
            stt=openai.STT(
                client=oaiclient
            ),
            tts=openai.TTS(
                client=oaiclient,
                voice="sage"
            ),
            llm=openai.LLM(
                client=oaiclient, 
                model="gpt-4.1"
            ),
        )
        return session


    async def create_agent(self, *, context, session):
        ctx=ChatContext()
        today_str = date.today().strftime("%A %B %-d")
        ctx.add_message(role="assistant", content=f"Today's date is: {today_str}")

        @function_tool
        async def say(context: RunContext, text: str):
            "says something out loud to the user"
            session.say(text)
            return "success"

        return Agent(
            chat_ctx=ctx,
            instructions="\n".join(self.rules),
            allow_interruptions=True,
            tools=[
                *await self.make_function_tools(context=context),
                say
            ]
        )
    
print(f"running on port {service.port}")
asyncio.run(service.run())
We can test the agent using the same commands as we did with the simple voice agent! And now when we go to the room we can ask the agent to add a task to our task database.
python main.py
meshagent call agent --url=http://localhost:7777/voice --room=gettingstarted --participant-name=voiceagent
Once we’re satisfied with how the agent is performing we can deploy and share it. Note: Building an agent will likely take multiple rounds of iterating through writing different versions of the system prompt and crafting the best tools for the agent.

Deploying and Running the Agent as a MeshAgent Service

This is the same process as in the Build and Deploy a ChatBot Example.

Prerequisites:

To deploy the agent you will need to have docker setup and a container registry with your cloud provider (e.g. GCP, Azure, AWS). For this example we are using Azure Container Registry. These steps assume you have created a container registry in Azure, have a service principal setup, and have the appropriate permissions to access and push images to the container registry. You will need to create a new image pull secret so that MeshAgent can pull and run your container, the image pull secret requires the service principal ID and password. To save the image pull secret navigate to the MeshAgent Studio, click the Secrets tab, then click New Image Pull Secret, give a name to your secret and fill in the required information. Once you have the registry setup proceed with the following steps.

Step 1: Build and Push Your Container

  1. Create and activate a dedicated Buildx builder We recommend using zstd images to speed up image pulls. To enable building zstd images, run the following commands:
docker buildx create --name zstd-builder --driver docker-container
docker buildx use zstd-builder
  1. Log in to Azure, connect to your ACR instance, then build and push the docker container. To build this container, use docker buildx to make a linux/amd64 image and push it to your registry.
az login
az acr login --name myregistry
docker buildx build . --platform linux/amd64 --output=type=image,name=myregistry.azurecr.io/mychatagent:v1,oci-mediatypes=true,compression=zstd,compression-level=3,force-compression=true,push=true

Step 2: Create a Service in MeshAgent for each of your Agents

  1. Navigate to the Services tab and click the button to create a New Service and fill in the required information about the agent.
  2. Add the MESHAGENT_PORT as an environment variable, this needs to match the port that you register the service with.

Step 3: Try your agent!

Navigate to the Sessions tab in the MeshAgent Studio and either join an existing room or create a new room by starting a session. You will then see the VoiceBot Agent available in the room for you to interact with!
  1. Navigate to your Room, click the hamburger menu icon in the top left corner, next click “Share”
  2. Select the Room and Agent you’d like to share and click “Generate Link”
  3. Proudly share your Agent!