> ## Documentation Index
> Fetch the complete documentation index at: https://docs.meshagent.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Custom Logs, Traces, and Metrics

> Add your own OpenTelemetry spans, logs, and metrics inside a MeshAgent service or room-connected toolkit.

MeshAgent already gives you logs, traces, metrics, session data, and developer logs in MeshAgent Studio.

Use this page when you want to add your own spans, logs, and metrics inside a custom Python service or room-connected toolkit.

For the built-in observability model first, see [Observability](./overview).

## Enable telemetry in your process

Call `otel_config()` once at process startup:

```python Python theme={null}
from meshagent.otel import otel_config
otel_config(service_name="weather-service")
```

When this code runs inside MeshAgent, `otel_config()` uses the injected `OTEL_ENDPOINT` and room/session environment to send telemetry to MeshAgent Studio.

## Example: Weather Toolkit with custom instrumentation

The example adds custom spans around validation, the external HTTP call, and response parsing. It also adds logs and metrics.

<CodeGroup>
  ```python Python theme={null}
  import asyncio
  import logging

  import httpx
  from meshagent.agents import SingleRoomAgent
  from meshagent.otel import otel_config
  from meshagent.tools import FunctionTool, ToolContext, Toolkit
  from opentelemetry import metrics, trace
  from opentelemetry.trace import Status, StatusCode

  # Configure OpenTelemetry
  otel_config(service_name="weather_tools")
  log = logging.getLogger(__name__)
  tracer = trace.get_tracer(__name__)
  meter = metrics.get_meter(__name__)

  # Counters
  calls = meter.create_counter(
      "weather.calls", unit="1", description="Total weather tool invocations"
  )
  errors = meter.create_counter(
      "weather.errors", unit="1", description="Errors during execution"
  )


  class WeatherTool(FunctionTool):
      def __init__(self):
          super().__init__(
              name="get_weather",
              title="Weather Tool",
              description="Get current weather for a city using wttr.in API",
              input_schema={
                  "type": "object",
                  "additionalProperties": False,
                  "required": ["city", "units"],
                  "properties": {
                      "city": {"type": "string", "description": "City name"},
                      "units": {
                          "type": "string",
                          "enum": ["metric", "imperial"],
                          "description": "Units the temperature will be returned in",
                      },
                  },
              },
          )

      async def execute(self, context: ToolContext, *, city: str, units: str):
          """
          This shows custom instrumentation that meshagent doesn't do automatically:
          1. Separate spans for validation, API call, parsing
          2. Custom attributes (API endpoint, response size)
          3. Events for important moments (rate limits etc.)
          4. Error handling with span status
          """
          log.info(f"Weather tool is running for city: {city} with units: {units}")
          calls.add(1, attributes={"city": city.lower(), "units": units})
          # Custom span for input validation
          with tracer.start_as_current_span("validate_input") as span:
              span.set_attribute("city", city)

              if not city or len(city) < 2:
                  span.set_status(Status(StatusCode.ERROR, "Invalid city name"))
                  span.add_event("validation_failed", {"reason": "city too short"})
                  return {"error": "City name must be at least 2 characters"}

              span.add_event("validation_passed")

          # Custom span for external API call
          with tracer.start_as_current_span("fetch_weather_api") as span:
              # Add attributes about the API call
              api_url = f"https://wttr.in/{city}?format=j1"
              span.set_attribute("http.url", api_url)
              span.set_attribute("http.method", "GET")
              span.set_attribute("api.provider", "wttr.in")

              span.add_event("api_request_start")

              try:
                  async with httpx.AsyncClient(timeout=10.0) as client:
                      response = await client.get(api_url)

                      # Record response attributes
                      span.set_attribute("http.status_code", response.status_code)
                      span.set_attribute("http.response_size", len(response.content))

                      if response.status_code == 429:
                          span.add_event("rate_limit_exceeded")
                          span.set_status(Status(StatusCode.ERROR, "Rate limited"))
                          return {"error": "Rate limit exceeded, try again later"}

                      response.raise_for_status()
                      data = response.json()

                      span.add_event(
                          "api_request_success",
                          {
                              "data_keys": list(data.keys()),
                          },
                      )

              except httpx.TimeoutException:
                  span.set_status(Status(StatusCode.ERROR, "API timeout"))
                  span.add_event("api_timeout")
                  errors.add(1, attributes={"kind": "timeout", "city": city.lower()})
                  return {"error": "Weather service timeout"}
              except Exception as e:
                  errors.add(
                      1, attributes={"kind": type(e).__name__, "city": city.lower()}
                  )
                  span.set_status(Status(StatusCode.ERROR, str(e)))
                  span.add_event("api_error", {"error_type": type(e).__name__})
                  return {"error": f"Failed to fetch weather: {str(e)}"}

          # Custom span for parsing and formatting
          with tracer.start_as_current_span("parse_response") as span:
              try:
                  current = data["current_condition"][0]
                  location = data["nearest_area"][0]

                  if units == "metric":
                      temperature = current["temp_C"]
                      degrees_in = "°C"
                  elif units == "imperial":
                      temperature = current["temp_F"]
                      degrees_in = "°F"
                  else:
                      log.warning(
                          f"Units {units} is not a valid unit. Must use metric or imperial"
                      )
                      return {"error": "Invalid units: must be 'metric' or 'imperial'"}

                  result = {
                      "city": location["areaName"][0]["value"],
                      "country": location["country"][0]["value"],
                      "temperature": temperature,
                      "units": degrees_in,
                      "description": current["weatherDesc"][0]["value"],
                      "humidity": current["humidity"],
                      "wind_speed": current["windspeedKmph"],
                  }

                  # Record what we parsed
                  span.set_attribute("parsed_fields", len(result))
                  span.add_event("parse_success")

                  return result

              except (KeyError, IndexError) as e:
                  span.set_status(Status(StatusCode.ERROR, "Parse failed"))
                  span.add_event("parse_error", {"error": str(e)})
                  errors.add(1, attributes={"kind": "parse_error"})
                  return {"error": "Failed to parse weather data"}


  class WeatherToolkit(Toolkit):
      def __init__(self):
          super().__init__(
              name="weather-toolkit",
              title="Weather Toolkit",
              description="Tools for getting weather information",
              tools=[WeatherTool()],
          )


  class WeatherAgent(SingleRoomAgent):
      async def get_exposed_toolkits(self) -> list[Toolkit]:
          return [WeatherToolkit()]


  async def main() -> None:
      agent = WeatherAgent(title="weather-toolkit-host")
      await agent.run()


  if __name__ == "__main__":
      asyncio.run(main())

  ```
</CodeGroup>

### Custom spans

Use spans to track specific operations inside a trace.

This example records:

* `execute.weather-toolkit.get_weather`, which MeshAgent creates for the overall tool call
* `validate_input` for city and units validation
* `fetch_weather_api` for the outbound HTTP request, including attributes such as `http.url`, `http.method`, `http.status_code`, and `http.response_size`
* `parse_response` for the final response shaping step

This gives you finer-grained visibility inside the spans MeshAgent already creates.

**Pattern:**

```python Python theme={null}
from opentelemetry import trace
from opentelemetry.trace import Status, StatusCode

tracer = trace.get_tracer(__name__)
with tracer.start_as_current_span("fetch_weather_api") as span:
    span.set_attribute("http.url", api_url)
    span.add_event("api_request_start")
    ...
    if resp.status_code == 429:
        span.set_status(Status(StatusCode.ERROR, "Rate limited"))
```

These spans appear in the room's **Traces** tab.

### Custom logs

`otel_config()` sets up an OTEL logging handler so normal `logging` calls are captured too.

```python Python theme={null}
import logging
log = logging.getLogger(__name__)

log.info(f"Weather tool is running for city: {city} with units: {units}")
```

These logs appear in the room's **Logs** tab.

### Custom metrics

You can also add counters and histograms to track trends over time.

```python Python theme={null}
from opentelemetry import metrics
meter = metrics.get_meter("weather.tools")

calls = meter.create_counter("weather.calls", unit="1",
                             description="Total weather tool invocations")

# When a call starts
calls.add(1, attributes={"city": city.lower(), "units": units})
```

These metrics appear in the room's **Metrics** tab.

## Deploy the sample

You can run this toolkit locally with `meshagent room connect`, deploy it as a service, and invoke `get_weather()` from Studio or the CLI.

### Run it locally

During development, run the toolkit with `meshagent room connect` so MeshAgent provides the room token, room name, LLM proxy credentials, and telemetry environment:

```bash bash theme={null}
meshagent rooms create gettingstarted --if-not-exists
meshagent room connect --room=gettingstarted --identity=weather-toolkit -- python3 observability.py
```

### Step 1: Package and deploy the room service

Package the sample with a `meshagent.yaml` file and a container image that MeshAgent can run. For the general deployment flow, see [Service YAML](../services/deployment/deploy_services).

This example uses a MeshAgent runtime image plus a lightweight code image. The default YAML points at the public `python-docs-examples` image so you can run the docs example without building your own image first.

Project structure:

```bash theme={null}
your-project/
├── Dockerfile                    # Shared by all samples
├── observability/
│   ├── observability.py
│   └── meshagent.yaml            # Config specific to this sample
└── another_sample/               # Other samples follow same pattern
    ├── another_sample.py
    └── meshagent.yaml
```

If you are building a single tool, you only need the `observability/` folder.

### Step 1a: Build a Docker image

Create a scratch Dockerfile and copy the files you want to run:

<CodeGroup>
  ```dockerfile Dockerfile theme={null}
  FROM scratch

  COPY . /
  ```
</CodeGroup>

Build and push the image:

```bash bash theme={null}
docker buildx build . \
  -t "<REGISTRY>/<NAMESPACE>/<IMAGE_NAME>:<TAG>" \
  --platform linux/amd64 \
  --push
```

### Step 1b: Define the service

Create a `meshagent.yaml` file that references:

* Runtime image: The MeshAgent Python SDK image with all dependencies
* Code mount: Your code-only image mounted at /src
* Command path: Points to your sample's specific location
* Participant token: Injects `MESHAGENT_TOKEN` for the room-connected toolkit process

<CodeGroup>
  ```yaml Yaml theme={null}
  kind: Service
  version: v1
  metadata:
    name: otel-example
    description: "An example weather tool with otel instrumentation"
  container:
    image: "us-central1-docker.pkg.dev/meshagent-public/images/python-sdk:{SERVER_VERSION}-esgz"
    command: python /src/observability/observability.py
    environment:
      - name: MESHAGENT_TOKEN
        token:
          identity: weather-toolkit
          role: agent
    storage:
      images:
        # Replace this image tag with your own code-only image if you build one.
        - image: "us-central1-docker.pkg.dev/meshagent-public/images/python-docs-examples:{SERVER_VERSION}"
          path: /src
          read_only: true

  ```
</CodeGroup>

Path mapping:

* Your code image contains `/observability/observability.py`
* It's mounted at `/src` in the runtime container
* The command runs `python /src/observability/observability.py`

The default YAML in the docs uses `us-central1-docker.pkg.dev/meshagent-public/images/python-docs-examples` so you can test this example immediately without building your own image first. Replace this with your own image tag when deploying your code.

### Step 1c: Deploy the service

From the directory that contains `meshagent.yaml`:

```bash theme={null}
meshagent service create --file "meshagent.yaml" --room=gettingstarted
```

### Step 2: Invoke the tool

Once the service is deployed, invoke the tool from MeshAgent Studio or the CLI.

You can also invoke the tool using the MeshAgent CLI:

```bash bash  theme={null}
meshagent room agents invoke-tool \
  --room=gettingstarted \
  --toolkit=weather-toolkit \
  --tool=get_weather \
  --arguments='{"city":"Costa Mesa","units":"imperial"}'
```

## View telemetry

In [MeshAgent Studio](https://studio.meshagent.com), you can inspect the telemetry from this example in both the **Session** view and the **Developer Console**.

After you invoke the weather tool, the **Traces** tab should show a tree like:

```
execute.weather-toolkit.get_weather
 ├─ validate_input
 ├─ fetch_weather_api
 └─ parse_response
```

Logs and metrics appear under their respective tabs.

## Next Steps

* [Observability](./overview): understand the built-in telemetry model and where to inspect it in MeshAgent Studio
* [Agents](../agents/overview): understand how agents work in MeshAgent and start building your first one
* [Tools and Toolkits](../agents/tools/tools_and_toolkits): learn how tools are discovered, shared, and called inside a MeshAgent room
* [Service YAML](../services/deployment/deploy_services): write service manifests for instrumented agents and tools
