Datasets

Overview

The DatasetsClient is the Room API for room-scoped structured data. Use it to create tables, insert and update rows, build indexes, and run text or vector search without provisioning a separate dataset. Datasets are designed and optimized for batch writes, indexing, scans, search, and retrieval. They are not a transactional database replacement: avoid using them for high-frequency row-by-row mutations, cross-table transactions, locks, or workloads that require immediate transactional consistency.

CLI commands

Start with the CLI help, then use a few common commands:

bash

meshagent room dataset --help
meshagent room dataset table --room myroom
meshagent room dataset inspect --room myroom --table users
meshagent room dataset search --room myroom --table users

Why use the Datasets API?

Keep structured room data close to the agents and services that use it.
Support filtering, analytics, semantic search, and retrieval workflows in one place.
Manage schemas and indexes through the same Room API surface you already use for messaging, storage, and sync.

How it works

Each room dataset contains named tables. You can create tables from a schema or from raw data, update rows, create scalar/full-text/vector indexes, and run searches that combine filters with text or vector similarity. Use it when your room needs structured records instead of only files or chat history.

Current implementation: MeshAgent room dataset is currently backed directly by Lance datasets, which provide the table, vector, and full-text primitives used by the room dataset toolkit.

Typed values: MeshAgent also supports json, uuid, list, and struct dataset types. In the typed SDKs, use the wrapper classes for those values, such as DatasetJson, DatasetStruct, DatasetUuid / UuidValue, and DatasetExpression where the client requires them.

Permissions and grants

The Datasets API is controlled by the dataset grant on the participant token. In practice:

list_tables controls whether the participant can list tables
table grants control read, write, and alter access per table
if no table list is supplied, access is broad across the room dataset

See API Scopes and Service YAML.

API reference

Use the methods below to manage room tables, rows, indexes, and search workflows. Each method is asynchronous, so you should await the call.

`list_tables()`

Description Retrieves a list of all table names currently present in the dataset. Returns:

A promise that resolves to an array of table name strings.

meshagent room dataset table \
  --room myroom

`create_table_with_schema(...)`

Description: Creates a new table with an optional schema and initial data. You can specify how the table should be created through the mode parameter.
- modes:
  - "create": Creates the table; fails if it already exists.
  - "overwrite": Drops the existing table (if any) and creates a new one.
  - "create_if_not_exists": Creates the table only if it does not already exist.
Parameters:
- name: The name of the new table.
- schema: An optional record defining column names and their data types.
- data: An optional array of initial records to populate the table. Prefer batching records together rather than issuing many single-row writes.
- mode: The creation mode (default is "create").
Returns: A promise that resolves once the table is created.

Example:

meshagent room dataset create \
  --room myroom \
  --table users \
  --columns "id int, username text, email text" \
  --data-json '[{"id":1,"username":"alice","email":"alice@example.com"},{"id":2,"username":"bob","email":"bob@example.com"}]'

`create_table_from_data(...)`

Description: Creates a table using only data and an optional mode.
Parameters:
- name: The table name to create.
- data: An array of records to initialize the table with.
- mode: Table creation mode (default "create").
Returns: A promise that resolves once the table is created.

Example:

meshagent room dataset create \
  --room myroom \
  --table orders \
  --mode overwrite \
  --data-json '[{"id":1,"product":"Laptop","quantity":2},{"id":2,"product":"Phone","quantity":5}]'

`drop_table(name, ...)`

Description: Drops (deletes) a table by name, optionally ignoring if it does not exist.
Parameters:
- name: The name of the table to drop.
- ignoreMissing: If true, no error is thrown if the table does not exist.
Returns: A promise that resolves once the table is dropped.

Example:

meshagent room dataset drop \
  --room myroom \
  --table temp_table \
  --ignore-missing

`add_columns(...)`

Description: Adds one or more columns to an existing table, specifying default value expressions.
Parameters:
- table: Name of the target table.
- newColumns: A record mapping column names to default value expressions (SQL or literal).
Returns: A promise that resolves once the columns are added.

For Dart, the expression-based helper is addColumnWithExpression(...). If you want to add columns by explicit Arrow schema instead, use addColumnsWithSchema(...). Example:

meshagent room dataset add-columns \
  --room myroom \
  --table users \
  --columns "isActive bool, createdAt timestamp"

`drop_columns(...)`

Description: Drops (removes) one or more columns from an existing table.
Parameters:
- table: Name of the target table.
- columns: An array of column names to remove.
Returns: A promise that resolves once the columns are dropped.

Example:

meshagent room dataset drop-columns \
  --room myroom \
  --table users \
  --column deprecatedColumn1 \
  --column deprecatedColumn2

`insert(table, records)`

Description: Inserts one or more new records into a table.
Parameters:
- table: The name of the table to insert into.
- records: An array of objects, each containing column-value pairs.
Returns: A promise that resolves once the records are inserted.

Example:

meshagent room dataset insert \
  --room myroom \
  --table users \
  --json '[{"id":3,"username":"charlie","email":"charlie@example.com"},{"id":4,"username":"dana","email":"dana@example.com"}]'

`update(table, where, ...)`

Description: Updates existing records in a table.
Parameters:
- table: Name of the table to update.
- where: A SQL WHERE clause specifying which records to update (e.g. "id = 123").
- values: A record of key-value pairs for direct assignment or expressions (e.g. { age: 30 } or { age: new DatasetExpression("age + 1") } in typed clients).
Returns: A promise that resolves once the update is complete.

Example:

meshagent room dataset update \
  --room myroom \
  --table users \
  --where "id = 3" \
  --values-json '{"email":"newcharlie@example.com","loginCount":{"expression":"loginCount + 1"}}'

`delete(table, where)`

Description: Deletes records from a table that match a specified condition.
Parameters:
- table: The target table.
- where: A SQL WHERE clause for filtering which records to delete.
Returns: A promise that resolves once the records are deleted.

Example:

meshagent room dataset delete \
  --room myroom \
  --table users \
  --where "id = 4"

`merge(table, records, ...)`

Description: Performs an upsert (update/insert) by merging incoming records into an existing table. Records matching the on column are updated; otherwise, new rows are inserted.
Parameters:
- table: The target table.
- on: The column name used to match existing records.
- records: The record(s) to merge/upsert.
Returns: A promise that resolves once the operation is complete.

Example:

meshagent room dataset merge \
  --room myroom \
  --table users \
  --on id \
  --json '[{"id":1,"username":"alice","email":"alice_new@example.com"},{"id":5,"username":"eric","email":"eric@example.com"}]'

`search(table, ...)`

Description: Searches for records in a table. This can be used for plain text search, vector similarity search, or simple SQL filtering.
Parameters:
- table: The target table name.
- text: An optional search string (if using full-text indexes).
- vector: An optional numeric array for vector-based similarity queries.
- where: SQL WHERE clause string or an object representing key-value equals conditions.
- offset: Optional offset for pagination in Python and Dart.
- limit: Maximum number of matching records to return.
- select: An array of column names to be returned.
Returns: An array of matching records.

Example:

meshagent room dataset search \
  --room myroom \
  --table users \
  --where-json '{"username":"alice"}' \
  --limit 1

`optimize(table, ...)`

Description: Optimizes a table (e.g., compacts its storage or rebuilds indexes if required).
Parameters:
- table: Name of the table to optimize.
- config: Optional optimization configuration using Lance option names, including compact_files, optimize_indices, cleanup_old_versions, target_rows_per_fragment, max_rows_per_group, max_bytes_per_file, materialize_deletions, materialize_deletions_threshold, defer_index_remap, num_threads, batch_size, compaction_mode, binary_copy_read_batch_bytes, num_indices_to_merge, index_names, retrain, older_than_seconds, retain_versions, delete_unverified, error_if_tagged_old_versions, and delete_rate_limit.
Returns: Optimization result details for compaction, index optimization, and cleanup.

Example:

meshagent room dataset optimize \
  --room myroom \
  --table users

`stats(table, ...)`

Description: Returns Lance dataset and data statistics for a table.
Parameters:
- table: Name of the table.
- max_rows_per_group: Optional row-group threshold used by Lance dataset stats.
Returns: Parsed dataset and data statistics.

Example:

meshagent room dataset stats \
  --room myroom \
  --table users

meshagent room dataset stats \
  --room myroom \
  --table users \
  --output json

`create_index(...)`

Description: Creates a Lance index on a dataset table.
Parameters:
- table: The target table name.
- config: Index configuration. Uses Lance option names such as column, index_type, name, replace, metric, num_partitions, num_sub_vectors, target_partition_size, filter_nan, train, fragment_ids, index_uuid, skip_transpose, num_bits, index_file_version, max_level, m, ef_construction, with_position, memory_limit, num_workers, skip_merge, base_tokenizer, language, max_token_length, lower_case, stem, remove_stop_words, custom_stop_words, and ascii_folding.
Returns: A promise that resolves once the index is created.

Supported index_type values include vector indexes (IVF_PQ, IVF_HNSW_PQ, IVF_HNSW_SQ, IVF_RQ) and scalar/text indexes (BTREE, BITMAP, LABEL_LIST, NGRAM, ZONEMAP, INVERTED, FTS, BLOOMFILTER, RTREE). Example:

meshagent room dataset index-create \
  --room myroom \
  --table documents \
  --column embedding \
  --index-type IVF_PQ \
  --num-partitions 32 \
  --num-sub-vectors 8

`drop_index(...)`

Description: Drop an index by name.
Parameters:
- table: Table name.
- name: Index name to drop.
Returns: None.
Availability: Python and Dart expose helpers today. JS/TS/.NET may add a helper in a future release.

meshagent room dataset index-drop \
  --room myroom \
  --table users \
  --name email_idx

`list_indexes(table)`

Description: Lists the indexes currently defined on a table.
Parameters:
- table: The name of the table for which to list indexes.
Returns: A list of index entries. Entries include name, columns, type, fields, type_url, num_rows_indexed, num_segments, total_size_bytes, details, and statistics.

Example:

meshagent room dataset index \
  --room myroom \
  --table users

`list_versions(table)`

Description: List historical versions of a table. Reads can target a specific version directly, and restore creates a new head version from an older snapshot.
Methods:
- list_versions(table, branch=None) → list of versions (version, timestamp, metadata) for the selected branch.
- search(table, branch=None, version=None) → read a branch head or a specific historical version without any checkout step.
- restore(table, version, branch=None) → restore a table branch to a prior version by creating a new head commit.
Availability: Python exposes list_versions, search(..., version=...), restore, and dataset branch operations. Dart currently exposes listVersions.

meshagent room dataset version \
  --room myroom \
  --table users

`list_branches()`

Description: List, create, and delete dataset branches for a namespace. Table reads and writes can then target a branch directly with branch=....
Methods:
- list_branches() → list available branches.
- create_branch(branch, from_branch=None) → create a branch from the head of main or another branch.
- delete_branch(branch) → delete a non-main branch.

meshagent room dataset branch list --room myroom
meshagent room dataset branch create --room myroom --branch exp --from-branch main

Introduction

Rooms

Agents & Tools

Deploy & Manage

Projects

Product Interfaces

Developer Reference

Overview

CLI commands

Why use the Datasets API?

How it works

Permissions and grants

API reference

`list_tables()`

`create_table_with_schema(...)`

`create_table_from_data(...)`

`drop_table(name, ...)`

`add_columns(...)`

`drop_columns(...)`

`insert(table, records)`

`update(table, where, ...)`

`delete(table, where)`

`merge(table, records, ...)`

`search(table, ...)`

`optimize(table, ...)`

`stats(table, ...)`

`create_index(...)`

`drop_index(...)`

`list_indexes(table)`

`list_versions(table)`

`list_branches()`

Introduction

Rooms

Agents & Tools

Deploy & Manage

Projects

Product Interfaces

Developer Reference

​Overview

​CLI commands

​Why use the Datasets API?

​How it works

​Permissions and grants

​API reference

​list_tables()

​create_table_with_schema(...)

​create_table_from_data(...)

​drop_table(name, ...)

​add_columns(...)

​drop_columns(...)

​insert(table, records)

​update(table, where, ...)

​delete(table, where)

​merge(table, records, ...)

​search(table, ...)

​optimize(table, ...)

​stats(table, ...)

​create_index(...)

​drop_index(...)

​list_indexes(table)

​list_versions(table)

​list_branches()

​Related guides

Overview

CLI commands

Why use the Datasets API?

How it works

Permissions and grants

API reference

`list_tables()`

`create_table_with_schema(...)`

`create_table_from_data(...)`

`drop_table(name, ...)`

`add_columns(...)`

`drop_columns(...)`

`insert(table, records)`

`update(table, where, ...)`

`delete(table, where)`

`merge(table, records, ...)`

`search(table, ...)`

`optimize(table, ...)`

`stats(table, ...)`

`create_index(...)`

`drop_index(...)`

`list_indexes(table)`

`list_versions(table)`

`list_branches()`

Related guides