Overview
TheDatasetsClient is the Room API for room-scoped structured data. Use it to create tables, insert and update rows, build indexes, and run text or vector search without provisioning a separate dataset.
Datasets are designed and optimized for batch writes, indexing, scans, search, and retrieval. They are not a transactional database replacement: avoid using them for high-frequency row-by-row mutations, cross-table transactions, locks, or workloads that require immediate transactional consistency.
CLI commands
Start with the CLI help, then use a few common commands:bash
Why use the Datasets API?
- Keep structured room data close to the agents and services that use it.
- Support filtering, analytics, semantic search, and retrieval workflows in one place.
- Manage schemas and indexes through the same Room API surface you already use for messaging, storage, and sync.
How it works
Each room dataset contains named tables. You can create tables from a schema or from raw data, update rows, create scalar/full-text/vector indexes, and run searches that combine filters with text or vector similarity. Use it when your room needs structured records instead of only files or chat history.Current implementation: MeshAgent room dataset is currently backed directly by Lance datasets, which provide the table, vector, and full-text primitives used by the room dataset toolkit.
Typed values: MeshAgent also supportsjson,uuid,list, andstructdataset types. In the typed SDKs, use the wrapper classes for those values, such asDatasetJson,DatasetStruct,DatasetUuid/UuidValue, andDatasetExpressionwhere the client requires them.
Permissions and grants
The Datasets API is controlled by thedataset grant on the participant token.
In practice:
list_tablescontrols whether the participant can list tables- table grants control
read,write, andalteraccess per table - if no table list is supplied, access is broad across the room dataset
API reference
Use the methods below to manage room tables, rows, indexes, and search workflows. Each method is asynchronous, so you shouldawait the call.
list_tables()
Description
Retrieves a list of all table names currently present in the dataset.
Returns:
- A promise that resolves to an array of table name strings.
create_table_with_schema(...)
- Description: Creates a new table with an optional schema and initial data. You can specify how the table should be created through the
modeparameter.- modes:
"create": Creates the table; fails if it already exists."overwrite": Drops the existing table (if any) and creates a new one."create_if_not_exists": Creates the table only if it does not already exist.
- modes:
- Parameters:
- name: The name of the new table.
- schema: An optional record defining column names and their data types.
- data: An optional array of initial records to populate the table. Prefer batching records together rather than issuing many single-row writes.
- mode: The creation mode (default is
"create").
- Returns: A promise that resolves once the table is created.
create_table_from_data(...)
- Description: Creates a table using only data and an optional mode.
- Parameters:
- name: The table name to create.
- data: An array of records to initialize the table with.
- mode: Table creation mode (default
"create").
- Returns: A promise that resolves once the table is created.
drop_table(name, ...)
- Description: Drops (deletes) a table by name, optionally ignoring if it does not exist.
- Parameters:
- name: The name of the table to drop.
- ignoreMissing: If
true, no error is thrown if the table does not exist.
- Returns: A promise that resolves once the table is dropped.
add_columns(...)
- Description: Adds one or more columns to an existing table, specifying default value expressions.
- Parameters:
- table: Name of the target table.
- newColumns: A record mapping column names to default value expressions (SQL or literal).
- Returns: A promise that resolves once the columns are added.
addColumnWithExpression(...). If you want to add columns by explicit Arrow schema instead, use addColumnsWithSchema(...).
Example:
drop_columns(...)
- Description: Drops (removes) one or more columns from an existing table.
- Parameters:
- table: Name of the target table.
- columns: An array of column names to remove.
- Returns: A promise that resolves once the columns are dropped.
insert(table, records)
- Description: Inserts one or more new records into a table.
- Parameters:
- table: The name of the table to insert into.
- records: An array of objects, each containing column-value pairs.
- Returns: A promise that resolves once the records are inserted.
update(table, where, ...)
- Description: Updates existing records in a table.
- Parameters:
- table: Name of the table to update.
- where: A SQL
WHEREclause specifying which records to update (e.g."id = 123"). - values: A record of key-value pairs for direct assignment or expressions (e.g.
{ age: 30 }or{ age: new DatasetExpression("age + 1") }in typed clients).
- Returns: A promise that resolves once the update is complete.
delete(table, where)
- Description: Deletes records from a table that match a specified condition.
- Parameters:
- table: The target table.
- where: A SQL
WHEREclause for filtering which records to delete.
- Returns: A promise that resolves once the records are deleted.
merge(table, records, ...)
- Description: Performs an upsert (update/insert) by merging incoming records into an existing table. Records matching the
oncolumn are updated; otherwise, new rows are inserted. - Parameters:
- table: The target table.
- on: The column name used to match existing records.
- records: The record(s) to merge/upsert.
- Returns: A promise that resolves once the operation is complete.
search(table, ...)
- Description: Searches for records in a table. This can be used for plain text search, vector similarity search, or simple SQL filtering.
- Parameters:
- table: The target table name.
- text: An optional search string (if using full-text indexes).
- vector: An optional numeric array for vector-based similarity queries.
- where: SQL
WHEREclause string or an object representing key-value equals conditions. - offset: Optional offset for pagination in Python and Dart.
- limit: Maximum number of matching records to return.
- select: An array of column names to be returned.
- Returns: An array of matching records.
optimize(table, ...)
- Description: Optimizes a table (e.g., compacts its storage or rebuilds indexes if required).
- Parameters:
- table: Name of the table to optimize.
- config: Optional optimization configuration using Lance option names, including
compact_files,optimize_indices,cleanup_old_versions,target_rows_per_fragment,max_rows_per_group,max_bytes_per_file,materialize_deletions,materialize_deletions_threshold,defer_index_remap,num_threads,batch_size,compaction_mode,binary_copy_read_batch_bytes,num_indices_to_merge,index_names,retrain,older_than_seconds,retain_versions,delete_unverified,error_if_tagged_old_versions, anddelete_rate_limit.
- Returns: Optimization result details for compaction, index optimization, and cleanup.
stats(table, ...)
- Description: Returns Lance dataset and data statistics for a table.
- Parameters:
- table: Name of the table.
- max_rows_per_group: Optional row-group threshold used by Lance dataset stats.
- Returns: Parsed
datasetanddatastatistics.
create_index(...)
- Description: Creates a Lance index on a dataset table.
- Parameters:
- table: The target table name.
- config: Index configuration. Uses Lance option names such as
column,index_type,name,replace,metric,num_partitions,num_sub_vectors,target_partition_size,filter_nan,train,fragment_ids,index_uuid,skip_transpose,num_bits,index_file_version,max_level,m,ef_construction,with_position,memory_limit,num_workers,skip_merge,base_tokenizer,language,max_token_length,lower_case,stem,remove_stop_words,custom_stop_words, andascii_folding.
- Returns: A promise that resolves once the index is created.
index_type values include vector indexes (IVF_PQ, IVF_HNSW_PQ, IVF_HNSW_SQ, IVF_RQ) and scalar/text indexes (BTREE, BITMAP, LABEL_LIST, NGRAM, ZONEMAP, INVERTED, FTS, BLOOMFILTER, RTREE).
Example:
drop_index(...)
- Description: Drop an index by name.
- Parameters:
- table: Table name.
- name: Index name to drop.
- Returns:
None. - Availability: Python and Dart expose helpers today. JS/TS/.NET may add a helper in a future release.
list_indexes(table)
- Description: Lists the indexes currently defined on a table.
- Parameters:
- table: The name of the table for which to list indexes.
- Returns: A list of index entries. Entries include
name,columns,type,fields,type_url,num_rows_indexed,num_segments,total_size_bytes,details, andstatistics.
list_versions(table)
- Description: List historical versions of a table. Reads can target a specific
versiondirectly, andrestorecreates a new head version from an older snapshot. - Methods:
list_versions(table, branch=None)→ list of versions (version,timestamp,metadata) for the selected branch.search(table, branch=None, version=None)→ read a branch head or a specific historical version without any checkout step.restore(table, version, branch=None)→ restore a table branch to a prior version by creating a new head commit.
- Availability: Python exposes
list_versions,search(..., version=...),restore, and dataset branch operations. Dart currently exposeslistVersions.
list_branches()
- Description: List, create, and delete dataset branches for a namespace. Table reads and writes can then target a branch directly with
branch=.... - Methods:
list_branches()→ list available branches.create_branch(branch, from_branch=None)→ create a branch from the head ofmainor another branch.delete_branch(branch)→ delete a non-mainbranch.