Overview
Chroma Sync exposes endpoints for developers to chunk, embed, and index various data sources. The API is intended for Chroma Cloud users and can be accessed for free (up to $5 in credits) by creating a Chroma Cloud account.Key Concepts
Chroma Sync has three primary concepts: source types, sources and invocations.Source Types
A source type defines a kind of entity that contains data that can be chunked, embedded, and indexed. An example of a source type—and notably the only currently supported one—is a GitHub repository. Each source type defines its own schema for configuring sources of its type. For example, the sources of the GitHub repository type allow developers to define a parameterinclude_globs, which is an array of glob patterns for which matching files will be synced.
Other examples of (future) source types may be S3 buckets, web pages, Notion workspaces, or any other corpus of continually updated content. If there is a specific source type for which you would like support, please reach out to [email protected].
GitHub Repositories
The GitHub repository source type allows developers to sync code in public and private GitHub repositories. Public repositories require no setup other than creating a Chroma Cloud account and issuing an API key. Chroma Sync for private repositories is available at two different tiers: direct and platform.Direct Sync
The direct tier requires you to install Chroma’s GitHub App into any repository for which you wish to perform syncing. The direct tier is only available via the Chroma Cloud UI and does not enable you to perform Sync-related operations via the API. This tier is ideal for developers who wish to sync private repositories that they own. If you are interested in using the direct tier via API, please reach out to us at [email protected].Platform Sync
The platform tier requires you to grant Chroma access to a GitHub App that you own, which has been installed into the private repositories you wish to sync. This GitHub App must have read-only access to the “Contents” and “Metadata” permissions on the list of “Repository permissions”. The platform tier grants access to the Chroma Sync API and is ideal for companies and organizations that offer services which access their users’ codebases. For a detailed walkthrough, see Platform Sync docs.Web
The web source type allows developers to scrape the contents of web pages into Chroma. Given a starting URL, Sync will crawl the page and its links up to a specified depth.Sources
A source is a specific instance of a source type configured according to the global and source type-specific configuration schema. The global source configuration schema refers to the configuration parameters that are required across sources of all types, while the source-type specific configuration schema refers to the configuration parameters required for a specific source type. The global source configuration schema requires the following parameters:database_namedefines the Chroma database in which collections should be created by invocations run on this source. A database must exist before creating sources that point to it.embedding.dense.modeldefines the embedding model that should be used to generate dense embeddings for chunked documents. Currently, only the Qwen3-Embedding-0.6B model is supported, but if there is a model you would like to use, please let us know by reaching out to [email protected].
GitHub Repositories
A source of the GitHub repository type is an individual GitHub repository configured with the global source configuration parameters, and the GitHub source-specific configuration parameters:repositorydefines the GitHub repository whose code should be synced. This must be the forward slash-separated combination of the repository owner’s GitHub username and the repository name (e.g.,chroma-core/chroma). Note that changing a repository name after creating a Chroma Sync source for it will result in invocations on that source failing, so a new source with the updated repository name must be created.app_iddefines the GitHub App ID of the GitHub App that has access to the providedrepository. This parameter should only be supplied if the provided repository is private. If you are unsure of the GitHub App ID you should use, see more about the two tiers Chroma offers for the GitHub repository source type.include_globsdefines a set of glob patterns for which matching files should be synced. If this parameter is not provided, files matching"*"will be synced. Note that Chroma will not sync binary data, images, and other large or non-UTF-8 files.
Web
A source of the web type is configured with a starting URL and a few other optional parameters:Invocations
Invocations refer to runs of the Sync Function over the data in a source. One invocation corresponds to one sync pass through all of the data in a source. A single invocation will result in the creation of exactly one collection in the database specified by the invocation’s source. This collection will contain the chunked, embedded, and indexed data that represents the state of the source at the time of the invocation’s creation. Invocations, like sources, have some global configuration parameters, as well as parameters specific to the type of the source for which the invocation is being run. The global invocation configuration parameters are:target_collection_namedefines the name of the Chroma collection in which synced data should be stored. This must be a collection that does not already exist with synced data. Chroma Sync uses the metadata keyfinished_ingestto indicate whether a collection contains synced data. If an invocation creation request is received for a collection with metadata in which this key is present and set to true, the API will return a 409 Conflict.
GitHub Repositories
Invocations on sources of the GitHub repository type are sync runs over an individual GitHub repository with some set of configuration parameters. The configuration parameters that are specific to invocations on sources of this type are:ref_identifieris either the commit SHA-256 or the name of the branch from which to retrieve the code to be synced. If a branch is provided, the code will be retrieved from the branch’s latest commit.
Endpoints
Sources
Create Source
Creates a new source of the specified type with the provided configuration. Method POST Endpoint https://sync.trychroma.com/api/v1/sources Required Headersx-chroma-tokenmust carry the caller’s Chroma Cloud API key.
-
200 OKIf the source is successfully created. -
400 Bad RequestIf the request payload is missing required fields or a provided field was malformed. -
401 UnauthorizedIf thex-chroma-tokenheader is not present or invalid. -
403 ForbiddenIf the user does not have permission to create a source. Note that only team owners and administrators can create sources. -
404 Not FoundIf the provided database or GitHub App does not exist. -
409 ConflictIf the source already exists. -
500 Internal Server ErrorIf an unknown error occurs while creating the source.
Get Source
Retrieve a specific source by its ID. Method GET Endpoint https://sync.trychroma.com/api/v1/sources/{source_id} Required Headersx-chroma-tokenmust carry the caller’s Chroma Cloud API key.
source_idis the ID of the source to retrieve.
200 OKIf the source is found.
400 Bad RequestIf the request parameters are missing required values or a provided value was malformed.401 UnauthorizedIf thex-chroma-tokenheader is not present or invalid.403 ForbiddenIf the user does not have permission to access this source.404 Not FoundIf the requested source does not exist.500 Internal Server ErrorIf an unknown error occurs while retrieving the source.
List Sources
List sources with optional filtering. Method GET Endpoint https://sync.trychroma.com/api/v1/sources Required Headersx-chroma-tokenmust carry the caller’s Chroma Cloud API key.
database_nameallows callers to optionally filter sources by their database.source_typeallows callers to optionally filter sources by their type. If provided, must be one of[github].github.app_idallows callers to optionally filter GitHub sources by their app ID. If provided,source_typemust begithub.github.repositoryallows callers to optionally filter GitHub sources by their repository. If provided,source_typemust begithub.limitspecifies the maximum number of results to return. Defaults to 100.offsetspecifies the number of results to skip before starting to assemble the list of returned sources when sorting by the sources’created_attimestamps. Defaults to 0.order_byindicates whether to perform sorting by the sources’created_attimestamps in ascending or descending order. If provided, must be one of[ASC, DESC]. Defaults toDESC.
-
200 OKIf the request does not encounter any errors while listing sources. -
400 Bad RequestIf the request parameters are missing required values or a provided value was malformed. -
401 UnauthorizedIf thex-chroma-tokenheader is not present or invalid. -
403 ForbiddenIf the user does not have permission to list sources. -
500 Internal Server ErrorIf an unknown error occurs while retrieving sources.
Delete Source
Delete a source. Does not cancel in-flight invocations on this source. Method DELETE Endpoint https://sync.trychroma.com/api/v1/sources/{source_id} Required Headersx-chroma-tokenmust carry the caller’s Chroma Cloud API key.
source_idis the ID of the source to delete.
204 No ContentIf the source is deleted.400 Bad RequestIf the request parameters are missing required values or a provided value was malformed.401 UnauthorizedIf thex-chroma-tokenheader is not present or invalid.403 ForbiddenIf the user does not have permission to delete this source. Note that only team owners and administrators can delete sources.404 Not FoundIf the provided source does not exist.500 Internal Server ErrorIf an unknown error occurs while deleting the source.
Invocations
Create Invocation
Creates a new invocation on the specified source with the provided configuration parameters. Method POST Endpoint https://sync.trychroma.com/api/v1/sources/{source_id}/invocations Required Headersx-chroma-tokenmust carry the caller’s Chroma Cloud API key.
source_idis the ID of the source for which to create an invocation.
-
200 OKIf the invocation is successfully created. -
400 Bad RequestIf the request payload is missing required fields or a provided field was malformed. -
401 UnauthorizedIf thex-chroma-tokenheader is not present or invalid. -
403 ForbiddenIf the user does not have permission to create an invocation. -
404 Not FoundIf the provided source is not found. -
409 ConflictIf the providedtarget_collection_nameexists withfinished_ingestset totrue. -
500 Internal Server ErrorIf an unknown error occurs while creating the invocation.
Get Invocation
Retrieve a specific invocation by its ID. Method GET Endpoint https://sync.trychroma.com/api/v1/invocations/{invocation_id} Required Headersx-chroma-tokenmust carry the caller’s Chroma Cloud API key.
invocation_idis the ID of the invocation to retrieve.
-
200 OKIf the invocation is found. -
400 Bad RequestIf the request parameters are missing required values or a provided value was malformed. -
401 UnauthorizedIf thex-chroma-tokenheader is not present or invalid. -
403 ForbiddenIf the user does not have permission to access this invocation. -
404 Not FoundIf the requested invocation does not exist. -
500 Internal Server ErrorIf an unknown error occurs while retrieving the invocation.
List Invocations
List invocations with optional filtering. Method GET Endpoint https://sync.trychroma.com/api/v1/invocations Required Headersx-chroma-tokenmust carry the caller’s Chroma Cloud API key.
source_idallows callers to optionally filter invocations by their source.statusallows callers to optionally filter invocations by their status. If provided, must be one of[pending, processing, complete, failed, cancelled].source_typeallows callers to optionally filter invocations by their source’s type. If provided, must be one of[github].github.app_idallows callers to optionally filter invocations on sources of the GitHub repository source type by their source’s app ID. If provided,source_typemust begithub.github.repositoryallows callers to optionally filter invocations on sources of the GitHub repository source type by their source’s repository. If provided,source_typemust begithub.limitspecifies the maximum number of results to return. Defaults to 100.offsetspecifies the number of results to skip before starting to assemble the list of returned invocations when sorting by the invocations’created_attimestamps. Defaults to 0.order_byindicates whether to perform sorting by the invocations’created_attimestamps in ascending or descending order. If provided, must be one of[ASC, DESC]. Defaults toDESC.
-
200 OKIf the request does not encounter any errors while listing invocations. -
400 Bad RequestIf the request parameters are missing required values or a provided value was malformed. -
401 UnauthorizedIf thex-chroma-tokenheader is not present or invalid. -
403 ForbiddenIf the user does not have permission to list invocations. -
500 Internal Server ErrorIf an unknown error occurs while retrieving invocations.
Cancel Pending Invocation
Cancel an invocation that is in thepending state. Invocations not in this state cannot be cancelled.
Method
PUT
Endpoint
https://sync.trychroma.com/api/v1/invocations/{invocation_id}
Required Headers
x-chroma-tokenmust carry the caller’s Chroma Cloud API key.
invocation_idis the ID of the invocation to cancel.
202 AcceptedIf the invocation is cancelled.400 Bad RequestIf the request parameters are missing required values or a provided value was malformed.401 UnauthorizedIf thex-chroma-tokenheader is not present or invalid.403 ForbiddenIf the user does not have permission to cancel this invocation.404 Not FoundIf the provided invocation does not exist.412 Precondition Not MetIf the invocation is not in thependingstate.500 Internal Server ErrorIf an unknown error occurs while cancelling the invocation.