Creating Collections
Chroma collections are created with a name. Collection names are used in the url, so there are a few restrictions on them:- The length of the name must be between 3 and 512 characters.
- The name must start and end with a lowercase letter or a digit, and it can contain dots, dashes, and underscores in between.
- The name must not contain two consecutive dots.
- The name must not be a valid IP address.
Embedding Functions
When you add documents to a collection, Chroma will embed them for you by using the collection’s embedding function. Chroma will use sentence transformer embedding function as a default. Chroma also offers various embedding function, which you can provide upon creating a collection. For example, you can create a collection using theOpenAIEmbeddingFunction:
- Python
- TypeScript
Install the Create your collection with the Instead of having Chroma embed documents, you can also provide embeddings directly when adding data to a collection. In this case, your collection will not have an embedding function set, and you will be responsible for providing embeddings directly when adding data and querying.
openai package:OpenAIEmbeddingFunction:Collection Metadata
When creating collections, you can pass the optionalmetadata argument to add a mapping of metadata key-value pairs to your collections. This can be useful for adding general about the collection like creation time, description of the data stored in the collection, and more.
Getting Collections
- Python
- TypeScript
There are several ways to get a collection after it was created.The The The By default, Current versions of Chroma store the embedding function you used to create a collection on the server, so the client can resolve it for you on subsequent “get” operations. If you are running an older version of the Chroma client or server (earlier than 1.1.13), you will need to provide the same embedding function you used to create a collection when using
get_collection function will get a collection from Chroma by name. It returns a Collection object with name, metadata, configuration, and embedding_function.get_or_create_collection function behaves similarly, but will create the collection if it doesn’t exist. You can pass to it the same arguments create_collection expects, and the client will ignore them if the collection already exists.list_collections function returns the collections you have in your Chroma database. The collections will be ordered by creation time from oldest to newest.list_collections returns up to 100 collections. If you have more than 100 collections, or need to get only a subset of your collections, you can use the limit and offset arguments:get_collection:Modifying Collections
After a collection is created, you can modify its name, metadata and elements of its index configuration with themodify method:
Deleting Collections
You can delete a collection by name. This action will delete a collection, all of its embeddings, and associated documents and records’ metadata.Convenience Methods
Collections also offer a few useful convenience methods:count- returns the number of records in the collection.peek- returns the first 10 records in the collection.