Skip to main content

Index Configuration Reference

Comprehensive reference for all index types and their configuration parameters.

Index Types Overview

Schema recognizes six value types, each with associated index types. Without providing a Schema, collections use these built-in defaults:
Config ClassValue TypeDefault BehaviorUse Case
StringInvertedIndexConfigstringEnabled for all metadataFilter on string values
FtsIndexConfigstringEnabled for K.DOCUMENT onlyFull-text search on documents
VectorIndexConfigfloat_listEnabled for K.EMBEDDING onlySimilarity search on embeddings
SparseVectorIndexConfigsparse_vectorDisabled (requires config)Keyword-based search
IntInvertedIndexConfigint_valueEnabled for all metadataFilter on integer values
FloatInvertedIndexConfigfloat_valueEnabled for all metadataFilter on float values
BoolInvertedIndexConfigbooleanEnabled for all metadataFilter on boolean values

Simple Index Configs

These index types have no configuration parameters.

FtsIndexConfig

Use Case: Full-text search and regular expression search on documents (e.g., where(K.DOCUMENT.contains("search term"))). Limitations: Cannot be deleted. Applies to K.DOCUMENT only.

StringInvertedIndexConfig

Use Case: Exact and prefix string matching on metadata fields (e.g., where(K("category") == "science")).

IntInvertedIndexConfig

Use Case: Range and equality queries on integer metadata (e.g., where(K("year") >= 2020)).

FloatInvertedIndexConfig

Use Case: Range and equality queries on float metadata (e.g., where(K("price") < 99.99)).

BoolInvertedIndexConfig

Use Case: Filtering on boolean metadata (e.g., where(K("published") == True)).

VectorIndexConfig

Use Case: Semantic similarity search on dense embeddings for finding conceptually similar content. Parameters:
ParameterTypeRequiredDescription
spacestringNoDistance function: l2 (geometric), ip (inner product), or cosine (angle-based, most common for text). Default: l2
embedding_functionEmbeddingFunctionNoFunction to auto-generate embeddings from K.DOCUMENT. If not provided, supply embeddings manually
source_keystringNoReserved for future use. Currently always uses K.DOCUMENT
hnswHnswConfigNoAdvanced: HNSW algorithm tuning for single-node deployments
spannSpannConfigNoAdvanced: SPANN algorithm tuning (clustering, probing) for Chroma Cloud
Limitations:
  • Cannot be deleted
  • Applies to K.EMBEDDING only
Advanced tuning: HNSW and SPANN parameters control index build and search behavior. They are pre-optimized for most use cases. Only adjust if you have specific performance requirements and understand the tradeoffs between recall, speed, and resource usage. Incorrect tuning can degrade performance.

SparseVectorIndexConfig

Use Case: Keyword-based search for exact term matching, domain-specific terminology, and technical terms. Ideal for hybrid search when combined with dense embeddings. Parameters:
ParameterTypeRequiredDescription
source_keystringNoField to generate sparse embeddings from. Typically K.DOCUMENT, but can be any text field
embedding_functionSparseEmbeddingFunctionNoSparse embedding function (e.g., ChromaCloudSpladeEmbeddingFunction, HuggingFaceSparseEmbeddingFunction, Bm25EmbeddingFunction)
bm25booleanNoSet to true when using Bm25EmbeddingFunction to enable inverse document frequency (IDF) scaling for queries. Not applicable for SPLADE
Limitations:
  • Must specify a metadata key name (per-key configuration required)
  • Only one sparse vector index allowed per collection
  • Cannot be deleted once created
For complete sparse vector search setup and querying examples, see Sparse Vector Search Setup.

Next Steps