Skip to main content

Pagination & Field Selection

Control how many results to return and which fields to include in your search results.

Pagination with Limit

Use limit() to control how many results to return and offset to skip results for pagination.
from chromadb import Search

# Limit results
search = Search().limit(10)  # Return top 10 results

# Pagination with offset
search = Search().limit(10, offset=20)  # Skip first 20, return next 10

# No limit - returns all matching results
search = Search()  # Be careful with large collections!

Limit Parameters

ParameterTypeDefaultDescription
limitint or NoneNoneMaximum results to return (None = no limit)
offsetint0Number of results to skip (for pagination)
For Chroma Cloud users: The actual number of results returned will be capped by your quota limits, regardless of the limit value specified. This applies even when no limit is set.

Pagination Patterns

# Page through results (0-indexed)
page_size = 10

# Page 0: Results 1-10
page_0 = Search().limit(page_size, offset=0)

# Page 1: Results 11-20
page_1 = Search().limit(page_size, offset=10)

# Page 2: Results 21-30
page_2 = Search().limit(page_size, offset=20)

# General formula
def get_page(page_number, page_size=10):
    return Search().limit(page_size, offset=page_number * page_size)
Pagination uses 0-based indexing. The first page is page 0, not page 1.

Field Selection with Select

Control which fields are returned in your results to optimize data transfer and processing.
from chromadb import Search, K

# Default - returns IDs only
search = Search()

# Select specific fields
search = Search().select(K.DOCUMENT, K.SCORE)

# Select metadata fields
search = Search().select("title", "author", "date")

# Mix predefined and metadata fields
search = Search().select(K.DOCUMENT, K.SCORE, "title", "author")

# Select all available fields
search = Search().select_all()
# Returns: IDs, documents, embeddings, metadata, scores

Selectable Fields

FieldInternal KeyUsageDescription
IDs#idAlways includedDocument IDs are always returned
K.DOCUMENT#document.select(K.DOCUMENT)Full document text
K.EMBEDDING#embedding.select(K.EMBEDDING)Vector embeddings
K.METADATA#metadata.select(K.METADATA)All metadata fields as a dict
K.SCORE#score.select(K.SCORE)Search scores (when ranking is used)
"field_name"(user-defined).select("title", "author")Specific metadata fields
Field constants: K.* constants (e.g., K.DOCUMENT, K.EMBEDDING, K.ID) correspond to internal keys with # prefix (e.g., #document, #embedding, #id). Use the K.* constants in queries. Internal keys like #document and #embedding are used in schema configuration, while #metadata and #score are query-only fields not used in schema.When selecting specific metadata fields (e.g., “title”), they appear directly in the metadata dict. Using K.METADATA returns ALL metadata fields at once.

Performance Considerations

Selecting fewer fields improves performance by reducing data transfer:
  • Minimal: IDs only (default) - fastest queries
  • Moderate: Add scores and specific metadata fields
  • Heavy: Including documents and embeddings - larger payloads
  • Maximum: select_all() - returns everything
# Fast - minimal data
search = Search().limit(100)  # IDs only

# Moderate - just what you need
search = Search().limit(100).select(K.SCORE, "title", "date")

# Slower - large fields
search = Search().limit(100).select(K.DOCUMENT, K.EMBEDDING)

# Slowest - everything
search = Search().limit(100).select_all()

Edge Cases

No Limit Specified

Without a limit, the search attempts to return all matching results, but will be capped by quota limits in Chroma Cloud.
# Attempts to return ALL matching documents
search = Search().where(K("status") == "active")  # No limit()
# Chroma Cloud: Results capped by quota

Empty Results

When no documents match, results will have empty lists/arrays.

Non-existent Fields

Selecting non-existent metadata fields simply omits them from the results - they won’t appear in the metadata dict.
# If "non_existent_field" doesn't exist
search = Search().select("title", "non_existent_field")

# Result metadata will only contain "title" if it exists
# "non_existent_field" will not appear in the metadata dict at all

Complete Example

Here’s a practical example combining pagination with field selection:
from chromadb import Search, K, Knn

# Paginated search with field selection
def search_with_pagination(collection, query_text, page_size=20):
    current_page = 0

    while True:
        search = (Search()
            .where(K("status") == "published")
            .rank(Knn(query=query_text))
            .limit(page_size, offset=current_page * page_size)
            .select(K.DOCUMENT, K.SCORE, "title", "author", "date")
        )

        results = collection.search(search)
        rows = results.rows()[0]  # Get first (and only) search results

        if not rows:  # No more results
            break

        print(f"\n--- Page {current_page + 1} ---")
        for i, row in enumerate(rows, 1):
            print(f"{i}. {row['metadata']['title']} by {row['metadata']['author']}")
            print(f"   Score: {row['score']:.3f}, Date: {row['metadata']['date']}")
            print(f"   Preview: {row['document'][:100]}...")

        # Check if we want to continue
        user_input = input("\nPress Enter for next page, or 'q' to quit: ")
        if user_input.lower() == 'q':
            break

        current_page += 1

Tips and Best Practices

  • Select only what you need - Reduces network transfer and memory usage
  • Use appropriate page sizes - 10-50 for UI, 100-500 for batch processing
  • Consider bandwidth - Avoid selecting embeddings unless necessary
  • IDs are always included - No need to explicitly select them
  • Use select_all() sparingly - Only when you truly need all fields

Next Steps