Navigate:
All Reposchroma
~$CHROMA0.2%

Chroma: Open-source embedding database for AI

Vector database for embedding storage and semantic search.

LIVE RANKINGS • 06:50 AM • STEADY
TOP 50TOP 50TOP 50TOP 50TOP 50TOP 50TOP 50TOP 50TOP 50TOP 50TOP 50TOP 50
OVERALL
#50
8
DATA ENGINEERING
#5
1
30 DAY RANKING TREND
ovr#50
·Data#5
STARS
25.4K
FORKS
2.0K
DOWNLOADS
2.1K
7D STARS
+51
7D FORKS
+4
See Repo:
Share:

Learn more about chroma

Chroma is a vector database that stores embeddings and enables retrieval through nearest-neighbor search rather than traditional substring matching. It handles the full pipeline of tokenization, embedding generation, and indexing automatically, though users can also provide custom embeddings. The system supports filtering through metadata and document content, and can run in multiple modes including in-memory for development, persistent local storage, or client-server architecture. Common deployment contexts include retrieval-augmented generation (RAG) systems, semantic search applications, and LLM-based chat interfaces that require contextual document retrieval.

chroma

1

Minimal API surface

The core functionality is exposed through four primary functions for collection management and querying, reducing the learning curve for integration into existing applications.

2

Automatic embedding handling

The system can automatically tokenize, embed, and index documents using default models like Sentence Transformers, while also accepting custom embeddings from alternative providers like OpenAI or Cohere.

3

Multi-mode deployment

Chroma runs in-memory for prototyping, supports persistent local storage, and offers a client-server mode for scaling, allowing the same API to function across development, testing, and production environments.


import chromadb

client = chromadb.Client()
collection = client.create_collection(name="my_documents")

collection.add(
    documents=["This is a document about cats", "This is about dogs"],
    ids=["doc1", "doc2"]
)

v1.3.4.dev19

Development pre-release build from main branch; release notes do not specify changes, breaking updates, or requirements.

  • Treat as unstable dev snapshot; pin to a stable release for production workloads.
  • No changelog provided; review commit history or wait for official release notes before upgrading.
v1.3.3

Adds BM25 embedding function and fixes Qwen embedding function hydration; no breaking changes noted.

  • Use the new chroma_bm25 embedding function in Python for sparse retrieval workflows.
  • Qwen embedding function now correctly hydrates with custom prompts and tasks.
v1.3.2

Patch release fixing compaction with empty logs and sparse autoembed queries in the search API.

  • Update if you hit compaction failures when log segments are empty during rebuild operations.
  • Upgrade to resolve broken sparse autoembed queries when using the search API endpoint.

See how people are using chroma

Loading tweets...


[ EXPLORE MORE ]

Related Repositories

Discover similar tools and frameworks used by developers