Chroma: Open-source embedding database for AI
Vector database for embedding storage and semantic search.
Learn more about Chroma
Chroma is a vector database that stores embeddings and enables retrieval through nearest-neighbor search rather than traditional substring matching. It handles the full pipeline of tokenization, embedding generation, and indexing automatically, though users can also provide custom embeddings. The system supports filtering through metadata and document content, and can run in multiple modes including in-memory for development, persistent local storage, or client-server architecture. Common deployment contexts include retrieval-augmented generation (RAG) systems, semantic search applications, and LLM-based chat interfaces that require contextual document retrieval.

Minimal API surface
The core functionality is exposed through four primary functions for collection management and querying, reducing the learning curve for integration into existing applications.
Automatic embedding handling
The system can automatically tokenize, embed, and index documents using default models like Sentence Transformers, while also accepting custom embeddings from alternative providers like OpenAI or Cohere.
Multi-mode deployment
Chroma runs in-memory for prototyping, supports persistent local storage, and offers a client-server mode for scaling, allowing the same API to function across development, testing, and production environments.
import chromadb
client = chromadb.Client()
collection = client.create_collection(name="my_documents")
collection.add(
documents=["This is a document about cats", "This is about dogs"],
ids=["doc1", "doc2"]
)See how people are using Chroma
Related Repositories
Discover similar tools and frameworks used by developers
Heretic
Tool that removes safety alignment from transformer language models using directional ablation without post-training.
InvokeAI
Node-based workflow interface for local Stable Diffusion deployment.
OpenAI.fm
Web demo showcasing OpenAI's Speech API text-to-speech capabilities with an interactive Next.js interface.
OpenAI Python
Type-safe Python client for OpenAI's REST API.
Summarize
CLI and browser extension that generates summaries from URLs, files, videos, podcasts, and other media sources.