Faiss: Similarity search and clustering for dense vectors
Efficient approximate nearest neighbor search for billion-scale vectors.
Learn more about FAISS
Faiss is a C++ library designed for similarity search and clustering operations on dense vector collections. It implements various indexing algorithms that support L2 distance and dot product comparisons, including methods based on binary vectors, quantization codes, and graph-based structures like HNSW and NSG. The library can handle vectors that exceed available RAM through compressed representations and scaling techniques, while optional GPU implementations provide accelerated search and clustering operations. Common applications include approximate nearest neighbor search at scale, vector database operations, and clustering tasks in machine learning pipelines.
Multiple Index Types
Offers exact search baselines and approximate methods using quantization, graphs, and hybrid structures. Engineers select indexes based on specific constraints like memory limits, accuracy requirements, or query latency targets.
Drop-in GPU Acceleration
CPU indexes run on NVIDIA or AMD GPUs with automatic memory management and no code changes. Supports single and multi-GPU configurations with optional cuVS backend for additional performance.
Compressed Vector Storage
Stores only quantized representations instead of original vectors, reducing memory by 8-64x. Enables billion-scale indexing on single machines with controlled precision trade-offs.
import faiss
import numpy as np
# Create random vectors and build index
vectors = np.random.random((1000, 128)).astype('float32')
index = faiss.IndexFlatL2(128)
index.add(vectors)
# Search for 5 nearest neighbors
query = np.random.random((1, 128)).astype('float32')
distances, indices = index.search(query, k=5)See how people are using FAISS
Related Repositories
Discover similar tools and frameworks used by developers
Open WebUI
Extensible multi-LLM chat platform with RAG pipeline.
LightRAG
Graph-based retrieval framework for structured RAG reasoning.
Triton
Domain-specific language and compiler for writing GPU deep learning primitives with higher productivity than CUDA.
CodeFormer
Transformer-based face restoration using vector-quantized codebook lookup.
PaddleOCR
Multilingual OCR toolkit with document structure extraction.