Zvec: In-process vector database
Lightweight vector database that embeds directly into applications for similarity search and vector operations.
Learn more about Zvec
Zvec is an in-process vector database designed to embed directly into applications without requiring separate server infrastructure. Built on Alibaba's Proxima vector search engine, it provides approximate nearest neighbor (ANN) search capabilities for both dense and sparse vector embeddings. The database operates as a library that runs within the application process, supporting hybrid search combining vector similarity with structured filtering. It targets use cases in retrieval-augmented generation (RAG), semantic search, and applications requiring low-latency vector operations.
In-Process Architecture
Runs as a library within applications rather than as a separate server, eliminating network overhead and simplifying deployment.
Multi-Vector Support
Handles both dense and sparse vector embeddings with support for multi-vector queries in single operations.
Hybrid Search
Combines vector similarity search with structured attribute filtering for more precise query results.
import zvec
# Define collection schema with vector field
schema = zvec.CollectionSchema(
name="documents",
vectors=zvec.VectorSchema("embedding", zvec.DataType.VECTOR_FP32, 128),
)
# Create and open collection
collection = zvec.create_and_open(path="./my_vector_db", schema=schema)
# Insert documents with embeddings
documents = [
zvec.Doc(id="doc_1", vectors={"embedding": [0.1] * 128}),
zvec.Doc(id="doc_2", vectors={"embedding": [0.2] * 128}),
zvec.Doc(id="doc_3", vectors={"embedding": [0.3] * 128}),
]
collection.insert(documents)
# Search for similar vectors
query_vector = [0.15] * 128
results = collection.search(
vectors={"embedding": query_vector},
topk=5
)
# Process results
for result in results:
print(f"Document ID: {result.id}, Score: {result.score}")A major update focused on optimization, platform expansion, and developer experience improvements.
- –Support AI extension framework for on-device embedding workflows (#88)
- –Auto-scalable segment metadata in MMap storage files (#67)
- –Unified search invocation interface in core (#15)
- –Linux ARM64 build support (#71)
- –Silent failure on repeated initialization attempts (#79)
v0.1.1: refactor: clarify HNSW 'm' as max neighbors of upper layer
- –refactor: clarify HNSW 'm' as max neighbors of upper layer
- –refactor: set HNSW 'scaling_factor' default to 'm' instead of 50
- –refactor: flat param string cleanup
- –fix: flat support more segments
- –Chore/add codecov badge
v0.1.0: minor: update readme
- –minor: update readme
- –chore:add git commit msg and branch name in pre-commit and modify org
- –feat(core): support cpu flag detect & dispatch
- –chore(cmake): auto detect cpu arch flag in cmake and rm redundant option
- –chore: release to pypi
See how people are using Zvec
Top in Data Engineering
Related Repositories
Discover similar tools and frameworks used by developers
Neo4j
Open-source graph database storing data as nodes and relationships with Cypher query language.
Luigi
Build complex batch pipelines with dependency management.
n8n
Node-based automation platform with JavaScript and Python scripting.
dbt
SQL-based transformation framework for analytics data warehouses.
Flyway
Version-controlled SQL migrations with automated execution tracking.