Navigate:
All Reposopen_clip
~$OPENCL0.1%

OpenCLIP: Open source CLIP implementation

PyTorch library for contrastive language-image pretraining.

LIVE RANKINGS • 06:51 AM • STEADY
OVERALL
#155
22
AI & ML
#64
3
30 DAY RANKING TREND
ovr#155
·AI#64
STARS
13.2K
FORKS
1.2K
DOWNLOADS
7D STARS
+12
7D FORKS
0
Tags:
See Repo:
Share:

Learn more about open_clip

OpenCLIP is a PyTorch-based library implementing contrastive language-image pretraining, a technique that learns joint embeddings of images and text by maximizing similarity between matched pairs while minimizing similarity between unmatched pairs. The implementation supports multiple vision encoders (ViT, ConvNeXt, SigLIP) and text encoders, trained on datasets ranging from LAION-400M to DataComp-1B. The library provides pretrained model checkpoints with documented zero-shot performance across 38 datasets and enables inference through simple APIs for encoding images and text into comparable embedding spaces. Common applications include zero-shot image classification, image-text retrieval, and transfer learning for vision tasks without task-specific labeled data.

open_clip

1

Reproducible scaling studies

The project includes detailed research on scaling laws for contrastive language-image learning, with models trained across different compute budgets and dataset sizes to document how performance scales with training data and model capacity.

2

Multiple architecture support

Supports diverse vision encoders including Vision Transformers, ConvNeXt, and SigLIP variants, along with different training datasets, allowing users to select models optimized for specific accuracy-efficiency trade-offs.

3

Comprehensive model collection

Provides access to numerous pretrained models through a unified interface, with model cards on Hugging Face Hub and documented zero-shot results across 38 datasets for transparent performance comparison.


pip install open_clip_torch

vv3.2.0

Removes invalid MetaCLIP 2 L/14 checkpoint and adds MobileCLIP2 model configs with pretrained weights.

  • Remove references to the non-existent MetaCLIP 2 L/14 checkpoint if your code depends on it.
  • Use new MobileCLIP2 model configs and pretrained weights now available in this release.
vv3.1.0

Adds MetaCLIP2 WorldWide model support, fixes CoCa generation masking bug, and introduces unified text-locking across CLIP variants.

  • Use MetaCLIP2 WorldWide models now available in the model registry for improved multilingual capabilities.
  • Upgrade if using CoCa generation; mask handling was corrected to prevent inference errors.
vv3.0.0

Major release updates minimum Python to 3.10 and adds local model loading, custom tokenizers, and configurable attention blocks.

  • Upgrade to Python 3.10 or later as minimum requirement has changed.
  • Use `local-dir:` schema to load models and tokenizers from local folders instead of remote sources.


[ EXPLORE MORE ]

Related Repositories

Discover similar tools and frameworks used by developers