DINOv2: Self-supervised visual feature learning
PyTorch vision transformers pretrained on 142M unlabeled images.
Learn more about dinov2
import torch\nmodel = torch.hub.load('facebookresearch/dinov2', 'dinov2_vits14')\nembedding = model(torch.randn(1, 3, 224, 224))
Large-scale unsupervised pretraining
Models are trained on 142 million unlabeled images without annotations or manual labels, producing features that generalize across domains without fine-tuning requirements.
Multiple model scales with registers
Provides four model sizes (ViT-S/14, ViT-B/14, ViT-L/14, ViT-g/14) with optional register token variants that improve feature quality and stability in transformer layers.
Patch-level feature extraction
Generates both image-level and per-patch visual features that enable pixel-level tasks like segmentation and depth estimation alongside image classification.
import torch
from PIL import Image
from torchvision import transforms
model = torch.hub.load('facebookresearch/dinov2', 'dinov2_vits14')
transform = transforms.Compose([transforms.Resize(224), transforms.ToTensor()])
image = Image.open('photo.jpg')
input_tensor = transform(image).unsqueeze(0)
features = model(input_tensor)See how people are using dinov2
Top in AI & ML
Related Repositories
Discover similar tools and frameworks used by developers
yolov7
PyTorch single-stage detector with bag-of-freebies training optimizations.
tesseract
LSTM-based OCR engine supporting 100+ languages.
LivePortrait
PyTorch implementation for animating portraits by transferring expressions from driving videos.
mediapipe
Graph-based framework for streaming media ML pipelines.
evo2
Foundation model for DNA sequence generation and scoring.