Ultralytics YOLO: Object detection and computer vision models
PyTorch library for YOLO-based real-time computer vision.
Learn more about Ultralytics YOLO
Ultralytics YOLO is a PyTorch-based computer vision library that implements successive versions of the YOLO (You Only Look Once) object detection architecture. The codebase provides model definitions, training pipelines, inference engines, and utilities for tasks including object detection, instance segmentation, image classification, pose estimation, and multi-object tracking. Models are distributed through the Ultralytics Hub and can be deployed via command-line interface or Python API. The library supports various hardware configurations and includes integration with popular deployment platforms.
Unified Multi-Task Interface
Single codebase handles detection, segmentation, classification, pose estimation, and tracking through consistent model APIs. Eliminates the need for separate specialized implementations or framework switching across vision tasks.
Versioned Model Lineage
Multiple YOLO versions (v8, v10, v11) with documented architectural differences and benchmarked performance characteristics. Enables explicit accuracy-latency trade-offs based on deployment constraints rather than guessing optimal models.
CLI and Python API
Offers both command-line interface for quick experiments and a comprehensive Python API for integration. Train, validate, and deploy models using simple commands or programmatic workflows with identical capabilities.
from ultralytics import YOLO
import cv2
# Load a pre-trained YOLOv8 model
model = YOLO('yolov8n.pt') # nano version for faster inference
# Load an image
image_path = 'path/to/your/image.jpg'
image = cv2.imread(image_path)
# Run inference
results = model(image)
# Process results
for result in results:
boxes = result.boxes
for box in boxes:
# Get bounding box coordinates
x1, y1, x2, y2 = box.xyxy[0].cpu().numpy()
confidence = box.conf[0].cpu().numpy()
class_id = int(box.cls[0].cpu().numpy())
class_name = model.names[class_id]
print(f"Detected {class_name} with confidence {confidence:.2f}")
# Draw bounding box on image
cv2.rectangle(image, (int(x1), int(y1)), (int(x2), int(y2)), (0, 255, 0), 2)
cv2.putText(image, f"{class_name} {confidence:.2f}",
(int(x1), int(y1-10)), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)
# Save or display the result
cv2.imwrite('output.jpg', image)
cv2.imshow('Detection Results', image)
cv2.waitKey(0)
cv2.destroyAllWindows()Adds AVIF training support and introduces new COCO12-Formats dataset.
- –AVIF is now a supported training image format
- –`IMG_FORMATS` expanded to include `avif`
- –More robust image decoding for modern formats
- –Adds a Pillow-based fallback image reader (`imreadpil`) for cases where OpenCV can't decode AVIF/HEIC
- –New "COCO12-Formats" dataset + generator script
Fixes DDP multi-GPU training crash and improves dataset URI resolution.
- –Fixed DDP multi-GPU training crash
- –Added missing `PosixPath` import to the generated temporary DDP training script
- –Prevents `NameError: name 'PosixPath' is not defined` when model paths are passed as `PosixPath` objects during Distributed Data Parallel training
- –More robust dataset URI resolution for large Ultralytics HUB/Platform datasets
- –`ul://...` dataset URLs now allow a much longer server-side preparation time to avoid premature timeouts
Adds 2D Pose Result.summary() support with YOLO26 documentation updates.
- –Version bump: `8.4.4` → `8.4.5`
- –Docs & examples shift to YOLO26: Kaggle links, Ultralytics Platform docs, and multiple notebooks now point to YOLO26 as the recommended current model family
- –NDJSON dataset docs improved: clearer per-task examples/tabs for Detect/Segment/Pose/OBB/Classify, including pose visibility explanations
- –Dependency cleanup: removed optional `hub-sdk` extra from `pyproject.toml`
- –More robust downstream integrations: safer summaries improve reliability for APIs, analytics, logging, and dataset export workflows
See how people are using Ultralytics YOLO
Related Repositories
Discover similar tools and frameworks used by developers
Tesseract
LSTM-based OCR engine supporting 100+ languages.
DeepFace
Python library wrapping multiple face recognition deep learning models.
YOLOv7
PyTorch single-stage detector with bag-of-freebies training optimizations.
OpenAI.fm
Web demo showcasing OpenAI's Speech API text-to-speech capabilities with an interactive Next.js interface.
LangChain
Modular framework for chaining LLMs with external data.