Navigate:
All Reposmmdetection
~$MMDETE0.0%

MMDetection: PyTorch object detection toolbox

Modular PyTorch framework for object detection research and deployment.

LIVE RANKINGS • 06:52 AM • STEADY
OVERALL
#183
19
AI & ML
#74
2
30 DAY RANKING TREND
ovr#183
·AI#74
STARS
32.3K
FORKS
9.8K
DOWNLOADS
7D STARS
+14
7D FORKS
0
Tags:
See Repo:
Share:

Learn more about mmdetection

MMDetection is a PyTorch-based detection framework developed as part of the OpenMMLab project. The codebase uses a modular architecture where detection pipelines are constructed by combining interchangeable components for backbones, necks, heads, and loss functions. It supports multiple detection paradigms including two-stage detectors (Faster R-CNN, Cascade R-CNN), single-stage detectors (RetinaNet, SSD, YOLO variants), and transformer-based approaches (DETR, Grounding DINO). The framework is commonly used for research, benchmarking detection algorithms, and deploying detection models in production applications.


1

Modular Component Architecture

Detection pipelines are built by combining independent modules for backbones, feature pyramids, detection heads, and loss functions. Enables customization and experimentation without modifying core framework code or requiring forks.

2

Multi-Task Detection Framework

Handles object detection, instance segmentation, panoptic segmentation, and semi-supervised learning within a single unified framework. Eliminates the need for separate tools or switching between different codebases for different vision tasks.

3

Comprehensive Model Zoo

Includes implementations of diverse detector architectures spanning two-stage methods, single-stage detectors, and transformer-based approaches. Ships with pre-trained weights for immediate benchmarking and transfer learning without training from scratch.


from mmdet.apis import init_detector, inference_detector

config_file = 'configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
checkpoint_file = 'checkpoints/faster_rcnn_r50_fpn_1x_coco.pth'

model = init_detector(config_file, checkpoint_file, device='cuda:0')
result = inference_detector(model, 'demo/demo.jpg')
print(result)

vv3.3.0

Adds MM-Grounding-DINO, an open-source baseline for open-set detection (OVD, phrase grounding, REC) built on MMDetection with full training code and pre-trained models.

  • Use MM-Grounding-DINO-Tiny for open-vocabulary detection tasks; it outperforms the original Grounding-DINO-Tiny baseline.
  • Access training code, pre-training datasets, and fine-tuning configs at configs/mm_grounding_dino for reproducible grounding and detection pipelines.
vv3.2.0

Adds four SOTA Transformer models (DDQ, CO-DETR, AlignDETR, H-DINO) and exclusive Grounding DINO fine-tuning support; introduces FSDP/DeepSpeed training and RF100 benchmark for CNN vs. Transformer comparison.

  • Enable AMP, gradient checkpointing, and FrozenBN in DINO to reduce memory usage; use FSDP or DeepSpeed to train large models with as low as 8.5 GB peak memory.
  • Fine-tune Grounding DINO (only library supporting this) for +0.9 mAP over official zero-shot; train Detic for open-vocabulary detection or multi-dataset joint training.
vv3.1.0

Adds tracking (MOT/VIS), multimodal inference (GLIP, XDecoder), and ViTDet; install multimodal deps via pip install -r requirements/multimodal.txt or mim install mmdet[multimodal].

  • Install multimodal dependencies (requirements/multimodal.txt) to enable GLIP and XDecoder inference and evaluation.
  • Use new tracking algorithms (SORT, ByteTrack, OCSORT, etc.) and gradio demo for local image task experimentation.


[ EXPLORE MORE ]

Related Repositories

Discover similar tools and frameworks used by developers