ControlNet: Control diffusion models with extra conditions
Dual-branch architecture for conditional diffusion model control.
Learn more about ControlNet
ControlNet is a neural network module designed to add conditional control to diffusion models like Stable Diffusion. It uses a dual-branch architecture where network blocks are copied into a locked branch that preserves the original model and a trainable branch that learns new conditions, connected through zero-initialized convolutions. The approach allows fine-tuning on small datasets or personal devices without storing gradients in the original encoder, maintaining computational efficiency. Common applications include edge-based image generation, depth-guided synthesis, pose-controlled generation, and scribble-to-image tasks.
Dual-Branch Architecture
Network blocks duplicate into locked and trainable branches, preserving the original diffusion model weights during fine-tuning. Prevents model degradation even with small datasets, enabling safe training on personal devices without risking production model quality.
Zero Convolution Initialization
Connections between branches use 1x1 convolutions initialized to zero weights and biases, outputting zeros before training starts. Guarantees no distortion to the pretrained model at initialization, allowing immediate fine-tuning without warmup or careful learning rate scheduling.
Gradient-Free Original Encoder
Locked branch requires no gradient computation or storage during training, keeping memory usage comparable to the base model. Adds multiple control layers throughout the architecture without proportional GPU memory increases.
from diffusers import StableDiffusionControlNetPipeline, ControlNetModel
import torch
from PIL import Image
import cv2
controlnet = ControlNetModel.from_pretrained("lllyasviel/sd-controlnet-canny")
pipe = StableDiffusionControlNetPipeline.from_pretrained(
"runwayml/stable-diffusion-v1-5", controlnet=controlnet
)
image = cv2.Canny(cv2.imread("input.jpg"), 100, 200)
output = pipe("a beautiful mansion", image=Image.fromarray(image)).images[0]
output.save("output.jpg")See how people are using ControlNet
Related Repositories
Discover similar tools and frameworks used by developers
Kimi-K2
Trillion-parameter MoE model with Muon-optimized training.
transformers
Unified API for pre-trained transformer models across frameworks.
tiktoken
Fast BPE tokenizer for OpenAI language models.
ultralytics
PyTorch library for YOLO-based real-time computer vision.
Real-ESRGAN
PyTorch framework for blind super-resolution using GANs.