Navigate:
All ReposWan2.1
~$WAN210.2%

Wan2.1: Open-source video generation models

Diffusion transformer models for text and image-to-video generation.

LIVE RANKINGS • 06:51 AM • STEADY
TOP 100TOP 100TOP 100TOP 100TOP 100TOP 100TOP 100TOP 100TOP 100TOP 100TOP 100TOP 100
OVERALL
#62
23
AI & ML
#33
9
30 DAY RANKING TREND
ovr#62
·AI#33
STARS
15.1K
FORKS
2.3K
DOWNLOADS
7D STARS
+33
7D FORKS
+6
Tags:
See Repo:
Share:

Learn more about Wan2.1

Wan2.1 is a collection of diffusion-based video generation models developed for multiple video synthesis tasks. The architecture includes a custom video VAE component (Wan-VAE) for encoding and decoding video frames while preserving temporal information, paired with transformer-based diffusion models of varying scales. The smallest variant (T2V-1.3B) requires approximately 8GB of VRAM and can generate 480p video on consumer hardware, while larger variants support higher resolutions and more complex generation tasks. The models are integrated with standard frameworks like Hugging Face Diffusers and ComfyUI for inference.

Wan2.1

1

Multi-task capability

Supports text-to-video, image-to-video, video editing, text-to-image, and video-to-audio generation within a single model family, rather than requiring separate specialized models for each task.

2

Consumer GPU compatibility

The 1.3B parameter variant operates within 8GB VRAM constraints, enabling deployment on standard consumer graphics cards without specialized hardware or quantization techniques.

3

Multilingual text generation

Includes capability to generate both Chinese and English text within video frames, addressing a gap in existing open-source video models at the time of release.


npm install wan2.1


[ EXPLORE MORE ]

Related Repositories

Discover similar tools and frameworks used by developers