Navigate:

All ReposMochi 1

~$MOCH0.4%

Mochi 1: Open-source video generation model

10B parameter diffusion model for text-to-video generation using Asymmetric Diffusion Transformer.

LIVE RANKINGS • 12:26 AM

OVERALL

#357

AI & ML

#108

30 DAY RANKING TREND

STARS

3.6K

FORKS

477

7D STARS

+14

7D FORKS

Tags:

AI & ML

See Repo:

Learn more about Mochi 1

Mochi 1 is an open-source video generation model that creates videos from text descriptions using diffusion-based machine learning. The system uses a novel Asymmetric Diffusion Transformer (AsymmDiT) architecture with 10 billion parameters, processing text and visual tokens through separate streams with different capacities. It includes an AsymmVAE component that compresses videos to 128x smaller size with 8x8 spatial and 6x temporal compression. The model generates 480p videos and supports LoRA fine-tuning for customization on specific datasets.

AsymmDiT Architecture

Uses asymmetric processing streams where the visual pathway has 4x more parameters than text processing. This design reduces memory requirements while focusing computational capacity on visual reasoning.

Efficient Video Compression

Includes AsymmVAE that compresses videos to 128x smaller size using asymmetric encoder-decoder structure. Achieves 8x8 spatial and 6x temporal compression to 12-channel latent space.

LoRA Fine-tuning Support

Provides built-in trainer for creating LoRA fine-tunes on custom video datasets. Can be fine-tuned on single H100 or A100 80GB GPU with safetensors format output.

from genmo.mochi_preview.pipelines import (
    DecoderModelFactory,
    DitModelFactory,
    MochiSingleGPUPipeline,
    T5ModelFactory,
    linear_quadratic_schedule,
)

pipeline = MochiSingleGPUPipeline(
    text_encoder_factory=T5ModelFactory(),
    dit_factory=DitModelFactory(
        model_path=f"weights/dit.safetensors", model_dtype="bf16"
    ),
    decoder_factory=DecoderModelFactory(
        model_path=f"weights/decoder.safetensors",
    ),
    cpu_offload=True,
    decode_type="tiled_spatial",
)

video = pipeline(
    height=480,
    width=848,
    num_frames=31,
    num_inference_steps=64,
    sigma_schedule=linear_quadratic_schedule(64, 0.025),
    cfg_schedule=[6.0] * 64,
    batch_cfg=False,
    prompt="your favorite prompt here ...",
    negative_prompt="",
    seed=12345,
)

See how people are using Mochi 1

Loading tweets...

Top in AI & ML

Trending Repos

Claude Code

116,661#1

Pi Mono

38,370#2

Goose

42,955#3

WiFi DensePose

48,951#4

Codex CLI

76,838#5

See all →

LIVE RANKINGS • 12:26 AM

OVERALL

#357

AI & ML

#108

30 DAY RANKING TREND

STARS

3.6K

FORKS

477

7D STARS

+14

7D FORKS

[ EXPLORE MORE ]

Related Repositories

Discover similar tools and frameworks used by developers

Mochi 1: Open-source video generation model

Learn more about Mochi 1

What is Mochi 1 for?

What makes Mochi 1 different?

AsymmDiT Architecture

Efficient Video Compression

LoRA Fine-tuning Support

Example code snippets

See how people are using Mochi 1

Top in AI & ML

Claude Code

Pi Mono

Goose

WiFi DensePose

Codex CLI

Trending Repos

Claude Code

Pi Mono

Goose

WiFi DensePose

Codex CLI

Related Repositories

Open WebUI

DeepFace

Open Notebook

Chat SDK

LangChain