DC-VideoGen: Efficient Video Generation with Deep Compression Video Autoencoder

🔥🔥 News

(🔥 New) [2025/9/30] We released the DC-VideoGen technical report on arXiv.

💡 Introduction

DC-VideoGen is a new post-training framework for accelerating video diffusion models. Key features:

🎬 Supports video generation up to 2160×3840 resolution on a single H100 GPU
⚡ Delivers 14.8× faster inference than the base model
💰 230× lower training cost compared to training from scratch (only 10 H100 GPU days for Wan-2.1-14B)

DC-VideoGen is built on two core innovations:

Deep Compression Video Autoencoder (DC-AE-V): a new family of deep compression autoencoders for video data, providing 32×/64× spatial and 4× temporal compression.
AE-Adapt-V: a robust adaptation strategy that enables rapid and stable transfer of pre-trained video diffusion models to DC-AE-V.

Highlight 1: DC-AE-V - Deep Compression Video Autoencoder with Chunk-Causal Temporal Modeling

Under deep compression settings, causal video autoencoders suffer from low reconstruction quality. In contrast, non-causal video autoencoders achieve better reconstruction quality but generalize poorly to longer videos.

DC-AE-V introduces a new temporal modeling design, chunk-causal, to overcome the limitations of non-causal and causal video autoencoders. It preserves causal information flow across chunks while enabling bidirectional flow within each chunk.

Highlight 2: AE-Adapt-V - Post-Training Video Autoencoder Adaptation

Direct fine-tuning without AE-Adapt-V leads to training instability and suboptimal quality. In contrast, AE-Adapt-V provides a robust initialization that preserves semantics in the new latent space for the video diffusion model, enabling rapid recovery of visual quality and allowing the model to match the base model’s performance with lightweight fine-tuning.

Results

Content

The code and pretrained models will be released after the legal review is completed.

Contact

Han Cai

Related Projects

DC-Gen: Post-Training Diffusion Acceleration with Deeply Compressed Latent Space

Reference

@article{chen2025dc,
  title={DC-VideoGen: Efficient Video Generation with Deep Compression Video Autoencoder},
  author={Chen, Junyu and He, Wenkun and Gu, Yuchao and Zhao, Yuyang and Yu, Jincheng and Chen, Junsong and Zou, Dongyun and Lin, Yujun and Zhang, Zhekai and Li, Muyang and others},
  journal={arXiv preprint arXiv:2509.25182},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
assets/dc-videogen-figures		assets/dc-videogen-figures
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DC-VideoGen: Efficient Video Generation with Deep Compression Video Autoencoder

🔥🔥 News

💡 Introduction

Highlight 1: DC-AE-V - Deep Compression Video Autoencoder with Chunk-Causal Temporal Modeling

Highlight 2: AE-Adapt-V - Post-Training Video Autoencoder Adaptation

Results

Content

Contact

Related Projects

Reference

About

Uh oh!

Releases

Packages

Contributors 2

License

dc-ai-projects/DC-VideoGen

Folders and files

Latest commit

History

Repository files navigation

DC-VideoGen: Efficient Video Generation with Deep Compression Video Autoencoder

🔥🔥 News

💡 Introduction

Highlight 1: DC-AE-V - Deep Compression Video Autoencoder with Chunk-Causal Temporal Modeling

Highlight 2: AE-Adapt-V - Post-Training Video Autoencoder Adaptation

Results

Content

Contact

Related Projects

Reference

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Packages