Image-to-Video
Diffusers
Safetensors
Video
WorldModels
Stream
Diffusion
minWM / README.md
zhuhz22's picture
Update README.md
d8ba72e verified
metadata
license: mit
pipeline_tag: image-to-video
datasets:
  - MIN-Lab/minWM-data
tags:
  - Video
  - WorldModels
  - Stream
  - Diffusion

๐ŸŒ minWM: The First Full-Stack Open-Source World Model Framework

A full-stack framework and tutorial for newcomers, rather than a specific model.

minWM is our contribution to the world-model community: a full-stack open-source framework that walks you end-to-end through turning a bidirectional T2V foundation model into an action-conditioned video world model โ€” with example data, runnable scripts, Claude skills capturing our hands-on experience, and onboarding knowledge for newcomers. We hope more researchers and developers join us in growing the community together.

Code: https://github.com/shengshu-ai/minWM

Citation

If you find this work useful, please cite:

@article{zhu2026causal,
  title={Causal Forcing: Autoregressive Diffusion Distillation Done Right for High-Quality Real-Time Interactive Video Generation},
  author={Zhu, Hongzhou and Zhao, Min and He, Guande heg and Su, Hang and Li, Chongxuan and Zhu, Jun},
  journal={arXiv preprint arXiv:2602.02214},
  year={2026}
}

@article{zhao2026causal,
  title={Causal Forcing++: Scalable Few-Step Autoregressive Diffusion Distillation for Real-Time Interactive Video Generation},
  author={Zhao, Min and Zhu, Hongzhou and Zheng, Kaiwen and Zhou, Zihan and Yan, Bokai and Li, Xinyuan and Yang, Xiao and Li, Chongxuan and Zhu, Jun},
  journal={arXiv preprint arXiv:2605.15141},
  year={2026}
}