Accelerate documentation
Example Zoo
Getting started
Tutorials
OverviewAdd Accelerate to your codeExecution processTPU trainingLaunching Accelerate scriptsLaunching distributed training from Jupyter Notebooks
How to guides
Accelerate
Start Here!Model memory estimatorModel quantizationExperiment trackersProfilerCheckpointingTroubleshootExample Zoo
Training
Gradient accumulationLocal SGDLow precision (FP8) trainingDeepSpeedUsing multiple models with DeepSpeedDDP Communication HooksFully Sharded Data ParallelMegatron-LMAmazon SageMakerApple M1 GPUsIntel CPUIntel GaudiCompilation
Inference
Concepts and fundamentals
Accelerate's internal mechanismLoading big models into memoryComparing performance across distributed setupsExecuting and deferring jobsGradient synchronizationFSDP vs DeepSpeedFSDP1 vs FSDP2Context parallelismSequence parallelismLow precision training methodsTraining on TPUs
Reference
Example Zoo
Below contains a non-exhaustive list of tutorials and scripts showcasing Accelerate.
Official Accelerate Examples:
Basic Examples
These examples showcase the base features of Accelerate and are a great starting point
- Barebones NLP example
- Barebones distributed NLP example in a Jupyter Notebook
- Barebones computer vision example
- Barebones distributed computer vision example in a Jupyter Notebook
- Using Accelerate in Kaggle
Feature Specific Examples
These examples showcase specific features that the Accelerate framework offers
- Automatic memory-aware gradient accumulation
- Checkpointing states
- Cross validation
- DeepSpeed
- Fully Sharded Data Parallelism
- Gradient accumulation
- Memory-aware batch size finder
- Metric Computation
- Using Trackers
- Using Megatron-LM
Full Examples
These examples showcase every feature in Accelerate at once that was shown in “Feature Specific Examples”
- Complete NLP example
- Complete computer vision example
- Very complete and extensible vision example showcasing SLURM, hydra, and a very extensible usage of the framework
- Causal language model fine-tuning example
- Masked language model fine-tuning example
- Speech pretraining example
- Translation fine-tuning example
- Text classification fine-tuning example
- Semantic segmentation fine-tuning example
- Question answering fine-tuning example
- Beam search question answering fine-tuning example
- Multiple choice question answering fine-tuning example
- Named entity recognition fine-tuning example
- Image classification fine-tuning example
- Summarization fine-tuning example
- End-to-end examples on how to use AWS SageMaker integration of Accelerate
- Megatron-LM examples for various NLp tasks
Integration Examples
These are tutorials from libraries that integrate with Accelerate:
Don’t find your integration here? Make a PR to include it!
Amphion
- Training Text-to-Speech Models with Amphion
- Training Singing Voice Conversion Models with Amphion
- Training Vocoders with Amphion
Catalyst
DALLE2-pytorch
Diffusers
fastai
- Distributed training from Jupyter Notebooks with fastai
- Basic distributed training examples with fastai
GradsFlow
imagen-pytorch
Kornia
PyTorch Accelerated
PyTorch3D
Stable-Dreamfusion
Tez
trlx
Comfy-UI
In Science
Below contains a non-exhaustive list of papers utilizing Accelerate.
Don’t find your paper here? Make a PR to include it!
- Yuval Kirstain, Adam Polyak, Uriel Singer, Shahbuland Matiana, Joe Penna, Omer Levy: “Pick-a-Pic: An Open Dataset of User Preferences for Text-to-Image Generation”, 2023; arXiv:2305.01569.
- Lei Wang, Wanyu Xu, Yihuai Lan, Zhiqiang Hu, Yunshi Lan, Roy Ka-Wei Lee, Ee-Peng Lim: “Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models”, 2023; arXiv:2305.04091.
- Arthur Câmara, Claudia Hauff: “Moving Stuff Around: A study on efficiency of moving documents into memory for Neural IR models”, 2022; arXiv:2205.08343.
- Ying Sheng, Lianmin Zheng, Binhang Yuan, Zhuohan Li, Max Ryabinin, Daniel Y. Fu, Zhiqiang Xie, Beidi Chen, Clark Barrett, Joseph E. Gonzalez, Percy Liang, Christopher Ré, Ion Stoica, Ce Zhang: “High-throughput Generative Inference of Large Language Models with a Single GPU”, 2023; arXiv:2303.06865.
- Peter Melchior, Yan Liang, ChangHoon Hahn, Andy Goulding: “Autoencoding Galaxy Spectra I: Architecture”, 2022; arXiv:2211.07890.
- Jiaao Chen, Aston Zhang, Mu Li, Alex Smola, Diyi Yang: “A Cheaper and Better Diffusion Language Model with Soft-Masked Noise”, 2023; arXiv:2304.04746.
- Ayaan Haque, Matthew Tancik, Alexei A. Efros, Aleksander Holynski, Angjoo Kanazawa: “Instruct-NeRF2NeRF: Editing 3D Scenes with Instructions”, 2023; arXiv:2303.12789.
- Luke Melas-Kyriazi, Christian Rupprecht, Iro Laina, Andrea Vedaldi: “RealFusion: 360° Reconstruction of Any Object from a Single Image”, 2023; arXiv:2302.10663.
- Xiaoshi Wu, Keqiang Sun, Feng Zhu, Rui Zhao, Hongsheng Li: “Better Aligning Text-to-Image Models with Human Preference”, 2023; arXiv:2303.14420.
- Yongliang Shen, Kaitao Song, Xu Tan, Dongsheng Li, Weiming Lu, Yueting Zhuang: “HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace”, 2023; arXiv:2303.17580.
- Yue Yang, Wenlin Yao, Hongming Zhang, Xiaoyang Wang, Dong Yu, Jianshu Chen: “Z-LaVI: Zero-Shot Language Solver Fueled by Visual Imagination”, 2022; arXiv:2210.12261.
- Sheng-Yen Chou, Pin-Yu Chen, Tsung-Yi Ho: “How to Backdoor Diffusion Models?”, 2022; arXiv:2212.05400.
- Junyoung Seo, Wooseok Jang, Min-Seop Kwak, Jaehoon Ko, Hyeonsu Kim, Junho Kim, Jin-Hwa Kim, Jiyoung Lee, Seungryong Kim: “Let 2D Diffusion Model Know 3D-Consistency for Robust Text-to-3D Generation”, 2023; arXiv:2303.07937.
- Or Patashnik, Daniel Garibi, Idan Azuri, Hadar Averbuch-Elor, Daniel Cohen-Or: “Localizing Object-level Shape Variations with Text-to-Image Diffusion Models”, 2023; arXiv:2303.11306.
- Dídac Surís, Sachit Menon, Carl Vondrick: “ViperGPT: Visual Inference via Python Execution for Reasoning”, 2023; arXiv:2303.08128.
- Chenyang Qi, Xiaodong Cun, Yong Zhang, Chenyang Lei, Xintao Wang, Ying Shan, Qifeng Chen: “FateZero: Fusing Attentions for Zero-shot Text-based Video Editing”, 2023; arXiv:2303.09535.
- Sean Welleck, Jiacheng Liu, Ximing Lu, Hannaneh Hajishirzi, Yejin Choi: “NaturalProver: Grounded Mathematical Proof Generation with Language Models”, 2022; arXiv:2205.12910.
- Elad Richardson, Gal Metzer, Yuval Alaluf, Raja Giryes, Daniel Cohen-Or: “TEXTure: Text-Guided Texturing of 3D Shapes”, 2023; arXiv:2302.01721.
- Puijin Cheng, Li Lin, Yijin Huang, Huaqing He, Wenhan Luo, Xiaoying Tang: “Learning Enhancement From Degradation: A Diffusion Model For Fundus Image Enhancement”, 2023; arXiv:2303.04603.
- Shun Shao, Yftah Ziser, Shay Cohen: “Erasure of Unaligned Attributes from Neural Representations”, 2023; arXiv:2302.02997.
- Seonghyeon Ye, Hyeonbin Hwang, Sohee Yang, Hyeongu Yun, Yireun Kim, Minjoon Seo: “In-Context Instruction Learning”, 2023; arXiv:2302.14691.
- Shikun Liu, Linxi Fan, Edward Johns, Zhiding Yu, Chaowei Xiao, Anima Anandkumar: “Prismer: A Vision-Language Model with An Ensemble of Experts”, 2023; arXiv:2303.02506.
- Haoyu Chen, Zhihua Wang, Yang Yang, Qilin Sun, Kede Ma: “Learning a Deep Color Difference Metric for Photographic Images”, 2023; arXiv:2303.14964.
- Van-Hoang Le, Hongyu Zhang: “Log Parsing with Prompt-based Few-shot Learning”, 2023; arXiv:2302.07435.
- Keito Kudo, Yoichi Aoki, Tatsuki Kuribayashi, Ana Brassard, Masashi Yoshikawa, Keisuke Sakaguchi, Kentaro Inui: “Do Deep Neural Networks Capture Compositionality in Arithmetic Reasoning?”, 2023; arXiv:2302.07866.
- Ruoyao Wang, Peter Jansen, Marc-Alexandre Côté, Prithviraj Ammanabrolu: “Behavior Cloned Transformers are Neurosymbolic Reasoners”, 2022; arXiv:2210.07382.
- Martin Wessel, Tomáš Horych, Terry Ruas, Akiko Aizawa, Bela Gipp, Timo Spinde: “Introducing MBIB — the first Media Bias Identification Benchmark Task and Dataset Collection”, 2023; arXiv:2304.13148. DOI: [https://dx.doi.org/10.1145/3539618.3591882 10.1145/3539618.3591882].
- Hila Chefer, Yuval Alaluf, Yael Vinker, Lior Wolf, Daniel Cohen-Or: “Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models”, 2023; arXiv:2301.13826.
- Marcio Fonseca, Yftah Ziser, Shay B. Cohen: “Factorizing Content and Budget Decisions in Abstractive Summarization of Long Documents”, 2022; arXiv:2205.12486.
- Elad Richardson, Gal Metzer, Yuval Alaluf, Raja Giryes, Daniel Cohen-Or: “TEXTure: Text-Guided Texturing of 3D Shapes”, 2023; arXiv:2302.01721.
- Tianxing He, Jingyu Zhang, Tianle Wang, Sachin Kumar, Kyunghyun Cho, James Glass, Yulia Tsvetkov: “On the Blind Spots of Model-Based Evaluation Metrics for Text Generation”, 2022; arXiv:2212.10020.
- Ori Ram, Yoav Levine, Itay Dalmedigos, Dor Muhlgay, Amnon Shashua, Kevin Leyton-Brown, Yoav Shoham: “In-Context Retrieval-Augmented Language Models”, 2023; arXiv:2302.00083.
- Dacheng Li, Rulin Shao, Hongyi Wang, Han Guo, Eric P. Xing, Hao Zhang: “MPCFormer: fast, performant and private Transformer inference with MPC”, 2022; arXiv:2211.01452.
- Baolin Peng, Michel Galley, Pengcheng He, Chris Brockett, Lars Liden, Elnaz Nouri, Zhou Yu, Bill Dolan, Jianfeng Gao: “GODEL: Large-Scale Pre-Training for Goal-Directed Dialog”, 2022; arXiv:2206.11309.
- Egil Rønningstad, Erik Velldal, Lilja Øvrelid: “Entity-Level Sentiment Analysis (ELSA): An exploratory task survey”, 2023, Proceedings of the 29th International Conference on Computational Linguistics, 2022, pages 6773-6783; arXiv:2304.14241.
- Charlie Snell, Ilya Kostrikov, Yi Su, Mengjiao Yang, Sergey Levine: “Offline RL for Natural Language Generation with Implicit Language Q Learning”, 2022; arXiv:2206.11871.
- Zhiruo Wang, Shuyan Zhou, Daniel Fried, Graham Neubig: “Execution-Based Evaluation for Open-Domain Code Generation”, 2022; arXiv:2212.10481.
- Minh-Long Luu, Zeyi Huang, Eric P. Xing, Yong Jae Lee, Haohan Wang: “Expeditious Saliency-guided Mix-up through Random Gradient Thresholding”, 2022; arXiv:2212.04875.
- Jun Hao Liew, Hanshu Yan, Daquan Zhou, Jiashi Feng: “MagicMix: Semantic Mixing with Diffusion Models”, 2022; arXiv:2210.16056.
- Yaqing Wang, Subhabrata Mukherjee, Xiaodong Liu, Jing Gao, Ahmed Hassan Awadallah, Jianfeng Gao: “LiST: Lite Prompted Self-training Makes Parameter-Efficient Few-shot Learners”, 2021; arXiv:2110.06274.