NVIDIA Transformer Engine
A Deep Dive into Efficient Transformer Model Training
Abstract
NVIDIA's Transformer Engine (TE) https://github.com/NVIDIA/TransformerEngine significantly accelerates Transformer model training and inference, particularly on NVIDIA's Hopper and Ada archit…
