Data Parallelism
Step 1 of 5
Pure PyTorch distributed data parallelism with all-reduce gradient sync.