Gradient SparsificationDistributed-Methods Optimization Stochastic-Optimization Data-Parallel-Methods