Chrome Extension
WeChat Mini Program
Use on ChatGLM

On Arbitrary Ignorance of Stragglers with Gradient Coding

2023 IEEE 43rd International Conference on Distributed Computing Systems (ICDCS)(2023)

Cited 0|Views0
No score
Abstract
Gradient methods, such as gradient descent, are widely deployed to train optimization-based models in machine learning. To train such models on a large dataset, the dataset is commonly split into multiple partitions which are trained on different workers. In order to tolerate stragglers, existing techniques either use gradient coding (GC) to recover the full gradients from a certain number of workers or recover gradients partially from an arbitrary number of workers. In this paper, we propose ignore-straggler gradient coding (IS-GC) that allows GC to tolerate an arbitrary number of stragglers. Compared to approximated gradient descent, IS-GC can recover more gradients when there is the same number of stragglers. We design a graph-based model to decode coded gradients from an arbitrary number of stragglers, and prove that it can maximize the recovery of gradients. We apply IS-GC on fractional repetition (FR) and cyclic repetition (CR), two representative dataset placement schemes of GC. We also propose hybrid repetition (HR) that generalizes over FR and CR and achieves a flexible trade-off between FR and CR. With extensive experiments, we demonstrate that IS-GC can flexibly tolerate an arbitrary number of stragglers and achieve a low completion time of training.
More
Translated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined