Chrome Extension
WeChat Mini Program
Use on ChatGLM

Faster Federated Learning With Decaying Number of Local SGD Steps

IEEE Transactions on Parallel and Distributed Systems(2023)

Cited 2|Views23
No score
Abstract
In Federated Learning (FL) client devices connected over the internet collaboratively train a machine learning model without sharing their private data with a central server or with other clients. The seminal Federated Averaging (FedAvg) algorithm trains a single global model by performing rounds of local training on clients followed by model averaging. FedAvg can improve the communication-efficiency of training by performing more steps of Stochastic Gradient Descent (SGD) on clients in each round. How-ever, client data in real-world FL is highly heterogeneous, which has been extensively shown to slow model convergence and harm final performance when K > 1 steps of SGD are performed on clients per round. In this article we propose decaying K as training progresses, which can jointly improve the final performance of the FL model whilst reducing the wall-clock time and the total compu-tational cost of training compared to using a fixed K. We analyse the convergence of FedAvg with decaying K for strongly-convex objectives, providing novel insights into the convergence properties, and derive three theoretically-motivated decay schedules for K. We then perform thorough experiments on four benchmark FL datasets (FEMNIST, CIFAR100, Sentiment140, Shakespeare) to show the real-world benefit of our approaches in terms of real -world convergence time, computational cost, and generalisation performance.
More
Translated text
Key words
Training,Convergence,Computational modeling,Servers,Data models,Costs,Benchmark testing,Computational efficiency,deep learning,edge computing,federated learning
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined