Enhancing Federated Learning With Server-Side Unlabeled Data by Adaptive Client and Data Selection.

IEEE Trans. Mob. Comput.(2024)

引用 0|浏览4
暂无评分
摘要
Federated learning (FL) has been widely applied to collaboratively train deep learning (DL) models on massive end devices (i.e., clients). Due to the limited storage capacity and high labeling cost, the data on each client may be insufficient for model training. Conversely, in cloud datacenters, there exist large-scale unlabeled data, which are easy to collect from public access (e.g., social media). Herein, we propose the Ada-FedSemi system, which leverages both on-device labeled data and in-cloud unlabeled data to boost the performance of DL models. In each round, local models are aggregated to produce pseudo-labels for the unlabeled data, which are utilized to enhance the global model. Considering that the number of participating clients and the quality of pseudo-labels will have a significant impact on the training performance, we introduce a multi-armed bandit (MAB) based online algorithm to adaptively determine the participating fraction and confidence threshold. Besides, to alleviate the impact of stragglers, we assign local models of different depths for heterogeneous clients. Extensive experiments on benchmark models and datasets show that given the same resource budget, the model trained by Ada-FedSemi achieves 3%$\sim$14.8% higher test accuracy than that of the baseline methods. When achieving the same test accuracy, Ada-FedSemi saves up to 48% training cost, compared with the baselines. Under the scenario with heterogeneous clients, the proposed HeteroAda-FedSemi can further speed up the training process by $1.3\times \sim 1.5\times$1.3×1.5×.
更多
查看译文
关键词
Federated learning,pseudo-labeling.,semi-supervised learning,edge computing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要