FedISM: Enhancing Data Imbalance via Shared Model in Federated Learning

Wu-Chun Chung, Yan-Hui Lin, Sih-Han Fang

MATHEMATICS(2023)

Cited 0|Views1
No score
Abstract
Considering the sensitivity of data in medical scenarios, federated learning (FL) is suitable for applications that require data privacy. Medical personnel can use the FL framework for machine learning to assist in analyzing large-scale data that are protected within the institution. However, not all clients have the same distribution of datasets, so data imbalance problems occur among clients. The main challenge is to overcome the performance degradation caused by low accuracy and the inability to converge the model. This paper proposes a FedISM method to enhance performance in the case of Non-Independent Identically Distribution (Non-IID). FedISM exploits a shared model trained on a candidate dataset before performing FL among clients. The Candidate Selection Mechanism (CSM) was proposed to effectively select the most suitable candidate among clients for training the shared model. Based on the proposed approaches, FedISM not only trains the shared model without sharing any raw data, but it also provides an optimal solution through the selection of the best shared model. To evaluate performance, the proposed FedISM was applied to classify coronavirus disease (COVID), pneumonia, normal, and viral pneumonia in the experiments. The Dirichlet process was also used to simulate a variety of imbalanced data distributions. Experimental results show that FedISM improves accuracy by up to 25% when privacy concerns regarding patient data are rising among medical institutions.
More
Translated text
Key words
federated learning, shared model, data imbalance, Non-IID, COVID-19
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined