Learning while Respecting Privacy and Robustness to Adversarial Distributed Datasets.

European Signal Processing Conference (EUSIPCO)(2022)

引用 0|浏览8
暂无评分
摘要
Massive datasets are typically distributed geographically across multiple sites, where scalability, data privacy and integrity, as well as bandwidth scarcity typically discourage uploading these data to a central server. This has propelled the so-called federated learning framework where multiple workers exchange information with a server to learn a "centralized" model using data locally generated and/or stored across workers. This learning framework necessitates workers to communicate iteratively with the server. Although appealing for its scalability, one needs to carefully address the various data distribution shifts across workers, which degrades the performance of the learnt model. In this context, the distributionally robust optimization framework is considered here. The objective is to endow the trained model with robustness against adversarially manipulated input data, or, distributional uncertainties, such as mismatches between training and testing data distributions, or among datasets stored at different workers. To this aim, the data distribution is assumed unknown, and to land within a Wasserstein ball centered around the empirical data distribution. This robust learning task entails an infinite-dimensional optimization problem, which is challenging. Leveraging a strong duality result, a surrogate is obtained, for which a primal-dual algorithm is developed. Compared to classical methods, the proposed algorithm offers robustness with little computational overhead. Numerical tests using image datasets showcase the merits of the proposed algorithm under several existing adversarial attacks and distributional uncertainties.
更多
查看译文
关键词
bandwidth scarcity,central server,federated learning framework,multiple workers exchange information,centralized model,data distribution shifts,learnt model,distributionally robust op-timization framework,trained model,adversarially manipulated input data,distributional uncertainties,testing data distributions,different workers,empirical data distribution,robust learning task,image datasets,existing adversarial attacks,adversarial distributed datasets,massive datasets,data privacy
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要