Secure and Practical Federated Matrix Factorization

PhD Thesis Proposal Defence


Title: "Secure and Practical Federated Matrix Factorization"

by

Mr. Di CHAI


Abstract:

Matrix factorization (MF) is an essential primitive to support various 
applications including recommender systems, genetic studies, topic modeling, 
financial applications etc. While most MF-based applications deal with 
large-scale sensitive data, e.g., users' shopping history in the recommender 
system and genome data in genetic studies, they encounter the 
isolated-data-islands problem. Restricted by the privacy-preserving 
regulations, it is challenging to share the data across different parties. 
Consequently, the data are in the form of isolated islands and conventional MF 
that requires data collected centrally cannot work. To solve this problem, 
federated MF has been proposed to decompose data distributed among different 
parties. While pioneered studies have shown the feasibility of federated MF, 
they are limited in terms of privacy, utility, or efficiency.

In this proposal, we first define the problem of federated MF and its two 
subproblems: federated sparse MF and federated dense MF. We then present the 
targets achieved by existing works and propose three new research questions. In 
the first question, we focus on the MF-based recommender system that deals with 
sparse matrices. Existing works exchange gradients in plaintext, raising 
significant privacy concerns. Our first contribution is mathematically proving 
that gradients leak raw data, leading us to protect the gradients using 
homomorphic encryption. However, even with the gradient values secured, the 
indexes of the gradients can still leak information (e.g., used for inference 
attacks) due to the sparsity of the rating matrices. To address this, we 
propose an obfuscation-based method to defend against inference attacks with 
bounded utility loss and efficiency overhead. In the second question, we focus 
on federated dense MF which supports applications such as genetic studies, 
topic modeling, and financial applications. These applications handle 
large-scale dense matrices and require high accuracy. Existing works either 
suffer from utility penalties because of leveraging differential privacy or 
face severe efficiency issues due to leveraging homomorphic encryption. We 
propose a practical lossless federated singular vector decomposition (SVD) 
system that is capable of decomposing billion-scale data. Specifically, we 
propose lossless masking-based protection tailored for federated SVD and 
improve the efficiency through extensive system optimizations. The evaluation 
results demonstrate that our proposed solutions outperform existing studies and 
significantly improve the practicality of MF-based real-world applications.

In the future thesis work, we plan to concentrate on decentralized federated 
MF, which is the third question. Most of the existing solutions rely on 
third-party servers, which compromises the system's security and increases 
implementation costs, given that trustworthy servers are not easy to find. We 
believe that decentralized solutions can further improve the security and 
practicality of federated MF systems.


Date:                   Monday, 15 April 2024

Time:                   2:00pm - 4:00pm

Venue:                  Room 5501
                        Lifts 25/26

Committee Members:      Prof. Qiang Yang (Supervisor)
                        Prof. Kai Chen (Supervisor)
                        Prof. Qifeng Chen (Chairperson)
                        Prof. Ke Yi