Why Federated Learning Is Pivotal to Privacy-Preserving Machine Learning

The simplest way to implement machine learning (ML) is to run it on a central processing platform against all the available data. It can then use the algorithms built into it to derive as much information as possible and create its own algorithms. The data it will be using may come from any number of other processing platforms and may well contain information that could be viewed as private. Moving that data away from its original platform could also put data privacy at risk. So, what can be done instead to enable machine learning and preserve the privacy of the data? The answer is federated learning â€“ but how does that work?

Federated learning uses decentralized edge devices (e.g. mobile phones) or servers to hold the data and runs machine learning algorithms against this distributed data. At no time is the original data transferred to a centralized server anywhere. It stays on the device. The benefits of this approach are data privacy and data security â€“ there is no way that the data can be accessed by anyone else. The local versions of the algorithm are trained on local data. The results of their learning can then be shared with a central server to produce a â€˜global’ model or algorithm. This can then be re-shared with the edge devices to continue learning.Â

The approach has been described as low-impact, resilient, and secure. The final model produced will be more accurate than with a centralized model because of the huge amounts of relevant data that have been used in the training process. In addition, with the growing importance of green issues, it’s worth noting that less power is consumed because the models are trained on edge devices.

Learn more: Combating Infobesity: 5 Things To Keep in Mind When Building Machine Learning Models

Difficulties With Federated Learning

Working this way does have some problems. For example, the local datasets are not identically distributed, and some may contain lots of relevant data for the algorithm to use, whereas others may not. Another problem with using, for example, mobile phones and IoT devices is that they may experience power failures and may lose their connection with WiFi. So, the algorithm may not always run and won’t send its data back to the central server.

Another problem that has to be overcome is that training the algorithm on the device, eg, a mobile phone, could drain the battery on that device. One way around that problem is to ensure that devices only participate in the training when eligible. And phones would be eligible when they are charging, and connected to free WiFi, and not doing anything else.

Training and Security

Once the training algorithm finishes work on the data, it sends the results (not the data) back to the server. The results that have been sent are encrypted. This prevents anyone from looking at the results and reconstructing the original data. To make things more secure, the results can be encrypted using a key that the server doesn’t have â€“ making it almost impossible to reconstruct any data. Several packages (models) of training data can be sent over time to produce a high-quality model from each device. The training algorithm can then delete itself from the device it was running on. For the upload times for these updates to be as fast as possible, they are compressed using random rotations and quantization.Â

Federated averaging is a technique used by the server where it only uses the average results from the updates received. An additional technique to use is called secure aggregation. With this technique, the server combines the encrypted results from any number of edge devices, and it can then only decrypt the aggregate results. This adds an extra layer of security, making it even more difficult to reconstruct the original data. Each edge device that will be sending its training results first adds zero-sum masks to the results. The secure aggregation protocol does this. The scrambled results are then transmitted. At the other end, when those training results are added up, the masks exactly cancel out.

Although the server can’t see the results from any edge device, there is still a privacy question about what would happen if one device has produced unique data that was so different from the rest of the results collected that it stood out. The question is this: could the privacy of the source of that unique data from that one device be compromised by appearing in with the other results data on the server. The worrying answer is that could happen. To prevent this happening, any unusual data is discarded. The rationale is that for machine learning to work optimally, it needs to capture and use common patterns in the data. So, anything specific to a single device will not be very useful for the model built using machine learning.

Another technique that can be used is called differential privacy. This is a way of limiting the amount of data that can be used in the model from any single edge device. It’s also possible to add noise to obscure any rare data. That prevents a single device from contributing too much data and having too big an influence on the model being created. This is referred to as model memorization.

Learn more: Getting the Data and AI Implementation Right for Your Organization

Testing the Improved Model

At this stage, the new improved model can be tested by pushing it out to the edge devices. Initially, the algorithm was trained on the device. Now the quality of the model produced can be tested on the devices as they are used.

It’s possible to train and test simultaneously, but, obviously, on different devices. Federated learning allows training, testing, and analytics to be performed privately and securely. Using this method, the centralized model can be continuously updated, improved, and tested again. Several thousands of iterations can take place over a few days. The final version can then be pushed out to edge devices and used. Because it will have been tested on lots of data, it will be very accurate. And, because it used federated learning, there was no centralization of the data, which all stayed on the device it came from. Naturally, the model being used will be static, i.e. there won’t be any updates for a while, and then a new round of training can take place to produce the update.

This type of approach is often used with mobile phones. An example would be predictive text, which suggests the whole word that you have just started to type, and the model will then try to predict the next word that you want to type. Other examples where it might be used is writing emails or training self-driving cars based on the way drivers actually behave. Or, it could possibly be used in hospitals to improve the diagnosis given to patients while also maintaining the patient’s privacy.

Alternative Models to Federated Learning

In theory, federated learning comes in three models. There’s centralized federated learning, which orchestrates the different steps in the training process from a central server and coordinates all of the edge devices being used.

With the decentralized, federated learning model, the edge devices themselves coordinate the training process and produce the final model. This arrangement removes any single point failure.

Heterogeneous federated learning involves the use of heterogeneous clients such as mobile phones. This is the type looked at here.

Learn more: Ethical AI May Not See Widespread Adoption by 2030: Pew Research

Conclusion

Federated learning is said to provide a way to â€œlearn from everyone without learning about anyoneâ€. It’s a much more secure way of working than centralized working, and all the data used is kept private. It’s also faster and more energy-efficient than using a centralized ML model.

Do you think federated learning is a better bet than a centralised ML model? Comment below or let us know on LinkedInOpens a new window , TwitterOpens a new window , or FacebookOpens a new window . We would love to hear from you!