Overview
This page provides an overview of the Federated Learning module in MEDomicsLab, offering insights into both the application's interface and the backend package employed for conducting experiments.
Introduction
The Federated Learning Module in MEDomicsLab simulates the process of federated learning and allows for training models in a decentralized manner using multiple datasets. This approach preserves privacy and enhances data security by ensuring that data never leaves its original location.
Key Aspects of the Federated Learning Module:
Decentralized Training: Models are trained across multiple nodes without transferring raw data.
Privacy Preservation: Utilizing techniques like differential privacy to ensure data confidentiality.
Hyperparameter Optimization: Tools to automatically tune and optimize model hyperparameters for improved performance.
Transfer learning: Allowing the user to use pre-trained models to initialize the central server to improve the model performance
Video Tutorial
MEDfl package
The Federated Learning module in the MEDomicsLab application uses MEDfl in the backend, a standalone Python package designed for simulating federated learning.
You can also use MEDfl independently from the app to create your networks and pipelines directly with code. Below is a brief example demonstrating how to do that.
For more detailed examples, you can check the tutorials on the GitHub repository.
Application Interface
The interface of the MEDfl module in the MEDomicsLab application provides a user-friendly space where you can visually manage and connect multiple nodes to create your federated learning pipelines. Each node type in the interface has a specific role and attributes, allowing you to build and customize your federated learning networks seamlessly.
Below is a table explaining the role and attributes of each node:
Node | Description | input | Output |
---|---|---|---|
The Dataset Node is where you specify the master dataset for your experiment. The master dataset is used differently based on the type of network you create:
To select a master dataset, click on the "Select Dataset" button, choose the file, and specify the target of the dataset. | / | Dataset | |
The Network Node is responsible for creating the federated network. A new screen will appear when you click on it, displaying additional node types: the Client Node and the Server Node. You will have the option to add multiple clients and a central server that will aggregate the results. | Dataset | Network | |
The FL Setup Node is responsible for configuring the federated learning setup. The user only needs to specify the name and description of the setup. | Network | Flsetup | |
The FL Dataset Node creates the federated dataset, which generates train, test, and validation loaders from the clients' datasets. To create a federated dataset, the user must specify two parameters:
| Flsetupt | FL dataset | |
The Model Node is responsible for creating the model that initializes the federated learning process. The user has several options based on whether they activate or deactivate transfer learning:
| FL datatset | model | |
The Optimize Node is responsible for hyperparameter optimization. Users can optimize hyperparameters using the following methods:
For more details on Optuna, you can find additional information here. | Model + Dataset | Model | |
The FL Strategy Node is responsible for creating the server strategy to aggregate and manage the network it contains. This includes defining:
| Model | fl strategy | |
The Train Model Node is used to define the client resources for training, specifying whether to utilize GPU or CPU resources during the training process. | flstrategy | train results | |
The Train Model Node is used to define the client resources for training, specifying whether to utilize GPU or CPU resources during the training process. | train results | save results | |
This node is used to merge two or more results files into one file | save results / none |