MEDomicsLab-docs
V1
V1
  • 👋Welcome!
  • 👊Quick start
  • 👀Overview
  • 🏫Tutorials
    • 🔵Design
      • Extraction Module
        • Image Extraction Page
        • Text Extraction Page
        • Time Series Extraction Page
        • MEDimage
      • Input Module
        • Feature Reduction Tool
        • MEDprofiles
          • MEDprofiles Viewer
      • Exploratory Module
    • 🟠Development
      • Learning Module
      • Evaluation Module
    • 🟢Deployment
      • Application Module
  • 🆕New features
  • Upcoming features
  • 💻Contributing
    • Our coding standards
    • How to push my modification ?
  • 🤕Troubleshooting
  • ❓FAQ
  • 🤓About us
  • Important Links
    • Official Website
    • 📔Release Notes
    • 🥲Known Issues
    • 😎Project Board
    • 🧬Physionet
  • MEDIA
    • ⚛️MEDomics
    • 👾Discord
    • 😺Github
    • 📺YouTube
  • Forms
    • 🗣️Contact us
    • 📝Report an issue
Powered by GitBook
On this page
  • 1. Select JPG data
  • 2. Select an extraction type
  • 2.1. DenseNet
  • 3. Extract features
  • 4. Extracted data
Edit on GitHub
  1. Tutorials
  2. Design
  3. Extraction Module

Image Extraction Page

The image extraction page takes JPG images as input and extracts embeddings using a selected model.

PreviousExtraction ModuleNextText Extraction Page

Last updated 11 months ago

When you click on the image extraction icon, you should see this page :

1. Select JPG data

The first step on this page is to select your input, which is a folder containing folders containing JPG images. The folders contained in the main folder must correspond to patients (one folder by patient). All the JPG files must be at the same folder depth level. Once your data will be imported, the warning "No data imported" will be replaced with a success message "Data successfully imported".

2. Select an extraction type

For now, only the DenseNet extraction type is available.

2.1. DenseNet

2.2.1. Select your model weights

The TorchXRayVision library provides seven different weights for the DenseNet model that are available in our application. By default the model weights is set to 'densenet121-res224-chex'.

2.1.2. Select the features you want to generate

The TorchXRayVision DenseNet model provides a vector of 1024 densefeatures and a vector of 18 predictions for the following targets : Atelectasis, Consolidation, Infiltration, Pneumothorax, Edema, Emphysema, Fibrosis, Effusion, Pneumonia, Pleural_Thickening, Cardiomegaly, Nodule, Mass, Hernia, Lung Lesion, Fracture, Lung Opacity, Enlarged Cardiomediastinum.

You may choose to generate only the densefeatures, only the predictions or both.

2.1.3. Format the dataset as master table

Regardless of the selected options, there is a toggle button indicating whether you want your generated embeddings to be Master Table Compatible. Turning this option on will generate embeddings that can be used in the MEDprofiles' process within the input module. The tables generated for the MEDprofiles' process may contain less or different information than the original tables.

For the images extraction, turning this option on will require a CSV file that associates the image filenames (including the .jpg extension) to a datetime. Also, you have to indicate which folder level corresponds to the patients identifiers and there is a checkbox that allows you to convert the patients identifiers folder name into integers (for example if a folder name is 'p123', it will be converted into '123' in the generated embeddings table). This option is useful if you want to compare the patients image data to other data types where the patients identifiers are numbers.

2.1.4. Column name prefix

You can choose to assign a prefix to the generated embeddings column names. This is useful for entering the MEDprofiles' process in the input module, especially for creating MEDclasses that depend on this prefix column name. The prefix must consist only of letters and/or numbers and cannot be empty. The default prefix is 'img'.

3. Extract features

Once all the previous steps have been completed, you can proceed to feature extraction. If a warning appears stating, 'You must select convenient options for feature generation', and the 'Extract Data' button is disabled, please check if you have provided all the required information in the 'Select an Extraction Type' section.

In this section, you can specify the filename under which you want to save your generated embeddings. The filename must be followed by the .csv extension, composed only of letters, numbers, and/or the '_' character, and cannot be empty. The default filename is 'image_extracted_features.csv.' The file will be saved under DATA/extracted_features.

Finally, you can initiate the extraction process by clicking the 'Extract Data' button. This may take a few minutes, and the progress will be displayed in this section and in the output tab.

4. Extracted data

Once the extraction process is complete (which may take a few minutes, but you can monitor the progress on the output tab), a message will appear at the bottom of the page indicating where the features have been saved. You can review your results in the 'Extracted data' section by toggling on the switch. Alternatively, you can open your generated CSV file in your workspace.

This extraction type uses the pre-trained DenseNet model from the TorchXRayVision python library : . TorchXRayVision is a library of chest X-ray datasets and models, therefore this model is intended to be used on chest radiography.

🏫
🔵
https://mlmed.org/torchxrayvision/models.html
Image extraction page
Select JPG data
Select your model weights
Select the features you want to generate
Format the dataset as master table
Column name prefix
Extract features
Section Extracted data while features have been generated