MEDomicsLab-docs
V0
V0
  • 👋Welcome!
  • 👊Quick start
  • 👀Overview
  • 🧑‍🏫Tutorials
    • 🔵Design
      • Extraction Module
        • Image Extraction Page
        • Text Extraction Page
        • Time Series Extraction Page
        • MEDimage
      • Input Module
        • Feature Reduction Tool
        • MEDprofiles
          • MEDprofiles Viewer
      • Exploratory Module
    • 🟠Development
      • Learning Module
      • Evaluation Module
      • Federated Learning Module
        • Overview
        • Configure database
        • Create pipelines
        • Pipeline results
        • Hyperparameters optimization
        • Merge results
        • Crash tutorial
    • 🟢Deployment
      • Application Module
    • 🛠️Miscellaneous
  • 📄Testing Phase with MIMIC
    • MIMIC data access
    • Step 1: Install and Explore
    • Step 2: Extract Data
    • Step 3: Prepare ML tables
    • Step 4: Explore Data
    • Step 5: Vacations
    • Step 6: Create Model
    • Step 7: Evaluate & Apply Model
    • Step 8: Challenge
    • Wrap-Up
  • 👩‍💻Contributing
    • Our coding standards
    • How to push my modification ?
  • 🤕Troubleshooting
  • ❓FAQ
  • 🤓About us
  • Important Links
    • Official Website
    • 📔Release Notes
    • 🥲Known Issues
    • 😎Project Board
    • 🧬Physionet
  • MEDIA
    • ⚛️MEDomics
    • 👾Discord
    • 😺Github
    • 📺YouTube
  • Forms
    • 🗣️Contact us
    • 📝Report an issue
    • ‼️Join the testing phase
Powered by GitBook
On this page
  • Recommendations
  • Instructions for Step 6 - Create Model
  1. Testing Phase with MIMIC

Step 6: Create Model

Mar 25 – Apr 8 | Create Model

PreviousStep 5: VacationsNextStep 7: Evaluate & Apply Model

If you completed , you have data ready for Step 6 - Create Model.

However, before proceeding to Step 6 - Create Model, we recommend that you replace your own output data from (the MEDprofiles/timePoints folder) with the data that we prepared for you (MEDomicsLab_TestingPhase_Step6.zip). This will ensure consistency of results across all participants of the Testing Phase.

An invitation to access the MEDomicsLab_TestingPhase_Step6.zip data was sent by email.

Scene 1: Time-Dependent Model Comparison

We aim to assess the impact of patient timelines on model performance, hypothesizing that the performance will increase with time, particularly nearing the last hospital stay. We will compare the best models from the following datasets:

  1. Dataset from the data obtained at the first time point (T1_learning_modified.csv).

  2. Dataset combining data from the first and second time points (T1_learning_modified.csv and T2_learning_modified.csv).

  3. Dataset combining data from the first, second, and third time points (T1_learning_modified.csv, T2_learning_modified.csv, and T3_learning_modified.csv).

  4. Dataset combining data from all time points (T1_learning_modified.csv, T2_learning_modified.csv, T3_learning_modified.csv, and T4_learning_modified.csv).

Scene 2: Variable-Dependent Model Comparison

This scene aims to assess the impact of considered variables on model performance. We will use data from the first two time points (T1_learning_modified.csv and T2_learning_modified.csv), assuming that models involving data from the last time points might make predictions too late in a patient's timeline. We'll compare the best models from the following datasets:

  1. All demographic and time-series data (tslab, tsprocedure, and tschart classes) from T1_learning_modified.csv and T2_learning_modified.csv.

  2. All demographic and notes data (ndischarge and nradiology) from T1_learning_modified.csv and T2_learning_modified.csv.

  3. All demographic and image data from T1_learning_modified.csv and T2_learning_modified.csv.

  4. Selected variables from various data types based on observations made using the first three pipelines, aiming to obtain the best possible model.

These scenes are designed to provide a comprehensive comparison of models under different temporal and variable considerations.

Recommendations

For example, if you are looking for information on the fold_strategy parameter in the Dataset box:

  1. Look for the category related to the fold_strategy parameter, which is under Other Setup Parameters -> Model Selection.

  • What PyCaret does?

  • PyCaret ROC (Receiver Operating Characteristic)/AUC (Area Under the Curve) plots

Instructions for Step 6 - Create Model

Content

In this current Step 6 - Create Model, we will leverage the functionalities of the to build machine learning models using the learning set obtained from . In this step, we'll create two Learning scenes:

You are welcome to use this step to conduct your own experiments and explore the functionalities of the . However, please note that there are some missing options and tooltips that we haven't implemented yet, and we intend to address these before .

Before proceeding with Step 6 - Create Model of the MEDomicsLab Testing Phase, we recommend consulting the documentation of the .

Please note that the is a graphical implementation of the . Additionally, if you are seeking information about elements in the Learning Module, you may find it in the .

The often refers to other Python packages, as they built their functions around these packages. If you want to learn more about some options of certain functionalities, you may need to search in these other packages to find the information you are looking for.

Visit the , specifically the .

The contains explanations about related parameters, including the fold_strategy parameter. It specifies that this parameter takes, as input, predefined strings or a cross-validation object compatible with . If you want additional information about the possible parameters, you'll have to search for the information on your own in the . For example, if you want to know more about the default value for fold_strategy (which is stratifiedkfold), you will have to search for 'stratifiedkfold' in the . The page related to this information is available .

Also, if you want to fully understand how works in the background, this is an open-source library, and the code is available on . (As we use the 3.1.0 version in our application, we recommend you to consult the if your research is related to our application).

Please pay attention to our last sections in the :

Intro

First Pipeline

Explanations about PyCaret

Scene 1: Time-Dependent Model Comparison

Scene 2: Variable-Dependent Model Comparison

📄
Learning Module
Step 4 - Explore Data
😉
Learning Module
Step 8 - Challenge
Learning Module
Learning Module
Learning Module
PyCaret Python library
PyCaret documentation
PyCaret documentation
PyCaret documentation
Data Preprocessing section
Model Selection part
scikit-learn
scikit-learn documentation
scikit-learn documentation
here
PyCaret
GitHub
3.1.0 code
Learning Module
0:00
1:09
5:37
7:35
17:12
Step 4 - Explore Data
Step 4 - Explore Data
Step 6 - Create Model