Step 6: Create Model
Mar 25 – Apr 8 | Create Model
Mar 25 – Apr 8 | Create Model
Scene 1: Time-Dependent Model Comparison
We aim to assess the impact of patient timelines on model performance, hypothesizing that the performance will increase with time, particularly nearing the last hospital stay. We will compare the best models from the following datasets:
Dataset from the data obtained at the first time point (T1_learning_modified.csv).
Dataset combining data from the first and second time points (T1_learning_modified.csv and T2_learning_modified.csv).
Dataset combining data from the first, second, and third time points (T1_learning_modified.csv, T2_learning_modified.csv, and T3_learning_modified.csv).
Dataset combining data from all time points (T1_learning_modified.csv, T2_learning_modified.csv, T3_learning_modified.csv, and T4_learning_modified.csv).
Scene 2: Variable-Dependent Model Comparison
This scene aims to assess the impact of considered variables on model performance. We will use data from the first two time points (T1_learning_modified.csv and T2_learning_modified.csv), assuming that models involving data from the last time points might make predictions too late in a patient's timeline. We'll compare the best models from the following datasets:
All demographic and time-series data (tslab, tsprocedure, and tschart classes) from T1_learning_modified.csv and T2_learning_modified.csv.
All demographic and notes data (ndischarge and nradiology) from T1_learning_modified.csv and T2_learning_modified.csv.
All demographic and image data from T1_learning_modified.csv and T2_learning_modified.csv.
Selected variables from various data types based on observations made using the first three pipelines, aiming to obtain the best possible model.
These scenes are designed to provide a comprehensive comparison of models under different temporal and variable considerations.
Content
In this current Step 6 - Create Model, we will leverage the functionalities of the to build machine learning models using the learning set obtained from . In this step, we'll create two Learning scenes:
You are welcome to use this step to conduct your own experiments and explore the functionalities of the . However, please note that there are some missing options and tooltips that we haven't implemented yet, and we intend to address these before .
Before proceeding with Step 6 - Create Model of the MEDomicsLab Testing Phase, we recommend consulting the documentation of the .
Please note that the is a graphical implementation of the . Additionally, if you are seeking information about elements in the Learning Module, you may find it in the .
The often refers to other Python packages, as they built their functions around these packages. If you want to learn more about some options of certain functionalities, you may need to search in these other packages to find the information you are looking for.
Visit the , specifically the .
The contains explanations about related parameters, including the fold_strategy
parameter. It specifies that this parameter takes, as input, predefined strings or a cross-validation object compatible with . If you want additional information about the possible parameters, you'll have to search for the information on your own in the . For example, if you want to know more about the default value for fold_strategy
(which is stratifiedkfold
), you will have to search for 'stratifiedkfold' in the . The page related to this information is available .
Also, if you want to fully understand how works in the background, this is an open-source library, and the code is available on . (As we use the 3.1.0 version in our application, we recommend you to consult the if your research is related to our application).
Please pay attention to our last sections in the :
Intro
First Pipeline
Explanations about PyCaret
Scene 1: Time-Dependent Model Comparison
Scene 2: Variable-Dependent Model Comparison