Step 2: Extract Data
Jan 29 - Feb 12 | Extract Data
Jan 29 - Feb 12 | Extract Data
This is the step where coffee breaks will be the most useful (a considerable amount of computation time will be required in this step) .
The Testing Phase of MEDomicsLab involves the use of MIMIC data. For the purpose of the Testing Phase, we selected a small subset of patients (200) with the intent of simulating a longitudinal clinical decision support system (CDSS) scenario as in the study of Morin et al. (see Figure 5) using multimodality data (tabular, time series, text, images) acquired from multiple patient visits over a year (i.e. all data from the last year since death or the last visit of a given patient). In the subsequent steps of the Testing Phase, one of the goal will be to predict one-year mortality using this data.
In this step of the Testing Phase, we will extract relevant features from time series, text and imaging data based on the package tsfresh (time series) and pretrained models from the study of Soenksen et al. (text and images). More specifically, this includes:
Images from the MIMIC-CXR-JPG database
Laboratory events (time series) from the MIMIC-IV database (we considered a subset of events as in the study of Soenksen et al.)
Chart events (time series) from the MIMIC-IV database (we considered a subset of events as in the study of Soenksen et al.)
Procedure events (time series) from the MIMIC-IV database (we considered a subset of events as in the study of Soenksen et al.)
Radiology notes from the MIMIC-IV-Note database
Discharge notes from the MIMIC-IV-Note database
Finally, note that the criterion for selecting patients for the Testing Phase (100 alive and 100 deceased after one year) was based on the maximization of the entropy of the distribution of multimodal data over one year.
At this point you must have completed the MIMIC data access requirements to have access to the data.
Once you have submitted the required documents to the MEDomicsLab team, you will receive an email with a link to a drive space, containing a zip folder. Simply download the folder into your documents, and follow the instructions for Step 2 in the video below.
For additional informations about the Extraction Module and the extraction types used in our application, please refer to the tutorials of the Extraction Module.
Content
Get data and create your workspace 0:00
Extract data from discharge notes 1:32
Extract data from radiology notes 2:52
Extract data from chart events 3:26
Extract data from laboratory events 4:51
Extract data from procedure events 5:50
Extract data from chest X-Ray images 6:59
NOTE: We commonly refer to the study of Soenksen et al. as the "HAIM study".