代做Assignment 2: Deep Learning for Patient Outcome Prediction代写Python语言

2025.10.09 - 首页 >> Python编程

Machine Learning Applications for Health

Assignment 2: Deep Learning for Patient Outcome Prediction

[Update as of September 22]: Please note that length of stay (LoS) is typically recorded after the patient is discharged. For in-hospital mortality, LoS will not be used as a feature and therefore will not be included in the feature analysis for step 1 either.

Introduction

Profound hypotension is a life-threatening condition in critically ill patients. For clinicians on the front lines, the ability to accurately and rapidly identify which of these patients face the greatest risk of in-hospital mortality is paramount. This assignment focuses on the application of deep learning for clinical prediction. You will work with a pre-identified cohort of patients with profound hypotension to develop a model that predicts a critical outcome: in-hospital mortality.

The Task

Your primary objective is to construct, train, and evaluate a Multi-Layer Perceptron (MLP) to classify patients based on their in-hospital mortality risk (a classification task). You will navigate the entire machine learning workflow, from exploratory data analysis and feature preprocessing to hyperparameter tuning and model interpretation.

The deadline for this assignment is Wednesday, October 8th, at 11:59 PM.

Dataset

You are provided with the dataset hypotension patients.csv. The key features for your model will include demographics, a co-morbidity score, and a severity of illness score. The target outcome, in hospital mortality, is derived from the dod (Date of Death) column.

The provided cohort has the following columns:

"ID": Patient de-identified number;

"anchor_age": Age of the patient (years);

"gender": F for female and M for male;

"dod": Date of Death (if empty means survived - patient outcome variable);

"apsiii”: Severity of illness score;

"LoS": Length of stay in ICU (days - patient outcome variable);

"charlson_comorbidity_index”: Comrbidity index.

Assignment Steps

Step 1: Describe the main properties of the patient cohort via summary statistics

Begin by performing an exploratory data analysis to understand the fundamental characteristics of the patient cohort.

Calculate and present summary statistics (i.e., mean and standard deviation) for all numerical features.

Visualize the distributions of all the relevant features using appropriate plots of histograms.

Write a brief summary interpreting these findings, describing the overall patient population. (Up to 200 words)

Step 2: Linear MLP Development and Hyperparameter Tuning

Develop a linear Multi-Layer Perceptron. A linear MLP is a neural network that does not use nonlinear activation functions in its hidden layers, effectively acting as a linear classifier. Your task is to determine the optimal architecture for this model.

Systematically tune the model key hyperparameters: the number of hidden layers (1,2,3), the number of neurons per layer (16,32,64), and the learning rate (1e2, 1e-3, 1e-4).

Justify your final hyperparameter choices using a clear validation-based approach (e.g., a grid search).

Provide plots that visualize the tuning process and support your selection of the best-performing model configuration.

Summarize your methodology, the chosen parameters, and the reasoning behind your choices. (Up to 300 words)

Step 3: Evaluating the Impact of Non-Linearity

Investigate the benefit of introducing nonlinearity into your model.

Using the best architecture (layers and neurons) found in Step 2, create a nonlinear MLP by adding a non-linear activation function (e.g., ReLU or other activation functions) between the hidden layers.

Train this non-linear model and evaluate its performance on the test set.

Directly compare the performance metrics (i.e., accuracy, precision, recall, F1-score and confusion matrix) of the linear model against the non-linear model.

Write a justification for your choice of activation function and discuss what the performance comparison reveals about the complexity of the patterns in the data. (Up to 300 words)

Your code should be written in Python. You can use any publicly available libraries of your choice.

Submission Requirements

Your submission should consist of three files:

1. A README file that states your name, the assignment, a description of the problem being addressed by the code, and instructions on how to create the environment to run your submission.

2. Afile that completely specifies the code environment you used (that is, the dependencies and version numbers for all the software you used in your solution. This could be either a requirements.txt (vanilla Python) or environment.yaml (for Anaconda).

3. A single Jupyter notebook with all the code, output, and explanatory text included.