Quinnipiac iQ Career and Experiential Learning Lab

Data Science

Maternal Health Risk Classification

Graphic that says, "Maternal Health Risk Classification"

Data Science

Maternal Health Risk Classification

By testing multiple machine learning methods, this study conducted by Lily Vogel '25 aimed to improve maternal health risk classification. This research was completed for DS 480: Data Science Capstone.

Overview

Maternal health risk refers to the level in which a mother or child is likely to experience harmful outcomes during the pregnancy or labor period. Despite medical advances, it remains a global issue. This study aimed to improve maternal health risk classification by testing multiple machine learning methods.

Researcher

Headshot of Lily Vogel

Lily Vogel '25

Data Science

College of Arts & Sciences

Maternal Health Risk Classification

 

Background

Maternal health risk refers to the likelihood that a mother or child experiences harmful outcomes during the pregnancy, labor, or postpartum period (Togunwa et al., 2023). 

Unfortunately, maternal health risk remains a global concern. According to the World Health Organization, nearly 800 women died every day on average from preventable issues during pregnancy and childbirth in 2020 (WHO, 2024). 

Machine learning models can recognize complex patterns within medical data that could be overlooked by humans or traditional statistical analyses (Sathasivam and Abdullahi, 2024). 

While previous studies have explored machine learning techniques for maternal health risk classification, there are still limitations in model variation and feature importance analysis.

 

Introduction

Given gaps in research, this study aimed to improve these issues by examining multiple ML methods to assess maternal health risk using a dataset with more maternal health features to identify the most crucial ones. 

In this study, I aimed to improve maternal health risk classification by addressing two research goals: 

  1. First, I used three machine learning models – an ordered probit logistic regression, a decision tree, and a random forest – to identify the most important factors for classifying maternal health risk.
  2. Second, I evaluated and compared overall model accuracy as well as within- group performance using confusion matrices. 

By understanding key risk factors and improving accuracy, this study contributes to the effort to use machine learning to improve maternal healthcare outcomes.

 

Methods

Dataset

  • All data came from the Maternal Health Risk dataset from the University of California, Irvine (UCI) Machine Learning Repository (Ahmed, 2020).
  • Data was collected from hospitals, community clinics, and other healthcare facilities in Bangladesh.
  • It contained six features, including age, systolic blood pressure, diastolic blood pressure, blood glucose levels, body temperature and heart rate.
  • The target variable, risk level during pregnancy, had three classes – low, mid, and high risk. 

Predictors of Maternal Health Risk

  • Fit 3 optimized models – ordered probit logistic regression, decision tree, and random forest.
  • Found most important predictors for classifying risk in each model. 

Model Comparison

  • Compared the accuracy of each of the 3 models. 
  • accuracy = sum of correct classifications / total classifications
  • Analyzed within-group classification accuracies by creating confusion matrices.

 

Discussion

Predictors of Maternal Health Risk

  • Blood glucose level was the most important predictor for classification.
  • Systolic blood pressure was the second most important predictor.
  • Heart rate was the least important feature for classification as it seemed to contribute to the models the least overall. 

Model Comparison

  • The results suggest a substantial difference in classification performance between the models, with random forest outperforming the other two models by a considerable margin.
  • The decision tree model demonstrated moderate performance, ranking second in terms of accuracy.
  • The ordinal logistic regression model yielded the worst accuracy out of the three models. 

Within-Group Classification

  • Low and high-risk classes were classified most accurately.
  • Classifying mid risk was difficult for the models – over half were misclassified, with most being wrongly classified as low risk.
  • Overall, it suggests that the model underestimates moderate risk.

Research Limitations

  • Since all the data was collected from a single country, the dataset lacked geographic diversity. This suggests that the results may not be generalizable to the whole population.
  • The dataset lacked social, economic, and psychological variables, which could be important in maternal health risk prediction.

 

Future Directions

In the future, there are several different ways I could extend my research. Specifically, I would like to:

  • Improve model performance by experimenting with hybrid models and tuning more hyperparameters.
  • Test model generalizability by using a dataset from a different region.
  • Apply the same research goals to a dataset containing multiple demographic features to explore feature importance across different patient groups.
  • Integrate information from Internet of Things (IoT) devices such as data from wearable devices to develop real-time risk evaluation.

 

For Further Discussion

This serves as an overview of the project and does not include the complete work. To further discuss this project, please email Lily Vogel.

Course Overview

DS 480: Data Science Capstone serves as a culminating experience for the Data Science major. Students work on an independent project that will allow them to integrate knowledge from their previous courses in the major and apply that knowledge to a problem in a domain of their interest.

Explore Our Areas of Interest

We've sorted each of our undergraduate, graduate and doctoral programs into unique Areas of Interest. Explore these categories to discover which programs and delivery methods best align with your educational and career goals.

Explore Business and Finance at Quinnipiac

Explore Computing and Technology at Quinnipiac

Explore Data and Analytics at Quinnipiac

Explore STEM Programs at Quinnipiac