Goal
To identify predictors of blood pressure and develop a predictive model to identify individuals at high risk of hypertension. The aim is to explore the relationship between blood pressure and factors like pregnancies, blood glucose, BMI, age, and diabetes status.
Dataset
The dataset (available on Kaggle) contained data on 768 female patients of Pima Indian heritage, aged 21 years and above. The dataset includes variables like pregnancies, glucose, blood pressure, BMI, age, and diabetes status, among others. Originally intended for diabetes classification, this dataset offers a unique opportunity to explore correlations with hypertension, thereby broadening its applicability in medical research.
What I Did
- Data Preparation: Utilized Python to address issues like encoding missing data as '0'.
- Exploratory Data Analysis (EDA): Conducted EDA using histograms, KDE plots, and boxplots to understand variable distributions and identify outliers.
- Statistical Testing: Performed two-sample t-tests to compare blood pressure across different groups of pregnancies, glucose levels, BMI, and age, finding statistically significant differences in most comparisons.
- Predictive Modeling: Developed a multiple linear regression model to predict blood pressure, focusing on variables like pregnancies, BMI, and age.
Major Findings
- Statistically Significant Relationships: Found notable blood pressure variations based on pregnancies, glucose levels, BMI, and age.
- Predictive Modeling Insights: Built a multiple linear regression model predicting blood pressure influenced by factors like BMI and age. This model explained 18% of the blood pressure variance, underlining hypertension's complexity.
- Interconnected Health Conditions: Highlighted the interconnected nature of health conditions, showcasing how a diabetes-focused dataset can provide insights into hypertension.
- Broader Health Status Insights: Used a diabetes-focused dataset to gain broader insights into overall patient health.
- Cardiovascular Risk Management: Stressed the need for early detection of hypertension indicators to enhance cardiovascular health management.