Você está na página 1de 27

Utilization of Predictive Modelling to

Identify Risk Factors for Infant


Mortality in Ohio
Rose Webster, Yiqing Yang
Advisor: Prof Zhe Shan
Collaborators : Perinatal
Center CCHMC
WHY INFANT MORTALITY RATE?

TO DECLARE OHIOS RATE OF INFANT MORTALITY A


PUBLIC HEALTH CRISIS AND URGE COMPREHENSIVE
PRETERM BIRTH RISK SCREENING FOR ALL PREGNANT
WOMEN IN OHIO
HOUSE CONCURRENT RESOLUTION #12

INFANT MORTALITY REDUCTION PLAN 2015-2020


Ohio was third highest in rate of infant mortality for black babies nationally

Ohio 45th nationally, near bottom for infant mortality


APPLICATION OF THE INFORMATION SYSTEM
EXPERIENCE
TO A PUBLIC HEALTH CRISIS

Data Visualization Advanced Business Intelligence


BANA 6037- Prof Shaffer & Duell IS7036 Prof. Shan
https://public.tableau.com/profile/rwbst
r#!/vizhome/linkedinfantmortality2/OHIOI
NFANTMORTALITYRATEIMR
Background to Current Project
WHY DATA VIZ & ADV BI

Immense Potential for Healthcare Research


Both provide a means to examine large datasets and many
variables at once
Coming from a non- IT background these 2 courses were
fascinating
Both had excellent instructors and excellent organization
Comparatively a gentler learning curve
PROJECT SEQUENCE

Data Identification & Acquisition


Data Cleanup & Preparation
Analysis using BI tool SPSS Modeller
PROJECT GOALS

Extend the analysis of the Data Visualization Project


Established leading medical causes of high IMR are:
i) preterm births (47 percent)
ii)birth defects (14 percent)
iii)sleep-related deaths (15 percent).
There are many non-medical factors that correlate to poor infant health
outcomes, including race, poverty, poor nutrition, and education.
Utilize potential of SPSS Modeller in an unbiased non-directed
approach to examine socio-ecomic causes of infant mortality
Initiate an interdisciplinary project
Data Identification & Acquisition

Data used in Data Viz Project easily available ODH website,


composite data 2007-2012
Current Data obtained with help of Perinatal Center,
CCHMC 2007-2014
ODH allows use of data only after a long approval process
that could potentially take up to 3 months.
Enormous uncertainty about data being available
Data Characteristics

~1.2million cases 2007-2014 / 429 fields


Many missing and null values
Very rich in information on many variables required for this
study
Data Preparation by cleaning & filtering was necessary
Indiana Project
Reducing Infant Mortality in Indiana
http://www.in.gov/omb/files/Infant_Mortality_Report.pdf
3 consulting groups KSM consulting, Management Performance Hub &
Indiana Office of Technology carried out this study to examine Indiana
IMR
Dataset -216,488 records 2011-2013
Using logistic regression and Self organizing Map Algorithm Kohonen
they created an infant mortality risk quantifier
This project wanted to follow the same path
Method Adapted
Domain Knowledge & Indiana project was a guide to variable
selection

Steps of Process
I. - Auto Data Preparation -53 fields identified as top predictors, Zipcode was not
recommended for use
II. - Initial Filter Node was used to reduce the fields from >400 to 6-10 of interest. Top
medical predictors were not used
III. Mulitiple Select Nodes were used to remove odd data eg baby weight =9999, Number of
Prenatal visits >40
IV. Stratified Sampling was done to pool together an equal number of the two conditions of
infant death 0- no death/1-death
V. Auto Cluster & Auto Classifier to Select Models
VI. Connect full dataset to Models
I. Auto Data Preparation
Predictors recommended for Use
II.Filtering Variables of Interest
III & IV. Cleaning Variables of Interest

1.Infant Death

2.Number of
Prenatal Visits
NPREV

3.Weight Birth
Weight

4.Gest_Calculation
Weeks for Delivery

5.Mom_Age

6.Zip_Res -Zipcode
V. Model Selection
Top Models for Binary & Set Outcomes- Auto Classifier
Auto Cluster Predictive Models
Logistic Regression
K Means
Infant Death Vs Kmeans Clusters

Number of Infant Death


K- Means
Models tested Two Step & Kohonen
Conclusions & Future Directions
Models used in the Indiana project such as Logistic Regression &
Kohonen were identified to be the top models by independent analysis
using auto cluster and auto classifier nodes.
Top Medical Causes such as genetic abnormalities, apgar scores and
low birth weight could be identified as such
On excluding the known medical causes, other variables that showed
the most predictive power in multiple analyses were number of prenatal
visits, moms race, age , pay method and WIC status
Using Zipcode information & census data it will be possible to identify
geographical areas of high rates and correct causes
Creating an infant mortality risk quantifier tool that will ultimately predict
potential loss of life
References

http://tamarackcci.ca/blogs/sylvia-cheuy/improving-health-outcomes-women-and-ch
ildren-ohio
Trends in Infant Mortality in United States: A Brief Study of the Southeastern
States from 20052009 http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4454945/
Office of Health Transformation Reduce Infant Mortality March 2016
http://www.healthtransformation.ohio.gov/LinkClick.aspx?fileticket=J2ZnzQtVKXo
%3D&tabid=120
Reducing Infant Mortality in Indiana
http://www.in.gov/omb/files/Infant_Mortality_Report.pdf
http://www.odh.ohio.gov/-/media/ODH/ASSETS/Files/cfhs/Infant-Mortality/collabor
ative/2015/Dr-James-IMRP-Power-Point.ashx?la=en
THANK YOU
Infant Death Vs K Means 10 fields

Você também pode gostar