Multimethod, multidataset analysis reveals paradoxical relationships between sociodemographic factors, Hispanic ethnicity and diabetes.
BMJ open diabetes research & care
2020; 8 (2)
INTRODUCTION: Population-level and individual-level analyses have strengths and limitations as do 'blackbox' machine learning (ML) and traditional, interpretable models. Diabetes mellitus (DM) is a leading cause of morbidity and mortality with complex sociodemographic dynamics that have not been analyzed in a way that leverages population-level and individual-level data as well as traditional epidemiological and ML models. We analyzed complementary individual-level and county-level datasets with both regression and ML methods to study the association between sociodemographic factors and DM.RESEARCH DESIGN AND METHODS: County-level DM prevalence, demographics, and socioeconomic status (SES) factors were extracted from the 2018 Robert Wood Johnson Foundation County Health Rankings and merged with US Census data. Analogous individual-level data were extracted from 2007 to 2016 National Health and Nutrition Examination Survey studies and corrected for oversampling with survey weights. We used multivariate linear (logistic) regression and ML regression (classification) models for county (individual) data. Regression and ML models were compared using measures of explained variation (area under the receiver operating characteristic curve (AUC) and R2).RESULTS: Among the 3138 counties assessed, the mean DM prevalence was 11.4% (range: 3.0%-21.1%). Among the 12824 individuals assessed, 1688 met DM criteria (13.2% unweighted; 10.2% weighted). Age, gender, race/ethnicity, income, and education were associated with DM at the county and individual levels. Higher county Hispanic ethnic density was negatively associated with county DM prevalence, while Hispanic ethnicity was positively associated with individual DM. ML outperformed regression in both datasets (mean R2 of 0.679 vs 0.610, respectively (p<0.001) for county-level data; mean AUC of 0.737 vs 0.727 (p<0.0427) for individual-level data).CONCLUSIONS: Hispanic individuals are at higher risk of DM, while counties with larger Hispanic populations have lower DM prevalence. Analyses of population-level and individual-level data with multiple methods may afford more confidence in results and identify areas for further study.
View details for DOI 10.1136/bmjdrc-2020-001725
View details for PubMedID 33229378
- The Hispanic paradox in the prevalence of obesity at the county-level OBESITY SCIENCE & PRACTICE 2020
COUNTY-LEVEL FACTORS ASSOCIATED WITH CARDIOVASCULAR MORTALITY DISAGGREGATED BY RACE/ETHNICITY
ELSEVIER SCIENCE INC. 2020: 1884
View details for Web of Science ID 000522979101871
- Identification of Factors Associated With Variation in US County-Level Obesity Prevalence Rates Using Epidemiologic vs Machine Learning Models JAMA NETWORK OPEN 2019; 2 (4)
EXPLAINING VARIATION IN US COUNTY-LEVEL OBESITY PREVALENCE
ELSEVIER SCIENCE INC. 2019: 1762
View details for Web of Science ID 000460565901774
The Impact of Ethnicity on Metabolic Outcomes after Bariatric Surgery.
The Journal of surgical research
2019; 236: 345?51
BACKGROUND: Previous studies have demonstrated that ethnic minority patients experience significant metabolic improvements after bariatric surgery but less so than non-Hispanic whites. Previous research has primarily investigated differences between non-Hispanic white and black patients. Thus, there remains a need to assess differences in diabetic outcomes among other ethnic groups, including Hispanic and Asian patient populations.MATERIALS AND METHODS: A retrospective analysis including 650 patients with type II diabetes mellitus (T2DM), who underwent either laparoscopic Roux-en-Y gastric bypass or laparoscopic sleeve gastrectomy (LSG) procedures, was conducted to understand ethnic disparities in diabetic metabolic outcomes, including weight loss, serum concentrations of glucose, fasting insulin, and hemoglobin A1c (HbA1c). Data were from a single academic institution in northern California. Ethnicity data were self reported. T2DM was defined as having one or more of the following criteria: a fasting glucose concentration >125mg/dL, HbA1c >6.5%, or taking one or more diabetic oral medications. Diabetes resolution was defined as having a fasting glucose <125mg/dL, a HbA1c <6.5%, and discontinuation of diabetic oral medications.RESULTS: Within-group comparisons in all ethnic groups showed significant reductions in body mass index, body weight, fasting insulin, fasting glucose, and HbA1c by 6mo, but Asian patients did not experience further improvement in body mass index or diabetic outcomes at the 12-mo visit. Black patients did not experience additional reductions in fasting insulin or glucose between the 6- and 12-mo visit and their HbA1c significantly increased. Nevertheless, the majority of patients had diabetes remission by the 12-mo postoperative visit (98%, 97%, 98%, and 92% in Non-Hispanic, Hispanic, black, and Asian, respectively).CONCLUSIONS: The results of this study demonstrate that bariatric surgery serves as an effective treatment for normalizing glucose metabolism among patients with T2DM. However, this study suggests that additional interventions that support black and Asian patients with achieving similar metabolic outcomes as non-Hispanic white and Hispanic patients warrant further consideration.
View details for DOI 10.1016/j.jss.2018.09.061
View details for PubMedID 30694776
Identification of Factors Associated With Variation in US County-Level Obesity Prevalence Rates Using Epidemiologic vs Machine Learning Models.
JAMA network open
2019; 2 (4): e192884
Obesity is a leading cause of high health care expenditures, disability, and premature mortality. Previous studies have documented geographic disparities in obesity prevalence.To identify county-level factors associated with obesity using traditional epidemiologic and machine learning methods.Cross-sectional study using linear regression models and machine learning models to evaluate the associations between county-level obesity and county-level demographic, socioeconomic, health care, and environmental factors from summarized statistical data extracted from the 2018 Robert Wood Johnson Foundation County Health Rankings and merged with US Census data from each of 3138 US counties. The explanatory power of the linear multivariate regression and the top performing machine learning model were compared using mean R2 measured in 30-fold cross validation.County-level demographic factors (population; rural status; census region; and race/ethnicity, sex, and age composition), socioeconomic factors (median income, unemployment rate, and percentage of population with some college education), health care factors (rate of uninsured adults and primary care physicians), and environmental factors (access to healthy foods and access to exercise opportunities).County-level obesity prevalence in 2018, its association with each county-level factor, and the percentage of variation in county-level obesity prevalence explained by linear multivariate and gradient boosting machine regression measured with R2.Among the 3138 counties studied, the mean (range) obesity prevalence was 31.5% (12.8%-47.8%). In multivariate regressions, demographic factors explained 44.9% of variation in obesity prevalence; socioeconomic factors, 33.0%; environmental factors, 15.5%; and health care factors, 9.1%. The county-level factors with the strongest association with obesity were census region, median household income, and percentage of population with some college education. R2 values of univariate regressions of obesity prevalence were 0.238 for census region, 0.218 for median household income, and 0.160 for percentage of population with some college education. Multivariate linear regression and gradient boosting machine regression (the best-performing machine learning model) of obesity prevalence using all county-level demographic, socioeconomic, health care, and environmental factors had R2 values of 0.58 and 0.66, respectively (P?.001).Obesity prevalence varies significantly between counties. County-level demographic, socioeconomic, health care, and environmental factors explain the majority of variation in county-level obesity prevalence. Using machine learning models may explain significantly more of the variation in obesity prevalence..
View details for PubMedID 31026030