I am a senior biostatistician in the Quantitative Sciences Unit in the Department of Medicine. I am interested in applications of statistical methods to all areas of medicine, with a particular interest in clinical decision making. My current work is focused on computational methods for developing risk prediction tools for health outcomes, with a focus on utilizing real time, genetic and longitudinal data.

Academic Appointments

  • Instructor, Medicine

Professional Education

  • BA, Wesleyan University, Psychology (2002)
  • MPH, UC Berkeley, Epidemiology & Biostatistics (2007)
  • PhD, UC Berkeley, Biostatistics (2011)

Research & Scholarship

Current Research and Scholarly Interests

I am involved in a variety of research collaboration and always welcome new areas of collaborations. A particular interest is work related to clinical decision making. Current collaborations include:

Predicting Adverse Events using Electronic Health Records:
With researchers in the Division of Nephrology we are using EHR data from hemodialysis sessions to predict risk of near-term cardiac events. In addition we are using the longitudinal records to understand blood pressure variability, changes in laboratory measures over time and other risk factors

Development of Genetic Risk Scores:
With researchers in Cardiovascular Medicine and Dermatology we are developing genetic risk scores for cardiovascular disease and Melanoma respectively. We are assessing how these scores improve risk assessment beyond just clinical metrics. The goal is to integrate these tools into EHR systems

Transplantation Decision Making:
With researchers in Cardiovascular Medicine we are working to understand the relationship of donor characteristics to heart transplant recipient outcomes. We are working to create a decision making tool to assist centers in whether to accept a heart for transplantation.

Health Services/Health Policy:
Across collaborations with researchers in PCOR, Pediatrics and Surgery, I am working on variety projects assessing the ACA, children's health policy, neonatal outcomes disparity, and the use of quality health indicators.


Journal Articles

  • Early acute lung injury: criteria for identifying lung injury prior to the need for positive pressure ventilation*. Critical care medicine Levitt, J. E., Calfee, C. S., Goldstein, B. A., Vojnik, R., Matthay, M. A. 2013; 41 (8): 1929-1937


    Mortality associated with acute lung injury remains high. Early identification of acute lung injury prior to onset of respiratory failure may provide a therapeutic window to target in future clinical trials. The recently validated Lung Injury Prediction Score identifies patients at risk for acute lung injury but may be limited for routine clinical use. We sought to empirically derive clinical criteria for a pragmatic definition of early acute lung injury to identify patients with lung injury prior to the need for positive pressure ventilation.Prospective observational cohort study.Stanford University Hospital.We prospectively evaluated 256 patients admitted to Stanford University Hospital with bilateral opacities on chest radiograph without isolated left atrial hypertension.None.Of the 256 patients enrolled, 62 patients (25%) progressed to acute lung injury requiring positive pressure ventilation. Clinical variables (through first 72 hr or up to 6 hr prior to acute lung injury) associated with progression to acute lung injury were analyzed by backward regression. Oxygen requirement, maximal respiratory rate, and baseline immune suppression were independent predictors of progression to acute lung injury. A simple three-component early acute lung injury score (1 point for oxygen requirement > 2-6 L/min or 2 points for > 6 L/min; 1 point each for a respiratory rate ≥ 30 and immune suppression) accurately identified patients who progressed to acute lung injury requiring positive pressure ventilation (area under the receiver-operator characteristic curve, 0.86) and performed similarly to the Lung Injury Prediction Score. An early acute lung injury score greater than or equal to 2 identified patients who progressed to acute lung injury with 89% sensitivity and 75% specificity. Median time of progression from early acute lung injury criteria to acute lung injury requiring positive pressure ventilation was 20 hours.This pragmatic definition of early acute lung injury accurately identified patients who progressed to acute lung injury prior to requiring positive pressure ventilation. Pending further validation, these criteria could be useful for future clinical trials targeting early treatment of acute lung injury.

    View details for DOI 10.1097/CCM.0b013e31828a3d99

    View details for PubMedID 23782966

  • Trends in Relative Mortality Between Hispanic and Non-Hispanic Whites Initiating Dialysis: A Retrospective Study of the US Renal Data System. American journal of kidney diseases Arce, C. M., Goldstein, B. A., Mitani, A. A., Winkelmayer, W. C. 2013; 62 (2): 312-321


    BACKGROUND: Hispanic patients undergoing long-term dialysis experience better survival compared with non-Hispanic whites. It is unknown whether this association differs by age, has changed over time, or is due to differential access to kidney transplantation. STUDY DESIGN: National retrospective cohort study. SETTING & PARTICIPANTS: Using the US Renal Data System, we identified 615,618 white patients 18 years or older who initiated dialysis therapy between January 1, 1995, and December 31, 2007. PREDICTORS: Hispanic ethnicity (vs non-Hispanic whites), year of end-stage renal disease incidence, age (as potential effect modifier). OUTCOMES: All-cause and cause-specific mortality. RESULTS: We found that Hispanics initiating dialysis therapy experienced lower mortality, but age modified this association (P < 0.001). Compared with non-Hispanic whites, mortality in Hispanics was 33% lower at ages 18-39 years (adjusted cause-specific HR [HRcs], 0.67; 95% CI, 0.64-0.71) and 40-59 years (HRcs, 0.67; 95% CI, 0.66-0.68), 19% lower at ages 60-79 years (HRcs, 0.81; 95% CI, 0.80-0.82), and 6% lower at 80 years or older (HRcs, 0.94; 95% CI, 0.91-0.97). Accounting for the differential rates of kidney transplantation, the associations were attenuated markedly in the younger age strata; the survival benefit for Hispanics was reduced from 33% to 10% at ages 18-39 years (adjusted subdistribution-specific HR [HRsd], 0.90; 95% CI, 0.85-0.94) and from 33% to 19% among those aged 40-59 years (HRsd, 0.81; 95% CI, 0.80-0.83). LIMITATIONS: Inability to analyze Hispanic subgroups that may experience heterogeneous mortality outcomes. CONCLUSIONS: Overall, Hispanics experienced lower mortality, but differential access to kidney transplantation was responsible for much of the apparent survival benefit noted in younger Hispanics.

    View details for DOI 10.1053/j.ajkd.2013.02.375

    View details for PubMedID 23647836

  • Temporal Trends in the Incidence, Treatment, and Outcomes of Hip Fracture in Older Patients Initiating Dialysis in the United States CLINICAL JOURNAL OF THE AMERICAN SOCIETY OF NEPHROLOGY Nair, S. S., Mitani, A. A., Goldstein, B. A., Chertow, G. M., Lowenberg, D. W., Winkelmayer, W. C. 2013; 8 (8): 1336-1342


    BACKGROUND AND OBJECTIVES: Patients with ESRD experience a fivefold higher incidence of hip fracture than the age- and sex-matched general population. Despite multiple changes in the treatment of CKD mineral bone disorder, little is known about long-term trends in hip fracture incidence, treatment patterns, and outcomes in patients on dialysis. DESIGN, SETTING, PARTICIPANTS, & MEASUREMENTS: Fourteen annual cohorts (1996-2009) of older patients (≥67 years) initiating dialysis in the United States were studied. Eligible patients had Medicare fee-for-service coverage for ≥2 years before dialysis initiation and were followed for ≤3 years for a first hip fracture. Type of treatment (internal fixation or partial or total hip replacement) was ascertained along with 30-day mortality. Cox and modified Poisson regressions were used to describe trends in study outcomes. RESULTS: This study followed 409,040 patients over 607,059 person-years, during which time 17,887 hip fracture events were recorded (29.3 events/1000 person-years). Compared with patients incident for ESRD in 1996, adjusted hip fracture rates increased until the 2004 cohort (+41%) and declined thereafter. Surgical treatment included internal fixation in 56%, partial hip replacement in 29%, and total hip replacement in 2%, which remained essentially unchanged over time; 30-day mortality after hip fracture declined from 20% (1996) to 16% (2009). CONCLUSIONS: Hip fracture incidence rates remain higher today than in patients reaching ESRD in 1996, despite multiple purported improvements in the management of CKD mineral bone disorder. Although recent declines in incidence and steady declines in associated short-term mortality are encouraging, hip fractures remain among the most common and consequential noncardiovascular complications of ESRD.

    View details for DOI 10.2215/CJN.10901012

    View details for Web of Science ID 000323122500011

    View details for PubMedID 23660182

  • Donor Predictors of Allograft Use and Recipient Outcomes After Heart Transplantation CIRCULATION-HEART FAILURE Khush, K. K., Menza, R., John Nguyen, J., Zaroff, J. G., Goldstein, B. A. 2013; 6 (2): 300-309


    Despite a national organ-donor shortage and a growing population of patients with end-stage heart disease, the acceptance rate of donor hearts for transplantation is low. We sought to identify donor predictors of allograft nonuse, and to determine whether these predictors are in fact associated with adverse recipient post-transplant outcomes.We studied a cohort of 1872 potential organ donors managed by the California Transplant Donor Network from 2001 to 2008. Forty-five percent of available allografts were accepted for heart transplantation. Donor predictors of allograft nonuse included age >50 years, female sex, death attributable to cerebrovascular accident, hypertension, diabetes mellitus, a positive troponin assay, left-ventricular dysfunction and regional wall motion abnormalities, and left-ventricular hypertrophy. For hearts that were transplanted, only donor cause of death was associated with prolonged recipient hospitalization post-transplant, and only donor diabetes mellitus was predictive of increased recipient mortality.Whereas there are many donor predictors of allograft discard in the current era, these characteristics seem to have little effect on recipient outcomes when the hearts are transplanted. Our results suggest that more liberal use of cardiac allografts with relative contraindications may be warranted.

    View details for DOI 10.1161/CIRCHEARTFAILURE.112.000165

    View details for Web of Science ID 000331381200026



    Switching from peritoneal dialysis (PD) to hemodialysis (HD) is undesirable, because of complications from temporary vascular access, disruption of daily routine, and higher costs. Little is known about the role that social factors play in technique failure. DESIGN, SETTING, PARTICIPANTS, MEASUREMENTS: We followed for 3 years a nationally representative cohort of US patients who initiated PD in 1996 - 1997. Technique failure was defined as any switch from PD to HD for 30 days or more. We used Cox regression to examine associations between technique failure and demographic, medical, social, and pre-dialysis factors. We estimated hazard ratios (HRs) with 95% confidence intervals (CIs).We identified an inception cohort of 1587 patients undergoing PD. In multivariate analysis, female sex (HR: 0.78; 95% CI: 0.64 to 0.95) was associated with lower rates of technique failure, and black race [compared with white race (HR: 1.48; 95% CI: 1.20 to 1.82)] and receiving Medicaid (HR: 1.48; 95% CI: 1.17 to 1.86) were associated with higher rates. Compared with patients who worked full-time, those who were retired (HR: 1.49; 95% CI: 1.07 to 2.08) or disabled (HR: 1.38; 95% CI: 1.01 to 1.88) had higher rates of failure. Patients with a systolic blood pressure of 140 - 160 mmHg had a higher rate of failure than did those with a pressure of 120 - 140 mmHg (HR: 1.24; 95% CI: 1.00 to 1.52). Earlier referral to a nephrologist (>3 months before dialysis initiation) and the primary decision-maker for the dialysis modality (physician vs patient vs shared) were not associated with technique failure.This study confirms that several socio-demographic factors are associated with technique failure, emphasizing the potential importance of social and financial support in maintaining PD.

    View details for DOI 10.3747/pdi.2011.00233

    View details for Web of Science ID 000315995000007

    View details for PubMedID 23032086

  • Beta-Adrenergic Receptor Polymorphisms and Cardiac Graft Function in Potential Organ Donors AMERICAN JOURNAL OF TRANSPLANTATION Khush, K. K., Pawlikowska, L., Menza, R. L., Goldstein, B. A., Hayden, V., Nguyen, J., Kim, H., Poon, A., Sapru, A., Matthay, M. A., Kwok, P. Y., Young, W. L., Baxter-Lowe, L. A., Zaroff, J. G. 2012; 12 (12): 3377-3386


    Prior studies have demonstrated associations between beta-adrenergic receptor (?AR) polymorphisms and left ventricular dysfunction-an important cause of allograft nonutilization for transplantation. We hypothesized that ?AR polymorphisms predispose donor hearts to LV dysfunction after brain death. A total of 1043 organ donors managed from 2001-2006 were initially studied. The following ?AR single nucleotide polymorphisms were genotyped: ?1AR 1165C/G (Arg389Gly), ?1AR 145A/G (Ser49Gly), ?2AR 46G/A (Gly16Arg) and ?2AR 79C/G (Gln27Glu). In multivariable regression analyses, the ?2AR46 SNP was significantly associated with LV systolic dysfunction, with each minor allele additively decreasing the odds for LV ejection fraction <50%. The ?1AR1165 and ?2AR46 SNPs were associated with higher dopamine requirement during the donor management period: donors with the GG and AA genotypes had ORs of 2.64 (95% CI 1.52-4.57) and 2.70 (1.07-2.74) respectively for requiring >10 ?g/kg/min of dopamine compared to those with the CC and GG genotypes. However, no significant associations were found between ?AR SNPs and cardiac dysfunction in 364 donors managed from 2007-2008, perhaps due to changes in donor management, lack of power in this validation cohort, or the absence of a true association. ?AR polymorphisms may be associated with cardiac dysfunction after brain death, but these relationships require further study in independent donor cohorts.

    View details for DOI 10.1111/j.1600-6143.2012.04266.x

    View details for Web of Science ID 000311854800022

    View details for PubMedID 22994654

  • Trends in the Incidence of Atrial Fibrillation in Older Patients Initiating Dialysis in the United States CIRCULATION Goldstein, B. A., Arce, C. M., Hlatky, M. A., Turakhia, M., Setoguchi, S., Winkelmayer, W. C. 2012; 126 (19): 2293-?


    One sixth of US dialysis patients 65 years of age have been diagnosed with atrial fibrillation/flutter (AF). Little is known, however, about the incidence of AF in this population.We identified 258 605 older patients (?67 years of age) with fee-for-service Medicare initiating dialysis in 1995 to 2007, who had not been diagnosed with AF within the previous 2 years. Patients were followed for newly diagnosed AF. Multivariable proportional hazard regression was used to examine temporal trends and associations of race and ethnicity with incident AF. We also studied temporal trends in the mortality and risk of ischemic stroke after new AF. Over 514 395 person-years of follow-up, 76 252 patients experienced incident AF for a crude AF incidence rate of 148/1000 person-years. Incidence of AF increased by 11% (95% confidence interval, 5-16) from 1995 to 2007. Compared with non-Hispanic whites, blacks (-30%), Asians (-19%), Native Americans (-42%), and Hispanics (-29%) all had lower rates of incident AF. Mortality after incident AF decreased by 22% from 1995 to 2008. Even more pronounced reductions were seen for incident ischemic stroke during these years.The incidence of AF is high in older patients initiating dialysis in the United States and has been increasing over the 13 years of study. Mortality declined during that time but remained >50% during the first year after newly diagnosed AF. Because data on warfarin use were not available, we were unable to study whether trends toward better outcomes could be explained by higher rates of oral anticoagulation.

    View details for Web of Science ID 000310744100010

    View details for PubMedID 23032326

  • Breast cancer risk factors differ between Asian and white women with BRCA1/2 mutations FAMILIAL CANCER de Bruin, M. A., Kwong, A., Goldstein, B. A., Lipson, J. A., Ikeda, D. M., McPherson, L., Sharma, B., Kardashian, A., Schackmann, E., Kingham, K. E., Mills, M. A., West, D. W., Ford, J. M., Kurian, A. W. 2012; 11 (3): 429-439


    The prevalence and penetrance of BRCA1 and BRCA2 (BRCA1/2) mutations may differ between Asians and whites. We investigated BRCA1/2 mutations and cancer risk factors in a clinic-based sample. BRCA1/2 mutation carriers were enrolled from cancer genetics clinics in Hong Kong and California according to standardized entry criteria. We compared BRCA mutation position, cancer history, hormonal and reproductive exposures. We analyzed DNA samples for single-nucleotide polymorphisms reported to modify breast cancer risk. We performed logistic regression to identify independent predictors of breast cancer. Fifty Asian women and forty-nine white American women were enrolled. BRCA1 mutations were more common among whites (67 vs. 42 %, p = 0.02), and BRCA2 mutations among Asians (58 vs. 37 %, p = 0.04). More Asians had breast cancer (76 vs. 53 %, p = 0.03); more whites had relatives with breast cancer (86 vs. 50 %, p = 0.0003). More whites than Asians had breastfed (71 vs. 42 %, p = 0.005), had high BMI (median 24.3 vs. 21.2, p = 0.04), consumed alcohol (2 drinks/week vs. 0, p < 0.001), and had oophorectomy (61 vs. 34 %, p = 0.01). Asians had a higher frequency of risk-associated alleles in MAP3K1 (88 vs. 59 %, p = 0.005) and TOX3/TNRC9 (88 vs. 55 %, p = 0.0002). On logistic regression, MAP3K1 was associated with increased breast cancer risk for BRCA2, but not BRCA1 mutation carriers; breast density was associated with increased risk among Asians but not whites. We found significant differences in breast cancer risk factors between Asian and white BRCA1/2 mutation carriers. Further investigation of racial differences in BRCA1/2 mutation epidemiology could inform targeted cancer risk-reduction strategies.

    View details for DOI 10.1007/s10689-012-9531-9

    View details for Web of Science ID 000311025000016

    View details for PubMedID 22638769

  • Electrocardiographic Characteristics of Potential Organ Donors and Associations With Cardiac Allograft Use CIRCULATION-HEART FAILURE Khush, K. K., Menza, R., Nguyen, J., Goldstein, B. A., Zaroff, J. G., Drew, B. J. 2012; 5 (4): 475-483


    Current regulations require that all cardiac allograft offers for transplantation must include an interpreted 12-lead electrocardiogram (ECG). However, little is known about the expected ECG findings in potential organ donors or the clinical significance of any identified abnormalities in terms of cardiac allograft function and suitability for transplantation.A single experienced reviewer interpreted the first ECG obtained after brain stem herniation in 980 potential organ donors managed by the California Transplant Donor Network from 2002 to 2007. ECG abnormalities were summarized, and associations between specific ECG findings and cardiac allograft use for transplantation were studied. ECG abnormalities were present in 51% of all cases reviewed. The most common abnormalities included voltage criteria for left ventricular hypertrophy, prolongation of the corrected QT interval, and repolarization changes (ST/T wave abnormalities). Fifty-seven percent of potential cardiac allografts in this cohort were accepted for transplantation. Left ventricular hypertrophy on ECG was a strong predictor of allograft nonuse. No significant associations were seen among corrected QT interval prolongation, repolarization changes, and allograft use for transplantation after adjusting for donor clinical variables and echocardiographic findings.We have performed the first comprehensive study of ECG findings in potential donors for cardiac transplantation. Many of the common ECG abnormalities seen in organ donors may result from the heightened state of sympathetic activation that occurs after brain stem herniation and are not associated with allograft use for transplantation.

    View details for DOI 10.1161/CIRCHEARTFAILURE.112.968388

    View details for Web of Science ID 000313578100018

    View details for PubMedID 22615333

  • Randomized Trial of Personal Genomics for Preventive Cardiology Design and Challenges CIRCULATION-CARDIOVASCULAR GENETICS Knowles, J. W., Assimes, T. L., Kiernan, M., Pavlovic, A., Goldstein, B. A., Yank, V., McConnell, M. V., Absher, D., Bustamante, C., Ashley, E. A., Ioannidis, J. P. 2012; 5 (3): 368-376
  • Hispanic Ethnicity and Vascular Access Use in Patients Initiating Hemodialysis in the United States CLINICAL JOURNAL OF THE AMERICAN SOCIETY OF NEPHROLOGY Arce, C. M., Mitani, A. A., Goldstein, B. A., Winkelmayer, W. C. 2012; 7 (2): 289-296


    Hispanics are the largest minority in the United States (comprising 16.3% of the US population) and have 1.5 times the age-, sex-, and race-adjusted incidence of ESRD compared with non-Hispanics. Poor health care access and low-quality care generally received by Hispanics are well documented. However, little is known regarding dialysis preparation of Hispanic patients with progressive CKD.Using data from Medical Evidence Report form CMS-2728-U3, 321,996 adult patients of white or black race were identified who initiated hemodialysis (HD) between July 1, 2005 and December 31, 2008. The form captures Hispanic ethnicity, vascular access use at first outpatient HD, sociodemographic characteristics, and comorbidities. This study also examined whether use of an arteriovenous fistula (AVF) or graft (AVG) was reported.AVF/AVG use was reported in 14.5% of Hispanics and 17.6% in non-Hispanics (P<0.001). The unadjusted prevalence ratio (PR) was 0.85 (95% confidence interval [95% CI], 0.83-0.88), indicating that Hispanics were 15% less likely to use AVG/AVF for their first outpatient HD. Adjustment for age, sex, and race, as well as a large number of comorbidities and frailty indicators, did not change this association (PR, 0.85; 95% CI, 0.83-0.88). Further adjustment for timing of first predialysis nephrology care, however, attenuated the PR by two-thirds (PR, 0.94; 95% CI, 0.92-0.97).Hispanics are less likely to use arteriovenous access for first outpatient HD compared with non-Hispanics, which seems to be explained by variation in the access to predialysis nephrology care.

    View details for DOI 10.2215/CJN.08370811

    View details for Web of Science ID 000300124300014

    View details for PubMedID 22114148

  • The rs4774 CIITA missense variant is associated with risk of systemic lupus erythematosus GENES AND IMMUNITY Bronson, P. G., Goldstein, B. A., Ramsay, P. P., Beckman, K. B., Noble, J. A., Lane, J. A., Seldin, M. F., Kelly, J. A., Harley, J. B., Moser, K. L., Gaffney, P. M., Behrens, T. W., Criswell, L. A., Barcellos, L. F. 2011; 12 (8): 667-671


    The major histocompatibility complex (MHC) class II transactivator gene (CIITA) encodes an important transcription factor required for human leukocyte antigens (HLA) class II MHC-restricted antigen presentation. MHC genes, including the HLA class II DRB1*03:01 allele, are strongly associated with systemic lupus erythematosus (SLE). Recently the rs4774 CIITA missense variant (+1632G/C) was reported to be associated with susceptibility to multiple sclerosis. In the current study, we investigated CIITA, DRB1*03:01 and risk of SLE using a multi-stage analysis. In stage 1, 9 CIITA variants were tested in 658 cases and 1363 controls (N=2021). In stage 2, rs4774 was tested in 684 cases and 2938 controls (N=3622). We also performed a meta-analysis of the pooled 1342 cases and 4301 controls (N=5643). In stage 1, rs4774(*)C was associated with SLE (odds ratio (OR)=1.24, 95% confidence interval (95% CI)=1.07-1.44, P=4.2 × 10(-3)). Similar results were observed in stage 2 (OR=1.16, 95% CI=1.02-1.33, P=8.5 × 10(-3)) and the meta-analysis of the combined data set (OR=1.20, 95% CI=1.09-1.33, P(meta)=2.5 × 10(-4)). In all three analyses, the strongest evidence for association between rs4774(*)C and SLE was present in individuals who carried at least one copy of DRB1*03:01 (P(meta)=1.9 × 10(-3)). Results support a role for CIITA in SLE, which appears to be stronger in the presence of DRB1*03:01.

    View details for DOI 10.1038/gene.2011.36

    View details for Web of Science ID 000297928300009

    View details for PubMedID 21614020

  • Random Forests for Genetic Association Studies Statistical Applications in Genetics and Molecular Biology Benjamin A. Goldstein, Eric C. Polley, Farren B.S. Briggs 2011; 10 (1): 1
  • Variation Within DNA Repair Pathway Genes and Risk of Multiple Sclerosis AMERICAN JOURNAL OF EPIDEMIOLOGY Briggs, F. B., Goldstein, B. A., McCauley, J. L., Zuvich, R. L., De Jager, P. L., Rioux, J. D., Ivinson, A. J., Compston, A., Hafler, D. A., Hauser, S. L., Oksenberg, J. R., Sawcer, S. J., Pericak-Vance, M. A., Haines, J. L., Barcellos, L. F. 2010; 172 (2): 217-224


    Multiple sclerosis (MS) is a complex autoimmune disease of the central nervous system with a prominent genetic component. The primary genetic risk factor is the human leukocyte antigen (HLA)-DRB1*1501 allele; however, much of the remaining genetic contribution to MS has not been elucidated. The authors investigated the relation between variation in DNA repair pathway genes and risk of MS. Single-locus association testing, epistatic tests of interactions, logistic regression modeling, and nonparametric Random Forests analyses were performed by using genotypes from 1,343 MS cases and 1,379 healthy controls of European ancestry. A total of 485 single nucleotide polymorphisms within 72 genes related to DNA repair pathways were investigated, including base excision repair, nucleotide excision repair, and double-strand breaks repair. A single nucleotide polymorphism variant within the general transcription factor IIH, polypeptide 4 gene, GTF2H4, on chromosome 6p21.33 was significantly associated with MS (odds ratio = 0.7, P = 3.5 x 10(-5)) after accounting for multiple testing and was not due to linkage disequilibrium with HLA-DRB1*1501. Although other candidate genes examined here warrant further follow-up studies, collectively, these results derived from a well-powered study do not support a strong role for common variation within DNA repair pathway genes in MS.

    View details for DOI 10.1093/aje/kwq086

    View details for Web of Science ID 000280263900013

    View details for PubMedID 20522537

  • An application of Random Forests to a genome-wide association dataset: Methodological considerations & new findings BMC GENETICS Goldstein, B. A., Hubbard, A. E., Cutler, A., Barcellos, L. F. 2010; 11


    As computational power improves, the application of more advanced machine learning techniques to the analysis of large genome-wide association (GWA) datasets becomes possible. While most traditional statistical methods can only elucidate main effects of genetic variants on risk for disease, certain machine learning approaches are particularly suited to discover higher order and non-linear effects. One such approach is the Random Forests (RF) algorithm. The use of RF for SNP discovery related to human disease has grown in recent years; however, most work has focused on small datasets or simulation studies which are limited.Using a multiple sclerosis (MS) case-control dataset comprised of 300 K SNP genotypes across the genome, we outline an approach and some considerations for optimally tuning the RF algorithm based on the empirical dataset. Importantly, results show that typical default parameter values are not appropriate for large GWA datasets. Furthermore, gains can be made by sub-sampling the data, pruning based on linkage disequilibrium (LD), and removing strong effects from RF analyses. The new RF results are compared to findings from the original MS GWA study and demonstrate overlap. In addition, four new interesting candidate MS genes are identified, MPHOSPH9, CTNNA3, PHACTR2 and IL7, by RF analysis and warrant further follow-up in independent studies.This study presents one of the first illustrations of successfully analyzing GWA data with a machine learning algorithm. It is shown that RF is computationally feasible for GWA data and the results obtained make biologic sense based on previous studies. More importantly, new genes were identified as potentially being associated with MS, suggesting new avenues of investigation for this complex disease.

    View details for DOI 10.1186/1471-2156-11-49

    View details for Web of Science ID 000279859700001

    View details for PubMedID 20546594

Stanford Medicine Resources: