Clinical Focus

  • Cancer > GI Oncology
  • Medical Oncology
  • Oncology (Cancer)
  • Gastrointestinal Neoplasms
  • Inherited Cancer Disorders

Academic Appointments

Administrative Appointments

  • Senior Associate Director, Stanford Genome Technology Center (2008 - Present)

Honors & Awards

  • Research Scholar Award, American Cancer Society (2013)
  • Clinical Scientist Development Award, Doris Duke Charitable Foundation (2009)
  • Physician Scientist Early Career Award, Howard Hughes Medical Institute (2008)
  • Merit Award for Research Achievement, American Society Clinical Oncology Foundation (2006)
  • American Association Cancer Research, Scholar-in-Training Award for Research Achievement (2005)
  • Physician-Scientist Fellowship Award, Howard Hughes Medical Institute (1998)

Professional Education

  • Fellowship:Stanford University Hospital -Clinical Excellence Research Center (2005) CA
  • Residency:University Of Iowa Hospitals and Clinics (1996) IA
  • Board Certification: Medical Oncology, American Board of Internal Medicine (2004)
  • Residency:University of Washington (2001) WA
  • Medical Education:John Hopkins University School of Medicine (1994) MD
  • B.A., Reed College, Biology
  • M.D., Johns Hopkins University, Medicine

Research & Scholarship

Current Research and Scholarly Interests

To improve the lives of individuals with cancer, our research group has embarked on a research initiative to use cutting edge genetics and technology to interrogate the fundamental genetic" digital" code responsible for cancer development and overall clinical behavior.

We are pursuing projects focused on personalized medicine. Specifically, we are interested in using genetic and genomic approaches in oncology to improve targeted cancer therapy development, make accurate prognosis, prediction of cancer therapy efficacy and identify clinically relevant cancer mutations. These projects are aimed towards establishing the paradigm for individualized medicine, facilitate the introduction of these approaches into validation clinical studies and thus develop the next generation of cancer diagnostics and treatment.

Our research program is specifically focused on:
1) Discovery and validation of genetic signatures portending prognosis and therapeutic drug targets for individuals with cancer

2) Development of novel approaches for analyzing cancer genomes and identifying personalized therapeutic targets

3) Determining inherited pathogenic mutations that increase the risk of developing gastrointestinal malignancies

4) The genetic analysis of complete cancer genome sequences derived from inherited cancer

5) Technology development on novel genetic diagnostic methods to help individuals with cancer

Clinical Trials

  • Genetic & Pathological Studies of BRCA1/BRCA2: Associated Tumors & Blood Samples Recruiting

    The purpose of this study is to try to understand the biology of development of breast, ovarian, fallopian tube, peritoneal or endometrial cancer from persons at high genetic risk for these diseases. The influence of environmental factors on cancer development in individuals and families will be studied. The efficacy of treatments for these diseases will be evaluated.

    View full details

  • Molecular Analysis of Thoracic Malignancies Recruiting

    A research study to learn about the biologic features of cancer development, growth, and spread. We are studying components of blood, tumor tissue, normal tissue, and other fluids, such as urine, cerebrospinal fluid, abdominal or chest fluid in patients with cancer. Our analyses of blood, tissue, and/or fluids may lead to improved diagnosis and treatment of cancer by the identification of markers that predict clinical outcome, markers that predict response to specific therapies, and the identification of targets for new therapies.

    View full details

  • Molecular Genetic and Pathological Studies of Anal Tumors Recruiting

    Study the Genetics of Anal Cancer

    View full details

  • Phase II Gemcitabine + Fractionated Stereotactic Radiotherapy for Unresectable Pancreatic Adenocarcinoma Not Recruiting

    This multi-institutional trial aims to evaluate the potential benefit and side effects of adding fractionated stereotactic body radiotherapy/surgery (SBRT) before and after chemotherapy with gemcitabine for locally advanced pancreatic cancer.

    Stanford is currently not accepting patients for this trial. For more information, please contact Laurie Ann Columbo, 650-736-0792.

    View full details

  • The Gastric Cancer Foundation: A Gastric Cancer Registry Recruiting

    The Gastric Cancer Registry will combine data acquired directly from patients with gastric cancer via an online questionnaire with genomic data obtained from blood and tissue samples. The purpose of this registry is to gain better understanding of the causes of gastric cancer, both environmental and genetic; whether certain genomic data can predict outcomes of treatment and survival; as well as explore the issues that effect the quality of life of these patients after diagnosis and treatment.

    View full details

  • Genetic Analysis of Liver Cancer Not Recruiting

    Liver cancer is a leading cause of cancer deaths worldwide. While the molecular pathogenesis of liver cancer has been extensively studied, less is known about how the molecular biology of liver cancer influences clinical outcome and treatment response. We are developing a translational research program that will characterize molecular changes in liver cancer. We plan to use molecular information obtained from studying liver tumor tissues to develop new diagnostics and treatment regimens for patients with these cancers. The experimental approach will require freezing fresh tumor tissues obtained from surgical procedures, which will be subsequently used for analysis of DNA, protein and mRNA expression. Many patients with liver cancer are referred to the Stanford Liver Tumor Board for consultation and treatment recommendations. We propose to gather tissue samples from those who subsequently undergo biopsy, liver resection surgery, or transplant surgery.

    Stanford is currently not accepting patients for this trial. For more information, please contact Mei-Sze Chua, (650) 724 - 3525.

    View full details

  • Genome, Proteome and Tissue Microarray in Childhood Acute Leukemia Recruiting

    We will study gene and protein expression in leukemia cells of children diagnosed with acute leukemia. We hope to identify genes or proteins which can help us grade leukemia at diagnosis in order to: (a) develop better means of diagnosis and (b) more accurately choose the best therapy for each patient.

    View full details

  • Clinical & Pathological Studies of Upper Gastrointestinal Carcinoma Recruiting

    Our research of the biology of upper gastrointestinal cancers involves the study of tissue samples and cells from biopsies of persons with gastric or esophageal cancer or blood samples from upper gastrointestinal cancer patients and persons at high inherited risk for these cancers. We hope to learn the role genes and proteins play in the development of gastric and esophageal cancer.

    View full details

  • Comprehensive Screening for Women at High Genetic Risk for Developing Breast Cancer Not Recruiting

    To screen women who are high risk for breast cancer with breast MRI, mammogram and random periareolar fine needle aspiration.

    Stanford is currently not accepting patients for this trial. For more information, please contact Meredith Mills, (650) 724 - 5223.

    View full details


2014-15 Courses

Graduate and Fellowship Programs


Journal Articles

  • A programmable method for massively parallel targeted sequencing. Nucleic acids research Hopmans, E. S., Natsoulis, G., Bell, J. M., Grimes, S. M., Sieh, W., Ji, H. P. 2014; 42 (10)


    We have developed a targeted resequencing approach referred to as Oligonucleotide-Selective Sequencing. In this study, we report a series of significant improvements and novel applications of this method whereby the surface of a sequencing flow cell is modified in situ to capture specific genomic regions of interest from a sample and then sequenced. These improvements include a fully automated targeted sequencing platform through the use of a standard Illumina cBot fluidics station. Targeting optimization increased the yield of total on-target sequencing data 2-fold compared to the previous iteration, while simultaneously increasing the percentage of reads that could be mapped to the human genome. The described assays cover up to 1421 genes with a total coverage of 5.5 Megabases (Mb). We demonstrate a 10-fold abundance uniformity of greater than 90% in 1 log distance from the median and a targeting rate of up to 95%. We also sequenced continuous genomic loci up to 1.5 Mb while simultaneously genotyping SNPs and genes. Variants with low minor allele fraction were sensitively detected at levels of 5%. Finally, we determined the exact breakpoint sequence of cancer rearrangements. Overall, this approach has high performance for selective sequencing of genome targets, configuration flexibility and variant calling accuracy.

    View details for DOI 10.1093/nar/gku282

    View details for PubMedID 24782526

  • High Sensitivity Detection and Quantitation of DNA Copy Number and Single Nucleotide Variants with Single Color Droplet Digital PCR ANALYTICAL CHEMISTRY Miotke, L., Lau, B. T., Rumma, R. T., Ji, H. P. 2014; 86 (5): 2618-2624


    In this study, we present a highly customizable method for quantifying copy number and point mutations utilizing a single-color, droplet digital PCR platform. Droplet digital polymerase chain reaction (ddPCR) is rapidly replacing real-time quantitative PCR (qRT-PCR) as an efficient method of independent DNA quantification. Compared to quantative PCR, ddPCR eliminates the needs for traditional standards; instead, it measures target and reference DNA within the same well. The applications for ddPCR are widespread including targeted quantitation of genetic aberrations, which is commonly achieved with a two-color fluorescent oligonucleotide probe (TaqMan) design. However, the overall cost and need for optimization can be greatly reduced with an alternative method of distinguishing between target and reference products using the nonspecific DNA binding properties of EvaGreen (EG) dye. By manipulating the length of the target and reference amplicons, we can distinguish between their fluorescent signals and quantify each independently. We demonstrate the effectiveness of this method by examining copy number in the proto-oncogene FLT3 and the common V600E point mutation in BRAF. Using a series of well-characterized control samples and cancer cell lines, we confirmed the accuracy of our method in quantifying mutation percentage and integer value copy number changes. As another novel feature, our assay was able to detect a mutation comprising less than 1% of an otherwise wild-type sample, as well as copy number changes from cancers even in the context of significant dilution with normal DNA. This flexible and cost-effective method of independent DNA quantification proves to be a robust alternative to the commercialized TaqMan assay.

    View details for DOI 10.1021/ac403843j

    View details for Web of Science ID 000332494100048

    View details for PubMedID 24483992

  • Metastatic tumor evolution and organoid modeling implicate TGFBR2 as a cancer driver in diffuse gastric cancer GENOME BIOLOGY Nadauld, L. D., Garcia, S., Natsoulis, G., Bell, J. M., Miotke, L., Hopmans, E. S., Xu, H., Pai, R. K., Palm, C., Regan, J. F., Chen, H., Flaherty, P., Ootani, A., Zhang, N. R., Ford, J. M., Kuo, C. J., Ji, H. P. 2014; 15 (8)


    Gastric cancer is the second-leading cause of global cancer deaths, with metastatic disease representing the primary cause of mortality. To identify candidate drivers involved in oncogenesis and tumor evolution, we conduct an extensive genome sequencing analysis of metastatic progression in a diffuse gastric cancer. This involves a comparison between a primary tumor from a hereditary diffuse gastric cancer syndrome proband and its recurrence as an ovarian metastasis.Both the primary tumor and ovarian metastasis have common biallelic loss-of-function of both the CDH1 and TP53 tumor suppressors, indicating a common genetic origin. While the primary tumor exhibits amplification of the Fibroblast growth factor receptor 2 (FGFR2) gene, the metastasis notably lacks FGFR2 amplification but rather possesses unique biallelic alterations of Transforming growth factor-beta receptor 2 (TGFBR2), indicating the divergent in vivo evolution of a TGFBR2-mutant metastatic clonal population in this patient. As TGFBR2 mutations have not previously been functionally validated in gastric cancer, we modeled the metastatic potential of TGFBR2 loss in a murine three-dimensional primary gastric organoid culture. The Tgfbr2 shRNA knockdown within Cdh1-/-; Tp53-/- organoids generates invasion in vitro and robust metastatic tumorigenicity in vivo, confirming Tgfbr2 metastasis suppressor activity.We document the metastatic differentiation and genetic heterogeneity of diffuse gastric cancer and reveal the potential metastatic role of TGFBR2 loss-of-function. In support of this study, we apply a murine primary organoid culture method capable of recapitulating in vivo metastatic gastric cancer. Overall, we describe an integrated approach to identify and functionally validate putative cancer drivers involved in metastasis.

    View details for DOI 10.1186/s13059-014-0428-9

    View details for Web of Science ID 000346604100009

    View details for PubMedID 25315765

  • Systematic genomic identification of colorectal cancer genes delineating advanced from early clinical stage and metastasis. BMC medical genomics Lee, H., Flaherty, P., Ji, H. P. 2013; 6: 54-?


    Colorectal cancer is the third leading cause of cancer deaths in the United States. The initial assessment of colorectal cancer involves clinical staging that takes into account the extent of primary tumor invasion, determining the number of lymph nodes with metastatic cancer and the identification of metastatic sites in other organs. Advanced clinical stage indicates metastatic cancer, either in regional lymph nodes or in distant organs. While the genomic and genetic basis of colorectal cancer has been elucidated to some degree, less is known about the identity of specific cancer genes that are associated with advanced clinical stage and metastasis.We compiled multiple genomic data types (mutations, copy number alterations, gene expression and methylation status) as well as clinical meta-data from The Cancer Genome Atlas (TCGA). We used an elastic-net regularized regression method on the combined genomic data to identify genetic aberrations and their associated cancer genes that are indicators of clinical stage. We ranked candidate genes by their regression coefficient and level of support from multiple assay modalities.A fit of the elastic-net regularized regression to 197 samples and integrated analysis of four genomic platforms identified the set of top gene predictors of advanced clinical stage, including: WRN, SYK, DDX5 and ADRA2C. These genetic features were identified robustly in bootstrap resampling analysis.We conducted an analysis integrating multiple genomic features including mutations, copy number alterations, gene expression and methylation. This integrated approach in which one considers all of these genomic features performs better than any individual genomic assay. We identified multiple genes that robustly delineate advanced clinical stage, suggesting their possible role in colorectal cancer metastatic progression.

    View details for DOI 10.1186/1755-8794-6-54

    View details for PubMedID 24308539

  • Improving bioinformatic pipelines for exome variant calling GENOME MEDICINE Ji, H. P. 2012; 4


    Exome sequencing analysis is a cost-effective approach for identifying variants in coding regions. However, recognizing the relevant single nucleotide variants, small insertions and deletions remains a challenge for many researchers and diagnostic laboratories typically do not have access to the bioinformatic analysis pipelines necessary for clinical application. The Atlas2 suite, recently released by Baylor Genome Center, is designed to be widely accessible, runs on desktop computers but is scalable to computational clusters, and performs comparably with other popular variant callers. Atlas2 may be an accessible alternative for data processing when a rapid solution for variant calling is required.See research article

    View details for DOI 10.1186/gm306

    View details for Web of Science ID 000314564600001

    View details for PubMedID 22289516

  • Ultrasensitive detection of rare mutations using next-generation targeted resequencing NUCLEIC ACIDS RESEARCH Flaherty, P., Natsoulis, G., Muralidharan, O., Winters, M., Buenrostro, J., Bell, J., Brown, S., Holodniy, M., Zhang, N., Ji, H. P. 2012; 40 (1)


    With next-generation DNA sequencing technologies, one can interrogate a specific genomic region of interest at very high depth of coverage and identify less prevalent, rare mutations in heterogeneous clinical samples. However, the mutation detection levels are limited by the error rate of the sequencing technology as well as by the availability of variant-calling algorithms with high statistical power and low false positive rates. We demonstrate that we can robustly detect mutations at 0.1% fractional representation. This represents accurate detection of one mutant per every 1000 wild-type alleles. To achieve this sensitive level of mutation detection, we integrate a high accuracy indexing strategy and reference replication for estimating sequencing error variance. We employ a statistical model to estimate the error rate at each position of the reference and to quantify the fraction of variant base in the sample. Our method is highly specific (99%) and sensitive (100%) when applied to a known 0.1% sample fraction admixture of two synthetic DNA samples to validate our method. As a clinical application of this method, we analyzed nine clinical samples of H1N1 influenza A and detected an oseltamivir (antiviral therapy) resistance mutation in the H1N1 neuraminidase gene at a sample fraction of 0.18%.

    View details for DOI 10.1093/nar/gkr861

    View details for Web of Science ID 000298733500002

    View details for PubMedID 22013163

  • The Human OligoGenome Resource: a database of oligonucleotide capture probes for resequencing target regions across the human genome NUCLEIC ACIDS RESEARCH Newburger, D. E., Natsoulis, G., Grimes, S., Bell, J. M., Davis, R. W., Batzoglou, S., Ji, H. P. 2012; 40 (D1): D1137-D1143

    View details for DOI 10.1093/nar/gkr973

    View details for Web of Science ID 000298601300170

  • Quantitative and Sensitive Detection of Cancer Genome Amplifications from Formalin Fixed Paraffin Embedded Tumors with Droplet Digital PCR. Translational medicine (Sunnyvale, Calif.) Nadauld, L., Regan, J. F., Miotke, L., Pai, R. K., Longacre, T. A., Kwok, S. S., Saxonov, S., Ford, J. M., Ji, H. P. 2012; 2 (2)


    For the analysis of cancer, there is great interest in rapid and accurate detection of cancer genome amplifications containing oncogenes that are potential therapeutic targets. The vast majority of cancer tissue samples are formalin fixed and paraffin embedded (FFPE) which enables histopathological examination and long term archiving. However, FFPE cancer genomic DNA is oftentimes degraded and generally a poor substrate for many molecular biology assays. To overcome the issues of poor DNA quality from FFPE samples and detect oncogenic copy number amplifications with high accuracy and sensitivity, we developed a novel approach. Our assay requires nanogram amounts of genomic DNA, thus facilitating study of small amounts of clinical samples. Using droplet digital PCR (ddPCR), we can determine the relative copy number of specific genomic loci even in the presence of intermingled normal tissue. We used a control dilution series to determine the limits of detection for the ddPCR assay and report its improved sensitivity on minimal amounts of DNA compared to standard real-time PCR. To develop this approach, we designed an assay for the fibroblast growth factor receptor 2 gene (FGFR2) that is amplified in a gastric and breast cancers as well as others. We successfully utilized ddPCR to ascertain FGFR2 amplifications from FFPE-preserved gastrointestinal adenocarcinomas.

    View details for PubMedID 23682346

  • Efficient targeted resequencing of human germline and cancer genomes by oligonucleotide-selective sequencing NATURE BIOTECHNOLOGY Myllykangas, S., Buenrostro, J. D., Natsoulis, G., Bell, J. M., Ji, H. P. 2011; 29 (11): 1024-U95


    We describe an approach for targeted genome resequencing, called oligonucleotide-selective sequencing (OS-Seq), in which we modify the immobilized lawn of oligonucleotide primers of a next-generation DNA sequencer to function as both a capture and sequencing substrate. We apply OS-Seq to resequence the exons of either 10 or 344 cancer genes from human DNA samples. In our assessment of capture performance, >87% of the captured sequence originated from the intended target region with sequencing coverage falling within a tenfold range for a majority of all targets. Single nucleotide variants (SNVs) called from OS-Seq data agreed with >95% of variants obtained from whole-genome sequencing of the same individual. We also demonstrate mutation discovery from a colorectal cancer tumor sample matched with normal tissue. Overall, we show the robust performance and utility of OS-Seq for the resequencing analysis of human germline and cancer genomes.

    View details for DOI 10.1038/nbt.1996

    View details for Web of Science ID 000296801300024

    View details for PubMedID 22020387

  • A Flexible Approach for Highly Multiplexed Candidate Gene Targeted Resequencing PLOS ONE Natsoulis, G., Bell, J. M., Xu, H., Buenrostro, J. D., Ordonez, H., Grimes, S., Newburger, D., Jensen, M., Zahn, J. M., Zhang, N., Ji, H. P. 2011; 6 (6)


    We have developed an integrated strategy for targeted resequencing and analysis of gene subsets from the human exome for variants. Our capture technology is geared towards resequencing gene subsets substantially larger than can be done efficiently with simplex or multiplex PCR but smaller in scale than exome sequencing. We describe all the steps from the initial capture assay to single nucleotide variant (SNV) discovery. The capture methodology uses in-solution 80-mer oligonucleotides. To provide optimal flexibility in choosing human gene targets, we designed an in silico set of oligonucleotides, the Human OligoExome, that covers the gene exons annotated by the Consensus Coding Sequencing Project (CCDS). This resource is openly available as an Internet accessible database where one can download capture oligonucleotides sequences for any CCDS gene and design custom capture assays. Using this resource, we demonstrated the flexibility of this assay by custom designing capture assays ranging from 10 to over 100 gene targets with total capture sizes from over 100 Kilobases to nearly one Megabase. We established a method to reduce capture variability and incorporated indexing schemes to increase sample throughput. Our approach has multiple applications that include but are not limited to population targeted resequencing studies of specific gene subsets, validation of variants discovered in whole genome sequencing surveys and possible diagnostic analysis of disease gene subsets. We also present a cost analysis demonstrating its cost-effectiveness for large population studies.

    View details for DOI 10.1371/journal.pone.0021088

    View details for Web of Science ID 000292291800008

    View details for PubMedID 21738606

  • Targeted deep resequencing of the human cancer genome using next-generation technologies BIOTECHNOLOGY AND GENETIC ENGINEERING REVIEWS, VOL 27 Myllykangas, S., Ji, H. P. 2010; 27: 135-158


    Next-generation sequencing technologies have revolutionized our ability to identify genetic variants, either germline or somatic point mutations, that occur in cancer. Parallelization and miniaturization of DNA sequencing enables massive data throughput and for the first time, large-scale, nucleotide resolution views of cancer genomes can be achieved. Systematic, large-scale sequencing surveys have revealed that the genetic spectrum of mutations in cancers appears to be highly complex with numerous low frequency bystander somatic variations, and a limited number of common, frequently mutated genes. Large sample sizes and deeper resequencing are much needed in resolving clinical and biological relevance of the mutations as well as in detecting somatic variants in heterogeneous samples and cancer cell sub-populations. However, even with the next-generation sequencing technologies, the overwhelming size of the human genome and need for very high fold coverage represents a major challenge for up-scaling cancer genome sequencing projects. Assays to target, capture, enrich or partition disease-specific regions of the genome offer immediate solutions for reducing the complexity of the sequencing libraries. Integration of targeted DNA capture assays and next-generation deep resequencing improves the ability to identify clinically and biologically relevant mutations.

    View details for Web of Science ID 000286179900006

    View details for PubMedID 21415896



    A collection of yeast strains bearing single marked Ty1 insertions on chromosome III was generated. Over 100 such insertions were physically mapped by pulsed-field gel electrophoresis. These insertions are very nonrandomly distributed. Thirty-two such insertions were cloned by the inverted PCR technique, and the flanking DNA sequences were determined. The sequenced insertions all fell within a few very limited regions of chromosome III. Most of these regions contained tRNA coding regions and/or LTRs of preexisting transposable elements. Open reading frames were disrupted at a far lower frequency than expected for random transposition. The results suggest that the Ty1 integration machinery can detect regions of the genome that may represent "safe havens" for insertion. These regions of the genome do not contain any special DNA sequences, nor do they behave as particularly good targets for Ty1 integration in vitro, suggesting that the targeted regions have special properties allowing specific recognition in vivo.

    View details for Web of Science ID A1993LF06100016

    View details for PubMedID 8388781

  • Oncogenic transformation of diverse gastrointestinal tissues in primary organoid culture NATURE MEDICINE Li, X., Nadauld, L., Ootani, A., Corney, D. C., Pai, R. K., Gevaert, O., Cantrell, M. A., Rack, P. G., Neal, J. T., Chan, C. W., Yeung, T., Gong, X., Yuan, J., Wilhelmy, J., Robine, S., Attardi, L. D., Plevritis, S. K., Hung, K. E., Chen, C., Ji, H. P., Kuo, C. J. 2014; 20 (7): 769-777


    The application of primary organoid cultures containing epithelial and mesenchymal elements to cancer modeling holds promise for combining the accurate multilineage differentiation and physiology of in vivo systems with the facile in vitro manipulation of transformed cell lines. Here we used a single air-liquid interface culture method without modification to engineer oncogenic mutations into primary epithelial and mesenchymal organoids from mouse colon, stomach and pancreas. Pancreatic and gastric organoids exhibited dysplasia as a result of expression of Kras carrying the G12D mutation (Kras(G12D)), p53 loss or both and readily generated adenocarcinoma after in vivo transplantation. In contrast, primary colon organoids required combinatorial Apc, p53, Kras(G12D) and Smad4 mutations for progressive transformation to invasive adenocarcinoma-like histology in vitro and tumorigenicity in vivo, recapitulating multi-hit models of colorectal cancer (CRC), as compared to the more promiscuous transformation of small intestinal organoids. Colon organoid culture functionally validated the microRNA miR-483 as a dominant driver oncogene at the IGF2 (insulin-like growth factor-2) 11p15.5 CRC amplicon, inducing dysplasia in vitro and tumorigenicity in vivo. These studies demonstrate the general utility of a highly tractable primary organoid system for cancer modeling and driver oncogene validation in diverse gastrointestinal tissues.

    View details for DOI 10.1038/nm.3585

    View details for Web of Science ID 000338689500021

  • Allele-specific copy number profiling by next-generation DNA sequencing. Nucleic acids research Chen, H., Bell, J. M., Zavala, N. A., Ji, H. P., Zhang, N. R. 2014


    The progression and clonal development of tumors often involve amplifications and deletions of genomic DNA. Estimation of allele-specific copy number, which quantifies the number of copies of each allele at each variant loci rather than the total number of chromosome copies, is an important step in the characterization of tumor genomes and the inference of their clonal history. We describe a new method, falcon, for finding somatic allele-specific copy number changes by next generation sequencing of tumors with matched normals. falcon is based on a change-point model on a bivariate mixed Binomial process, which explicitly models the copy numbers of the two chromosome haplotypes and corrects for local allele-specific coverage biases. By using the Binomial distribution rather than a normal approximation, falcon more effectively pools evidence from sites with low coverage. A modified Bayesian information criterion is used to guide model selection for determining the number of copy number events. Falcon is evaluated on in silico spike-in data and applied to the analysis of a pre-malignant colon tumor sample and late-stage colorectal adenocarcinoma from the same individual. The allele-specific copy number estimates obtained by falcon allows us to draw detailed conclusions regarding the clonal history of the individual's colon cancer.

    View details for DOI 10.1093/nar/gku1252

    View details for PubMedID 25477383

  • MendeLIMS: a web-based laboratory information management system for clinical genome sequencing. BMC bioinformatics Grimes, S. M., Ji, H. P. 2014; 15 (1): 290-?


    Large clinical genomics studies using next generation DNA sequencing require the ability to select and track samples from a large population of patients through many experimental steps. With the number of clinical genome sequencing studies increasing, it is critical to maintain adequate laboratory information management systems to manage the thousands of patient samples that are subject to this type of genetic analysis.To meet the needs of clinical population studies using genome sequencing, we developed a web-based laboratory information management system (LIMS) with a flexible configuration that is adaptable to continuously evolving experimental protocols of next generation DNA sequencing technologies. Our system is referred to as MendeLIMS, is easily implemented with open source tools and is also highly configurable and extensible. MendeLIMS has been invaluable in the management of our clinical genome sequencing studies.We maintain a publicly available demonstration version of the application for evaluation purposes at MendeLIMS is programmed in Ruby on Rails (RoR) and accesses data stored in SQL-compliant relational databases. Software is freely available for non-commercial use at

    View details for DOI 10.1186/1471-2105-15-290

    View details for PubMedID 25159034

  • RVD: a command-line program for ultrasensitive rare single nucleotide variant detection using targeted next-generation DNA resequencing. BMC research notes Cushing, A., Flaherty, P., Hopmans, E., Bell, J. M., Ji, H. P. 2013; 6: 206-?


    Rare single nucleotide variants play an important role in genetic diversity and heterogeneity of specific human disease. For example, an individual clinical sample can harbor rare mutations at minor frequencies. Genetic diversity within an individual clinical sample is oftentimes reflected in rare mutations. Therefore, detecting rare variants prior to treatment may prove to be a useful predictor for therapeutic response. Current rare variant detection algorithms using next generation DNA sequencing are limited by inherent sequencing error rate and platform availability.Here we describe an optimized implementation of a rare variant detection algorithm called RVD for use in targeted gene resequencing. RVD is available both as a command-line program and for use in MATLAB and estimates context-specific error using a beta-binomial model to call variants with minor allele frequency (MAF) as low as 0.1%. We show that RVD accepts standard BAM formatted sequence files. We tested RVD analysis on multiple Illumina sequencing platforms, among the most widely used DNA sequencing platforms.RVD meets a growing need for highly sensitive and specific tools for variant detection. To demonstrate the usefulness of RVD, we carried out a thorough analysis of the software's performance on synthetic and clinical virus samples sequenced on both an Illumina GAIIx and a MiSeq. We expect RVD can improve understanding the genetics and treatment of common viral diseases including influenza. RVD is available at the following URL:

    View details for DOI 10.1186/1756-0500-6-206

    View details for PubMedID 23701658

  • Identification of Insertion Deletion Mutations from Deep Targeted Resequencing. Journal of data mining in genomics & proteomics Natsoulis, G., Zhang, N., Welch, K., Bell, J., Ji, H. P. 2013; 4 (3)


    Taking advantage of the deep targeted sequencing capabilities of next generation sequencers, we have developed a novel two step insertion deletion (indel) detection algorithm (IDA) that can determine indels from single read sequences with high computational efficiency and sensitivity when indels are fractionally less compared to wild type reference sequence. First, it identifies candidate indel positions utilizing specific sequence alignment artifacts produced by rapid alignment programs. Second, it confirms the location of the candidate indel by using the Smith-Waterman (SW) algorithm on a restricted subset of Sequence reads. We demonstrate that IDA is applicable to indels of varying sizes from deep targeted sequencing data at low fractions where the indel is diluted by wild type sequence. Our algorithm is useful in detecting indel variants present at variable allelic frequencies such as may occur in heterozygotes and mixed normal-tumor tissue.

    View details for DOI 10.4172/2153-0602.1000132

    View details for PubMedID 24511426


    View details for DOI 10.1214/12-AOAS538

    View details for Web of Science ID 000314457400010

  • The Human OligoGenome Resource: a database of oligonucleotide capture probes for resequencing target regions across the human genome. Nucleic acids research Newburger, D. E., Natsoulis, G., Grimes, S., Bell, J. M., Davis, R. W., Batzoglou, S., Ji, H. P. 2012; 40 (Database issue): D1137-43


    Recent exponential growth in the throughput of next-generation DNA sequencing platforms has dramatically spurred the use of accessible and scalable targeted resequencing approaches. This includes candidate region diagnostic resequencing and novel variant validation from whole genome or exome sequencing analysis. We have previously demonstrated that selective genomic circularization is a robust in-solution approach for capturing and resequencing thousands of target human genome loci such as exons and regulatory sequences. To facilitate the design and production of customized capture assays for any given region in the human genome, we developed the Human OligoGenome Resource ( This online database contains over 21 million capture oligonucleotide sequences. It enables one to create customized and highly multiplexed resequencing assays of target regions across the human genome and is not restricted to coding regions. In total, this resource provides 92.1% in silico coverage of the human genome. The online server allows researchers to download a complete repository of oligonucleotide probes and design customized capture assays to target multiple regions throughout the human genome. The website has query tools for selecting and evaluating capture oligonucleotides from specified genomic regions.

    View details for DOI 10.1093/nar/gkr973

    View details for PubMedID 22102592

  • Performance comparison of whole-genome sequencing platforms NATURE BIOTECHNOLOGY Lam, H. Y., Clark, M. J., Chen, R., Chen, R., Natsoulis, G., O'Huallachain, M., Dewey, F. E., Habegger, L., Ashley, E. A., Gerstein, M. B., Butte, A. J., Ji, H. P., Snyder, M. 2012; 30 (1): 78-U118

    View details for DOI 10.1038/nbt.2065

    View details for Web of Science ID 000299110600023

  • A cross-sample statistical model for SNP detection in short-read sequencing data NUCLEIC ACIDS RESEARCH Muralidharan, O., Natsoulis, G., Bell, J., Newburger, D., Xu, H., Kela, I., Ji, H., Zhang, N. 2012; 40 (1)


    Highly multiplex DNA sequencers have greatly expanded our ability to survey human genomes for previously unknown single nucleotide polymorphisms (SNPs). However, sequencing and mapping errors, though rare, contribute substantially to the number of false discoveries in current SNP callers. We demonstrate that we can significantly reduce the number of false positive SNP calls by pooling information across samples. Although many studies prepare and sequence multiple samples with the same protocol, most existing SNP callers ignore cross-sample information. In contrast, we propose an empirical Bayes method that uses cross-sample information to learn the error properties of the data. This error information lets us call SNPs with a lower false discovery rate than existing methods.

    View details for DOI 10.1093/nar/gkr851

    View details for Web of Science ID 000298733500005

    View details for PubMedID 22064853

  • Identification of a novel deletion mutant strain in Saccharomyces cerevisiae that results in a microsatellite instability phenotype. BioDiscovery Ji, H. P., Morales, S., Welch, K., Yuen, C., Farnam, K., Ford, J. M. 2012


    The DNA mismatch repair (MMR) pathway corrects specific types of DNA replication errors that affect microsatellites and thus is critical for maintaining genomic integrity. The genes of the MMR pathway are highly conserved across different organisms. Likewise, defective MMR function universally results in microsatellite instability (MSI) which is a hallmark of certain types of cancer associated with the Mendelian disorder hereditary nonpolyposis colorectal cancer. (Lynch syndrome). To identify previously unrecognized deleted genes or loci that can lead to MSI, we developed a functional genomics screen utilizing a plasmid containing a microsatellite sequence that is a host spot for MSI mutations and the comprehensive homozygous diploid deletion mutant resource for Saccharomyces cerevisiae. This pool represents a collection of non-essential homozygous yeast diploid (2N) mutants in which there are deletions for over four thousand yeast open reading frames (ORFs). From our screen, we identified a deletion mutant strain of the PAU24 gene that leads to MSI. In a series of validation experiments, we determined that this PAU24 mutant strain had an increased MSI-specific mutation rate in comparison to the original background wildtype strain, other deletion mutants and comparable to a MMR mutant involving the MLH1 gene. Likewise, in yeast strains with a deletion of PAU24, we identified specific de novo indel mutations that occurred within the targeted microsatellite used for this screen.

    View details for PubMedID 23667739

  • Targeted sequencing library preparation by genomic DNA circularization BMC BIOTECHNOLOGY Myllykangas, S., Natsoulis, G., Bell, J. M., Ji, H. P. 2011; 11


    For next generation DNA sequencing, we have developed a rapid and simple approach for preparing DNA libraries of targeted DNA content. Current protocols for preparing DNA for next-generation targeted sequencing are labor-intensive, require large amounts of starting material, and are prone to artifacts that result from necessary PCR amplification of sequencing libraries. Typically, sample preparation for targeted NGS is a two-step process where (1) the desired regions are selectively captured and (2) the ends of the DNA molecules are modified to render them compatible with any given NGS sequencing platform.In this proof-of-concept study, we present an integrated approach that combines these two separate steps into one. Our method involves circularization of a specific genomic DNA molecule that directly incorporates the necessary components for conducting sequencing in a single assay and requires only one PCR amplification step. We also show that specific regions of the genome can be targeted and sequenced without any PCR amplification.We anticipate that these rapid targeted libraries will be useful for validation of variants and may have diagnostic application.

    View details for DOI 10.1186/1472-6750-11-122

    View details for Web of Science ID 000300427900001

    View details for PubMedID 22168766

  • Genetic-based biomarkers and next-generation sequencing: the future of personalized care in colorectal cancer PERSONALIZED MEDICINE Kim, R. Y., Xu, H., Myllykangas, S., Ji, H. 2011; 8 (3): 331-345

    View details for DOI 10.2217/PME.11.16

    View details for Web of Science ID 000291444800013

  • Oncogenic BRAF Mutation with CDKN2A Inactivation Is Characteristic of a Subset of Pediatric Malignant Astrocytomas CANCER RESEARCH Schiffman, J. D., Hodgson, J. G., VandenBerg, S. R., Flaherty, P., Polley, M. C., Yu, M., Fisher, P. G., Rowitch, D. H., Ford, J. M., Berger, M. S., Ji, H., Gutmann, D. H., James, C. D. 2010; 70 (2): 512-519


    Malignant astrocytomas are a deadly solid tumor in children. Limited understanding of their underlying genetic basis has contributed to modest progress in developing more effective therapies. In an effort to identify such alterations, we performed a genome-wide search for DNA copy number aberrations (CNA) in a panel of 33 tumors encompassing grade 1 through grade 4 tumors. Genomic amplifications of 10-fold or greater were restricted to grade 3 and 4 astrocytomas and included the MDM4 (1q32), PDGFRA (4q12), MET (7q21), CMYC (8q24), PVT1 (8q24), WNT5B (12p13), and IGF1R (15q26) genes. Homozygous deletions of CDKN2A (9p21), PTEN (10q26), and TP53 (17p3.1) were evident among grade 2 to 4 tumors. BRAF gene rearrangements that were indicated in three tumors prompted the discovery of KIAA1549-BRAF fusion transcripts expressed in 10 of 10 grade 1 astrocytomas and in none of the grade 2 to 4 tumors. In contrast, an oncogenic missense BRAF mutation (BRAF(V600E)) was detected in 7 of 31 grade 2 to 4 tumors but in none of the grade 1 tumors. BRAF(V600E) mutation seems to define a subset of malignant astrocytomas in children, in which there is frequent concomitant homozygous deletion of CDKN2A (five of seven cases). Taken together, these findings highlight BRAF as a frequent mutation target in pediatric astrocytomas, with distinct types of BRAF alteration occurring in grade 1 versus grade 2 to 4 tumors.

    View details for DOI 10.1158/0008-5472.CAN-09-1851

    View details for Web of Science ID 000278485500011

    View details for PubMedID 20068183

  • Detecting simultaneous changepoints in multiple sequences. Biometrika Zhang, N. R., Siegmund, D. O., Ji, H., Li, J. Z. 2010; 97 (3): 631-645


    We discuss the detection of local signals that occur at the same location in multiple one-dimensional noisy sequences, with particular attention to relatively weak signals that may occur in only a fraction of the sequences. We propose simple scan and segmentation algorithms based on the sum of the chi-squared statistics for each individual sample, which is equivalent to the generalized likelihood ratio for a model where the errors in each sample are independent. The simple geometry of the statistic allows us to derive accurate analytic approximations to the significance level of such scans. The formulation of the model is motivated by the biological problem of detecting recurrent DNA copy number variants in multiple samples. We show using replicates and parent-child comparisons that pooling data across samples results in more accurate detection of copy number variants. We also apply the multisample segmentation algorithm to the analysis of a cohort of tumour samples containing complex nested and overlapping copy number aberrations, for which our method gives a sparse and intuitive cross-sample summary.

    View details for PubMedID 22822250

  • Identification of a biomarker panel using a multiplex proximity ligation assay improves accuracy of pancreatic cancer diagnosis JOURNAL OF TRANSLATIONAL MEDICINE Chang, S. T., Zahn, J. M., Horecka, J., Kunz, P. L., Ford, J. M., Fisher, G. A., Le, Q. T., Chang, D. T., Ji, H., Koong, A. C. 2009; 7


    Pancreatic cancer continues to prove difficult to clinically diagnose. Multiple simultaneous measurements of plasma biomarkers can increase sensitivity and selectivity of diagnosis. Proximity ligation assay (PLA) is a highly sensitive technique for multiplex detection of biomarkers in plasma with little or no interfering background signal.We examined the plasma levels of 21 biomarkers in a clinically defined cohort of 52 locally advanced (Stage II/III) pancreatic ductal adenocarcinoma cases and 43 age-matched controls using a multiplex proximity ligation assay. The optimal biomarker panel for diagnosis was computed using a combination of the PAM algorithm and logistic regression modeling. Biomarkers that were significantly prognostic for survival in combination were determined using univariate and multivariate Cox survival models.Three markers, CA19-9, OPN and CHI3L1, measured in multiplex were found to have superior sensitivity for pancreatic cancer vs. CA19-9 alone (93% vs. 80%). In addition, we identified two markers, CEA and CA125, that when measured simultaneously have prognostic significance for survival for this clinical stage of pancreatic cancer (p < 0.003).A multiplex panel assaying CA19-9, OPN and CHI3L1 in plasma improves accuracy of pancreatic cancer diagnosis. A panel assaying CEA and CA125 in plasma can predict survival for this clinical cohort of pancreatic cancer patients.

    View details for DOI 10.1186/1479-5876-7-105

    View details for Web of Science ID 000272889900001

    View details for PubMedID 20003342

  • Molecular inversion probes reveal patterns of 9p21 deletion and copy number aberrations in childhood leukemia CANCER GENETICS AND CYTOGENETICS Schiffman, J. D., Wang, Y., McPherson, L. A., Welch, K., Zhang, N., Davis, R., Lacayo, N. J., Dahl, G. V., Faham, M., Ford, J. M., Ji, H. P. 2009; 193 (1): 9-18


    Childhood leukemia, which accounts for >30% of newly diagnosed childhood malignancies, is one of the leading causes of death for children with cancer. Genome-wide studies using microarray chips to identify copy number changes in human cancer are becoming more common. In this pilot study, 45 pediatric leukemia samples were analyzed for gene copy aberrations using novel molecular inversion probe (MIP) technology. Acute leukemia subtypes included precursor B-cell acute lymphoblastic leukemia (ALL) (n=23), precursor T-cell ALL (n=6), and acute myeloid leukemia (n=14). The MIP analysis identified 69 regions of recurring copy number changes, of which 41 have not been identified with other DNA microarray platforms. Copy number gains and losses were validated in 98% of clinical karyotypes and 100% of fluorescence in situ hybridization studies available. We report unique patterns of copy number loss in samples with 9p21.3 (CDKN2A) deletion in the precursor B-cell ALL patients, compared with the precursor T-cell ALL patients. MIPs represent an attractive technology for identifying novel copy number aberrations, validating previously reported copy number changes, and translating molecular findings into clinically relevant targets for further investigation.

    View details for DOI 10.1016/j.cancergencyto.2009.03.005

    View details for Web of Science ID 000268922900002

    View details for PubMedID 19602459

  • Disperse-a software system for design of selector probes for exon resequencing applications BIOINFORMATICS Stenberg, J., Zhang, M., Ji, H. 2009; 25 (5): 666-667


    Selector probes enable the amplification of many selected regions of the genome in multiplex. Disperse is a software pipeline that automates the procedure of designing selector probes for exon resequencing applications.Software and documentation is available at

    View details for DOI 10.1093/bioinformatics/btp001

    View details for Web of Science ID 000263834600018

    View details for PubMedID 19158162

  • Molecular inversion probe assay for allelic quantitation. Methods in molecular biology (Clifton, N.J.) Ji, H., Welch, K. 2009; 556: 67-87


    Molecular inversion probe (MIP) technology has been demonstrated to be a robust platform for large-scale dual genotyping and copy number analysis. Applications in human genomic and genetic studies include the possibility of running dual germline genotyping and combined copy number variation ascertainment. MIPs analyze large numbers of specific genetic target sequences in parallel, relying on interrogation of a barcode tag, rather than direct hybridization of genomic DNA to an array. The MIP approach does not replace, but is complementary to many of the copy number technologies being performed today. Some specific advantages of MIP technology include: less DNA required (37 ng vs. 250 ng), DNA quality less important, more dynamic range (amplifications detected up to copy number 60), allele-specific information "cleaner" (less SNP cross-talk/contamination), and quality of markers better (fewer individual MIPs versus SNPs needed to identify copy number changes). MIPs can be considered a candidate gene (targeted whole genome) approach and can find specific areas of interest that otherwise may be missed with other methods.

    View details for DOI 10.1007/978-1-60327-192-9_6

    View details for PubMedID 19488872

  • Next-generation DNA sequencing NATURE BIOTECHNOLOGY Shendure, J., Ji, H. 2008; 26 (10): 1135-1145


    DNA sequence represents a single format onto which a broad range of biological phenomena can be projected for high-throughput data collection. Over the past three years, massively parallel DNA sequencing platforms have become widely available, reducing the cost of DNA sequencing by over two orders of magnitude, and democratizing the field by putting the sequencing capacity of a major genome center in the hands of individual investigators. These new technologies are rapidly evolving, and near-term challenges include the development of robust protocols for generating sequencing libraries, building effective new approaches to data-analysis, and often a rethinking of experimental design. Next-generation DNA sequencing has the potential to dramatically accelerate biological and biomedical research, by enabling the comprehensive analysis of genomes, transcriptomes and interactomes to become inexpensive, routine and widespread, rather than requiring significant production-scale efforts.

    View details for DOI 10.1038/nbt1486

    View details for Web of Science ID 000259926000028

    View details for PubMedID 18846087

  • Multigene amplification and massively parallel sequencing for cancer mutation discovery PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA Dahl, F., Stenberg, J., Fredriksson, S., Welch, K., Zhang, M., Nilsson, M., Bicknell, D., Bodmer, W. F., Davis, R. W., Ji, H. 2007; 104 (22): 9387-9392


    We have developed a procedure for massively parallel resequencing of multiple human genes by combining a highly multiplexed and target-specific amplification process with a high-throughput parallel sequencing technology. The amplification process is based on oligonucleotide constructs, called selectors, that guide the circularization of specific DNA target regions. Subsequently, the circularized target sequences are amplified in multiplex and analyzed by using a highly parallel sequencing-by-synthesis technology. As a proof-of-concept study, we demonstrate parallel resequencing of 10 cancer genes covering 177 exons with average sequence coverage per sample of 93%. Seven cancer cell lines and one normal genomic DNA sample were studied with multiple mutations and polymorphisms identified among the 10 genes. Mutations and polymorphisms in the TP53 gene were confirmed by traditional sequencing.

    View details for DOI 10.1073/pnas.0702165104

    View details for Web of Science ID 000246935700055

    View details for PubMedID 17517648

  • Multiplex amplification of all coding sequences within 10 cancer genes by Gene-Collector NUCLEIC ACIDS RESEARCH Fredriksson, S., Baner, J., Dahl, F., Chu, A., Ji, H., Welch, K., Davis, R. W. 2007; 35 (7)


    Herein we present Gene-Collector, a method for multiplex amplification of nucleic acids. The procedure has been employed to successfully amplify the coding sequence of 10 human cancer genes in one assay with uniform abundance of the final products. Amplification is initiated by a multiplex PCR in this case with 170 primer pairs. Each PCR product is then specifically circularized by ligation on a Collector probe capable of juxtapositioning only the perfectly matched cognate primer pairs. Any amplification artifacts typically associated with multiplex PCR derived from the use of many primer pairs such as false amplicons, primer-dimers etc. are not circularized and degraded by exonuclease treatment. Circular DNA molecules are then further enriched by randomly primed rolling circle replication. Amplification was successful for 90% of the targeted amplicons as seen by hybridization to a custom resequencing DNA micro-array. Real-time quantitative PCR revealed that 96% of the amplification products were all within 4-fold of the average abundance. Gene-Collector has utility for numerous applications such as high throughput resequencing, SNP analyses, and pathogen detection.

    View details for DOI 10.1093/nar/gkm078

    View details for Web of Science ID 000246294700001

    View details for PubMedID 17317684

  • Multiplexed protein detection by proximity ligation for cancer biomarker validation NATURE METHODS Fredriksson, S., Dixon, W., Ji, H., Koong, A. C., Mindrinos, M., Davis, R. W. 2007; 4 (4): 327-329


    We present a proximity ligation-based multiplexed protein detection procedure in which several selected proteins can be detected via unique nucleic-acid identifiers and subsequently quantified by real-time PCR. The assay requires a 1-microl sample, has low-femtomolar sensitivity as well as five-log linear range and allows for modular multiplexing without cross-reactivity. The procedure can use a single polyclonal antibody batch for each target protein, simplifying affinity-reagent creation for new biomarker candidates.

    View details for DOI 10.1038/NMETH1020

    View details for Web of Science ID 000245584900013

    View details for PubMedID 17369836

  • Under-expression of Kalirin-7 increases iNOS activity in cultured cells and correlates to elevated iNOS activity in Alzheimer's disease hippocampus JOURNAL OF ALZHEIMERS DISEASE Youn, H., Ji, I., Ji, H. P., Markesbery, W. R., Ji, T. H. 2007; 12 (3): 271-281


    Recently, it has been reported that Kalirin gene transcripts are under-expressed in AD hippocampal specimens compared to the controls. The Kalirin gene generates a dozen Kalirin isoforms. Kalirin-7 is the predominant protein expressed in the adult brain and plays crucial roles in growth and maintenance of neurons. Yet its role in human diseases is unknown. We report that Kalirin-7 is significantly diminished both at the mRNA and protein levels in the hippocampus specimens from 19 AD patients compared to the specimens from 15 controls. Kalirin-7 associates with iNOS in the hippocampus, and therefore, Kalirin-7 is complexed with iNOS less in AD hippocampus extracts than in control hippocampus extracts. In cultured cells, Kalirin-7 associates with iNOS and down-regulates the enzyme activity. The down-regulation is attributed to the highly conserved 33 amino acid sequence, K(617) -H(649), of the 1,663 amino acids long Kalirin-7. Remarkably, the iNOS activity is considerably higher in the hippocampus specimens from AD patients than the specimens from 15 controls. These observations suggest that the under-expression of Kalirin-7 in AD hippocampus correlates to the elevated iNOS activity.

    View details for Web of Science ID 000252300000009

    View details for PubMedID 18057561

  • Reproducibility Probability Score - incorporating measurement variability across laboratories for gene selection NATURE BIOTECHNOLOGY Lin, G., He, X., Ji, H., Shi, L., Davis, R. W., Zhong, S. 2006; 24 (12): 1476-1477

    View details for Web of Science ID 000242795800015

    View details for PubMedID 17160039

  • Data quality in genomics and microarrays NATURE BIOTECHNOLOGY Ji, H., Davis, R. W. 2006; 24 (9): 1112-1113

    View details for DOI 10.1038/nbt0906-1108

    View details for Web of Science ID 000240495200031

    View details for PubMedID 16964224

  • The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements NATURE BIOTECHNOLOGY Shi, L., Reid, L. H., Jones, W. D., Shippy, R., Warrington, J. A., Baker, S. C., Collins, P. J., de Longueville, F., Kawasaki, E. S., Lee, K. Y., Luo, Y., Sun, Y. A., Willey, J. C., Setterquist, R. A., Fischer, G. M., Tong, W., Dragan, Y. P., Dix, D. J., Frueh, F. W., Goodsaid, F. M., Herman, D., Jensen, R. V., Johnson, C. D., Lobenhofer, E. K., Puri, R. K., Scherf, U., Thierry-Mieg, J., Wang, C., Wilson, M., Wolber, P. K., Zhang, L., Amur, S., Bao, W., Barbacioru, C. C., Lucas, A. B., Bertholet, V., Boysen, C., Bromley, B., Brown, D., Brunner, A., Canales, R., Cao, X. M., Cebula, T. A., Chen, J. J., Cheng, J., Chu, T., Chudin, E., Corson, J., Corton, J. C., Croner, L. J., Davies, C., Davison, T. S., Delenstarr, G., Deng, X., Dorris, D., Eklund, A. C., Fan, X., Fang, H., Fulmer-Smentek, S., Fuscoe, J. C., Gallagher, K., Ge, W., Guo, L., Guo, X., Hager, J., Haje, P. K., Han, J., Han, T., Harbottle, H. C., Harris, S. C., Hatchwell, E., Hauser, C. A., Hester, S., Hong, H., Hurban, P., Jackson, S. A., Ji, H., Knight, C. R., Kuo, W. P., LeClerc, J. E., Levy, S., Li, Q., Liu, C., Liu, Y., Lombardi, M. J., Ma, Y., Magnuson, S. R., Maqsodi, B., McDaniel, T., Mei, N., Myklebost, O., Ning, B., Novoradovskaya, N., Orr, M. S., Osborn, T. W., Papallo, A., Patterson, T. A., Perkins, R. G., Peters, E. H., Peterson, R., Philips, K. L., Pine, P. S., Pusztai, L., Qian, F., Ren, H., Rosen, M., Rosenzweig, B. A., Samaha, R. R., Schena, M., Schroth, G. P., Shchegrova, S., Smith, D. D., Staedtler, F., Su, Z., Sun, H., Szallasi, Z., Tezak, Z., Thierry-Mieg, D., Thompson, K. L., Tikhonova, I., Turpaz, Y., Vallanat, B., Van, C., Walker, S. J., Wang, S. J., Wang, Y., Wolfinger, R., Wong, A., Wu, J., Xiao, C., Xie, Q., Xu, J., Yang, W., Zhang, L., Zhong, S., Zong, Y., Slikker, W. 2006; 24 (9): 1151-1161


    Over the last decade, the introduction of microarray technology has had a profound impact on gene expression research. The publication of studies with dissimilar or altogether contradictory results, obtained using different microarray platforms to analyze identical RNA samples, has raised concerns about the reliability of this technology. The MicroArray Quality Control (MAQC) project was initiated to address these concerns, as well as other performance and data analysis issues. Expression data on four titration pools from two distinct reference RNA samples were generated at multiple test sites using a variety of microarray-based and alternative technology platforms. Here we describe the experimental design and probe mapping efforts behind the MAQC project. We show intraplatform consistency across test sites as well as a high level of interplatform concordance in terms of genes identified as differentially expressed. This study provides a resource that represents an important first step toward establishing a framework for the use of microarrays in clinical and regulatory settings.

    View details for DOI 10.1038/nbt1239

    View details for Web of Science ID 000240495200036

    View details for PubMedID 16964229

  • Molecular inversion probe analysis of gene copy alterations reveals distinct categories of colorectal carcinoma CANCER RESEARCH Ji, H., Kumm, J., Zhang, M., Farnam, K., Salari, K., Faham, M., Ford, J. M., Davis, R. W. 2006; 66 (16): 7910-7919


    Genomic instability is a major feature of neoplastic development in colorectal carcinoma and other cancers. Specific genomic instability events, such as deletions in chromosomes and other alterations in gene copy number, have potential utility as biologically relevant prognostic biomarkers. For example, genomic deletions on chromosome arm 18q are an indicator of colorectal carcinoma behavior and potentially useful as a prognostic indicator. Adapting a novel genomic technology called molecular inversion probes which can determine gene copy alterations, such as genomic deletions, we designed a set of probes to interrogate several hundred individual exons of >200 cancer genes with an overall distribution covering all chromosome arms. In addition, >100 probes were designed in close proximity of microsatellite markers on chromosome arm 18q. We analyzed a set of colorectal carcinoma cell lines and primary colorectal tumor samples for gene copy alterations and deletion mutations in exons. Based on clustering analysis, we distinguished the different categories of genomic instability among the colorectal cancer cell lines. Our analysis of primary tumors uncovered several distinct categories of colorectal carcinoma, each with specific patterns of 18q deletions and deletion mutations in specific genes. This finding has potential clinical ramifications given the application of 18q loss of heterozygosity events as a potential indicator for adjuvant treatment in stage II colorectal carcinoma.

    View details for DOI 10.1158/0008-5472.CAN-06-0595

    View details for Web of Science ID 000239828200013

    View details for PubMedID 16912164

  • A functional assay for mutations in tumor suppressor genes caused by mismatch repair deficiency HUMAN MOLECULAR GENETICS Ji, H. P., King, M. C. 2001; 10 (24): 2737-2743


    The coding sequences of multiple human tumor suppressor genes include microsatellite sequences that are prone to mutations. Saccharomyces cerevisiae strains deficient in DNA mismatch repair (MMR) can be used to determine de novo mutation rates of these human tumor suppressor genes as well as any other gene sequence. Microsatellites in human TGFBR2, PTEN and APC genes were placed in yeast vectors and analyzed in isogenic yeast strains that were wild-type or deletion mutants for MSH2 or MLH1. In MMR-deficient strains, the vector containing the (A)(10) microsatellite sequence of TGFBR2 had a mutation rate (mutations/cell division) of 1.4 x 10(-4), compared to a mutation rate of 1.7 x 10(-6) in the wild-type strain. In MMR-deficient strains, mutation rates in PTEN and APC were also elevated above background levels. PTEN mutation rates were higher in both msh2 (4.4 x 10-5) and mlh1 strains (2.3 x 10-5). APC mutation rates in the msh2 strain (2.4 x 10-6) and the mlh1 strain (1.7 x 10-6) were also significantly, but less dramatically, elevated over background. Mutations selected for in the yeast screen were identical to those previously observed in human tumor samples with microsatellite instability (MSI). This functional assay has applicability in providing quantitative data about microsatellite mutation rates caused by MMR deficiency in any human tumor suppressor gene sequence. It can also be applied as a genetic screen to identify new genes that are vulnerable to such microsatellite mutations and thus may be involved in the neoplastic development of tumors with MSI.

    View details for Web of Science ID 000172867500001

    View details for PubMedID 11734538

  • Spondyloepimetaphyseal dysplasia with joint laxity (SEMDJL): Presentation in two unrelated patients in the United States AMERICAN JOURNAL OF MEDICAL GENETICS Smith, W., Ji, H. L., Mouradian, W., Pagon, R. A. 1999; 86 (3): 245-252


    This is a report of two North American patients with spondyloepimetaphyseal dysplasia with joint laxity, an uncommon autosomal recessive skeletal dysplasia rarely reported outside of South Africa. Patients with SEMDJL have vertebral abnormalities and ligamentous laxity that results in spinal misalignment and progressive severe kyphoscoliosis, thoracic asymmetry, and respiratory compromise resulting in early death. Nonaxial skeletal involvement includes elbow deformities with radial head dislocation, dislocated hips, clubbed feet, and tapered fingers with spatulate distal phalanges. Many affected children have an oval face, flat midface, prominent eyes with blue sclerae, and a long philtrum. Palatal abnormalities and congenital heart disease are also observed. Diagnosis in infancy may be difficult because many of the typical findings are not apparent early and only evolve over time. We review the physical and radiographic findings in two unrelated patients with this disorder in order to increase the awareness of this disorder, particularly for clinicians outside of South Africa.

    View details for Web of Science ID 000082714300010

    View details for PubMedID 10482874

  • Molecular classification of the inherited hamartoma polyposis syndromes: Clearing the muddied waters AMERICAN JOURNAL OF HUMAN GENETICS Eng, C., Ji, H. L. 1998; 62 (5): 1020-1022

    View details for Web of Science ID 000073487000004

    View details for PubMedID 9545417

  • Inherited mutations in PTEN that are associated with breast cancer, Cowden disease, and juvenile polyposis AMERICAN JOURNAL OF HUMAN GENETICS Lynch, E. D., OSTERMEYER, E. A., Lee, M. K., Arena, J. F., Ji, H. L., Dann, J., Swisshelm, K., Suchard, D., MACLEOD, P. M., KVINNSLAND, S., Gjertsen, B. T., Heimdal, K., Lubs, H., Moller, P., KING, M. C. 1997; 61 (6): 1254-1260


    PTEN, a protein tyrosine phosphatase with homology to tensin, is a tumor-suppressor gene on chromosome 10q23. Somatic mutations in PTEN occur in multiple tumors, most markedly glioblastomas. Germ-line mutations in PTEN are responsible for Cowden disease (CD), a rare autosomal dominant multiple-hamartoma syndrome. PTEN was sequenced from constitutional DNA from 25 families. Germ-line PTEN mutations were detected in all of five families with both breast cancer and CD, in one family with juvenile polyposis syndrome, and in one of four families with breast and thyroid tumors. In this last case, signs of CD were subtle and were diagnosed only in the context of mutation analysis. PTEN mutations were not detected in 13 families at high risk of breast and/or ovarian cancer. No PTEN-coding-sequence polymorphisms were detected in 70 independent chromosomes. Seven PTEN germ-line mutations occurred, five nonsense and two missense mutations, in six of nine PTEN exons. The wild-type PTEN allele was lost from renal, uterine, breast, and thyroid tumors from a single patient. Loss of PTEN expression was an early event, reflected in loss of the wild-type allele in DNA from normal tissue adjacent to the breast and thyroid tumors. In RNA from normal tissues from three families, mutant transcripts appeared unstable. Germ-line PTEN mutations predispose to breast cancer in association with CD, although the signs of CD may be subtle.

    View details for Web of Science ID 000071555900007

    View details for PubMedID 9399897

Footer Links:

Stanford Medicine Resources: