Bio

Bio


Throughout my doctoral training, I focused on methods development and application for bioinformatics and machine learning approaches. During my postdoctoral training I contributed to the development of machine learning based approaches and applied them to cardiovascular and cancer genetic data. My current interest are in the development of methods to investigate how different layers of (epi)genomic data can be integrated in order to establish a holistic view of the molecular mechanisms underlying cancer initiation and progression and drug resistance.

Academic Appointments


Honors & Awards


  • Susan G. Komen Postdoctoral Fellowship, The Susan G. Komen Breast Cancer Foundation (June 2016)

Professional Education


  • PhD, University of A Coruna, Computer Science (2012)

Publications

All Publications


  • Chromatin state as a mechanism of anthracycline response in breast cancer Seoane, J. A., Kirkland, J. G., Caswell-Jin, J. L., Crabtree, G. R., Curtis, C. AMER ASSOC CANCER RESEARCH. 2019
  • Community assessment to advance computational prediction of cancer drug combinations in a pharmacogenomic screen NATURE COMMUNICATIONS Menden, M. P., Wang, D., Mason, M. J., Szalai, B., Bulusu, K. C., Guan, Y., Yu, T., Kang, J., Jeon, M., Wolfinger, R., Nguyen, T., Zaslavskiy, M., Jang, I., Ghazoui, Z., Ahsen, M., Vogel, R., Neto, E., Norman, T., Tang, E. Y., Garnett, M. J., Di Veroli, G. Y., Fawell, S., Stolovitzky, G., Guinney, J., Dry, J. R., Saez-Rodriguez, J., Abante, J., Abecassis, B., Aben, N., Aghamirzaie, D., Aittokallio, T., Akhtari, F. S., Al-lazikani, B., Alam, T., Allam, A., Allen, C., de Almeida, M., Altarawy, D., Alves, V., Amadoz, A., Anchang, B., Antolin, A. A., Ash, J. R., Romeo Aznar, V., Ba-alawi, W., Bagheri, M., Bajic, V., Ball, G., Ballester, P. J., Baptista, D., Bare, C., Bateson, M., Bender, A., Bertrand, D., Wijayawardena, B., Boroevich, K. A., Bosdriesz, E., Bougouffa, S., Bounova, G., Brouwer, T., Bryant, B., Calaza, M., Calderone, A., Calza, S., Capuzzi, S., Carbonell-Caballero, J., Carlin, D., Carter, H., Castagnoli, L., Celebi, R., Cesareni, G., Chang, H., Chen, G., Chen, H., Chen, H., Cheng, L., Chernomoretz, A., Chicco, D., Cho, K., Cho, S., Choi, D., Choi, J., Choi, K., Choi, M., De Cock, M., Coker, E., Cortes-Ciriano, I., Cserzo, M., Cubuk, C., Curtis, C., Van Daele, D., Dang, C. C., Dijkstra, T., Dopazo, J., Draghici, S., Drosou, A., Dumontier, M., Ehrhart, F., Eid, F., ElHefnawi, M., Elmarakeby, H., van Engelen, B., Engin, H., de Esch, I., Evelo, C., Falcao, A. O., Farag, S., Fernandez-Lozano, C., Fisch, K., Flobak, A., Fornari, C., Foroushani, A. K., Fotso, D., Fourches, D., Friend, S., Frigessi, A., Gao, F., Gao, X., Gerold, J. M., Gestraud, P., Ghosh, S., Gillberg, J., Godoy-Lorite, A., Godynyuk, L., Godzik, A., Goldenberg, A., Gomez-Cabrero, D., Gonen, M., de Graaf, C., Gray, H., Grechkin, M., Guimera, R., Guney, E., Haibe-Kains, B., Han, Y., Hase, T., He, D., He, L., Heath, L. S., Hellton, K. H., Helmer-Citterich, M., Hidalgo, M. R., Hidru, D., Hill, S. M., Hochreiter, S., Hong, S., Hovig, E., Hsueh, Y., Hu, Z., Huang, J. K., Huang, R., Hunyady, L., Hwang, J., Hwang, T., Hwang, W., Hwang, Y., Isayev, O., Walk, O., Jack, J., Jahandideh, S., Ji, J., Jo, Y., Kamola, P. J., Kanev, G. K., Karacosta, L., Karimi, M., Kaski, S., Kazanov, M., Khamis, A. M., Khan, S., Kiani, N. A., Kim, A., Kim, J., Kim, J., Kim, K., Kim, K., Kim, S., Kim, Y., Kim, Y., Kirk, P. W., Kitano, H., Klambauer, G., Knowles, D., Ko, M., Kohn-Luque, A., Kooistra, A. J., Kuenemann, M. A., Kuiper, M., Kurz, C., Kwon, M., van Laarhoven, T., Laegreid, A., Lederer, S., Lee, H., Lee, J., Lee, Y., Leppaho, E., Lewis, R., Li, J., Li, L., Liley, J., Lim, W., Lin, C., Liu, Y., Lopez, Y., Low, J., Lysenko, A., Machado, D., Madhukar, N., De Maeyer, D., Malpartida, A., Mamitsuka, H., Marabita, F., Marchal, K., Marttinen, P., Mason, D., Mazaheri, A., Mehmood, A., Mehreen, A., Michaut, M., Miller, R. A., Mitsopoulos, C., Modos, D., Van Moerbeke, M., Moo, K., Motsinger-Reif, A., Movva, R., Muraru, S., Muratov, E., Mushthofa, M., Nagarajan, N., Nakken, S., Nath, A., Neuvial, P., Newton, R., Ning, Z., De Niz, C., Oliva, B., Olsen, C., Palmeri, A., Panesar, B., Papadopoulos, S., Park, J., Park, S., Park, S., Pawitan, Y., Peluso, D., Pendyala, S., Peng, J., Perfetto, L., Pirro, S., Plevritis, S., Politi, R., Poon, H., Porta, E., Prellner, I., Preuer, K., Angel Pujana, M., Ramnarine, R., Reid, J. E., Reyal, F., Richardson, S., Ricketts, C., Rieswijk, L., Rocha, M., Rodriguez-Gonzalvez, C., Roell, K., Rotroff, D., de Ruiter, J. R., Rukawa, P., Sadacca, B., Safikhani, Z., Safitri, F., Sales-Pardo, M., Sauer, S., Schlichting, M., Seoane, J. A., Serra, J., Shang, M., Sharma, A., Sharma, H., Shen, Y., Shiga, M., Shin, M., Shkedy, Z., Shopsowitz, K., Sinai, S., Skola, D., Smirnov, P., Soerensen, I., Soerensen, P., Song, J., Song, S., Soufan, O., Spitzmueller, A., Steipe, B., Suphavilai, C., Tamayo, S., Tamborero, D., Tang, J., Tanoli, Z., Tarres-Deulofeu, M., Tegner, J., Thommesen, L., Tonekaboni, S., Tran, H., De Troyer, E., Truong, A., Tsunoda, T., Turu, G., Tzeng, G., Verbeke, L., Videla, S., Vis, D., Voronkov, A., Votis, K., Wang, A., Wang, H., Wang, P., Wang, S., Wang, W., Wang, X., Wang, X., Wennerberg, K., Wernisch, L., Wessels, L., van Westen, G. P., Westerman, B. A., White, S., Willighagen, E., Wurdinger, T., Xie, L., Xie, S., Xu, H., Yadav, B., Yau, C., Yeerna, H., Yin, J., Yu, M., Yu, M., Yun, S., Zakharov, A., Zamichos, A., Zanin, M., Zeng, L., Zenil, H., Zhang, F., Zhang, P., Zhang, W., Zhao, H., Zhao, L., Zheng, W., Zoufir, A., Zucknick, M., AstraZeneca-Sanger Drug Combinatio 2019; 10: 2674

    Abstract

    The effectiveness of most cancer targeted therapies is short-lived. Tumors often develop resistance that might be overcome with drug combinations. However, the number of possible combinations is vast, necessitating data-driven approaches to find optimal patient-specific treatments. Here we report AstraZeneca's large drug combination dataset, consisting of 11,576 experiments from 910 combinations across 85 molecularly characterized cancer cell lines, and results of a DREAM Challenge to evaluate computational strategies for predicting synergistic drug pairs and biomarkers. 160 teams participated to provide a comprehensive methodological development and benchmarking. Winning methods incorporate prior knowledge of drug-target interactions. Synergy is predicted with an accuracy matching biological replicates for >60% of combinations. However, 20% of drug combinations are poorly predicted by all methods. Genomic rationale for synergy predictions are identified, including ADAM17 inhibitor antagonism when combined with PIK3CB/D inhibition contrasting to synergy when combined with other PI3K-pathway inhibitors in PIK3CA mutant cells.

    View details for DOI 10.1038/s41467-019-09799-2

    View details for Web of Science ID 000471758500010

    View details for PubMedID 31209238

    View details for PubMedCentralID PMC6572829

  • Quantitative evidence for early metastatic seeding in colorectal cancer. Nature genetics Hu, Z., Ding, J., Ma, Z., Sun, R., Seoane, J. A., Scott Shaffer, J., Suarez, C. J., Berghoff, A. S., Cremolini, C., Falcone, A., Loupakis, F., Birner, P., Preusser, M., Lenz, H., Curtis, C. 2019

    Abstract

    Both the timing and molecular determinants of metastasis are unknown, hindering treatment and prevention efforts. Here we characterize the evolutionary dynamics of this lethal process by analyzing exome-sequencing data from 118biopsies from 23patients with colorectal cancer with metastases to the liver or brain. The data show that the genomic divergence between the primary tumor and metastasis is low and that canonical driver genes were acquired early. Analysis within a spatial tumor growth model and statistical inference framework indicates that early disseminated cells commonly (81%, 17 out of 21evaluable patients) seed metastases while the carcinoma is clinically undetectable (typically, less than 0.01cm3). We validated the association between early drivers and metastasis in an independent cohort of 2,751colorectal cancers, demonstrating their utility as biomarkers of metastasis. This conceptual and analytical framework provides quantitative in vivo evidence that systemic spread can occur early in colorectal cancer and illuminates strategies for patient stratification and therapeutic targeting of the canonical drivers of tumorigenesis.

    View details for DOI 10.1038/s41588-019-0423-x

    View details for PubMedID 31209394

  • Dynamics of breast-cancer relapse reveal late-recurring ER-positive genomic subgroups. Nature Rueda, O. M., Sammut, S., Seoane, J. A., Chin, S., Caswell-Jin, J. L., Callari, M., Batra, R., Pereira, B., Bruna, A., Ali, H. R., Provenzano, E., Liu, B., Parisien, M., Gillett, C., McKinney, S., Green, A. R., Murphy, L., Purushotham, A., Ellis, I. O., Pharoah, P. D., Rueda, C., Aparicio, S., Caldas, C., Curtis, C. 2019

    Abstract

    The rates and routes of lethal systemic spread in breast cancer are poorly understood owing to a lack of molecularly characterized patient cohorts with long-term, detailed follow-up data. Long-term follow-up is especially important for those with oestrogen-receptor (ER)-positive breast cancers, which can recur up to two decades after initial diagnosis1-6. It is therefore essential to identify patients who have a high risk of late relapse7-9. Here we present a statistical framework that models distinct disease stages (locoregional recurrence, distant recurrence, breast-cancer-related deathand death from other causes) and competing risks of mortality from breast cancer, while yielding individual risk-of-recurrence predictions. We apply this model to 3,240 patients with breast cancer, including 1,980 for whom molecular data are available, and delineate spatiotemporal patterns of relapse across different categories of molecular information (namely immunohistochemical subtypes; PAM50 subtypes, which are based on gene-expression patterns10,11; and integrative or IntClust subtypes, which are based on patterns of genomic copy-number alterations and gene expression12,13). We identify four late-recurring integrative subtypes, comprisingabout one quarter (26%) of tumours that are both positive for ER and negative for human epidermal growth factor receptor 2, each with characteristic tumour-driving alterations in genomic copy number and a high risk of recurrence (mean 47-62%) up to 20 years after diagnosis. We also define a subgroup of triple-negative breast cancers in which cancer rarely recurs after five years, and a separate subgroup in which patients remain at risk. Use of the integrative subtypes improves the prediction of late, distant relapse beyond what is possible with clinical covariates (nodal status, tumour size, tumour grade and immunohistochemical subtype). These findings highlight opportunities for improved patient stratification and biomarker-driven clinical trials.

    View details for PubMedID 30867590

  • Assessment of ERBB2/HER2 Status in HER2-Equivocal Breast Cancers by FISH and 2013/2014 ASCO-CAP Guidelines. JAMA oncology Press, M. F., Seoane, J. A., Curtis, C., Quinaux, E., Guzman, R., Sauter, G., Eiermann, W., Mackey, J. R., Robert, N., Pienkowski, T., Crown, J., Martin, M., Valero, V., Bee, V., Ma, Y., Villalobos, I., Slamon, D. J. 2018

    Abstract

    Importance: The 2013/2014 American Society of Clinical Oncology and College of American Pathologists (ASCO-CAP) guidelines for HER2 testing by fluorescence in situ hybridization (FISH) designated an "equivocal" category (average HER2 copies per tumor cell ?4-6 with HER2/CEP17 ratio <2.0) to be resolved as negative or positive by assessments with alternative control probes. Approximately 4% to 12% of all invasive breast cancers are characterized as HER2-equivocal based on FISH.Objective: To evaluate the following hypotheses: (1) genetic loci used as alternative controls are heterozygously deleted in a substantial proportion of breast cancers; (2) use of these loci for assessment of HER2 by FISH leads to false-positive assessments; and (3) these HER2 false-positive breast cancer patients have outcomes that do not differ from clinical outcomes for patients with HER2-negative breast cancer.Design, Setting, and Participants: We retrospectively assessed the use of chromosome 17 p-arm and q-arm alternative control genomic sites (TP53, D17S122, SMS, RARA, TOP2A), as recommended by the 2013/2014 ASCO-CAP guidelines for HER2 testing, in patients whose data were available through Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) and whose tissues were available through the Breast Cancer International Research Group clinical trials. We used data from an international cohort database of invasive breast cancers (1980 participants) and international clinical trial of adjuvant chemotherapy in invasive, node-positive breast cancer patients.Main Outcomes and Measures: The primary objectives were to (1) assess frequency of heterozygous deletions in chromosome 17 genomic sites used as FISH internal controls for evaluation of HER2 status among HER2-equivocal cancers; (2) characterize impact of using deleted sites for determination of HER2-to-internal-control-gene ratios; (3) assess HER2 protein expression in each subgroup; and (4) compare clinical outcomes for each subgroup.Results: Of the 1980 patients in METABRIC,1915 patients were fully evaluated. In addition, 100 HER2-equivocal breast cancers by FISH and 100 comparator FISH-negative breast cancers from the BCIRG-005 trial were analyzed. Heterozygous deletions, particularly in specific p-arm sites, were common in both HER2-amplified and HER2-not-amplified breast cancers. Use of alternative control probes from these regions to assess HER2 by FISH in HER2-equivocal as well as HER2-not-amplified breast cancers resulted in high rates of false-positive ratios (HER2-to-alternative control ratio ?2.0) owing to heterozygous deletions of control p-arm genomic sites used in ratio denominators. Misclassification of HER2 status was observed not only in breast cancers with ASCO-CAP equivocal status but also in breast cancers with an average of fewer than 4.0 HER2 copies per tumor cell when using alternative control probes.Conclusions and Relevance: The indiscriminate use of alternative control probes to calculate HER2 FISH ratios in HER2-equivocal breast cancers may lead to false-positive interpretations of HER2 status resulting from unrecognized heterozygous deletions in 1 or more of these alternative control genomic sites and incorrect HER2 ratio determinations.

    View details for PubMedID 30520947

  • The DLK1-DIO3 imprinted region regulates long-term proliferation in normal and malignant breast epithelium Zabala, M., Lobo, N. A., Seoane, J. A., Stelzer, Y., Luong, A. V., Isobe, T., Zarnegar, M. A., Watanabe, N., Antonana, S., Lam, J., Qian, D., Sikandar, S. S., Kuo, A. H., Heitink, L. S., Shimono, Y., Scheeren, F. A., Cai, S., Hisamori, S., Sahoo, D., Dirbas, F. M., Somlo, G., Jaenisch, R., Christina, C., Clarke, M. F. AMER ASSOC CANCER RESEARCH. 2018: 95
  • Comparative Molecular Analysis of Gastrointestinal Adenocarcinomas CANCER CELL Liu, Y., Sethi, N. S., Hinoue, T., Schneider, B. G., Cherniack, A. D., Sanchez-Vega, F., Seoane, J. A., Farshidfar, F., Bowlby, R., Islam, M., Kim, J., Chatila, W., Akbani, R., Kanchi, R. S., Rabkin, C. S., Willis, J. E., Wang, K. K., McCall, S. J., Mishra, L., Ojesina, A. I., Bullman, S., Pedamallu, C., Lazar, A. J., Sakai, R., Thorsson, V., Bass, A. J., Laird, P. W., Canc Genome Atlas Res Network 2018; 33 (4): 721-+

    Abstract

    We analyzed 921 adenocarcinomas of the esophagus, stomach, colon, and rectum to examine shared and distinguishing molecular characteristics of gastrointestinal tract adenocarcinomas (GIACs). Hypermutated tumors were distinct regardless of cancer type and comprised those enriched for insertions/deletions, representing microsatellite instability cases with epigenetic silencing of MLH1 in the context of CpG island methylator phenotype, plus tumors with elevated single-nucleotide variants associated with mutations in POLE. Tumors with chromosomal instability were diverse, with gastroesophageal adenocarcinomas harboring fragmented genomes associated with genomic doubling and distinct mutational signatures. We identified a group of tumors in the colon and rectum lacking hypermutation and aneuploidy termed genome stable and enriched in DNA hypermethylation and mutations in KRAS, SOX9, and PCBP1.

    View details for PubMedID 29622466

  • Mapping the in vivo fitness landscape of lung adenocarcinoma tumor suppression in mice NATURE GENETICS Rogers, Z. N., McFarland, C. D., Winters, I. P., Seoane, J. A., Brady, J. J., Yoon, S., Curtis, C., Petrov, D. A., Winslow, M. M. 2018; 50 (4): 483-+

    Abstract

    The functional impact of most genomic alterations found in cancer, alone or in combination, remains largely unknown. Here we integrate tumor barcoding, CRISPR/Cas9-mediated genome editing and ultra-deep barcode sequencing to interrogate pairwise combinations of tumor suppressor alterations in autochthonous mouse models of human lung adenocarcinoma. We map the tumor suppressive effects of 31 common lung adenocarcinoma genotypes and identify a landscape of context dependence and differential effect strengths.

    View details for PubMedID 29610476

  • Identification and validation of a novel drug target in an organoid model of esophageal cancer. Shukla, N., Salahudeen, A., de la O, S., Hart, D., Taylor, G., Zhu, J., Yuki, K., Seoane, J., Ma, Z., Ding, J., Han, K., Morgens, D., Bassik, M., Curtis, C., Kuo, C. AMER SOC CLINICAL ONCOLOGY. 2018
  • The chromatin accessibility landscape of primary human cancers. Science (New York, N.Y.) Corces, M. R., Granja, J. M., Shams, S., Louie, B. H., Seoane, J. A., Zhou, W., Silva, T. C., Groeneveld, C., Wong, C. K., Cho, S. W., Satpathy, A. T., Mumbach, M. R., Hoadley, K. A., Robertson, A. G., Sheffield, N. C., Felau, I., Castro, M. A., Berman, B. P., Staudt, L. M., Zenklusen, J. C., Laird, P. W., Curtis, C., Greenleaf, W. J., Chang, H. Y. 2018; 362 (6413)

    Abstract

    We present the genome-wide chromatin accessibility profiles of 410 tumor samples spanning 23 cancer types from The Cancer Genome Atlas (TCGA). We identify 562,709 transposase-accessible DNA elements that substantially extend the compendium of known cis-regulatory elements. Integration of ATAC-seq (the assay for transposase-accessible chromatin using sequencing) with TCGA multi-omic data identifies a large number of putative distal enhancers that distinguish molecular subtypes of cancers, uncovers specific driving transcription factors via protein-DNA footprints, and nominates long-range gene-regulatory interactions in cancer. These data reveal genetic risk loci of cancer predisposition as active DNA regulatory elements in cancer, identify gene-regulatory interactions underlying cancer immune evasion, and pinpoint noncoding mutations that drive enhancer activation and may affect patient survival. These results suggest a systematic approach to understanding the noncoding genome in cancer to advance diagnosis and therapy.

    View details for DOI 10.1126/science.aav1898

    View details for PubMedID 30361341

  • Genome co-amplification upregulates a mitotic gene network activity that predicts outcome and response to mitotic protein inhibitors in breast cancer (vol 18, pg 70, 2016) BREAST CANCER RESEARCH Hu, Z., Mao, J., Curtis, C., Huang, G., Gu, S., Heiser, L., Lenburg, M. E., Korkola, J. E., Bayani, N., Samarajiwa, S., Seoane, J. A., Dane, M. A., Esch, A., Feiler, H. S., Wang, N. J., Hardwicke, M., Laquerre, S., Jackson, J., Wood, K. W., Weber, B., Spellman, P. T., Aparicio, S., Wooster, R., Caldas, C., Gray, J. W. 2017; 19: 17

    View details for PubMedID 28183333

    View details for PubMedCentralID PMC5301377

  • Integrated genomic characterization of oesophageal carcinoma NATURE Kim, J., Bowlby, R., Mungall, A. J., Robertson, A. G., Odze, R. D., Cherniack, A. D., Shih, J., Pedamallu, C. S., Cibulskis, C., Dunford, A., Meier, S. R., Kim, J., Raphael, B. J., Wu, H., Wong, A. M., Willis, J. E., Bass, A. J., Derks, S., Garman, K., McCall, S. J., Wiznerowicz, M., Pantazi, A., Parfenov, M., Thorsson, V., Shmulevich, I., Dhankani, V., Miller, M., Sakai, R., Wang, K., Schultz, N., Shen, R., Arora, A., Weinhold, N., Sanchez-Vega, F., Kelsen, D. P., Zhang, J., Felau, I., Demchok, J., Rabkin, C. S., Camargo, M. C., Zenklusen, J. C., Bowen, J., Leraas, K., Lichtenberg, T. M., Curtis, C., Seoane, J. A., Ojesina, A. I., Beer, D. G., Gulley, M. L., Pennathur, A., Luketich, J. D., Zhou, Z., Weisenberger, D. J., Akbani, R., Lee, J., Liu, W., Mills, G. B., Zhang, W., Reid, B. J., Hinoue, T., Laird, P. W., Shen, H., Piazuelo, M. B., Schneider, B. G., McLellan, M., Taylor-Weiner, A., Cibulskis, C., Lawrence, M., Cibulskis, K., Stewart, C., Getz, G., Lander, E., Gabriel, S. B., Ding, L., McLellan, M. D., Miller, C. A., Appelbaum, E. L., Cordes, M. G., Fronick, C. C., Fulton, L. A., Mardis, E. R., Wilson, R. K., Schmidt, H. K., Fulton, R. S., Ally, A., Balasundaram, M., Bowlby, R., Carlsen, R., Chuah, E., Dhalla, N., Holt, R. A., Jones, S. J., Kasaian, K., Brooks, D., Li, H. I., Ma, Y., Marra, M. A., Mayo, M., Moore, R. A., Mungall, A. J., Mungall, K. L., Robertson, A. G., Schein, J. E., Sipahimalani, P., Tam, A., Thiessen, N., Wong, T., Cherniack, A. D., Shih, J., Pedamallu, C. S., Beroukhim, R., Bullman, S., Cibulskis, C., Murray, B. A., Saksena, G., Schumacher, S. E., Gabriel, S., Meyerson, M., Hadjipanayis, A., Kucherlapati, R., Pantazi, A., Parfenov, M., Ren, X., Park, P. J., Lee, S., Kucherlapati, M., Yang, L., Baylin, S. B., Hoadley, K. A., Weisenberger, D. J., Bootwalla, M. S., Lai, P. H., Van den Berg, D. J., Berrios, M., Holbrook, A., Akbani, R., Hwang, J., Jang, H., Liu, W., Weinstein, J. N., Lee, J., Lu, Y., Sohn, B. H., Mills, G., Seth, S., Protopopov, A., Bristow, C. A., Mahadeshwar, H. S., Tang, J., Song, X., Zhang, J., Laird, P. W., Hinoue, T., Shen, H., Cho, J., Defrietas, T., Frazer, S., Gehlenborg, N., Heiman, D. I., Lawrence, M. S., Lin, P., Meier, S. R., Noble, M. S., Doug Voet, D., Zhang, H., Kim, J., Polak, P., Saksena, G., Chin, L., Getz, G., Wong, A. M., Raphael, B. J., Wu, H., Lee, S., Park, P. J., Yang, L., Thorsson, V., Bernard, B., Iype, L., Miller, M., Reynolds, S. M., Shmulevich, I., Dhankani, V., Abeshouse, A., Arora, A., Armenia, J., Kundra, R., Ladanyi, M., Kjong-Van Lehmann, Gao, J., Sander, C., Schultz, N., Sanchez-Vega, F., Shen, R., Weinhold, N., Chakravarty, D., Zhang, H., Radenbaugh, A., Hegde, A., Akbani, R., Liu, W., Weinstein, J. N., Chin, L., Bristow, C. A., Lu, Y., Penny, R., Crain, D., Gardner, J., Curley, E., Mallery, D., Morris, S., Paulauskis, J., Shelton, T., Shelton, C., Bowen, J., Frick, J., Gastier-Foster, J. M., Gerken, M., Leraas, K. M., Lichtenberg, T. M., Ramirez, N. C., Wise, L., Zmuda, E., Tarvin, K., Saller, C., Park, Y. S., Button, M., Carvalho, A. L., Reis, R. M., Matsushita, M. M., Lucchesi, F., de Oliveira, A. T., Le, X., Paklina, O., Setdikova, G., Lee, J., Bennett, J., Iacocca, M., Huelsenbeck-Dill, L., Potapova, C. O., Voronina, O., Liu, O., Fulidou, V., Cates, C., Sharp, A., Behera, M., Force, S., Khuri, F., Owonikoko, T., Pickens, A., Ramalingam, S., Sica, G., Dinjens, W., van Nistelrooij, A., Wijnhoven, B., Sandusky, G., Stepa, S., Crain, D., Paulauskis, J., Penny, R., Gardner, J., Mallery, D., Morris, S., Shelton, T., Shelton, C., Curley, E., Juhl, I. H., Zornig, C., Kwon, S. Y., Kelsen, D., Kim, G. H., Bartlett, J., Parfitt, J., Chetty, R., Darling, G., Knox, J., Wong, R., El-Zimaity, H., Liu, G., Boussioutas, A., Park, D. Y., Kemp, R., Carlotti, C. G., da Cunha Tirapelli, D. P., Saggioro, F. P., Sankarankutty, A. K., Noushmehr, H., dos Santos, J. S., Trevisan, F. A., Eschbacher, J., Eschbacher, J., Dubina, M., Mozgovoy, E., Carey, F., Chalmers, S., Forgie, I., Godwin, A., Reilly, C., Madan, R., Naima, Z., Ferrer-Torres, D., Rathmell, W. K., Dhir, R., Luketich, J., Pennathur, A., Ajani, J. A., McCall, S. J., Janjigian, Y., Kelsen, D., Ladanyi, M., Tang, L., Camargo, M. C., Ajani, J. A., Cheong, J., Chudamani, S., Liu, J., Lolla, L., Naresh, R., Pihl, T., Sun, Q., Wan, Y., Wu, Y., Demchok, J. A., Felau, I., Ferguson, M. L., Shaw, K. R., Sheth, M., Tarnuzzer, R., Wang, Z., Yang, L., Zenklusen, J. C., Hutter, C. M., Sofia, H. J., Zhang, J. 2017; 541 (7636): 169-?

    Abstract

    Oesophageal cancers are prominent worldwide; however, there are few targeted therapies and survival rates for these cancers remain dismal. Here we performed a comprehensive molecular analysis of 164 carcinomas of the oesophagus derived from Western and Eastern populations. Beyond known histopathological and epidemiologic distinctions, molecular features differentiated oesophageal squamous cell carcinomas from oesophageal adenocarcinomas. Oesophageal squamous cell carcinomas resembled squamous carcinomas of other organs more than they did oesophageal adenocarcinomas. Our analyses identified three molecular subclasses of oesophageal squamous cell carcinomas, but none showed evidence for an aetiological role of human papillomavirus. Squamous cell carcinomas showed frequent genomic amplifications of CCND1 and SOX2 and/or TP63, whereas ERBB2, VEGFA and GATA4 and GATA6 were more commonly amplified in adenocarcinomas. Oesophageal adenocarcinomas strongly resembled the chromosomally unstable variant of gastric adenocarcinoma, suggesting that these cancers could be considered a single disease entity. However, some molecular features, including DNA hypermethylation, occurred disproportionally in oesophageal adenocarcinomas. These data provide a framework to facilitate more rational categorization of these tumours and a foundation for new therapies.

    View details for DOI 10.1038/nature20805

    View details for Web of Science ID 000396125500030

    View details for PubMedID 28052061

  • A p53 Super-tumor Suppressor Reveals a Tumor Suppressive p53-Ptpn14-Yap Axis in Pancreatic Cancer. Cancer cell Mello, S. S., Valente, L. J., Raj, N., Seoane, J. A., Flowers, B. M., McClendon, J., Bieging-Rolett, K. T., Lee, J., Ivanochko, D., Kozak, M. M., Chang, D. T., Longacre, T. A., Koong, A. C., Arrowsmith, C. H., Kim, S. K., Vogel, H., Wood, L. D., Hruban, R. H., Curtis, C., Attardi, L. D. 2017; 32 (4): 460?73.e6

    Abstract

    The p53 transcription factor is a critical barrier to pancreatic cancer progression. To unravel mechanisms of p53-mediated tumor suppression, which have remained elusive, we analyzed pancreatic cancer development in mice expressing p53 transcriptional activation domain (TAD) mutants. Surprisingly, the p5353,54 TAD2 mutant behaves as a "super-tumor suppressor," with an enhanced capacity to both suppress pancreatic cancer and transactivate select p53 target genes, including Ptpn14. Ptpn14 encodes a negative regulator of the Yap oncoprotein and is necessary and sufficient for pancreatic cancer suppression, like p53. We show that p53 deficiency promotes Yap signaling and that PTPN14 and TP53 mutations are mutually exclusive in human cancers. These studies uncover a p53-Ptpn14-Yap pathway that is integral to p53-mediated tumor suppression.

    View details for PubMedID 29017057

  • Genome co-amplification upregulates a mitotic gene network activity that predicts outcome and response to mitotic protein inhibitors in breast cancer BREAST CANCER RESEARCH Hu, Z., Mao, J., Curtis, C., Huang, G., Gu, S., Heiser, L., Lenburg, M. E., Korkola, J. E., Bayani, N., Samarajiwa, S., Seoane, J. A., Dane, M. A., Esch, A., Feiler, H. S., Wang, N. J., Hardwicke, M. A., Laquerre, S., Jackson, J., Wood, K. W., Weber, B., Spellman, P. T., Aparicio, S., Wooster, R., Caldas, C., Gray, J. W. 2016; 18

    Abstract

    High mitotic activity is associated with the genesis and progression of many cancers. Small molecule inhibitors of mitotic apparatus proteins are now being developed and evaluated clinically as anticancer agents. With clinical trials of several of these experimental compounds underway, it is important to understand the molecular mechanisms that determine high mitotic activity, identify tumor subtypes that carry molecular aberrations that confer high mitotic activity, and to develop molecular markers that distinguish which tumors will be most responsive to mitotic apparatus inhibitors.We identified a coordinately regulated mitotic apparatus network by analyzing gene expression profiles for 53 malignant and non-malignant human breast cancer cell lines and two separate primary breast tumor datasets. We defined the mitotic network activity index (MNAI) as the sum of the transcriptional levels of the 54 coordinately regulated mitotic apparatus genes. The effect of those genes on cell growth was evaluated by small interfering RNA (siRNA).High MNAI was enriched in basal-like breast tumors and was associated with reduced survival duration and preferential sensitivity to inhibitors of the mitotic apparatus proteins, polo-like kinase, centromere associated protein E and aurora kinase designated GSK462364, GSK923295 and GSK1070916, respectively. Co-amplification of regions of chromosomes 8q24, 10p15-p12, 12p13, and 17q24-q25 was associated with the transcriptional upregulation of this network of 54 mitotic apparatus genes, and we identify transcription factors that localize to these regions and putatively regulate mitotic activity. Knockdown of the mitotic network by siRNA identified 22 genes that might be considered as additional therapeutic targets for this clinically relevant patient subgroup.We define a molecular signature which may guide therapeutic approaches for tumors with high mitotic network activity.

    View details for DOI 10.1186/s13058-016-0728-y

    View details for Web of Science ID 000378898900001

    View details for PubMedCentralID PMC4930593

  • Texture analysis in gel electrophoresis images using an integrative kernel-based approach SCIENTIFIC REPORTS Fernandez-Lozano, C., Seoane, J. A., Gestal, M., Gaunt, T. R., Dorado, J., Pazos, A., Campbell, C. 2016; 6

    Abstract

    Texture information could be used in proteomics to improve the quality of the image analysis of proteins separated on a gel. In order to evaluate the best technique to identify relevant textures, we use several different kernel-based machine learning techniques to classify proteins in 2-DE images into spot and noise. We evaluate the classification accuracy of each of these techniques with proteins extracted from ten 2-DE images of different types of tissues and different experimental conditions. We found that the best classification model was FSMKL, a data integration method using multiple kernel learning, which achieved AUROC values above 95% while using a reduced number of features. This technique allows us to increment the interpretability of the complex combinations of textures and to weight the importance of each particular feature in the final model. In particular the Inverse Difference Moment exhibited the highest discriminating power. A higher value can be associated with an homogeneous structure as this feature describes the homogeneity; the larger the value, the more symmetric. The final model is performed by the combination of different groups of textural features. Here we demonstrated the feasibility of combining different groups of textures in 2-DE image analysis for spot detection.

    View details for DOI 10.1038/srep19256

    View details for Web of Science ID 000368106100001

    View details for PubMedID 26758643

    View details for PubMedCentralID PMC4713050

  • RRegrs: an R package for computer-aided model selection with multiple regression models JOURNAL OF CHEMINFORMATICS Tsiliki, G., Munteanu, C. R., Seoane, J. A., Fernandez-Lozano, C., Sarimveis, H., Willighagen, E. L. 2015; 7: 46

    Abstract

    Predictive regression models can be created with many different modelling approaches. Choices need to be made for data set splitting, cross-validation methods, specific regression parameters and best model criteria, as they all affect the accuracy and efficiency of the produced predictive models, and therefore, raising model reproducibility and comparison issues. Cheminformatics and bioinformatics are extensively using predictive modelling and exhibit a need for standardization of these methodologies in order to assist model selection and speed up the process of predictive model development. A tool accessible to all users, irrespectively of their statistical knowledge, would be valuable if it tests several simple and complex regression models and validation schemes, produce unified reports, and offer the option to be integrated into more extensive studies. Additionally, such methodology should be implemented as a free programming package, in order to be continuously adapted and redistributed by others.We propose an integrated framework for creating multiple regression models, called RRegrs. The tool offers the option of ten simple and complex regression methods combined with repeated 10-fold and leave-one-out cross-validation. Methods include Multiple Linear regression, Generalized Linear Model with Stepwise Feature Selection, Partial Least Squares regression, Lasso regression, and Support Vector Machines Recursive Feature Elimination. The new framework is an automated fully validated procedure which produces standardized reports to quickly oversee the impact of choices in modelling algorithms and assess the model and cross-validation results. The methodology was implemented as an open source R package, available at https://www.github.com/enanomapper/RRegrs, by reusing and extending on the caret package.The universality of the new methodology is demonstrated using five standard data sets from different scientific fields. Its efficiency in cheminformatics and QSAR modelling is shown with three use cases: proteomics data for surface-modified gold nanoparticles, nano-metal oxides descriptor data, and molecular descriptors for acute aquatic toxicity data. The results show that for all data sets RRegrs reports models with equal or better performance for both training and test sets than those reported in the original publications. Its good performance as well as its adaptability in terms of parameter optimization could make RRegrs a popular framework to assist the initial exploration of predictive models, and with that, the design of more comprehensive in silico screening applications.Graphical abstractRRegrs is a computer-aided model selection framework for R multiple regression models; this is a fully validated procedure with application to QSAR modelling.

    View details for PubMedID 26379782

  • Texture classification using feature selection and kernel-based techniques SOFT COMPUTING Fernandez-Lozano, C., Seoane, J. A., Gestal, M., Gaunt, T. R., Dorado, J., Campbell, C. 2015; 19 (9): 2469-2480
  • Canonical Correlation Analysis for Gene-Based Pleiotropy Discovery PLOS COMPUTATIONAL BIOLOGY Seoane, J. A., Campbell, C., Day, I. M., Casas, J. P., Gaunt, T. R. 2014; 10 (10): e1003876

    Abstract

    Genome-wide association studies have identified a wealth of genetic variants involved in complex traits and multifactorial diseases. There is now considerable interest in testing variants for association with multiple phenotypes (pleiotropy) and for testing multiple variants for association with a single phenotype (gene-based association tests). Such approaches can increase statistical power by combining evidence for association over multiple phenotypes or genetic variants respectively. Canonical Correlation Analysis (CCA) measures the correlation between two sets of multidimensional variables, and thus offers the potential to combine these two approaches. To apply CCA, we must restrict the number of attributes relative to the number of samples. Hence we consider modules of genetic variation that can comprise a gene, a pathway or another biologically relevant grouping, and/or a set of phenotypes. In order to do this, we use an attribute selection strategy based on a binary genetic algorithm. Applied to a UK-based prospective cohort study of 4286 women (the British Women's Heart and Health Study), we find improved statistical power in the detection of previously reported genetic associations, and identify a number of novel pleiotropic associations between genetic variants and phenotypes. New discoveries include gene-based association of NSF with triglyceride levels and several genes (ACSM3, ERI2, IL18RAP, IL23RAP and NRG1) with left ventricular hypertrophy phenotypes. In multiple-phenotype analyses we find association of NRG1 with left ventricular hypertrophy phenotypes, fibrinogen and urea and pleiotropic relationships of F7 and F10 with Factor VII, Factor IX and cholesterol levels.

    View details for PubMedID 25329069

    View details for PubMedCentralID PMC4199483

  • A pathway-based data integration framework for prediction of disease progression BIOINFORMATICS Seoane, J. A., Day, I. M., Gaunt, T. R., Campbell, C. 2014; 30 (6): 838?45

    Abstract

    Within medical research there is an increasing trend toward deriving multiple types of data from the same individual. The most effective prognostic prediction methods should use all available data, as this maximizes the amount of information used. In this article, we consider a variety of learning strategies to boost prediction performance based on the use of all available data.We consider data integration via the use of multiple kernel learning supervised learning methods. We propose a scheme in which feature selection by statistical score is performed separately per data type and by pathway membership. We further consider the introduction of a confidence measure for the class assignment, both to remove some ambiguously labeled datapoints from the training data and to implement a cautious classifier that only makes predictions when the associated confidence is high.We use the METABRIC dataset for breast cancer, with prediction of survival at 2000 days from diagnosis. Predictive accuracy is improved by using kernels that exclusively use those genes, as features, which are known members of particular pathways. We show that yet further improvements can be made by using a range of additional kernels based on clinical covariates such as Estrogen Receptor (ER) status. Using this range of measures to improve prediction performance, we show that the test accuracy on new instances is nearly 80%, though predictions are only made on 69.2% of the patient cohort.https://github.com/jseoane/FSMKL CONTACT: J.Seoane@bristol.ac.ukSupplementary data are available at Bioinformatics online.

    View details for PubMedID 24162466

    View details for PubMedCentralID PMC3957070

  • Breast density classification to reduce false positives in CADe systems COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE Vallez, N., Bueno, G., Deniz, O., Dorado, J., Antonio Seoane, J., Pazos, A., Pastor, C. 2014; 113 (2): 569?84

    Abstract

    This paper describes a novel weighted voting tree classification scheme for breast density classification. Breast parenchymal density is an important risk factor in breast cancer. Moreover, it is known that mammogram interpretation is more difficult when dense tissue is involved. Therefore, automated breast density classification may aid in breast lesion detection and analysis. Several classification methods have been compared and a novel hierarchical classification procedure of combined classifiers with linear discriminant analysis (LDA) is proposed as the best solution to classify the mammograms into the four BIRADS tissue classes. The classification scheme is based on 298 texture features. Statistical analysis to test the normality and homoscedasticity of the data was carried out for feature selection. Thus, only features that are influenced by the tissue type were considered. The novel classification techniques have been incorporated into a CADe system to drive the detection algorithms and tested with 1459 images. The results obtained on the 322 screen-film mammograms (SFM) of the mini-MIAS dataset show that 99.75% of samples were correctly classified. On the 1137 full-field digital mammograms (FFDM) dataset results show 91.58% agreement. The results of the lesion detection algorithms were obtained from modules integrated within the CADe system developed by the authors and show that using breast tissue classification prior to lesion detection leads to an improvement of the detection results. The tools enhance the detectability of lesions and they are able to distinguish their local attenuation without local tissue density constraints.

    View details for DOI 10.1016/j.cmpb.2013.10.004

    View details for Web of Science ID 000330137600014

    View details for PubMedID 24286729

  • An artificial neural network improves the non-invasive diagnosis of significant fibrosis in HIV/HCV coinfected patients JOURNAL OF INFECTION Resino, S., Antonio Seoane, J., Maria Bellon, J., Dorado, J., Martin-Sanchez, F., Alvarez, E., Cosin, J., Carlos Lopez, J., Lopez, G., Miralles, P., Berenguer, J. 2011; 62 (1): 77?86

    Abstract

    To develop an artificial neural network to predict significant fibrosis (F?2) (ANN-SF) in HIV/Hepatitis C (HCV) coinfected patients using clinical data derived from peripheral blood.Patients were randomly divided into an estimation group (217 cases) used to generate the ANN and a test group (145 cases) used to confirm its power to predict F?2. Liver fibrosis was estimated according to the METAVIR score.The values of the area under the receiver operating characteristic curve (AUC-ROC) of the ANN-SF were 0.868 in the estimation set and 0.846 in the test set. In the estimation set, with a cut-off value of <0.35 to predict the absence of F?2, the sensitivity (Se), specificity (Sp), and positive (PPV) and negative predictive values (NPV) were 94.1%, 41.8%, 66.3% and 85.4% respectively. Furthermore, with a cut-off value of >0.75 to predict the presence of F?2, the ANN-SF provided Se, Sp, PPV and NPV of 53.8%, 94.9%, 92.8% and 62.8% respectively. In the test set, with a cut-off value of <0.35 to predict the absence of F?2, the Se, Sp, PPV and NPV were 91.8%, 51.7%, 72.9% and 81.6% respectively. Furthermore, with a cut-off value of >0.75 to predict the presence of F?2, the ANN-SF provided Se, Sp, PPV and NPV of 43.5%, 96.7%, 94.9% and 54.7% respectively.The ANN-SF accurately predicted significant fibrosis and outperformed other simple non-invasive indices for HIV/HCV coinfected patients. Our data suggest that ANN may be a helpful tool for guiding therapeutic decisions in clinical practice concerning HIV/HCV coinfection.

    View details for DOI 10.1016/j.jinf.2010.11.003

    View details for Web of Science ID 000286456600011

    View details for PubMedID 21073895

Footer Links:

Stanford Medicine Resources: