DataMed is a prototype biomedical data search engine based on the Data Discovery Index (DDI) developed by biomedical and healthCAre Data Discovery Index Ecosystem (bioCADDIE) team. DDI allows for indexing data that are stored elsewhere and its goal is to discover data sets across data repositories or data aggregators. DataMed supports the NIH-endorsed FAIR principles of Findability, Accessibility, Interoperability and Reusability of datasets with current functionality assisting in finding datasets and providing access information about them.

Encyclopedia of DNA Elements (ENCODE)

The Encyclopedia of DNA Elements project (ENCODE) has been producing data for over a decade to investigate DNA and RNA binding proteins, chromatin structure, transcriptional activity and DNA methylation on a variety of human and model organisms tissues and cell lines. The ENCODE Data Coordination Center (DCC) has incorporated a representational state transfer application programming interface (REST API) with JSON (JavaScript Object Notation) objects to facilitate the access of ENCODE experimental metadata using a web portal. Metadata can be accessed and data can be searched for at ENCODE website using the HTTP request from a script or the curl command. The portal also allows filtering of the metadata with the use of search urls. This system allows external researchers to write their own interfaces to access, analyze and visualize the ENCODE data. It also facilitates the integration of other large-scale datasets such as REMC, modENCODE, modERN and GGR with the ENCODE data. Data from the ENCODE project can be accessed via the ENCODE portal and documentation for the REST API can be accessed here.

Schema profiles can be found on the site at*.json where * is replaced by the name of the object of interest.  A complete listing of all the current schemas can be found on DCC github.

ISB-Cancer Genomics Cloud Platform

Institute for Systems Biology - Cancer Genomics Cloud (ISB-CGC) platform hosts the majority of the TCGA data set as well as other reference and annotation datasets in different appropriate Google Cloud technologies. Documentation for the ISB-CGC is available on readthedocs.  Anyone can sign in to ISB  web-app, and access our open-access data resources in BigQuery and example code on GitHub.  In order to access controlled-access TCGA data hosted by the ISB-CGC, users need to be authenticated and authorized through the web-app.

mPower Public Researcher Portal

mPower is an mobile application-based study piloting new approaches to monitoring key indicators of Parkinson Disease progression and diagnosis by supplementing traditional behavioral symptom measurements with metrics gleaned from sensor-rich mobile devices. The goal of this study is to understand the frequency and degree of variation of patient symptoms, the sources of those variations, and the potential modulators of those variations. 

This study is sponsored by Sage Bionetworks, a 501(c)(3) nonprofit research organization, with funding from the Robert Wood Johnson Foundation. Sage Bionetworks has made this data available to citizen scientist. Please check mPower wiki for details.

NASA Genelab

NASA has been conducting biological experiments on international space station. To broadly share this data, NASA has established its GeneLab group and website. This data repository holds NASA's space biology datasets including experimental meta data. It is easy to search and access. Details of their Human Research Program are also available on their website although access to the human data is not open due to privacy concerns for the astronauts.

You can find out more about NASA Research Opportunities by visiting the NASA Solicitation and Proposal Integrated Review and Evaluation System, NSPIRES, and you can browse past research activities on following the link here. NASA chose the Center for the Advancement of Science in Space (CASIS) to be the sole manager of the International Space Station U.S. National Laboratory. A good place for researchers to find more about CASIS is here.

Population Health Sciences

Stanford Center for Population Health Sciences (PHS) is making certain datasets available to Stanford research community. These currently include Truven, IPUM, Optum, UK BioBank and CMS. More data will become available in time.

The mission of PHS is to improve individual and population health by bringing together diverse disciplines and data to understand and address social, environmental, behavioral, and biological factors.

Stanford Center for Population Health Sciences