Bioinformatics

In the past three years alone, the Jackson Lab has now conducted over 100 AP/MS (LINK to my AP/MS page). We have developed a new computational method for identifying and curating hits from these experiments, which we are currently calling HANDI, for Historical AP/MS NSAF Distribution Identification. In brief, we compare AP/MS experiments conducted in a given cell line with each other, automatically filter outliers, and infer a null distribution for every observed gene. The signal associated with each protein in a new experiment is then tested against the appropriate distribution.

A single experiment can produce dozens of hits with < 0.05 after FDR correction. Some of which bind the bait directly and specifically, while others may bind the protein through an intermediate molecule. To aid with curation of this data, we incorporate information from genetic and physical interaction mapping from myriad sources, including BioGRIDTCGA, and IMPC, to identify functional modules, both curated and uncurated, in our datasets. Furthermore, we incorporate sequence, domain, and publication data gathered directly from NCBI to identify genes frequently studied in the same context, as well as sequence features enriched in our datasets that may be determinants of their function. The result is an annotated pathway map that we use to propose testable hypotheses about cellular processes. An online client for these tools is under development.