Professional Education

  • Bachelor of Science, Universita Degli Studi Di Roma (2004)
  • Master of Science, Universita Degli Studi Di Roma (2006)
  • Doctor of Philosophy, Universita Degli Studi Di Roma (2009)

Stanford Advisors


2015-16 Courses


All Publications

  • An interactive reference framework for modeling a dynamic immune system SCIENCE Spitzer, M. H., Gherardini, P. F., Fragiadakis, G. K., Bhattacharya, N., Yuan, R. T., Hotson, A. N., Finck, R., Carmi, Y., Zunder, E. R., Fantl, W. J., Bendall, S. C., Engleman, E. G., Nolan, G. P. 2015; 349 (6244): 155-?
  • IMMUNOLOGY. An interactive reference framework for modeling a dynamic immune system. Science Spitzer, M. H., Gherardini, P. F., Fragiadakis, G. K., Bhattacharya, N., Yuan, R. T., Hotson, A. N., Finck, R., Carmi, Y., Zunder, E. R., Fantl, W. J., Bendall, S. C., Engleman, E. G., Nolan, G. P. 2015; 349 (6244)


    Immune cells function in an interacting hierarchy that coordinates the activities of various cell types according to genetic and environmental contexts. We developed graphical approaches to construct an extensible immune reference map from mass cytometry data of cells from different organs, incorporating landmark cell populations as flags on the map to compare cells from distinct samples. The maps recapitulated canonical cellular phenotypes and revealed reproducible, tissue-specific deviations. The approach revealed influences of genetic variation and circadian rhythms on immune system structure, enabled direct comparisons of murine and human blood cell phenotypes, and even enabled archival fluorescence-based flow cytometry data to be mapped onto the reference framework. This foundational reference map provides a working definition of systemic immune organization to which new data can be integrated to reveal deviations driven by genetics, environment, or pathology.

    View details for DOI 10.1126/science.1259425

    View details for PubMedID 26160952

  • Experimental and computational methods for the analysis and modeling of signaling networks NEW BIOTECHNOLOGY Gherardini, P. F., Helmer-Citterich, M. 2013; 30 (3): 327-332


    External cues are processed and integrated by signal transduction networks that drive appropriate cellular responses. Characterizing these programs, as well as how their deregulation leads to disease, is crucial for our understanding of cell biology. The past ten years have witnessed a gradual increase in the number of molecular parameters that can be simultaneously measured in a sample. Moreover our capacity to handle multiple samples in parallel has expanded, thus allowing a deeper profiling of cellular states under diverse experimental conditions. These technological advances have been complemented by the development of computational methods aimed at mining, analyzing and modeling these data. In this review we give a general overview of the most important experimental and computational techniques used in the field and describe several interesting application of these methodologies. We conclude by highlighting the issues that we think will keep researchers in the field busy in the next few years.

    View details for DOI 10.1016/j.nbt.2012.11.007

    View details for Web of Science ID 000317240000012

    View details for PubMedID 23165097

  • Exploring the diversity of SPRY/B30.2-mediated interactions TRENDS IN BIOCHEMICAL SCIENCES Perfetto, L., Gherardini, P. F., Davey, N. E., Diella, F., Helmer-Citterich, M., Cesareni, G. 2013; 38 (1): 38-46


    The SPla/Ryanodine receptor (SPRY)/B30.2 domain is one of the most common folds in higher eukaryotes. The human genome encodes 103 SPRY/B30.2 domains, several of which are involved in the immune response. Approximately 45% of human SPRY/B30.2-containing proteins are E3 ligases. The role and function of the majority of SPRY/B30.2 domains are still poorly understood, however, in several cases mutations in this domain have been linked to congenital disorders. The recent characterization of SPRY/B30.2-mediated protein interactions has provided evidence for a role of this domain as an adaptor module to assemble macromolecular complexes, analogous to Src homology (SH)2, SH3, and WW domains. However, functional and structural evidence suggests that SPRY/B30.2 is a more versatile fold, allowing a wide range of binding modes.

    View details for DOI 10.1016/j.tibs.2012.10.001

    View details for Web of Science ID 000314143700006

    View details for PubMedID 23164942

  • Identification of Nucleotide-Binding Sites in Protein Structures: A Novel Approach Based on Nucleotide Modularity PLOS ONE Parca, L., Gherardini, P. F., Truglio, M., Mangone, I., Ferre, F., Helmer-Citterich, M., Ausiello, G. 2012; 7 (11)


    Nucleotides are involved in several cellular processes, ranging from the transmission of genetic information, to energy transfer and storage. Both sequence and structure based methods have been developed to predict the location of nucleotide-binding sites in proteins. Here we propose a novel methodology that leverages the observation that nucleotide-binding sites have a modular structure. Nucleotides are composed of identifiable fragments, i.e. the phosphate, the nucleobase and the carbohydrate moieties. These fragments are bound by specific structural motifs that recur in proteins of different fold. Moreover these motifs behave as modules and are found in different combinations across fold space. Our method predicts binding sites for each nucleotide fragment by comparing a query protein with a database of templates extracted from proteins of known structure. Whenever a similarity is found the fragment bound by the template is transferred on the query protein, thus identifying a putative binding site. Predictions falling inside the surface of the protein are discarded, and the remaining ones are scored using clustering and conservation. The method is able to rank as first a correct prediction in the 48%, 48% and 68% of the analyzed proteins for the nucleobase, carbohydrate and phosphate respectively, while considering the first five predictions the performances change to 71%, 65% and 86% respectively. Furthermore we attempted to reconstruct the full structure of the binding site, starting from the predicted positions of the fragments. We calculated that in the 59% of the analyzed proteins the method ranks as first a reconstructed binding site or a part of it. Finally we tested the reliability of our method in a real world case in which it has to predict nucleotide-binding sites in unbound proteins. We analyzed proteins whose structure has been solved with and without the nucleotide and observed only little variations in the method performance.

    View details for DOI 10.1371/journal.pone.0050240

    View details for Web of Science ID 000311885800056

    View details for PubMedID 23209685

  • Mapping the human phosphatome on growth pathways MOLECULAR SYSTEMS BIOLOGY Sacco, F., Gherardini, P. F., Paoluzi, S., Saez-Rodriguez, J., Helmer-Citterich, M., Ragnini-Wilson, A., Castagnoli, L., Cesareni, G. 2012; 8


    Large-scale siRNA screenings allow linking the function of poorly characterized genes to phenotypic readouts. According to this strategy, genes are associated with a function of interest if the alteration of their expression perturbs the phenotypic readouts. However, given the intricacy of the cell regulatory network, the mapping procedure is low resolution and the resulting models provide little mechanistic insights. We have developed a new strategy that combines multiparametric analysis of cell perturbation with logic modeling to achieve a more detailed functional mapping of human genes onto complex pathways. A literature-derived optimized model is used to infer the cell activation state following upregulation or downregulation of the model entities. By matching this signature with the experimental profile obtained in the high-throughput siRNA screening it is possible to infer the target of each protein, thus defining its 'entry point' in the network. By this novel approach, 41 phosphatases that affect key growth pathways were identified and mapped onto a human epithelial cell-specific growth model, thus providing insights into the mechanisms underlying their function.

    View details for DOI 10.1038/msb.2012.36

    View details for Web of Science ID 000308316400003

    View details for PubMedID 22893001

  • What has proteomics taught us about Leishmania development? PARASITOLOGY Tsigankov, P., Gherardini, P. F., Helmer-Citterich, M., Zilberstein, D. 2012; 139 (9): 1146-1157


    Leishmania are obligatory intracellular parasitic protozoa that cycle between sand fly mid-gut and phagolysosomes of mammalian macrophages. They have developed genetically programmed changes in gene and protein expression that enable rapid optimization of cell function according to vector and host environments. During the last two decades, host-free systems that mimic intra-lysosomal environments have been devised in which promastigotes differentiate into amastigotes axenically. These cultures have facilitated detailed investigation of the molecular mechanisms underlying Leishmania development inside its host. Axenic promastigotes and amastigotes have been subjected to transcriptome and proteomic analyses. Development had appeared somewhat variable but was revealed by proteomics to be strictly coordinated and regulated. Here we summarize the current understanding of Leishmania promastigote to amastigote differentiation, highlighting the data generated by proteomics.

    View details for DOI 10.1017/S0031182012000157

    View details for Web of Science ID 000308382200005

    View details for PubMedID 22369930

  • Identification of binding pockets in protein structures using a knowledge-based potential derived from local structural similarities Bianchi, V., Gherardini, P. F., Helmer-Citterich, M., Ausiello, G. BIOMED CENTRAL LTD. 2012


    The identification of ligand binding sites is a key task in the annotation of proteins with known structure but uncharacterized function. Here we describe a knowledge-based method exploiting the observation that unrelated binding sites share small structural motifs that bind the same chemical fragments irrespective of the nature of the ligand as a whole.PDBinder compares a query protein against a library of binding and non-binding protein surface regions derived from the PDB. The results of the comparison are used to derive a propensity value for each residue which is correlated with the likelihood that the residue is part of a ligand binding site. The method was applied to two different problems: i) the prediction of ligand binding residues and ii) the identification of which surface cleft harbours the binding site. In both cases PDBinder performed consistently better than existing methods. PDBinder has been trained on a non-redundant set of 1356 high-quality protein-ligand complexes and tested on a set of 239 holo and apo complex pairs. We obtained an MCC of 0.313 on the holo set with a PPV of 0.413 while on the apo set we achieved an MCC of 0.271 and a PPV of 0.372.We show that PDBinder performs better than existing methods. The good performance on the unbound proteins is extremely important for real-world applications where the location of the binding site is unknown. Moreover, since our approach is orthogonal to those used in other programs, the PDBinder propensity value can be integrated in other algorithms further increasing the final performance.

    View details for DOI 10.1186/1471-2105-13-S4-S17

    View details for Web of Science ID 000303936400018

    View details for PubMedID 22536963

  • B-Pred, a structure based B-cell epitopes prediction server. Advances and applications in bioinformatics and chemistry : AABC Giacò, L., Amicosante, M., Fraziano, M., Gherardini, P. F., Ausiello, G., Helmer-Citterich, M., Colizzi, V., Cabibbo, A. 2012; 5: 11-21


    The ability to predict immunogenic regions in selected proteins by in-silico methods has broad implications, such as allowing a quick selection of potential reagents to be used as diagnostics, vaccines, immunotherapeutics, or research tools in several branches of biological and biotechnological research. However, the prediction of antibody target sites in proteins using computational methodologies has proven to be a highly challenging task, which is likely due to the somewhat elusive nature of B-cell epitopes. This paper proposes a web-based platform for scoring potential immunological reagents based on the structures or 3D models of the proteins of interest. The method scores a protein's peptides set, which is derived from a sliding window, based on the average solvent exposure, with a filter on the average local model quality for each peptide. The platform was validated on a custom-assembled database of 1336 experimentally determined epitopes from 106 proteins for which a reliable 3D model could be obtained through standard modeling techniques. Despite showing poor sensitivity, this method can achieve a specificity of 0.70 and a positive predictive value of 0.29 by combining these two simple parameters. These values are slightly higher than those obtained with other established sequence-based or structure-based methods that have been evaluated using the same epitopes dataset. This method is implemented in a web server called B-Pred, which is accessible at The server contains a number of original features that allow users to perform personalized reagent searches by manipulating the sliding window's width and sliding step, changing the exposure and model quality thresholds, and running sequential queries with different parameters. The B-Pred server should assist experimentalists in the rational selection of epitope antigens for a wide range of applications.

    View details for DOI 10.2147/AABC.S30620

    View details for PubMedID 22888263

  • PhosTryp: a phosphorylation site predictor specific for parasitic protozoa of the family trypanosomatidae BMC GENOMICS Palmeri, A., Gherardini, P. F., Tsigankov, P., Ausiello, G., Spaeth, G. F., Zilberstein, D., Helmer-Citterich, M. 2011; 12


    Protein phosphorylation modulates protein function in organisms at all levels of complexity. Parasites of the Leishmania genus undergo various developmental transitions in their life cycle triggered by changes in the environment. The molecular mechanisms that these organisms use to process and integrate these external cues are largely unknown. However Leishmania lacks transcription factors, therefore most regulatory processes may occur at a post-translational level and phosphorylation has recently been demonstrated to be an important player in this process. Experimental identification of phosphorylation sites is a time-consuming task. Moreover some sites could be missed due to the highly dynamic nature of this process or to difficulties in phospho-peptide enrichment.Here we present PhosTryp, a phosphorylation site predictor specific for trypansomatids. This method uses an SVM-based approach and has been trained with recent Leishmania phosphosproteomics data. PhosTryp achieved a 17% improvement in prediction performance compared with Netphos, a non organism-specific predictor. The analysis of the peptides correctly predicted by our method but missed by Netphos demonstrates that PhosTryp captures Leishmania-specific phosphorylation features. More specifically our results show that Leishmania kinases have sequence specificities which are different from their counterparts in higher eukaryotes. Consequently we were able to propose two possible Leishmania-specific phosphorylation motifs.We further demonstrate that this improvement in performance extends to the related trypanosomatids Trypanosoma brucei and Trypanosoma cruzi. Finally, in order to maximize the usefulness of PhosTryp, we trained a predictor combining all the peptides from L. infantum, T. brucei and T. cruzi.Our work demonstrates that training on organism-specific data results in an improvement that extends to related species. PhosTryp is freely available at

    View details for DOI 10.1186/1471-2164-12-614

    View details for Web of Science ID 000300735700001

    View details for PubMedID 22182631

  • Adaptation of a 2D in-gel kinase assay to trace phosphotransferase activities in the human pathogen Leishmania donovani JOURNAL OF PROTEOMICS Schmidt-Arras, D., Leclercq, O., Gherardini, P. F., Helmer-Citterich, M., Faigle, W., Loew, D., Spaeth, G. F. 2011; 74 (9): 1644-1651


    The protozoan parasite Leishmania donovani undergoes various developmental transitions during its infectious cycle that are triggered by environmental signals encountered inside insect and vertebrate hosts. Intracellular differentiation of the pathogenic amastigote stage is induced by pH and temperature shifts that affect protein kinase activities and downstream protein phosphorylation. Identification of parasite proteins with phosphotransferase activity during intracellular infection may reveal new targets for pharmacological intervention. Here we describe an improved protocol to trace this activity in L. donovani extracts at high resolution combining in-gel kinase assay and two-dimensional gel electrophoresis. This 2D procedure allowed us to identify proteins that are associated with amastigote ATP-binding, ATPase, and phosphotransferase activities. The 2D in-gel kinase assay, in combination with recombinant phospho-protein substrates previously identified by phospho-proteomics analyses, provides a novel tool to establish specific protein kinase-substrate relationships thus improving our understanding of Leishmania signal transduction with relevance for future drug development.

    View details for DOI 10.1016/j.jprot.2011.03.024

    View details for Web of Science ID 000295149000014

    View details for PubMedID 21443974

  • Phosfinder: a web server for the identification of phosphate-binding sites on protein structures NUCLEIC ACIDS RESEARCH Parca, L., Mangone, I., Gherardini, P. F., Ausiello, G., Helmer-Citterich, M. 2011; 39: W278-W282


    Phosfinder is a web server for the identification of phosphate binding sites in protein structures. Phosfinder uses a structural comparison algorithm to scan a query structure against a set of known 3D phosphate binding motifs. Whenever a structural similarity between the query protein and a phosphate binding motif is detected, the phosphate bound by the known motif is added to the protein structure thus representing a putative phosphate binding site. Predicted binding sites are then evaluated according to (i) their position with respect to the query protein solvent-excluded surface and (ii) the conservation of the binding residues in the protein family. The server accepts as input either the PDB code of the protein to be analyzed or a user-submitted structure in PDB format. All the search parameters are user modifiable. Phosfinder outputs a list of predicted binding sites with detailed information about their structural similarity with known phosphate binding motifs, and the conservation of the residues involved. A graphical applet allows the user to visualize the predicted binding sites on the query protein structure. The results on a set of 52 apo/holo structure pairs show that the performance of our method is largely unaffected by ligand-induced conformational changes. Phosfinder is available at

    View details for DOI 10.1093/nar/gkr389

    View details for Web of Science ID 000292325300045

    View details for PubMedID 21622655

  • Phosphate binding sites identification in protein structures NUCLEIC ACIDS RESEARCH Parca, L., Gherardini, P. F., Helmer-Citterich, M., Ausiello, G. 2011; 39 (4): 1231-1242


    Nearly half of known protein structures interact with phosphate-containing ligands, such as nucleotides and other cofactors. Many methods have been developed for the identification of metal ions-binding sites and some for bigger ligands such as carbohydrates, but none is yet available for the prediction of phosphate-binding sites. Here we describe Pfinder, a method that predicts binding sites for phosphate groups, both in the form of ions or as parts of other non-peptide ligands, in proteins of known structure. Pfinder uses the Query3D local structural comparison algorithm to scan a protein structure for the presence of a number of structural motifs identified for their ability to bind the phosphate chemical group. Pfinder has been tested on a data set of 52 proteins for which both the apo and holo forms were available. We obtained at least one correct prediction in 63% of the holo structures and in 62% of the apo. The ability of Pfinder to recognize a phosphate-binding site in unbound protein structures makes it an ideal tool for functional annotation and for complementing docking and drug design methods. The Pfinder program is available at

    View details for DOI 10.1093/nar/gkq987

    View details for Web of Science ID 000288019400013

    View details for PubMedID 20974634

  • Phospho3D 2.0: an enhanced database of three-dimensional structures of phosphorylation sites NUCLEIC ACIDS RESEARCH Zanzoni, A., Carbajo, D., Diella, F., Gherardini, P. F., Tramontano, A., Helmer-Citterich, M., Via, A. 2011; 39: D268-D271


    Phospho3D is a database of three-dimensional (3D) structures of phosphorylation sites (P-sites) derived from the Phospho.ELM database, which also collects information on the residues surrounding the P-site in space (3D zones). The database also provides the results of a large-scale structural comparison of the 3D zones versus a representative dataset of structures, thus associating to each P-site a number of structurally similar sites. The new version of Phospho3D presents an 11-fold increase in the number of 3D sites and incorporates several additional features, including new structural descriptors, the possibility of selecting non-redundant sets of 3D structures and the availability for download of non-redundant sets of structurally annotated P-sites. Moreover, it features P3Dscan, a new functionality that allows the user to submit a protein structure and scan it against the 3D zones collected in the Phospho3D database. Phospho3D version 2.0 is available at:

    View details for DOI 10.1093/nar/gkq936

    View details for Web of Science ID 000285831700045

    View details for PubMedID 20965970

  • Identification of Leishmania-specific protein phosphorylation sites by LC-ESI-MS/MS and comparative genomics analyses PROTEOMICS Hem, S., Gherardini, P. F., Osorio y Fortea, J., Hourdel, V., Morales, M. A., Watanabe, R., Pescher, P., Kuzyk, M. A., Smith, D., Borchers, C. H., Zilberstein, D., Helmer-Citterich, M., Namane, A., Spaeth, G. F. 2010; 10 (21): 3868-3883


    Human pathogenic protozoa of the genus Leishmania undergo various developmental transitions during the infectious cycle that are triggered by changes in the host environment. How these parasites sense, transduce, and respond to these signals is only poorly understood. Here we used phosphoproteomic approaches to monitor signaling events in L. donovani axenic amastigotes, which may be important for intracellular parasite survival. LC-ESI-MS/MS analysis of IMAC-enriched phosphoprotein extracts identified 445 putative phosphoproteins in two independent biological experiments. Functional enrichment analysis allowed us to gain insight into parasite pathways that are regulated by protein phosphorylation and revealed significant enrichment in our data set of proteins whose biological functions are associated with protein turn-over, stress response, and signal transduction. LC-ESI-MS/MS analysis of TiO(2)-enriched phosphopeptides confirmed these results and identified 157 unique phosphopeptides covering 181 unique phosphorylation sites in 126 distinct proteins. Investigation of phosphorylation site conservation across related trypanosomatids and higher eukaryotes by multiple sequence alignment and cluster analysis revealed L. donovani-specific phosphoresidues in highly conserved proteins that share significant sequence homology to orthologs of the human host. These unique phosphorylation sites reveal important differences between host and parasite biology and post-translational protein regulation, which may be exploited for the design of novel anti-parasitic interventions.

    View details for DOI 10.1002/pmic.201000305

    View details for Web of Science ID 000284045300009

    View details for PubMedID 20960452

  • Superpose3D: A Local Structural Comparison Program That Allows for User-Defined Structure Representations PLOS ONE Gherardini, P. F., Ausiello, G., Helmer-Citterich, M. 2010; 5 (8)


    Local structural comparison methods can be used to find structural similarities involving functional protein patches such as enzyme active sites and ligand binding sites. The outcome of such analyses is critically dependent on the representation used to describe the structure. Indeed different categories of functional sites may require the comparison program to focus on different characteristics of the protein residues. We have therefore developed superpose3D, a novel structural comparison software that lets users specify, with a powerful and flexible syntax, the structure description most suited to the requirements of their analysis. Input proteins are processed according to the user's directives and the program identifies sets of residues (or groups of atoms) that have a similar 3D position in the two structures. The advantages of using such a general purpose program are demonstrated with several examples. These test cases show that no single representation is appropriate for every analysis, hence the usefulness of having a flexible program that can be tailored to different needs. Moreover we also discuss how to interpret the results of a database screening where a known structural motif is searched against a large ensemble of structures. The software is written in C++ and is released under the open source GPL license. Superpose3D does not require any external library, runs on Linux, Mac OSX, Windows and is available at

    View details for DOI 10.1371/journal.pone.0011988

    View details for Web of Science ID 000280605400021

    View details for PubMedID 20700534

  • Modular architecture of nucleotide-binding pockets NUCLEIC ACIDS RESEARCH Gherardini, P. F., Ausiello, G., Russell, R. B., Helmer-Citterich, M. 2010; 38 (11): 3809-3816


    Recently, modularity has emerged as a general attribute of complex biological systems. This is probably because modular systems lend themselves readily to optimization via random mutation followed by natural selection. Although they are not traditionally considered to evolve by this process, biological ligands are also modular, being composed of recurring chemical fragments, and moreover they exhibit similarities reminiscent of mutations (e.g. the few atoms differentiating adenine and guanine). Many ligands are also promiscuous in the sense that they bind to many different protein folds. Here, we investigated whether ligand chemical modularity is reflected in an underlying modularity of binding sites across unrelated proteins. We chose nucleotides as paradigmatic ligands, because they can be described as composed of well-defined fragments (nucleobase, ribose and phosphates) and are quite abundant both in nature and in protein structure databases. We found that nucleotide-binding sites do indeed show a modular organization and are composed of fragment-specific protein structural motifs, which parallel the modular structure of their ligands. Through an analysis of the distribution of these motifs in different proteins and in different folds, we discuss the evolutionary implications of these findings and argue that the structural features we observed can arise both as a result of divergence from a common ancestor or convergent evolution.

    View details for DOI 10.1093/nar/gkq090

    View details for Web of Science ID 000279188800035

    View details for PubMedID 20185567

  • Identification from Chest X-Rays: Reliability of Bone Density Patterns of the Humerus Ciaffi, R., De Angelis, D., Gherardini, P. F., Arcudi, G., Nessi, R., Cornalba, G. P., Grandi, M., Cattaneo, C. WILEY-BLACKWELL PUBLISHING, INC. 2010: 478-481


    A critical review of Kahana and Hiss' study on identification from bone trabecular pattern and a test of their method conducted on the humerus are presented. Bone trabecular pattern was studied through the generation of a numerical file representing the gray scale. Using the correlation coefficient, several pairwise comparisons between numerical files were performed. The test gave nearly 30% of incorrect exclusions (the method did not recognize couples of radiographs belonging to the same subject) and 50% of misidentifications (the method recognized couples of radiographs belonging to different subjects, as belonging to the same subject); therefore, this research shows that at the present time, it is not possible to safely quantify identification through bone density patterns, of the proximal humerus taken from thoracic X-rays. Thus, an "easy"-but dangerous-use of trabecular density patterns on this specific type of radiogram as an identification method should be currently avoided.

    View details for DOI 10.1111/j.1556-4029.2009.01297.x

    View details for Web of Science ID 000275098700028

    View details for PubMedID 20158592

  • Structural motifs recurring in different folds recognize the same ligand fragments BMC BIOINFORMATICS Ausiello, G., Gherardini, P. F., Gatti, E., Incani, O., Helmer-Citterich, M. 2009; 10


    The structural analysis of protein ligand binding sites can provide information relevant for assigning functions to unknown proteins, to guide the drug discovery process and to infer relations among distant protein folds. Previous approaches to the comparative analysis of binding pockets have usually been focused either on the ligand or the protein component. Even though several useful observations have been made with these approaches they both have limitations. In the former case the analysis is restricted to binding pockets interacting with similar ligands, while in the latter it is difficult to systematically check whether the observed structural similarities have a functional significance.Here we propose a novel methodology that takes into account the structure of both the binding pocket and the ligand. We first look for local similarities in a set of binding pockets and then check whether the bound ligands, even if completely different, share a common fragment that can account for the presence of the structural motif. Thanks to this method we can identify structural motifs whose functional significance is explained by the presence of shared features in the interacting ligands.The application of this method to a large dataset of binding pockets allows the identification of recurring protein motifs that bind specific ligand fragments, even in the context of molecules with a different overall structure. In addition some of these motifs are present in a high number of evolutionarily unrelated proteins.

    View details for DOI 10.1186/1471-2105-10-182

    View details for Web of Science ID 000267597100001

    View details for PubMedID 19527512

  • Structure-based function prediction: approaches and applications. Briefings in functional genomics & proteomics Gherardini, P. F., Helmer-Citterich, M. 2008; 7 (4): 291-302


    The ever increasing number of protein structures determined by structural genomic projects has spurred much interest in the development of methods for structure-based function prediction. Existing methods can be roughly classified in two groups: some use a comparative approach looking for the presence of structural motifs possibly associated with a known biochemical function. Other methods try to identify functional patches on the surface of a protein using only its physicochemical characteristics. This review will cover both kinds of approaches to structure-based function prediction as well as their use in real-world cases. The main issues and limitations in using protein structure to predict function will also be discussed. These are mainly: the assessment of the statistical significance of structural similarities and the extent to which these methods depend on the accuracy and availability of structural data.

    View details for DOI 10.1093/bfgp/eln030

    View details for PubMedID 18599513

  • FunClust: a web server for the identification of structural motifs in a set of non-homologous protein structures BMC BIOINFORMATICS Ausiello, G., Gherardini, P. F., Marcatili, P., Tramontano, A., Via, A., Helmer-Citterich, M. 2008; 9


    The occurrence of very similar structural motifs brought about by different parts of non homologous proteins is often indicative of a common function. Indeed, relatively small local structures can mediate binding to a common partner, be it a protein, a nucleic acid, a cofactor or a substrate. While it is relatively easy to identify short amino acid or nucleotide sequence motifs in a given set of proteins or genes, and many methods do exist for this purpose, much more challenging is the identification of common local substructures, especially if they are formed by non consecutive residues in the sequence.Here we describe a publicly available tool, able to identify common structural motifs shared by different non homologous proteins in an unsupervised mode. The motifs can be as short as three residues and need not to be contiguous or even present in the same order in the sequence. Users can submit a set of protein structures deemed or not to share a common function (e.g. they bind similar ligands, or share a common epitope). The server finds and lists structural motifs composed of three or more spatially well conserved residues shared by at least three of the submitted structures. The method uses a local structural comparison algorithm to identify subsets of similar amino acids between each pair of input protein chains and a clustering procedure to group similarities shared among different structure pairs.FunClust is fast, completely sequence independent, and does not need an a priori knowledge of the motif to be found. The output consists of a list of aligned structural matches displayed in both tabular and graphical form. We show here examples of its usefulness by searching for the largest common structural motifs in test sets of non homologous proteins and showing that the identified motifs correspond to a known common functional feature.

    View details for DOI 10.1186/1471-2105-9-S2-S2

    View details for Web of Science ID 000259022900002

    View details for PubMedID 18387204

  • Convergent evolution of enzyme active sites is not a rare phenomenon JOURNAL OF MOLECULAR BIOLOGY Gherardini, P. F., Wass, M. N., Helmer-Citterich, M., Sternberg, M. J. 2007; 372 (3): 817-845


    Since convergent evolution of enzyme active sites was first identified in serine proteases, other individual instances of this phenomenon have been documented. However, a systematic analysis assessing the frequency of this phenomenon across enzyme space is still lacking. This work uses the Query3d structural comparison algorithm to integrate for the first time detailed knowledge about catalytic residues, available through the Catalytic Site Atlas (CSA), with the evolutionary information provided by the Structural Classification of Proteins (SCOP) database. This study considers two modes of convergent evolution: (i) mechanistic analogues which are enzymes that use the same mechanism to perform related, but possibly different, reactions (considered here as sharing the first three digits of the EC number); and (ii) transformational analogues which catalyse exactly the same reaction (identical EC numbers), but may use different mechanisms. Mechanistic analogues were identified in 15% (26 out of 169) of the three-digit EC groups considered, showing that this phenomenon is not rare. Furthermore 11 of these groups also contain transformational analogues. The catalytic triad is the most widespread active site; the results of the structural comparison show that this mechanism, or variations thereof, is present in 23 superfamilies. Transformational analogues were identified for 45 of the 951 four-digit EC numbers present within the CSA and about half of these were also mechanistic analogues exhibiting convergence of their active sites. This analysis has also been extended to the whole Protein Data Bank to provide a complete and manually curated list of the all the transformational analogues whose structure is classified in SCOP. The results of this work show that the phenomenon of convergent evolution is not rare, especially when considering large enzymatic families.

    View details for DOI 10.1016/j.jmb.2007.06.017

    View details for Web of Science ID 000249494800020

    View details for PubMedID 17681532

  • 3dLOGO: a web server for the identification, analysis and use of conserved protein substructures NUCLEIC ACIDS RESEARCH Via, A., Peluso, D., Gherardini, P. F., De Rinaldis, E., Colombo, T., Ausiello, G., Helmer-Citterich, M. 2007; 35: W416-W419


    3dLOGO is a web server for the identification and analysis of conserved protein 3D substructures. Given a set of residues in a PDB (Protein Data Bank) chain, the server detects the matching substructure(s) in a set of user-provided protein structures, generates a multiple structure alignment centered on the input substructures and highlights other residues whose structural conservation becomes evident after the defined superposition. Conserved residues are proposed to the user for highlighting functional areas, deriving refined structural motifs or building sequence patterns. Residue structural conservation can be visualized through an expressly designed Java application, 3dProLogo, which is a 3D implementation of a sequence logo. The 3dLOGO server, with related documentation, is available at

    View details for DOI 10.1093/nar/gkm228

    View details for Web of Science ID 000255311500078

    View details for PubMedID 17488847

  • False occurrences of functional motifs in protein sequences highlight evolutionary constraints BMC BIOINFORMATICS Via, A., Gherardini, P. F., Ferraro, E., Ausiello, G., Tomba, G. S., Helmer-Citterich, M. 2007; 8


    False occurrences of functional motifs in protein sequences can be considered as random events due solely to the sequence composition of a proteome. Here we use a numerical approach to investigate the random appearance of functional motifs with the aim of addressing biological questions such as: How are organisms protected from undesirable occurrences of motifs otherwise selected for their functionality? Has the random appearance of functional motifs in protein sequences been affected during evolution?Here we analyse the occurrence of functional motifs in random sequences and compare it to that observed in biological proteomes; the behaviour of random motifs is also studied. Most motifs exhibit a number of false positives significantly similar to the number of times they appear in randomized proteomes (=expected number of false positives). Interestingly, about 3% of the analysed motifs show a different kind of behaviour and appear in biological proteomes less than they do in random sequences. In some of these cases, a mechanism of evolutionary negative selection is apparent; this helps to prevent unwanted functionalities which could interfere with cellular mechanisms.Our thorough statistical and biological analysis showed that there are several mechanisms and evolutionary constraints both of which affect the appearance of functional motifs in protein sequences.

    View details for DOI 10.1186/1471-2105-8-68

    View details for Web of Science ID 000244856400001

    View details for PubMedID 17331242

  • Phospho3D: a database of three-dimensional structures of protein phosphorylation sites NUCLEIC ACIDS RESEARCH Zanzoni, A., Ausiello, G., Via, A., Gherardini, P. F., Helmer-Citterich, M. 2007; 35: D229-D231


    Phosphorylation is the most common protein post-translational modification. Phosphorylated residues (serine, threonine and tyrosine) play critical roles in the regulation of many cellular processes. Since the amount of data produced by screening assays is growing continuously, the development of computational tools for collecting and analysing experimental data has become a pivotal task for unravelling the complex network of interactions regulating eukaryotic cell life. Here we present Phospho3D,, a database of 3D structures of phosphorylation sites, which stores information retrieved from the phospho.ELM database and is enriched with structural information and annotations at the residue level. The database also collects the results of a large-scale structural comparison procedure providing clues for the identification of new putative phosphorylation sites.

    View details for DOI 10.1093/nar/gkl922

    View details for Web of Science ID 000243494600047

    View details for PubMedID 17142231