From Human Salivary Proteome Wiki
This user guide contains descriptions of various features on the Human Salivary Proteome Wiki, a service of the National Institute of Dental and Craniofacial Research (NIDCR), that stores information on human salivary proteins from a large collection of biomedical knowledge bases. Look for the tab with the icon on top of the page that you are on for a direct link to the appropriate help page.
Types of Data on the Wiki
This wiki focuses on more than 1,000 unique human salivary proteins identified by high-throughput proteomic technologies. The experiments were mostly conducted under the Human Salivary Proteome Project (HSPP) framework. The content is separated into multiple levels of detail. These are described here.
Salivary Proteins
These are the proteins that have sequence segments found in human saliva using the latest proteomics technologies. Since many proteins can share identical sequence segments, the proteins listed here may or may not actually be in saliva depending on many other factors. Descriptions of individual proteins, including community annotations, are displayed on each protein page to provide better understanding of their characteristics.
Protein Clusters
Each protein clusters represent a set of proteins with overlapping peptide hits. Peptide-based evidence cannot distinguish between redundant identifications that likely represent homologs and isoforms. In the Human Salivary Proteome Project (HSPP), protein identifications are grouped together when they are produced from identical peptide evidence.
Protein Signatures
Protein sequence signatures are sequence patterns frequently found in related proteins. Each signature can be classified as a protein family, functional domain, active site, or merely a conserved region. The collection of protein signatures stored in the wiki are imported from the InterPro database.
Genes
Basic information about human genes are stored in the wiki to augment the protein data. Gene pages are created on demand by retrieving contents from NCBI's Entrez Gene database.
Citations
Citations indexed by the PubMed database and referenced within the wiki are stored locally. The various information associated with a PubMed citation in this wiki is described.
Proteomics Database
Data from tandem mass spectrometry (MS/MS) experiments are stored in a standard-compliant proteomics database called PRIDE (PRoteomics IDEntifications database). Various interfaces are provided to query and display these data in the wiki.
Protein Interactions
Protein-protein interactions are important to our understanding of cell functioning. Curated protein interaction data, including experimental details and evidence extracted from the literature, can be retrieved from the IntAct database.
Adding Annotations or Comments
The content on this wiki is open to the community for addition and modification. Your contribution is important to increase our understanding of the salivary proteome and its potential applications.
Protein Annotation
User contributions to enrich the proteome catelog is greatly appreciated. This help page will describes the user-friendly interface used to add new annotations or modify existing ones.
Curation Process
As content is added to the wiki by you the users, it is reviewed by one or more independent scientists to confirm that the content is grounded in scientific evidence and relevant to the community. The process of curation is described in this help page.
Discussions
Less formal than adding specific annotation content is adding comments and discussions to talk pages on the wiki. Go here to understand how to be part of the community and participate in scientific exchange on the wiki.
Semantic Features
One of the unique features of the wiki is the semantic framework built on top of the data. This framework facilitates many advanced reporting and retrieval tasks as described below.
Semantic Annotations
Semantic annotations are metadata about a page which describe facts that are computer interpretable. This help page introduces some of the semantic features available for you to take advantage of the wiki's semantic framework.
Semantic Queries
Semantic queries are structured ways to find information on the wiki. These are fully customizable and extremely powerful ways of finding data. There are two ways of doing searches: 1) using forms with drop-down menus (simpler, powerful but not quite as flexible) and 2) constructing your own custom queries. This help page describes how to construct your own queries.
Tools
Many tools are available on the wiki to help with specific tasks. In addition to the standard wiki features, custom tools have been developed for you to retrieve and understand the data stored in the database.
Page Search
This help page describes the various tools you can use to look for and retrieve pages matching your criteria. Some of these tools provide user-friendly interfaces to construct semantic queries as described above.
BLAST Search
This help page describes the utility of the BLAST search tool to compare protein sequences against the IPI protein database.
Concept Extraction
Concept extraction is a tool that helps "annotate" free text using terms from a wide range of domain-specific controlled vocabularies. This concept extraction feature is available on PubMed and InterPro pages, as described in this help page.
InterProScan
InterProScan is a tool to "search" for potential regions of the protein that have biological significance, using the InterPro database of known protein features. Even though many features are already identified in the wiki, there is also a way for you to do a new search with InterProScan. This help page describes the details of InterProScan and how searches are done.
Ontology Lookup
Ontology lookup is a tool used to find information on specific ontology terms. This page describes how to use the tool to support searching and annotating.
Sequence Alignment
ClustalW is a widely-used multiple sequence alignment program for proteins (or DNA). It produces biologically meaningful multiple sequence alignments of divergent sequences. This help page describes the details of the ClustalW program and how to use it in the wiki.
Structure Prediction
Phyre2 uses homology-based techniques to predict the 3D structure of a protein from its amino acid sequence. This page describes how to submit a job to the Phyre2 server.