Skip to Main Content

From Human Salivary Proteome Wiki

Jump to: navigation, search

This user guide contains descriptions of various features on the Human Salivary Proteome Wiki, a service of the National Institute of Dental and Craniofacial Research (NIDCR), that stores information on human salivary proteins from a large collection of biomedical knowledge bases. Look for the tab with the question mark icon on top of the page that you are on for a direct link to the appropriate help page.

Types of Data on the Wiki

This wiki focuses on more than 1,000 unique human salivary proteins identified by high-throughput proteomic technologies. The experiments were mostly conducted under the Human Salivary Proteome Project (HSPP) framework. The content is separated into multiple levels of detail. These are described here.

Salivary Proteins

These are the proteins that have sequence segments found in human saliva using the latest proteomics technologies. Since many proteins can share identical sequence segments, the proteins listed here may or may not actually be in saliva depending on many other factors. Descriptions of individual proteins, including community annotations, are displayed on each protein page to provide better understanding of their characteristics.

Protein Clusters

Each protein clusters represent a set of proteins with overlapping peptide hits. Peptide-based evidence cannot distinguish between redundant identifications that likely represent homologs and isoforms. In the Human Salivary Proteome Project (HSPP), protein identifications are grouped together when they are produced from identical peptide evidence.

Protein Signatures

Protein sequence signatures are sequence patterns frequently found in related proteins. Each signature can be classified as a protein family, functional domain, active site, or merely a conserved region. The collection of protein signatures stored in the wiki are imported from the InterPro database.


Basic information about human genes are stored in the wiki to augment the protein data. Gene pages are created on demand by retrieving contents from NCBI's Entrez Gene database.


Citations indexed by the PubMed database and referenced within the wiki are stored locally. The various information associated with a PubMed citation in this wiki is described.

Proteomics Database

Data from tandem mass spectrometry (MS/MS) experiments are stored in a standard-compliant proteomics database called PRIDE (PRoteomics IDEntifications database). Various interfaces are provided to query and display these data in the wiki.

Protein Interactions

Protein-protein interactions are important to our understanding of cell functioning. Curated protein interaction data, including experimental details and evidence extracted from the literature, can be retrieved from the IntAct database.

Adding Annotations or Comments

The content on this wiki is open to the community for addition and modification. Your contribution is important to increase our understanding of the salivary proteome and its potential applications.

Protein Annotation

User contributions to enrich the proteome catelog is greatly appreciated. This help page will describes the user-friendly interface used to add new annotations or modify existing ones.

Curation Process

As content is added to the wiki by you the users, it is reviewed by one or more independent scientists to confirm that the content is grounded in scientific evidence and relevant to the community. The process of curation is described in this help page.


Less formal than adding specific annotation content is adding comments and discussions to talk pages on the wiki. Go here to understand how to be part of the community and participate in scientific exchange on the wiki.

Semantic Features

One of the unique features of the wiki is the semantic framework built on top of the data. This framework facilitates many advanced reporting and retrieval tasks as described below.

Semantic Annotations

Semantic annotations are metadata about a page which describe facts that are computer interpretable. This help page introduces some of the semantic features available for you to take advantage of the wiki's semantic framework.

Semantic Queries

Semantic queries are structured ways to find information on the wiki. These are fully customizable and extremely powerful ways of finding data. There are two ways of doing searches: 1) using forms with drop-down menus (simpler, powerful but not quite as flexible) and 2) constructing your own custom queries. This help page describes how to construct your own queries.


Many tools are available on the wiki to help with specific tasks. In addition to the standard wiki features, custom tools have been developed for you to retrieve and understand the data stored in the database.

Page Search

This help page describes the various tools you can use to look for and retrieve pages matching your criteria. Some of these tools provide user-friendly interfaces to construct semantic queries as described above.

BLAST Search

This help page describes the utility of the BLAST search tool to compare protein sequences against the IPI protein database.

Concept Extraction

Concept extraction is a tool that helps "annotate" free text using terms from a wide range of domain-specific controlled vocabularies. This concept extraction feature is available on PubMed and InterPro pages, as described in this help page.


InterProScan is a tool to "search" for potential regions of the protein that have biological significance, using the InterPro database of known protein features. Even though many features are already identified in the wiki, there is also a way for you to do a new search with InterProScan. This help page describes the details of InterProScan and how searches are done.

Ontology Lookup

Ontology lookup is a tool used to find information on specific ontology terms. This page describes how to use the tool to support searching and annotating.

Sequence Alignment

ClustalW is a widely-used multiple sequence alignment program for proteins (or DNA). It produces biologically meaningful multiple sequence alignments of divergent sequences. This help page describes the details of the ClustalW program and how to use it in the wiki.

Structure Prediction

Phyre2 uses homology-based techniques to predict the 3D structure of a protein from its amino acid sequence. This page describes how to submit a job to the Phyre2 server.

HSPW Version 1.5.3. This page was last modified on 8 May 2019, at 20:40.This page has been accessed 1,490 times.