Skip to Main Content

From Human Salivary Proteome Wiki

Jump to: navigation, search

Concept Extraction

Many of the annotations and table fields in the wiki have a large amount of free-text. Having the ability to automatically highlight important biomedical terms in the text and then quickly see their definition can be very useful. This feature is available in the wiki through a process called Concept Extraction, which will be described in details below.



Over the past decade or so, a large number of ontologies to describe specific domain concepts have been created by biomedical subject matter experts. One central resource for viewing and using these ontologies is the The National Center for Biomedical Ontology (NCBO). This NIH-supported center collects and utilizes biomedical ontologies for the research community. The specific tool we use from the NCBO to "mark up" free text is the NCBO Annotator. The Ontology Lookup function in this wiki is a supplementary gateway for you to explore these concepts in further depth.

See also: Help:Ontology Lookup

Available terms

In the Human Salivary Proteome Wiki, we are using a subset of the ontologies from NCBO (shown in the list below along with the authors) that are relevant to saliva and proteomics research. You can also explore these ontologies in the Ontology Lookup tool:

  • Medical Subject Headings (MSH); National Library of Medicine
  • Foundational Model of Anatomy (FMA); University of Washington
  • NCI Thesaurus (NCI); National Cancer Institute
  • Human Disease (DOID); University of Maryland School of Medicine
  • Pathway Ontology (PW); Medical College of Wisconsin
  • Human Phenotype Ontology (HP); OBO Foundry
  • Gene Ontology (GO); Gene Ontology Consortium
  • Cell Type (CL); OBO Foundry
  • Mammalian Phenotype (MP); The Jackson Laboratory

Seeing it In Action

The concept extraction feature is available on pages that use a lot of free text, including citation and protein signature pages. When available, the "Tag" tab will appear on the top of the page (see Figure 1).

Start the Concept Extraction by Selecting the "tag" Tab.
Fig. 1: Start the Concept Extraction by Selecting the "tag" Tab - you start the concept extraction by selecting the "tag" tab at the top of the page (highlighted).

Let's use the citation page PubMed:9563472 as an example to show you the the kind of information you can obtain using this feature. Simply click the "Tag" tab to start the extraction process. Once the results are returned from NCBO, concepts extracted from the text will be underlined and highlighted in yellow (see Figure 2).

See also: Help:PubMed Citations, Help:Protein Signatures

Extracted Concepts are Highlighted.
Fig. 2: Extracted biomedical concepts are highlighted in yellow.

Understanding the extracted concepts

To see a list of concepts that were mapped to a particular term, hover over the highlighted area and right mouse click (or control-click on a Mac). Figure 3 shows an instance where there is one ontology that has a concept for "cysteine proteinase". As highlighted in the figure, "MSH" identified the term. As seen in the list above, the abbreviation maps to "Medical Subject Headings".

In addition to the source ontology, each mapping also lists the unique identifier of the concept, and the preferred name of the term. A score in square brackets (also highlighted in Figure 3) indicates the confidence of the mapping. Theoretically, the higher the score, the more accurate the match of the text to the ontological concept. In general you should use the highest scoring concept to get the most accurate definition.

Right Click Shows Details of the Ontologies, Terms and Matching Strength.
Fig. 3: Right clicking on a term shows details of the ontologies that found the term, matching term names, and a score that represents the strength of matching (the higher the score, the more accurate the match).

Detailed information of the concept

To drill into the details of the concept, after right clicking, hover over the term of interest and left-click. This takes you directly to the term in the Ontology Lookup tool, where various properties of the concept and its position in the ontology are displayed (see Figure 4).

Drill Down to Details of the Biomedical Term on the Ontology Term Page.
Fig. 4: Selecting and clicking on a term takes you to a detailed description of the concept, the "Ontology Term" page (highlighted is the term that was used to match to the free text).
HSPW Version 1.5.3. This page was last modified on 4 January 2022, at 17:51.This page has been accessed 740 times.