InterPro Entries : essential information¶
An InterPro entry is created for each protein family, domain or important site signature that is integrated into InterPro from one or more of it’s 13 member databases. Where signatures from two or more member databases describe the same family, domain or site, the member database signatures are brought together under one InterPro entry.
An InterPro entry provides a written description of the family, domain or site and lists the contributing member database signatures. Each entry has a name, a unique InterPro identifier and an entry type. Go terms associated with the entry are also displayed. For each InterPro entry further information is provided showing, for example, the proteins, structures and pathways matching this entry along with taxonomic distribution. This information can be easily viewed by Browsing entries in the InterPro website.
InterPro entry types¶
InterPro entries are created for protein families, domains, sites, repeats and homologous superfamilies, defined as follows:
Homologous Superfamily - a group of proteins that share a common evolutionary origin, reflected by similarity in their structure, even if sequence similarity is low. This entry type contains signatures from the CATH-Gene3D and SUPERFAMILY member databases exclusively.
Other entry and page types¶
In addition to the main InterPro Entries, which bring together protein signatures from the member databases consortium, InterPro also provides entry pages for the individual member database signatures and for proteins, structures, taxons, proteomes and sets/clans integrated or used by InterPro. These entry pages also have further information available that can be viewed by Browsing entries in the InterPro website. More information is available in the corresponding train online section.
InterPro entries that represent a subset of proteins from another InterPro entry are identified as “children” of the “parent” entry. InterPro displays these connections between entries in the “Family Relationships” or “Domain Relationships” sections. Entries at the top of these hierarchies describe broad families or domains that share higher level structure and/or function, while those entries at the bottom describe more specific functional subfamilies or structural/functional subclasses of domains. More information is available in the corresponding train online section.
Relationships between homologous superfamilies and either family or domain entries are generated automatically using the Jaccard and containment indexes. These relationships are shown in the Overlapping homologous superfamilies/Overlapping entries section on the InterPro entry pages. More information is available in the corresponding train online section.
InterPro uses several standards and ontologies:
the NCBI Taxonomy for taxa: the NCBI assigns unique taxonomic identifiers for all organisms (taxa) that are represented in UniProtKB. As these taxonomic identifiers are stable, InterPro uses them to let users search the resource by organism;
the Gene Ontology (GO) for functions, processes, cellular components: InterPro2Go (https://doi.org/10.1093/database/bar068) is a manually created mapping between InterPro entries and GO terms. Where an InterPro entry hits a set of functionally similar proteins, GO terms describing the conserved function or location are associated with the InterPro entry.
the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (NC-IUBMB) via IntEnz: Enzyme Commission (EC) numbers describe enzyme-catalyzed reactions and are available in UniProtKB, e.g. P17050. Where an InterPro entry hits reviewed/Swiss-Prot proteins annotated with EC numbers, the EC numbers are associated to the InterPro entry.
Reactome and MetaCyc for pathways. Where an InterPro entry hits a reviewed/Swiss-Prot protein involved in a pathway described by Reactome, the pathway is associated to the InterPro entry. As reactions in MetaCyc include EC numbers, InterPro uses EC numbers assigned to an entry (as described above) and to a metabolic pathway to link InterPro entries and MetaCyc pathways.