Using the InterPro API

This guide provides an introduction to the InterPro API. The query syntax is flexible and makes it possible to fetch and filter many different kinds of data from the InterPro databases.

The InterPro website is built on top of the API and provides many practical examples of how the API can be used to fetch data. The script generator is particularly useful for generating examples to download InterPro data in Python 3, Perl and JavaScript.

Getting started with examples

How many entries are there in the InterPro API?

This query returns counts for all the different sources of entries in the InterPro dataset. It also includes counts for member database entries that are integrated into InterPro and those that are still unintegrated.

/api/entry

How do I get a list of all CDD entries in the InterPro API?

This query returns a paginated list of summary information about CDD entries in the dataset. The response returns the first 20 hits and provides a link to the next page of results.

/api/entry/cdd

How many entries match human P53 protein (UniProt accession P04637)?

This query returns counts for entries linked to P53 across the different data sources represented in InterPro.

/api/entry/protein/uniprot/P04637

How do I retrieve all InterPro entries found in human P53 protein (P04637)?

This query returns the list of InterPro entries mapped to human P53 protein.

/api/entry/interpro/protein/uniprot/P04637

How do I retrieve UniProtKB reviewed proteins containing IPR002117?

This query returns proteins that contain IPR002117, together with the location of that match in each protein sequence.

/api/protein/reviewed/entry/interpro/ipr002117

How do I retrieve organisms that possess IPR026381?

This query returns organisms that have at least one protein containing a match to IPR026381.

/api/taxonomy/uniprot/entry/interpro/IPR026381

Key concepts

Main data types

Currently, six main types of data are available through the API:

Type

Description

Source

Entry

Predicted functional and structural domains on proteins

InterPro, CATH-Gene3D, CDD, HAMAP, PANTHER, Pfam, PIRSF, PRINTS, PROSITE Patterns, PROSITE Profiles, SMART, SFLD, SUPERFAMILY, NCBIfam

Protein

Protein sequence

UniProtKB, including reviewed and unreviewed proteins

Structure

Macromolecular structures involving proteins

PDB

Set

Sets describing relationships between entries

Pfam, CDD, PIRSF

Taxonomy

Taxonomic information about proteins

UniProtKB

Proteome

Collections of proteins defined from whole genome sequencing projects

UniProtKB

REST interface

Queries are expressed as URLs. In general, a short query returns broad summary data, while a more specific query returns more detailed data.

The most general query is:

https://www.ebi.ac.uk/interpro/api/

The JSON response includes, among other fields, a list of supported endpoints.

Main endpoints

The main data types are the most important endpoints because they determine the type of data returned by a query.

Endpoint blocks

An endpoint block can contain up to three parts: a data type, a source database, and a unique identifier. The type is mandatory, while the source and identifier are optional.

These parts determine the response:

  • A query with only a data type returns aggregated counts grouped by source.

  • A query with a data type and source returns a paginated list of entities.

  • A query that also includes an accession returns detailed information about a single entity.

Examples:

Query

Response type

Description

/api/entry

List of counts

Counts of entries in InterPro and member databases

/api/protein

List of counts

Counts of proteins in UniProtKB and its reviewed and unreviewed sets

/api/structure

List of counts

Counts of structures in PDB

/api/entry/interpro

List of entities

List of InterPro entries

/api/entry/cdd

List of entities

List of CDD entries

/api/proteome/uniprot

List of entities

List of proteomes

/api/entry/interpro/ipr023411

Detailed object

Details of InterPro entry IPR023411

/api/entry/pfam/pf06235

Detailed object

Details of Pfam entry PF06235

/api/taxonomy/uniprot/9606

Detailed object

Details of taxonomy identifier 9606

Filtering data

The first endpoint block defines the main type of data returned. Additional endpoint blocks then filter that main dataset.

Examples:

API query modifiers

Modifiers allow filtering and/or ordering the data returned by an API call. They are appended to the URL as query parameters (e.g. ?page_size=100). Multiple compatible modifiers can be combined with &.

Apply to any API call

Modifier

Combinable

Data returned

Example

page_size=<number up to 200>

Yes

Number of results returned at a time (default=20)

/api/protein/reviewed?page_size=100

search=<text>

Yes

Entries matching the text search

/api/taxonomy/uniprot/?search=9606

/api/entry

Modifier

Combinable

Data returned

Example

group_by=type

No

Number of entries for each entry type (e.g. family, domain, site…)

/api/entry?group_by=type

group_by=source_database

No

Number of entries for each member database (e.g. pfam, CDD…)

/api/entry?group_by=source_database

group_by=tax_id

No

Number of entries (InterPro+member database) for key species

/api/entry?group_by=tax_id

group_by=go_terms

No

Number of entries (InterPro+member database) for each GO term

/api/entry?group_by=go_terms

type=<entry type>

Yes

List of signatures with the entry type specified

/api/entry?type=family

go_category=[F, C, P]

Yes

List of GO terms for the category specified (P for Biological Process, F for Molecular Function, and C for Cellular Component)

/api/entry?go_category=F

go_term=<GO identifier>

Yes

Count entries that have been annotated with the given GO term, group by member database

/api/entry?go_term=GO:0004298

ida_search

Yes

List of InterPro domain architectures with protein count

/api/entry?ida_search

ida_search=<ipr1,pf2,ipr3>

Yes

List of ida and protein count for the specified domain accessions

/api/entry?ida_search=IPR003100,IPR003165

ida_search=<ipr1,pf2,ipr3>&ordered

Only with ida_search

List of ida and protein count for the specified domain accessions where the accession order matters

/api/entry?ida_search=IPR003100,IPR003165&ordered

ida_search=<ipr1,pf2,ipr3>&ordered&exact

Only with ida_search + ordered

Protein count for proteins containing specified domain accessions only

/api/entry?ida_search=IPR003100,IPR003165&exact

ida_search=<ipr1,pf2,ipr3>&ida-ignore=<ipr4,pf6>

Only with ida_search

List of ida and protein count for the specified domain accessions where the last accessions specified shouldn’t be in the ida

/api/entry?ida_search=IPR003100,IPR003165&ida_ignore=PF08699

/api/entry/<database name>

Database name can be: interpro, pfam, cdd, panther, sfld, cathgene3d, ssf, hamap, pirsf, prints, prosite, profile, smart, ncbifam.

Note

Since InterPro 94.0, tigrfams has been replaced by ncbifam. Temporary redirects are in place, but users are recommended to update to ncbifam to avoid any problems.

Modifier

Combinable

Data returned

Example

group_by=type

No

Number of signatures for each member database entry type (e.g. family, domain…) for the database selected

/api/entry/pfam?group_by=type

group_by=source_database

No

Number of signatures for the database selected

/api/entry/pfam?group_by=source_database

group_by=go_categories

No

Number of signatures for each GO term category for the database selected

/api/entry/cathgene3d?group_by=go_categories

group_by=tax_id

No

Number of signatures for key species for the database selected

/api/entry/cathgene3d?group_by=tax_id

group_by=go_terms

No

Number of entries for each GO term for the database selected

/api/entry/integrated?group_by=go_terms

type=<entry_type>

Yes

List of signatures with the entry type specified for the database selected

/api/entry/smart?type=domain

sort_by=accession

Yes

List of signatures sorted by accession (low to high) for the database selected

/api/entry/pfam?sort_by=accession

sort_by=integrated

Yes

List of signatures sorted by integrated ones first for the database selected

/api/entry/pfam?sort_by=integrated

extra_fields=[counters, entry_id, short_name, description, wikipedia, literature, hierarchy, cross_references, entry_date, is_featured, overlaps_with]

No

Includes the value of the selected fields in the results

/api/entry/InterPro?signature_in=hamap&extra_fields=description

annotation=[hmm, alignment, logo]

No

List of entries which have an annotation of the given type (hmm, alignment, logo)

/api/entry/pfam?annotation=hmm

/api/entry/<integrated, unintegrated>

Information on member database signatures integrated/unintegrated in InterPro entries.

Modifier

Combinable

Data returned

Example

group_by=type

No

Number of integrated/unintegrated entries for each entry type (e.g. family, domain, site…)

/api/entry/unintegrated?group_by=type

group_by=source_database

No

Number of integrated/unintegrated entries for each member database (e.g. pfam, CDD…)

/api/entry/unintegrated?group_by=source_database

group_by=tax_id

No

Number of integrated/unintegrated entries for key species

/api/entry/unintegrated?group_by=tax_id

group_by=go_terms

No

Number of integrated/unintegrated entries for each GO term

/api/entry/integrated?group_by=go_terms

extra_fields=[counters, entry_id, short_name, description, wikipedia, literature, hierarchy, cross_references, entry_date, is_featured, overlaps_with]

No

Includes the value of the selected fields in the results

/api/entry/integrated?group_by=go_terms&extra_fields=counters

/api/entry/interpro

Modifier

Combinable

Data returned

Example

group_by=member_databases

No

Number of integrated signatures for each member database (e.g. pfam, CDD…)

/api/entry/interpro?group_by=member_databases

latest_entries

No

List of InterPro entries integrated in the last InterPro release

/api/entry/interpro?latest_entries

signature_in=<memberdb>

Yes

List of InterPro entries that have a match with the given memberDB

/api/entry/InterPro?signature_in=hamap

go_category=[F, C, P]

Yes

List of GO terms for the category specified (P for Biological Process, F for Molecular Function, and C for Cellular Component)

/api/entry/interpro?go_category=F

go_term=<GO identifier>

No

List of InterPro entries that have been annotated with the given GO term

/api/entry/interpro?go_term=GO:0004298

/api/entry/interpro/<InterPro entry accession>

Modifier

Combinable

Data returned

Example

interactions

No

List of interactions proteins matching the entry are involved in (obtained from Intact database)

/api/entry/InterPro/IPR000477?interactions

pathways

No

List of pathways proteins matching the entry are involved in (obtained from MetaCyc and Reactome databases)

/api/entry/InterPro/IPR024156?pathways

annotation:info

No

Entry information

/api/entry/InterPro/IPR025743?annotation:info

extra_fields=[counters, entry_id, short_name, description, wikipedia, literature, hierarchy, cross_references, entry_date, is_featured, overlaps_with]

No

Includes the value of the selected fields in the results

/api/entry/InterPro/IPR024156?extra_fields=short_name

/api/entry/<member database>

Note

Not available for InterPro.

Modifier

Combinable

Data returned

Example

interpro_status

Yes

Number of signatures integrated and unintegrated in InterPro entries for the member database selected

/api/entry/panther?interpro_status

integrated=<interpro accession>

Yes

List of signatures integrated in the specified InterPro entry

/api/entry/pfam?integrated=IPR003165

/api/entry/<member database>/<accession>

Modifier

Combinable

Data returned

Example

annotation=[hmm, alignment, logo]

No

Download compressed signature hmm file if it exists

/api/entry/pfam/pf02171?annotation=hmm

/api/entry/protein

Modifier

Combinable

Data returned

Example

group_by=type

No

Number of proteins for each entry type (e.g. family, domain, site…)

/api/entry/protein?group_by=type

/api/protein

Modifier

Combinable

Data returned

Example

group_by=tax_id

No

Number of proteins for each taxon

/api/protein?group_by=tax_id

group_by=go_terms

No

Number of proteins for each GO term

/api/protein?group_by=go_terms

group_by=match_presence

No

Number of proteins with/without an InterPro entry match

/api/protein?group_by=match_presence

group_by=is_fragment

No

Number of full/fragmented proteins

/api/protein?group_by=is_fragment

group_by=source_database

No

Number of reviewed and unreviewed proteins

/api/protein?group_by=source_database

match_presence=[true,false]

Yes

Number of proteins with [true]/without [false] a match to an InterPro entry

/api/protein?match_presence=false

tax_id=<accession>

Yes

Number of proteins that belong to this taxonomy id

/api/protein?tax_id=2711

is_fragment=[true,false]

Yes

Number of proteins that are [true]/aren’t [false] fragments

/api/protein?is_fragment=true

/api/protein/<uniprot/reviewed/unreviewed>

Modifier

Combinable

Data returned

Example

group_by=go_terms

No

List of proteins for each GO term for the protein source selected

/api/protein/reviewed?group_by=go_terms

group_by=is_fragment

No

Number of proteins that are and aren’t fragments for the protein source selected

/api/protein/reviewed?group_by=is_fragment

group_by=match_presence

No

Number of proteins that have and don’t have matches to InterPro entries for the protein source selected

/api/protein/reviewed?group_by=match_presence

group_by=tax_id

No

Number of proteins for each taxon for the protein source selected

/api/protein/uniprot?group_by=tax_id

group_by=source_database

No

Number of proteins for the protein source selected

/api/protein/unreviewed?group_by=source_database

go_term=<GO identifier>

Yes

List of proteins for the GO term and protein source selected

/api/protein/reviewed?go_term=GO:0004298

id=<Uniprot identifier>

Yes

Information about the protein with the specified UniProt identifier

/api/protein/reviewed?id=CYC_HUMAN

tax_id=<accession>

Yes

List of proteins corresponding to the tax_id specified for the protein resource selected

/api/protein/uniprot?tax_id=2711

ida=<ida_accession>

Yes

List of proteins with the specified domain architecture for the protein source selected

/api/protein/reviewed?ida=6ad3f81f5ba41a43b4c938fb2018f519f64e0548

match_presence=[true,false]

Yes

List of proteins for the protein source selected with [true]/without [false] a match to an InterPro entry

/api/protein/reviewed?match_presence=true

is_fragment=[true,false]

Yes

List of proteins for the protein source selected that are [true]/aren’t [false] fragments

/api/protein/reviewed?is_fragment=true

extra_fields=[counters, identifier, description, sequence, gene, go_terms, evidence_code, residues, tax_id, proteome, extra_features, structure, is_fragment, ida_id, ida]

No

Includes the value of the selected fields in the results

/api/protein/reviewed?id=CYC_HUMAN&extra_fields=sequence

/api/protein/<uniprot/reviewed/unreviewed>/<protein accession>

Modifier

Combinable

Data returned

Example

residues

No

Residues annotations for the protein selected

/api/protein/uniprot/A0A000?residues

structureinfo

No

CATH/SCOP domains matching the protein selected

/api/protein/uniprot/P02185?structureinfo

ida

No

Information about the protein domains arrangement based on Pfam domains

/api/protein/reviewed/D4A7N1?ida

extra_features

No

Matches from the extra feature section (e.g. Mobidb-lite, Coil)

/api/protein/reviewed/D4A7N1?extra_features

isoforms

No

Different isoforms of the protein

/api/protein/reviewed/Q6ZNL6?isoforms

isoforms=<isoform_id>

No

Information about the isoform selected

/api/protein/reviewed/Q6ZNL6?isoforms=Q6ZNL6-1

extra_fields=[counters, identifier, description, sequence, gene, go_terms, evidence_code, residues, tax_id, proteome, extra_features, structure, is_fragment, ida_id, ida]

No

Includes the value of the selected fields in the results

/api/protein/reviewed/Q6ZNL6?extra_fields=sequence

conservation=<member database>

No

Residue conservation calculated using HMMER for the member database specified

/api/protein/reviewed/D4A7N1?conservation=panther

/api/proteome/uniprot

Modifier

Combinable

Data returned

Example

group_by=proteome_is_reference

No

Number of UniProt proteomes that are/aren’t from the UniProt reference proteome

/api/proteome/uniprot?group_by=proteome_is_reference

extra_fields=[counters, strain, assembly]

No

Includes the value of the selected fields in the results

/api/proteome/uniprot?extra_fields=counters

/api/set/<all, cdd, pfam>

Modifier

Combinable

Data returned

Example

extra_fields=[counters, description, relationships]

No

Includes the value of the selected fields in the results

/api/set/cdd?extra_fields=counters

/api/set/<all, cdd, pfam>/<accession>

Modifier

Combinable

Data returned

Example

alignments

No

Alignment information for the database and set specified

/api/set/cdd/cl00014/?alignments=

/api/structure

Modifier

Combinable

Data returned

Example

experiment_type=[x_ray,nmr,em]

No

Number of structures for the experiment type selected

/api/structure?experiment_type=nmr

group_by=experiment_type

No

Number of structures for each experiment type

/api/structure?group_by=experiment_type

/api/structure/pdb

Modifier

Combinable

Data returned

Example

experiment_type=[x_ray,nmr,em]

Yes

List of PDB structures for the experiment type selected

/api/structure/PDB/?experiment_type=x-ray

resolution=<start-end>

Yes

List of PDB structures between the resolution range selected

/api/structure/pdb?resolution=1.0-2.5

group_by=experiment_type

No

Number of PDB structures for each experiment type

/api/structure/pdb?group_by=experiment_type

extra_fields=[release_date, literature, chains, secondary_structures, counters]

Yes

Includes the value of the selected fields in the results

/api/structure/pdb?resolution=1.0-2.5&extra_field=secondary_structures

/api/structure/pdb/<pdb accession>

Modifier

Combinable

Data returned

Example

extra_fields=[release_date, literature, chains, secondary_structures, counters]

No

Includes the value of the selected fields in the results

/api/structure/pdb/101m?extra_field=release_date

/api/taxonomy/uniprot

Modifier

Combinable

Data returned

Example

scientific_name=<name>

No

Taxon hierarchy and counters

/api/taxonomy/uniprot?scientific_name=Bacteria

key_species

No

Taxonomy info for key species

/api/taxonomy/uniprot?key_species

extra_fields=[counters, scientific_name, full_name, lineage, rank]

Yes

Includes the value of the selected fields in the results

/api/taxonomy/uniprot?extra_fields=full_name

/api/taxonomy/uniprot/<taxonomy accession>

Modifier

Combinable

Data returned

Example

with_names

No

Selected taxon hierarchy and names

/api/taxonomy/uniprot/1?with_names

filter_by_entry=<InterPro accession>

No

Selected taxon hierarchy and counters for the InterPro entry accession specified

/api/taxonomy/uniprot/1?filter_by_entry=IPR001165

filter_by_entry_db=<db name>

No

Selected taxon hierarchy and counters for the database name specified (e.g. interpro, pfam, smart…)

/api/taxonomy/uniprot/1?filter_by_entry_db=interpro

/api/protein/uniprot/entry/<source database>/<interpro accession>

Proteins with an AlphaFold model.

Modifier

Combinable

Data returned

Example

has_model=[true,false]

No

List of proteins with/without an AlphaFold prediction for the source database entry selected

/api/protein/uniprot/entry/InterPro/IPR000001/?has_model=true

Proteins with an AlphaFold or BFVD model.

Modifier

Combinable

Data returned

Example

with=[alphafold,bfvd]

No

List of proteins with an AlphaFold prediction or a BFVD prediction for the source database entry selected

/api/protein/uniprot/entry/InterPro/IPR000001/?with=alphafold

Ongoing issues and future work

Some care is needed when combining filters to make sure the requested data matches the intended query. Some responses still contain nested lists that are limited to a maximum of 20 hits and cannot be paginated.