How to download InterPro data?¶
InterPro data and search tools are freely available for download. We provide bulk downloads, data exports on each relevant InterPro page and an API to allow easy access for user scripts.
This is available under the Download section in the navigation menu. It provides a downloadable version of the InterProScan software as well as various files containing pre-calculated InterPro data for the current release that can be downloaded. Data from previous releases are available in the InterPro ftp.
This page is accessible through the Results tab in the navigation menu, under “Your downloads” section.
The purpose of this page is to give the user a way to select and filter InterPro data. Filtered data can then be downloaded in different file formats (if the selection has less than 10K entities), using the provided API call or through a script generated automatically.
For Example, the image above shows Protein as the main data type selected and it will only select proteins included in the database UniProtKB/Swiss-Prot; this selection is then filtered by the selection of the endpoint entry with InterPro as the database and accession IPR000001. In other words this will generate the list of SwissProt proteins that are matching IPR000001 (also available under the Proteins tab in the InterPro entry page for IPR000001, with the reviewed option selected).
The following output formats are currently supported, if the number of entities selected is lower than 10K:
Text: a list of accessions, 1 per line
FASTA: a single file with multiple sequences in Fasta format (only available for proteins)
JSON: it reuses the format returned by the InterPro API.
TSV: reformats the JSON from the API to create a TSV file.
After selecting the output format, clicking on the Download button at the bottom of the page will start the downloading.
InterPro Application Programming Interface (API)¶
The InterPro API provides programmatic access to all the InterPro entries and their related entities in Json format.The API has six main endpoints, which corresponds to the InterPro data types: entry, protein, structure, taxonomy, proteome and set.
An API call is formed of one or multiple endpoint blocks. An endpoint block consists of a data type, a source database and an accession (e.g. api/datatype/sourcedb/accession).
For example the URL /entry/interpro provides a pageable list of all the interpro entries. And the URL /protein/uniprot/p99999 returns all the details of the protein identified with the UniProt accession P99999.
The combined URL /entry/interpro/protein/uniprot/p99999 returns the list of all the InterPro entries that match in the P99999 protein accession.