How to search the InterPro website?¶
A search can be performed on the InterPro homepage using the Search box component, by clicking on the Search tab in the navigation menu, or by clicking on the magnifying glass in the navigation banner. There are five different types of search available in InterPro:
Quick search¶
The magnifying glass in the navigation banner allows a quick search for a specified keyword. A search can be triggered by entering some text and pressing the enter/return key or clicking the magnifying glass. If the keyword is text, the results will be displayed as described in the Text search. If the keyword entered is an accession, it automatically redirects to the corresponding InterPro page under the Browse tab in the navigation menu.
Sequence search¶
A sequence or a batch of sequences of nucleotides or amino acids can be submitted in FASTA format in the dedicated text area or by uploading a fasta file. The Advanced options allows users to select the sequence type (protein -amino acids-, or RNA/DNA -nucleotides-), the InterPro member databases and other sequence features of interest to search against (by default they are all selected). The sequence search is performed using the InterProScan software. While the sequence search is running, the user can continue to navigate through the website, other browser tabs or applications and will get a pop-up notification when the job has been completed (this requires the browser notifications to be allowed).
Sequence search results¶
Accessing results¶
Protein sequence search results can be found under the Results tab within the Your InterProScan Searches section of the navigation menu. This page presents sequence searches performed within the past seven days, with the most recent searches appearing at the top of the list.
Search status and management¶
The Status column provides information about the search state. A completed search is indicated with the word ‘Completed’, whilst a searching symbol shows an in-progress search. Users can delete their searches through the Action column (bin icon). Search results are automatically saved in the browser.
Summary of sequence searches.¶
Importing previous searches¶
Users can import previous searches through two methods. The first method involves entering the job ID for searches conducted within the last seven days on InterPro servers. Alternatively, users can upload an InterProScan output file in JSON format. When importing nucleotide sequence searches, the system creates separate job results for each Open Reading Frame (ORF), with ORFs from the same sequence automatically grouped together. This feature is particularly useful for users requiring InterProScan graphic output formats for publications and other purposes.
Search results summary¶
Selecting a job ID or entry in the Results column reveals detailed information about the search, including the sequence type, number of sequences analysed, current status, and expiry date. Users can perform several actions on their search results. The Resubmit All button allows running searches again using the latest InterProScan version. The search results can also be downloaded in different formats.
InterProScan search results (Sequences) page.¶
Result export options¶
Search results can be exported using the Download button. Within the first seven days, users can choose from multiple format options including TSV, JSON, XML, and GFF. For searches saved locally after the seven-day period, results remain available in JSON format only.
Sequence viewer interface¶
The sequence viewer displays the full-length sequence as a grey bar at the top of the interface, followed by InterPro matches organised into categories such as Families, Domains, and Conserved residues. Users can choose between two display modes: Summary View and Full View. The Summary View presents a condensed overview showing Families, a simplified domain representation, and Conserved sites, whilst the Full View reveals all available annotations.
Match visualisation¶
Each match in the viewer is represented by colour-coded bars indicating protein families, domains, or important sites. When a signature has been integrated into an InterPro entry, the entry appears above its contributing database signatures, on the right-hand side of the viewer. Non-integrated database signatures have the Unintegrated label displayed on the right-hand side of the viewer. InterProScan does not specify signature types for unintegrated signatures. When a signature lacks a consistent type and is not integrated into an InterPro entry, it is displayed in the Unintegrated category.
Additional features and annotations¶
The viewer includes conserved residue annotations, signal peptide and Transmembrane region information when available. Below the sequence viewer, users can find GO terms associated with matching InterPro entries and PANTHER signatures. The GO terms are assigned manually to InterPro entries using on the Gene Ontology and provide insights into the protein’s biological process, molecular function, and cellular location.
Example of a protein sequence analysis¶
Let’s imagine you would like to analyse the following protein sequence:
>my_protein
MKKTAIAIAVALAGFATVAQAAPKDNTWYAGAKLGWSQYHDTGFIHNDGPTHENQLGAGAFGGYQVNPYVGFEMGYDWLG
RMPYKGDNINGAYKAQGVQLTAKLGYPITDDLDVYTRLGGMVWRADTKSNVPGGPSTKDHDTGVSPVFAGGIEYAITPEI
ATRLEYQWTNNIGDANTIGTRPDNGLLSVGVSYRFGQQEAAPVVAPAPAPAPEVQTKHFTLKSDVLFNFNKSTLKPEGQQ
ALDQLYSQLSNLDPKDGSVVVLGFTDRIGSDAYNQGLSEKRAQSVVDYLISKGIPSDKISARGMGESNPVTGNTCDNVKP
RAALIDCLAPDRRVEIEVKGVKDVVTQPQA
The sequence viewer reveals several InterPro entries including two families (F), three domains (D), and two homologous superfamilies (H). The first family entry contains signatures from both Prosite (PR01022) and HAMAP (MF_00842), whilst subsequent entries show various combinations of database signatures. The protein contains two domains: an N-terminal OmpA_membrane and a C-terminal OMPA_2. To learn more about each domain’s function, hover over it to display a tooltip and click the InterPro accession. Additional features include N-terminal signal peptide identification and specific conserved residue annotations towards the C-terminal, provided by CDD.
Sequence viewer displaying the results of the sequence search.¶
Text search¶
The text search is available by selecting the “By Text” section under the Search tab in the website menu. The text search allows to search the following information in the database:
Name or keyword (e.g. Afadin)
InterPro accession (e.g. IPR000562)
Member database signature accession (e.g. PF00040)
Protein accession (e.g. P04937) or identifier/short name (e.g. FINC_RAT)
PDB structure (e.g. 6AR9)
Gene name (e.g. BRCA2)
GO terms (e.g. GO:0005911)
Proteome accession (e.g. UP000000304)
Taxonomy accession (e.g. 7240)
Set/Clan accession (e.g. CL0451)
Entering a name, or keywords, retrieves a list of all the InterPro entries and InterPro member database
signatures that contain these searched words in their title or description. By default the term searched is highlighted
in the results list and the description is shortened, clicking on the
symbol located on the left hand side of
the Export button removes the highlight and shows the full description text. The setting is saved and also applied
to other text searches throughout the website.
Entering an accession number gives an exact match and a quick access to the corresponding InterPro page. It also displays the list of the InterPro entries and any member database signatures linked to that accession number/identifier.
Selecting the accession number or name of any entry in the list of entries opens the corresponding InterPro page (e.g. member database signature, InterPro entry). An overview of the entry is provided and tabs on the left hand-side menu allow specific information for the entry to be viewed, for example the species in which a protein has been found, or structures matching an entry. More information on the browsing an InterPro page section.
Domain architecture search¶
This search option allows the retrieval of protein sequences that contain specific Pfam/InterPro domains in a particular arrangement referred to as a “domain architecture”. For example, protein sequences containing both a SH2 domain and SH3 domain can be retrieved. Domains that the proteins should or should not contain can be included or excluded from the domain architecture respectively. Selecting “Order of domain matters” offers the possibility to arrange the domains in a particular order. Selecting “Exact match” performs the search to find proteins containing the selected domains only (no extra domain in the proteins). Domains can be selected by entering a domain name, a Pfam accession, or an InterPro accession if a Pfam entry is integrated in it.
Once a search is performed the corresponding results are displayed below the search component and show the number of proteins followed by the corresponding domain architecture. For each domain architecture, the domain size is displayed based on the real length of the domain, using a protein of reference. When hovering over a domain, more details are available in a tooltip, including the domain’s position. Clicking on the number of proteins redirects to the Browse tab in the navigation menu under the protein section, showing the list of proteins which can be filtered to a specific member database, if required, as described in the browse feature.
By default, Pfam entries are shown in the results. This can be changed to show InterPro entries by toggling the Pfam checkbox to InterPro and vice versa.
The domain architectures can be downloaded in JSON and TSV formats through the Export button.
Using Browse feature to search and filter InterPro¶
The browse search page can be accessed by clicking on the Browse tab in the navigation menu. The browse search provides a powerful functionality to select subsets of data available in InterPro by selecting filters according to the results required. For example, this page can be used to browse all entries which have a contributing signature from a particular member database e.g. HAMAP, or to retrieve all proteins from a certain taxon, e.g. Escherichia coli, that contain a specific domain e.g. OmpA-like domain.
Below we describe how to use the browse search feature:
Select a data type
The browse page opens up with 7 data types to allow browsing of InterPro entries, Member databases signatures, Proteins, Structures, Taxonomies, Proteomes or Clans/Sets.
Select any additional filters
The filters options displayed for each data type will vary as appropriate.
Sort by accession
The lists can be ordered by accession in ascending or descending order by clicking on the arrow on the right side of the column name Accession when browsing by InterPro, Member DB and Clan/Set.
Member database filter¶
The “Select your database” option is available when Browsing by Member DB, Protein, Structure, Taxonomy and Set. It allows results to be retrieved from all or a selection of InterPro member databases. Only the databases that contain signatures for the chosen data type are displayed as options. By default all the member databases are selected, expect when Browsing by Member DB, where Pfam is the default option selected.
Text filter¶
The “Search entries” box allows results to be filtered to match the text entered. For example, the text could
be a keyword that might be found in entry names. It also allows specific protein names or taxa to be entered.
By default the term searched is highlighted in yellow in the results list, this can be disabled by clicking on the
symbol appearing between the text box and Export button once the search has started, the setting is saved and
also applied to other text searches throughout the website.
Data-type specific filters¶
InterPro entry filters¶
When Browse by InterPro is selected, three filter types can be applied:
InterPro Type: limits the data in the data views to the selected InterPro entry types.
GO Terms: filters by selected GO terms from InterPro2GO.
New entries: shows all the entries or only the entries created or made available in the most recent release.
Curation status, show all the entries or show:
Curated: entries that have been created by an InterPro curator
AI-generated (unreviewed): entries that have been created automatically by Artificial Intelligence
AI-generated (reviewed): entries that have been created automatically by Artificial Intelligence for which the content has been verified by an InterPro curator
More information about AI-generated content on the InterPro website.
Member database filters¶
When Browse by Member DB is selected and a member database has been chosen, subsequent filters can be applied:
Member Database Entry Type: select the types of signatures required. This is dependent on the database type selected. For example, if a database contains both domains and family signatures you can filter the results for a specific type.
InterPro state: select all signatures from the selected database or only those signatures that have been integrated into InterPro.
Curation status, show all the signatures or show:
Curated: signatures for which the name and description have been created by a scientific curator.
AI-generated (unreviewed): signatures for which the name and description have been created automatically by Artificial Intelligence.
More information about AI-generated content on the InterPro website.
Protein filters¶
Just as with the Member DB data type, Protein filters change based on the selection in the member database filter component. The basic filters are displayed irrespective of the selection made and an extra filter when the “All Proteins” option is selected.
Database selected¶
If a member database has been selected, the following filters are displayed:
UniProt Curation: the UniProtKB is split into two sections. The reviewed set is manually curated (SwissProt) and the unreviewed set is derived from public databases automatically integrated into UniProt (TrEMBL).
Taxonomy: this filter allows the displayed list of proteins to be limited to certain organisms.
Sequence Status: this filter allows proteins to be limited to complete proteins or fragments.
All Proteins¶
Additionally to the filters mentioned above, when the “All Proteins” option is selected in the member database filter component, the Matching Entries filter is displayed. This filter allows the selection of proteins which do or do not contain matches to entries in the InterPro dataset.
Structure filters¶
Structure filters do not vary depending on which option has been selected in the member database filter component.
Experiment Type: this filter allows selection of structures based on the type of experimental data the structure is based on.
Resolution: this filter allows structures to be selected based on the resolution of the structure.
Data Display Options¶
The data display is the main part of the results section in the browse page and shows the data selected in the data type menu. The actual details shown will also be dependent on the selected data type.
Tabular view¶
The tabular view is the default view and is available for all InterPro data types. The table view icon formats data into a tabular view composed of rows representing individual entities. The table header describes the contents of each column. Clicking on one of the rows redirects to the corresponding InterPro page.
Tabular view example for InterPro entry data type¶
Grid view¶
The grid view is available for all InterPro data types. It displays a series of cards summarising details of the entities being viewed. Clicking on one of the cards redirects to the corresponding InterPro page.
Grid view example for InterPro entry data type¶
Tree view¶
The tree view is currently only enabled for taxonomy data. The tree view icon is only shown where a tree view is possible. The taxonomy tree viewer can be navigated by clicking on nodes or using keyboard arrow keys. This component is also used in the Taxonomy entry page.
Tree view example for Euryarchaeota phylum¶















