The MINT-inspired score (MI score) is a measure of confidence in molecular interactions annotated from literature. To assign MI scores, we closely follow an earlier approach developed by the MINT database team (Ceol et al., 2010) and adopted their confidence score formula.
The idea underlying the score is to collect pieces of evidence from each publication that supports an interaction record. Unlike the MINT team, we do not curate the actual publications and hence are not able to list various figures, figure panels, tables, etc., as separate pieces of evidence. Instead we rely on the annotations of the interaction types and experimental detection methods, provided by the original database curators using the PSI-MI controlled vocabulary. As an illustration, if an interaction was detected using two independent methods, such as "two hybrid pooling approach (MI:0398)" and "anti tag coimmunoprecipitation (MI:0007)", they are treated as two separate piece of evidence. On the other hand, if the listed methods are "two hybrid pooling approach (MI:0398)" and "two hybrid (MI:0018)", they are treated as only one piece of evidence because the former term is a descendant of the latter one in the PSI-MI ontology hierarchy.
In accordance with the MINT approach, we assign the full weight of 1 to pieces of evidence that correspond to the interaction type "direct interaction (MI:0407)" or its descendants. Otherwise we reduce the evidence weight by half. Furthermore, we apply another reduction by half to evidence coming from high-throughput publications. Following the MINT approach, we define a publication as "high-throughput" if it supports 50 or more different interaction records in the iRefWeb. Hence, for example, if the source-database curators have annotated a particular interaction as a "physical association (MI:0915)" supported by a high-throughput experiment, this piece of evidence would contribute only 0.25 (non-direct, high-throughput) towards the overall weighted evidence for that interaction record.
Once the weighted evidence is determined for each interaction record, the score assignment proceeds as follows. For every pair of interacting proteins A and B, we first identify all interactions that contain this pair, including multi-protein complexes. Next, we identify various homologs of A and B in this and other organisms (i.e. both paralogs and orthologs); we then find all interactions containing pairs A' and B', where A' is a homolog of A and B' is a homolog of B. The information on homology is taken from Inparanoid.
To apply the MI score formula, three types of information are collected:
These three values are substituted directly into the MI formula to arrive at a score for the pair of proteins A and B.
For multi-subunit complexes, we determine the MI scores for all possible pairs (i.e. using the so-called matrix expansion of a complex) and then take their median as the overall MI score for the complex.
MI (MINT Inspired) scores are in the range [0,1], are limited to interactions where all its proteins are the same species, supported by valid PubMed IDs, not predicted or from OPHID, not a self-interaction, and are from one of the following organisms:
Any interaction that does not meet the above requirements has no score calculated and this is displayed as 'n/c'
Go here to find out how the score is calculated.
Go here for a detail description of each column.
In general we have tried to preserve identifiers across releases. However due to a change in iRefIndex 9, interaction IDs had to be updated. Thus the same interaction in iRefWeb 3.9 and iRefWeb 4.0 (or higher) will have different IDs.
However inputing an
old ID in a URL like this
http://wodaklab.org/iRefWeb/interaction/show/OLD_ID
should bring you to
http://wodaklab.org/iRefWeb/interaction/show/NEW_ID
Note not all interactions in iRefIndex 8 still exist in iRefIndex 9. In particular due to changes in how
taxonomy was handled for yeast and database changes a large number interactions have changed so many old identifiers will not exist.
If you require, mapping files can be found here.
You can use the MITAB file (either downloaded from the search page, or from here) as the basis for generating links to iRefWeb using columns 49, 50, and 51.
column 48 ... |
column 49 icrogida |
column 50 icrogidb |
column 51 icrigid |
column 52 ... |
---|---|---|---|---|
... | 1466236 | 4803728 | 101 | ... |
Using the examples above, you can then generate your links as follows:
Page | Link Structure |
---|---|
Protein (Interactor) Page | http://wodaklab.org/iRefWeb/interactor/show/1466236 http://wodaklab.org/iRefWeb/interactor/show/4803728 |
Search Results Page for a Protein | http://wodaklab.org/iRefWeb/search/index?search.q=act_id:1466236 http://wodaklab.org/iRefWeb/search/index?search.q=act_id:4803728 |
Interaction Page | http://wodaklab.org/iRefWeb/interaction/show/101 |
Search Results Page for the Interaction | http://wodaklab.org/iRefWeb/search/index?search.q=int_id:101 |
If you have an Entrez Gene ID (for example 10277), you can also try linking this way:
http://wodaklab.org/iRefWeb/interactor/showForGene/10277
(Note not all proteins in iRefWeb can be mapped to Entrez Gene IDs.)
There are variety reasons as to why the same paper might be annotated differently by the several source databases. We've examined these variations in a paper Literature curation of protein interactions: measuring agreement across major public databases.
Each distinct protein binary interaction or complex can be assigned a distinct rigid that is calculated using only the primary sequence and taxon identifiers of the participant proteins. If two interaction records share the same rigid, they are said to belong to the same group of redundant interactions; this means that their protein participants all have the same primary sequence and come from the same organism. However, the experiments used to support the interaction may be different in each record. The rigid is an alphanumeric string.
NP, LPR and HPR values can be used to help focus your search on interactions based on their relationship(s) to Pubmed. For example if you want interactions with multiple evidences (Pubmeds) or interactions from (or not from) high-throughput experiments, etc.
Below is a simplified example to understand how these numbers are derived for a given interaction. A checkmark indicates that an interaction was noted in that pubmed, and x, the interaction was not seen in that pubmed.
Interaction | Pubmed 1 | Pubmed 2 | Pubmed 3 | Pubmed 4 | NP | LPR | HPR |
interaction A | 2 | 1 | 3 | ||||
interaction B | 2 | 3 | 3 | ||||
interaction C | 3 | 1 | 3 | ||||
interaction D | 1 | 3 | 3 |
In the table above, for Interaction C, its NP, LPR and HPR values are determined as shown below:
Total interactions found in each pubmed that include Interaction C | ||
P 2 Ints | = | { int A , int B , int C }| | = 3 |
P 3 Ints | = |{ int B , int C , int D }| | = 3 |
P 4 Ints | = |{ interaction C }| | = 1 |
Calculations for Interaction C | |||
NPint C | = |{ Pubmed 2 , Pubmed 3 , Pubmed 4 }| | = 3 | |
LPRintC | = min( P 2 Ints , P 3 Ints , P 4 Ints ) | = min( 3 , 3 , 1 ) | = 1 |
HPRint C | = max( P 2 Ints , P 3 Ints , P 4 Ints ) | = max( 3 , 3 , 1 ) | = 3 |