FAQ - iRefWeb - Wodak Lab

Topics:

How do you account for the differences between databases?

The differences in which interactions are annotated in a given paper can have several origins. A frequent origin might be the inconsistencies in the choice (often arbitrary) of protein splice forms. Despite the best attempts to find and merge duplicate interaction records across databases, such inconsistencies cannot be reliably resolved at present. Genuine differences and biases in the interpretation of the same reported data may also contribute. We are currently carrying out systematic analyses of the identified discrepancies in order to gain further insight into these issues.Until such insights are obtained, finding that a database failed to record some interactions from a given paper that another database did capture, does not necessarily indicate that there has been an error in either database

What is a RIGID (redundant interaction group identifier)?

Each distinct protein binary interaction or complex can be assigned a distinct rigid that is calculated using only the primary sequence and taxon identifiers of the participant proteins. If two interaction records share the same rigid, they are said to belong to the same group of redundant interactions; this means that their protein participants all have the same primary sequence and come from the same organism. However, the experiments used to support the interaction may be different in each record. The rigid is an alphanumeric string.

What are NP, LPR, and HPR?

NP, LPR and HPR values can be used to help focus your search on interactions based on their relationship(s) to Pubmed. For example if you want interactions with multiple evidences (Pubmeds) or interactions from (or not from) high-throughput experiments, etc.

NP
Number of Publications:
Number of distinct publications (PubMed identifiers) that support this interaction.
LPR
Lowest PubMed Identifier (PMID) Reuse:
A publication may be used to support more than one interaction. The lpr score (lowest PubMed Identifier re-use) is the lowest number of unique interactions that are supported by one of the interaction's PMIDs.
HPR
Highest PubMed Identifier (PMID) Reuse:
A publication may be used to support more than one interaction. The hpr score is the highest number of unique interactions that are supported by one of the interaction's PMIDs.
LPR = 1
At least one of the PMID's supporting the interaction has never been used to support any other Int And that the interaction is not likely to rely solely on high-throughput data.
LPR < 20
Likely describes an interaction that is supported by a low-throughput study.
LPR >= 20
Likely describes an Int Derived solely from middle-throughput or high-throughput experiments.
HPR = 1
None of the PMID's supporting the interaction has ever been used to support any other Int And that the interaction has not been detected as part of a high-throughput study.
HPR >= 20
The interaction has been detected as part of a middle-throughput or high-throughput study.

Below is a simplified example to understand how these numbers are derived for a given interaction. A checkmark indicates that an interaction was noted in that pubmed, and x, the interaction was not seen in that pubmed.

Interaction Pubmed 1 Pubmed 2 Pubmed 3 Pubmed 4 NP LPR HPR
Int A         2 1 3
Int B         2 3 3
Int C         3 1 3
Int D         1 3 3

In the table above, for Interaction C, its NP, LPR and HPR values are determined as shown below:

Total interactions found in each pubmed that include Interaction C
P 2 Ints = | { Int A , Int B , Int C }| = 3
P 3 Ints = |{ Int B , Int C , Int D }| = 3
P 4 Ints = |{ Int C }| = 1

How is iRefWeb implemented?

iRefWeb uses Grails for its web framework. Grails, in turn, is built on Spring which provides the necessary inversion of control to help seamlessly knit together the underlying data sources. For iRefWeb the two main sources are MySQL and Lucene each wrapped in their own object mappers Hibernate and Compass respectively.