About - Local Coherence Detection - Wodak Lab


Clustering is a popular unsupervised classification procedure which groups entities (genes or proteins) on some distance metric. Gene can be clustered on gene expression or genetic interaction profiles. Proteins can be clustered on sequence/structural similarity or interaction partners.

Biclustering is a special type of clustering technique in which entities are simultaneously clustered on two criteria. Applied to gene expression data, biclustering optimally groups genes displaying coherent expression patterns under subsets of the experimental conditions.

The Local Coherence Detection (LCD) algorithm is a specialized biclustering procedure that handles numerical data matrices that may contain positive, negative and missing values. LCD extracts groups of entities, where members of each group display a highly coherent and statistically significant pattern of interactions with the same subset the remaining entities.

When applied to Epistatic MiniArray Profiles (EMAP) data on S. cerevisiae, LCD groups genes into modules that display a coherent profile of genetic interactions with a subset of the library genes. Some of these modules correspond to complexes of physically interacting proteins, whereas others group genes from the same or related pathways. Importantly, LCD assigns genes to more than one module, making it possible to discover multiple cellular functions of genes.

In the heatmap example above, query genes (grouped in a cluster) are displayed on the right-side and library genes are displayed top-side. The Mre11 complex is co-clustered with a DNA damage epistasis group, which operates in the same DNA double strand break repair pathway. For details of the LCD algorithm and its applications to the analysis of Epistatic Miniarray Profiles, please refer to:

18718945   Pu, S., Ronen, K., Vlasblom, J., Greenblatt, J. and Wodak, S.J.
Local coherence in genetic interaction patterns reveals prevalent functional versatility.
Bioinformatics (Oxford, England), 24, 2376-2383 (2008).

In addition to clustering of genetic interaction profiles, LCD can be applied to other problems, such as clustering of gene expression data and detecting modules in protein-protein interaction networks. Further inquiries can be directed to shuyepu@sickkids.ca

Local Coherence Detection