Protein-DNA Interactions in Eukaryotes - Wodak Lab

Protein-DNA interactions play a key role in the regulation of gene expression and damage repair. Understanding the factors that govern the affinity and specificity of these interactions is hence of great importance. In order to gain insight into these factors we work on classifying/clustering DNA binding domains in eukaryotes into smaller groups (sub-families) on the basis of the information contained in their amino acid sequence. Obtained classifications and characteristic sequence signatures are mapped onto the known 3D structures of these proteins and their complexes with DNA in order to rationalize the role of specific residues in binding. We recently started a collaboration with colleagues at the University of Toronto and Harvard Medical School Boston on relating our protein sequence-based classifications to DNA binding specificities derived from high throughput protein binding microarrays (PBMs).

Some of our older work on protein DNA interactions on the analysis of general properties of interfaces in known protein-DNA complexes is described in:

  1. 11504874
    Nadassy K, Tomàs-Oliveira I, Alberts I, Janin J, Wodak SJ.
    Standard atomic volumes in double-stranded DNA and packing in protein--DNA interfaces.
    Nucleic Acids Res. 2001 Aug 15;29(16):3362-76.
  2. 10026283
    Nadassy K, Wodak SJ, Janin J.
    Structural features of protein-nucleic acid recognition sites.
    Biochemistry. 1999 Feb 16;38(7):1999-2017.

Stereo view of the HOX homeodomain 1ig7 in contact with the DNA. In stick mode are key residues of the homeodomain family: blue ones are conserved residues while the others correspond to the set of 9 specificty determining positions (SDP) making contact with the DNA. The contribution of all the SDP to the interface with the DNA is on average a significant 23%, while that of the conserved residues adds up to around 15%. In comparison the N-terminal coil has a contribution that in terms of relative interface adds up to 42%


40 GMD03350_1 ------FSSFQRKGLEIQF--QQQKYITKPDRRKLAARL--NLTD--AQVKVWFQNRRMKWR-
41 GAP01770_1 ------FTNHQIYELEKRF--LYQKYLSPADRDQIAQQL--GLTN--AQVITWFQNRRAKLKR
42 SSF35481_1 ------FTDHQLAQLERSF--ERQKYLSVQDRMELAASL--NLTD--TQVKTWYQNRR-----
43 BMD57173_1 ------FTDHQLQTLEKSF--ERQKYLSVQDRMELAAKL--GLTD--TQVKTWYQNRRTKWKR
44 SSF01710_1 ------FTELQLMGLEKRF--EKQKYLSTPDRIDLAECL--DLSQ--LQVKTWYQNRRMKWKK
45 MCP03763_1 ------FSDQQLQGLEQRF--NGQKYLSTPERISLAESL--HLSE--TQVKTWFQNRRMK---
46 CVP02277_1 ------FSDQQLNGLEKRF--EAQRYLSTPERVELANQL--SLSE--TQVKTWFQNRRMKHKK
47 BMD56378_1 ------FTSEQLLELEREF--HAKKYLSLTERSQIAAAL--KLSE--VQVKIWFQNRRAKWKR
48 IPD17949_1 ------FTHLQVLELEKKF--SRQRYLSAPERAHLASAL--RLTE--TQVKIWFQNRRYKTKR
49 CCE05724_1 ------FTTSQLLVLERKF--LQKQYLSIAERAEFSNSL--NLTE--TQVKIWFSNTRAKAKR
50 BME03454_1 ------FTTQQLLALERKF--RVKQYLSIAERAEFSSSL--NLTE--TQVKIWFQNRRAKEKR
51 1ig7A RKPRTPFTTAQLLALERKF--RQKQYLSIAERAEFSSSL--SLTE--TQVKIWFQNRRAKAKR
52 SPE19093_1 ------FSGRQIFELEKQF--EVKKYLSASERAELASLL--NVTD--TQVKIWFQNRRTKWKK
53 SSF54974_1 ------FSKRQIFQLESTF--DMKRYLSSAERACLASSL--QLTE--TQVKIWFQNRRNKLKR
54 SSP03100_1 ------FSRHQVSQLEMTF--DMKRYLSSQERAHLASNL--QLTE--TQVKIWFQNRRNKWKR
55 SPE27538_1 ------FSRSQVFQLESTF--EVKRYLSSSERAGLAANL--HLTE--TQVKIWFQNRRNKWKR
56 SSF22556_1 ------FSRVQICELEKRF--HRQKYLASAERATLAKSL--KMTD--AQVKTWFQNRRTKWRR
57 SPE11478_1 ------FSNDQTMELEKKF--ENQKYLSPPERKKLAKVL--QLSE--RQVKTWFQNRRAKWRR
58 SPE27764_1 ------FTREQIGRLEKEF--ARENYVSRPKRCELATAL--NLPE--TTIKVWFQNRRMKDKR
59 1jggA -RYRTAFTRDQLGRLEKEF--YKENYVSRPRRCELAAQL--NLPE--STIKVWFQNRRMKDKR
60 CSH13076_1 ------FTHEQVRQLELDF--SENHYLTRLRRYELSLKL--SLTE--RQIKVWFQNRRMKLKR
61 SPE28572_1 ------FTKEQIRELENEF--NHHNYLTRLRRYEIAVTL--NLTE--RQVKVWFQNRRMKWKR
62 CCE07325_1 ------FTKEQIRELESEF--AHHNYLTRLRRYEIAVNL--DLTE--RQVKVWFQNRRMKWKR
78 SRP01584_1 ------FTTHQLTELEKEY--YTSKYLDRSRRREIAKQL--ALNE--TQVKIWFQNRRMKEKK
79 sp_P09022 _HXA1_MOUSE ---RTNFTTKQLTELEKEF--HFNKYLTRARRVEIAASL--QLNE--TQVKIWFQNRRMKQKK
80 1b72A --LRTNFTTRQLTELEKEF--HFNKYLSRARRVEIAATL--ELNE--TQVKIWFQNRRMKQKK
92 HRP00014_1 ------FTPEQLERLEREF--LKQQYMVGTERFYLAKEL--NLGE--AQVKVWFQNRRIKWRK
93 ASP20371_1 ------FTPTQADTLEKEY--LTDQYMPRTRRILIAESL--GLSE--GQVKTWFQNRRAKEKR
94 TCP00488_1 ------FTPAQADTLEKEY--LTDQYMPRTRRILIAESL--GLNE--GQVKTWFQNRRAKEKR
95 BMD20907_1 ------FTGDQQLRLEQTL--EKTQYINGTDRRELAQKW--GIGE--KGIKIWFQNRRMKNKR
96 BMD49473_1 ------FTTEQINYLENEF--KKSHYISAVQRKEIANIV--NVPE--KVIKIWFQNRRMREKK
Subset of the MSA for the HOX homeodomain subfamily. Highlighted in red are the set of SDP obtained based on a classification of a non-redundant set of 456 homeodomain proteins.

Wodak Lab:
Protein-DNA Interactions in Eukaryotes