Direct Coupling Analysis (DCA) is a statistical inference framework used to infer direct co-evolutionary couplings among residue pairs in multiple sequence alignments. A recent formulation of DCA termed mean field DCA (mfDCA), provides the computational power to apply this tool in a high throughput manner to a number of protein and domain families. mfDCA is able to uncover a large number of native intra-domain and inter-domain residue-residue contacts in many domain families.
An example of a contact map for the bovine pancreatic trypsin inhibitor (PDB ID: 5pti) created with the top 80 DCA coupled pairs (lower triangular) compared to the native c-alpha contacts (upper triangular). The colorbar gradient corresponds to the DI rank with the extremes being the background (white) and the native contacts (blue). The figure was created with the Matlab script plotDCAmap.m. (Right) The top 30 DCA contacts depicted as green sticks in the trypsin inhibitor (PDB ID: 5pti). This figure was generated using Chimera with input files generated by the bash script GeneratePseudobonds.bash.
The predictive performance statistics of mfDCA for 131 domain families and 856 PDB structures is shown in the following figure:
April, 2014
[1] J. I. Sułkowska, F. Morcos, M. Weigt, T. Hwa, and J. N. Onuchic, “Genomics-aided structure prediction,” Proc Natl Acad Sci U S A, vol. 109, no. 26, pp. 10340–5, 2012. Jun 26, Epub 2012 Jun 12.
[2] A. E. Dago, A. Schug, A. Procaccini, J. A. Hoch, M. Weigt, and H. Szurmant, “Structural basis of histidine kinase autophosphorylation deduced by integrating genomics, molecular dynamics, and mutagenesis,” Proc Natl Acad Sci U S A, vol. 109, no. 26, pp. E1733–42, 2012. Epub 2012 Jun 5.
[3] F. Morcos, A. Pagnani, B. Lunt, A. Bertolino, D. S. Marks, C. Sander, R. Zecchina, J. N. Onuchic, T. Hwa, and M. Weigt, “Direct-coupling analysis of residue coevolution captures native contacts across many protein families,” Proc Natl Acad Sci U S A, vol. 108, no. 49, pp. E1293–301, 2011. Epub 2011 Nov 21.
[4] A. Procaccini, B. Lunt, H. Szurmant, T. Hwa, and M. Weigt, “Dissecting the specificity of protein-protein interaction in bacterial two-component signaling: orphans and crosstalks.,” PLoS one, vol. 6, p. e19729, 2011.
[5] M. Weigt, R. A. White, H. Szurmant, J. A. Hoch, and T. Hwa, “Identification of direct residue contacts in protein-protein interaction by message passing.,” Proc Natl Acad Sci USA, vol. 106, pp. 67–72, 2009.
[6] A. Schug, M. Weigt, J. N. Onuchic, T. Hwa, and H. Szurmant, “High-resolution protein complexes from integrating genomic information with molecular simulation.,” Proc Natl Acad Sci USA, vol. 106, pp. 22124–22129, 2009.
[1] D. de Juan, F. Pazos, and A. Valencia, “Emerging methods in protein co-evolution,” Nat Rev Genet, vol. 14, no. 4, pp. 249–61, 2013. Epub 2013 Mar 5.
[2] H. Szurmant and J. A. Hoch, “Statistical analyses of protein sequence alignments identify structures and mechanisms in signal activation of sensor histidine kinases,” Mol Microbiol, vol. 87, no. 4, pp. 707–12, 2013. Epub 2012 Dec 28.
[3] D. S. Marks, T. A. Hopf, and C. Sander, “Protein structure prediction from sequence variation,” Nat Biotechnol, vol. 30, no. 11, pp. 1072–80, 2012.
[4] B. K. Ho, D. Perahia, and A. M. Buckle, “Hybrid approaches to molecular simulation,” Curr Opin Struct Biol, vol. 22, no. 3, pp. 386–93, 2012. Epub 2012 May 25.
[5] K. A. Dill and J. L. MacCallum, “The protein-folding problem, 50 years on,” Science, vol. 338, no. 6110, pp. 1042–1046, 2012.