Motivation: The commonly accepted statistical mechanical theory is now multiply confirmed by using the weight matrix methods successfully recognizing DNA sites binding regulatory proteins in prokaryotes. Nevertheless, the recent evaluation of weight matrix methods application for transcription factor binding site recognition in eukaryotes has unexpectedly revealed that the matrix scores correlate better to each other than to the activity of DNA sites interacting with proteins. This observation points out that molecular mechanisms of DNA/protein recognition are more complicated in eukaryotes than in prokaryotes. As the extra events in eukaryotes, the following processes may be considered: (i) competition between the proteins and nucleosome core particle for DNA sites binding these proteins and (ii) interaction between two synergetic/antagonist proteins recognizing a composed element compiled from two DNA sites binding these proteins. That is why identification of the sequence-dependent DNA features correlating with affinity magnitudes of DNA sites interacting with a protein can pinpoint the molecular event limiting this protein/DNA recognition machinery.
Results: An approach for predicting site activity based on its primary nucleotide sequence has been developed. The approach is realized in the computer system ACTIVITY, containing the databases on site activity and on conformational and physicochemical DNA/RNA parameters. By using the system ACTIVITY, an analysis of some sites was provided and the methods for predicting site activity were constructed. The methods developed are in good agreement with the experimental data.
Availability: The database ACTIVITY is available at http://wwwmgs.bionet.nsc.ru/systems/Activity/ and the mirror site, http://www.cbil.upenn.edu/mgs/systems/acti vity/.