FeatureREDUCE PSAM XML Format

The FeatureREDUCE PSAM XML format is a positional-independence-model format that assumes that each nucleotide position in the binding site contributes independently and additively to the overall binding-free energy of the DNA motif.

This format also assumes that there is only one affinity maxima in the sequence-affinity space of the protein and therefore cannot capture possible different binding modes.

The FeatureREDUCE PSAM XML format is identical to the XML format for the WeightMatrix() class in the BioJava API, except that the table contains nucleotide relative affinities (PSAM format) instead of nucleotide probabilities (PWM format). The ADB expects models saved in this format to have a “.xml” file extension.

Number of parameters in the model: 3N (where N is the length of the motif)

Here is the complete FeatureREDUCE PSAM XML for a 10bp Cbf1 affinity model:

 

<MarkovModel>
<alphabet name="DNA"/>
<col indx="1">
<weight sym="guanine" prob="1.0"/>
<weight sym="cytosine" prob="0.4901327784034081"/>
<weight sym="thymine" prob="0.2436050722556149"/>
<weight sym="adenine" prob="0.7174272856934215"/>
</col>
<col indx="2">
<weight sym="guanine" prob="0.4304885624184548"/>
<weight sym="cytosine" prob="0.31120698067127917"/>
<weight sym="thymine" prob="1.0"/>
<weight sym="adenine" prob="0.3292149865558403"/>
</col>
<col indx="3">
<weight sym="guanine" prob="0.01142492769995876"/>
<weight sym="cytosine" prob="1.0"/>
<weight sym="thymine" prob="0.013169232992975328"/>
<weight sym="adenine" prob="0.03537356050042596"/>
</col>
<col indx="4">
<weight sym="guanine" prob="0.05702088446098836"/>
<weight sym="cytosine" prob="0.04298085463382874"/>
<weight sym="thymine" prob="0.043602639700604066"/>
<weight sym="adenine" prob="1.0"/>
</col>
<col indx="5">
<weight sym="guanine" prob="0.023618096698772937"/>
<weight sym="cytosine" prob="1.0"/>
<weight sym="thymine" prob="0.07341816465065486"/>
<weight sym="adenine" prob="0.007440266508326506"/>
</col>
<col indx="6">
<weight sym="guanine" prob="1.0"/>
<weight sym="cytosine" prob="0.023618096698772937"/>
<weight sym="thymine" prob="0.007440266508326506"/>
<weight sym="adenine" prob="0.07341816465065486"/>
</col>
<col indx="7">
<weight sym="guanine" prob="0.04298085463382874"/>
<weight sym="cytosine" prob="0.05702088446098836"/>
<weight sym="thymine" prob="1.0"/>
<weight sym="adenine" prob="0.043602639700604066"/>
</col>
<col indx="8">
<weight sym="guanine" prob="1.0"/>
<weight sym="cytosine" prob="0.01142492769995876"/>
<weight sym="thymine" prob="0.03537356050042596"/>
<weight sym="adenine" prob="0.013169232992975328"/>
</col>
<col indx="9">
<weight sym="guanine" prob="0.31120698067127917"/>
<weight sym="cytosine" prob="0.4304885624184548"/>
<weight sym="thymine" prob="0.3292149865558403"/>
<weight sym="adenine" prob="1.0"/>
</col>
<col indx="10">
<weight sym="guanine" prob="0.4901327784034081"/>
<weight sym="cytosine" prob="1.0"/>
<weight sym="thymine" prob="0.7174272856934215"/>
<weight sym="adenine" prob="0.2436050722556149"/>
</col>
</MarkovModel>