Uniprobe PWM Format

The Uniprobe PWM format is a positional-independence-model format (that assumes that each nucleotide position in the binding site contributes independently and additively to the overall binding-free energy of the DNA motif).

This format also assumes that there is only one affinity maxima in the sequence-affinity space of the protein, and therefore cannot capture possible different binding modes.

This format contains a table of nucleotide probabilities (PWM format), instead of nucleotide relative affinities (PSAM format) or nucleotide counts (TRANSFAC format). The ADB expects models saved in this format to have a “.uniprobe” file extension.

Number of parameters in the model: 3N (where N is the length of the motif)

Here is the complete Uniprobe PWM file for a 20 base-pair affinity model:

 

Protein: Cbf1	Seed k-mer: ATCACGTG	Enrichment Score: 0.499010437669239
A:	0.387398837326962	0.160370450851774	0.00579566973382471	0.984310428811586	0.000520578518462409	0.0512168242470759	0.00554791387069823	0.00108871328362558	0.436684281349379	0.106429865986653
C:	0.0418650553618826	0.045745403962552	0.99055073718171	0.0105040001038691	0.975244090256611	0.0064124125195024	0.00433861322483728	0.0016092288172844	0.262472975122313	0.184027817720692
G:	0.554463521652852	0.0956960762910429	0.00242503126912309	0.0010685902059978	0.00922202192948052	0.94180586404824	0.00940766558113401	0.973242502288466	0.0857562483507934	0.0600857360004131
T:	0.0162725856583034	0.698188068894632	0.00122856181534235	0.00411698087854669	0.0150133092954459	0.00056489918