ADB Sequence Annotations

The ADB Sequence Browser loads and saves sequence annotations in a table format with 15 required columns and an optional 16th description column. The table format is designed to accommodate all gene, polymorphism, binding site, and positional prior annotations attributed to a sequence. Due to the 4 different types of annotations described in one table, the meaning of some of the columns changes slightly depending on the annotation type. Below is a table describing each column by annotation type:

The table format also supports general domains that can have any subtype or subsubtype. The scale of a domain controls the height of the rendered rectangle for the given domain. All entries of type “domain” are rendered in the track labelled “Domains”. Lastly, the table format also currently supports 3 secondary structures: alpha helix, beta strand, and turn. All entries of type “structure” are rendered in the track labelled “Structures”. Below are examples of the 3 structure annotations as well as domain annotations:

Below are the gene annotations for the human RefSeq gene NM_175840. Chromosomal locations are inclusive:

Below are some polymorphisms in snp141 for the human RefSeq gene NM_175840 above. Chromosomal locations are inclusive. The polymorphisms are always labeled “reference/polymorphism”. If the polymorphism is an insertion then the reference is “-“. The direction is “none” since the polymorphism exists on both strands of double-stranded DNA:

s10

Below are some binding site (a.k.a. response element or re) annotations in the human RefSeq gene NM_175840 above when using the HT-SELEX model for the human transription factor BHLH40. Chromosomal locations are inclusive. If the binding site overlaps polymorphism(s), then the relative affinities are always labeled as “reference affinity/polymorphism(s) affinity”. The direction is “none” since the binding site exists on both strands of double-stranded DNA:

Below are some functional prior annotations. Again, chromosomal locations are inclusive. In order to create gene-generic positional priors, reserved variables can be used as place holders for chromosomal positions and distances. In addition, the JEPLite library is used to parse mathematical formulas for defining the probabilities of accessibility, function, or conserved function in the Scale column. All of the JEPLite functions are available for use in the ADB. Also, all of the functional priors below are available in the ~/.ADB/positionalPriors directory in the user’s home directory.

Besides mathematical formulas, the Scale column also supports comma separated values for each position in the defined location range. An example of a comma separated list of values is “0.78,0.34,0.89,0.34,0.89,….”.

Below are the the reserved variables that can be used as place holders for certain gene positions and distances: