9.3.2 Alignment graphs

In addition to the basic graphs available for individual sequences, the following graphs are available for alignments and assemblies:

Coverage: The height of the graph at each position represents the number of sequences which have a non-gap character at that position. The coverage graph is made up of three bar graphs overlaid on each other: a blue graph shows the minimum coverage, a black graph shows the mean coverage and a yellow graph (underneath the blue and black graphs) shows the maximum coverage. The minimum graph is drawn over the top of the mean color graph, but if necessary the minimum color graph will be reduced in height so that a single pixel of the mean color graph is always visible at each position. Thus, for sequences which are zoomed in so that the horizontal width of each site is one pixel or more, then the graph will be shown in blue with a black line across the top, denoting the coverage at that position. For large alignments which are zoomed out so that the horizontal width of each site is less than one pixel (i.e. each pixel represents more than one site in the alignment), all three bars are visible, showing the minimum, mean and maximum coverage of bases within that pixel (see Figure 9.5 ).


PIC


Figure 9.5: The coverage graph for an assembly, shown zoomed out in the top panel, and zoomed in below


To highlight regions above or below a particular coverage level, check Highlight above... or Highlight below... and a bar will appear below the coverage graph across regions which fit these criteria. The “Highlight above” bar is blue, and the “Highlight below” bar is yellow. Regions where the alignment or assembly is made up of sequences in a single direction (e.g. forward or reverse sequences only) can be highlighted by checking Highlight single strand.

The scale bar to the left of the graph shows minimum and maximum coverage for the entire alignment or assembly, as well as a tick somewhere in between for the mean coverage.

Sequence Logo: This displays a sequence logo, where the height of the logo at each site is equal to the total information at that site and the height of each symbol in the logo is proportional to its contribution to the information content. When zoomed out far enough such that the horizontal width of each site is less than one pixel, then the height is the average of the information over multiple sites. When gaps occur at at some sites, the height is scaled down further to be proportional in height to the number of non-gap residues.

Identity: This displays the identity across all sequences for every position. Green means that the residue at the position is the same across all sequences. Yellow is for less than complete identity and red refers to very low identity for the given position (Figure 9.6 ).


PIC


Figure 9.6: The identity graph for an alignment of nucleotide sequences