10.2.2 Trim Ends

Trimming low quality ends of sequences is normally performed before assembling a contig. This is because the noise introduced by low quality regions and vector contamination can produce incorrect assemblies.

To trim vectors, primers and poor quality bases using the Geneious tools, select the sequences you wish to trim and choose Annotate and Predict Trim Ends. This option can also be performed at the assembly step, by checking the trim sequences option in the assembly set-up. Geneious R9 and above also have a plugin for trimming using the BBDuk algorithm from the BBTools suite. This is can be installed by going to Tools Plugins.

Trim Ends can soft or hard trim your sequences. If you wish to soft trim, choose to Annotate new trimmed regions in the Trim Ends set up. The trimmed sequence will then remain visible but will be annotated with “Trimmed” annotations. Sequence annotated with a trimmed annotation is ignored by the assembler when constructing a contig and will not be included in the consensus sequence calculation (refer to subsubsection 10.2.2 , Operations that respect soft trims and Operations that do not respect soft trims for lists of operations handling or ignoring Trim Ends). So although the trimmed regions are visible, they do not affect the results of the assembly at all. Soft trims can be adjusted as needed, or deleted completely. Dragging the ends of the trim annotation will make the newly untrimmed sequence visible and part of the consensus (Figure 10.1 ). If you wish to remove the trimmed sequence completely (hard trim), choose Remove new trimmed regions from sequences.


PIC

Figure 10.1: Click and drag the trims to adjust


If you choose to trim your sequences at the assembly step, the sequences are trimmed and assembled in one operation and you will not be able to view the trimming before assembly is performed. However, the trimmed regions will still be available and adjustable after assembly is complete. If you choose to trim your sequences prior to assembly, select Use Existing Trim Regions when you set up the assembly.

Trimmed annotations can also be created manually using the annotation editing in the sequence viewer. If you create annotations of type Trimmed and save them, then Geneious will treat them the same as ones generated automatically and they will be ignored during assembly. Trimmed annotations can also be modified in this way before or after assembly.

Trim Ends options


PIC


Figure 10.2: Trimming options


The Modified Mott algorithm

The modified-Mott algorithm for trimming ends based on quality operates as follows:

For each base, it subtracts the base error probability from an error probability cutoff value (default 0.05) to form the base score. The base error probability is calculated from the quality score (Q), such that P(error)=10(Q∕10). This means that low quality bases have high error probabilities and thus may have a negative base score.

E.g. For Q10, P(error)= 0.1, For Q30, P(error)=0.001

So with an error probability cutoff of 0.05, a base with Q10 has a base score of 0.05-0.1= -0.05, and a base with Q30 would have a base score of 0.05-0.001=0.049.

The trimming algorithm then calculates the running sum of the base score across the sequence. If the sum drops below zero it is set to zero. The part of the sequence not trimmed is the region between the first positive value of the running sum and the highest value of the running sum (i.e. the highest scoring segment of the sequence). Everything before and after this region is trimmed.

Operations that respect soft trims

The following operations will exclude sequence that has been soft trimmed:

*Only the Geneious assembler supports the use of trimmed annotations. Sequences should be hard trimmed if using other assembly algorithms, such as SPAdes, Tadpole, Bowtie etc.

Operations that do not respect soft trims

The following operations will include sequence that has been soft trimmed:

Export formats SAM/BAM, Genbank, GFF, and ACE incorporate the trim information in the export.