Exercise 1: Transferring annotations by homology Using the "Copy to..." function

The Copy to... method for transferring annotations requires that you have an alignment or assembly of two or more homologous sequences that have differing annotations that you would like to transfer or combine.

This method does not compare or consider sequence similarity as it assumes accurate alignment of homologous features across the breadth of the alignment.

This method can be used to transfer individual annotations, groups of annotations or the entire annotation set from one sequence to any other sequence within the alignment, or to the consensus sequence of the alignment.

If the alignment retains links to the parent sequences used to generate the alignment, then you will be given the option to apply the transferred annotations to the parental sequences when you save the changes to the alignment.

Exercise - Annotating a mitochondrial genome sequence

In this exercise we will transfer annotations from a published annotated sequence of the emu mitochondrial genome to a "new" unannotated kiwi mitochondrial genome. The sequences are provided with this tutorial and are named Mitochondrion_Emu and Mitochondrion_Kiwi_1.

Select the two files from the document list, select the Sequence view panel, select the "General" tab, and make sure the option to display annotations is turned on.

You should see two sequences in the Sequence viewer panel, the emu sequence with annotations, and the kiwi sequence without annotations.

The first thing we need to do is to create an alignment of these sequences, as they differ by about 300 bp in length.

With the two files selected, click on Align/Assemble → Pairwise align, select the MAFFT aligner and click OK to align the two genomes. This will create an alignment file called Nucleotide Alignment. Select this file from the File list to view it. If you zoom in you will see that the sequences share high similarity across most of the alignment.

The next step is to perform the annotation transfer. We will transfer all annotations from the emu sequence to the kiwi sequence. To do this, right click (or Alt/CTRL-click) on the Mitochondrial_Emu sequence title, this will select the sequence and all annotations associated with this sequence, and display a contextual menu. From the menu, select Annotation → Copy all in selected region to → Mitochondrion_Kiwi_1.

Once you have "copied all" you should see all of the annotations now added to the kiwi sequence. Save the alignment. Because the alignment is linked back to the parental sequences, you should be given the option to "Apply the changes to the the parental sequences. Make sure you choose Yes to apply the changes to the Mitochondrion_Kiwi_1 sequence.

Note that if you had wanted to transfer only single feature, or a single class of feature (for instance only CDS's), then right clicking on an individual feature will change the contextual menu options to allow you to do this.

If you select the Mitochondrion_Kiwi_1 file, zoom out if required, and you should see that it now contains all of the transferred annotations. Hovering the mouse over any of the annotations will show you details of the transferred annotations. For CDS annotations this includes an automatic translation of the region spanned by the annotation coordinates.

You may notice that source annotation has also transferred (the thick blue line labelled source Dromaius novaehollandiae - you may need to turn on display of Source annotations in the Annotations Tab to see this Annotation type).  Double click on the blue Source annotation to edit the annotation and change the Name: to source Apteryx owenii, the binomial name for the kiwi.  Before closing the Edit Annotations window, you should also click on Properties, then double click on the organism: property and change the property value to Apteryx owenii.  Also, click on interval and edit the interval so that it covers the entire genome sequence (1-17,020 bp).

If you hover over the ND4 CDS annotation on the kiwi sequence (bases 10,240-11,613) you will see that the automatic translation includes two extra amino acids after a stop codon.

We will now correct this error.  Go to the Display tab and ensure that Translation is turned on, and the Frame: to display is set to By selection or annotation.

If you select the 3' end of the ND4 CDS and zoom in using the Full Zoom tool , you will see the kiwi CDS actually terminates two codons earlier than the emu homolog. You may also notice that the codon for the stop codon is a non-standard AGA. This has been called because the transferred ND4 annotation contains information specifying the genetic code, and is using translation table 2 for vertebrate mitrochondrial genomes (See http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi for more information).

To correct the discrepancy between the automatic translation and the predicted translation, select the end of the CDS annotation feature and drag it so that it ends at the AGA stop codon. You should also drag and adjust the corresponding ND4 gene annotation.

This exercise has demonstrated how the Copy To: function allows you to rapidly transfer annotations between homologous sequences. It has also demonstrated that this method of transfer is only as good as the quality of the alignment. You should always double check the boundaries of all annotations that you have transferred to make sure they are correct.

Go to next exercise, Exercise 2: Transferring annotations using "Transfer Annotations"