Exercise 2: Transferring annotations by homology Using "Annotate From:" tool

The Annotate from Database tool is found on the Live Annotate and Predict tab associated with the Sequence Viewer panel.   This tool uses a Blast-like search to identify and annotate any features on your sequence that share homology with features found in a custom database. This tool can also be accessed from the Geneious Annotate & Predict menu.

The Annotate from Database tool, as the name suggests, requires that you create a database of the annotations that are likely to be found in your sequence. A database comprises a Geneious folder containing annotated sequences of interest. The database may contain sequences with multiple annotations, for instance a related genome, and/or annotated sequences that encode single features.

Geneious when first installed provides a Sample Documents folder which contains an example annotation database.  This folder is normally located within your Geneious database at

/Local/Sample Documents/PlasMapper features

If for any reason you do not have the sample documents in your local database you can download them by going to Geneious menu File Import Sample Documents.

The Plasmapper Features database contains plasmid-related annotated sequences and is derived from the database used by PlasMapper (see http://wishart.biology.ualberta.ca/PlasMapper/). This database can be used to annotate plasmids with many common features, including promoters, terminators, origins of replication (ori's) and selection markers. Take a look at the sequences within the Plasmapper folder to see what a database can look like.

2A. Annotation of a plasmid sequence

In this exercise we will annotate an unannotated plasmid sequence using the Plasmapper folder as a database.

Select the unannotated plasmid sequence pPROEXHTA, select the Live Annotation & Predict tab , turn on Annotate from..., and if the Source: folder is not already set to "PlasMapper Features, then click on the Source: Name and use the Select Feature folder window to navigate to

/Local/Sample Documents/Plasmapper Features

then click OK.

Straight away you should see a number of features appear on the plasmid sequence. These features appear because they share 100% similarity with annotated features present in the Plasmapper database.

Use the slider in the Annotate From... tab to decrease the % Similarity required for a match, drop it to 98% and you will see all of the major features of the plasmid appear. Click the Apply button to add the matched features to the plasmid, then save the document. That's it, you're done, you now have an annotated plasmid sequence.

2B. Annotation of a mitochondrial genome using a custom annotation database

In  Exercise 2B, as with Exercise 1, we will take an unannotated kiwi mitochondrial genome and this time annotate it using a simple database comprising only the emu mitochondrial genome.

Step 1: Creating your own annotation database

To create a database we first need to create a folder to hold our annotated sequence. Right click (or ALT/CTRL-click) on the /Local folder in your Geneious Sources list and chose the option for New Folder. Give your new folder an appropriate name (in this example we'll use the name Emu database), then click OK.

Now copy the Mitochondrion_Emu file in this Tutorial folder and paste it into the new Emu database folder. That's it, you now have a very simple annotation database.

Step 2: Annotating your kiwi sequence

Switch back to the tutorial folder and select the unannotated sequence file called Mitochondrion_Kiwi_2 located in the Annotation Tutorial folder. Select the Sequence View panel, and click on the Live Annotate and Predict tab .

To set the Emu database folder as an Annotation database, click on Source: and use the Select Feature Folder window that opens to navigate to the Emu Database folder.

Once you have specified the database the live annotation tool will go to work comparing your sequence to all annotated features found in sequences within the database folder. For large databases, a progress bar will appear showing that the live annotation search is in progress. Adjust the % Similarity slider downwards until no new features appear in the Sequence Viewer.  You should find that below about 45% similarity you will see that no new features appear on the kiwi sequence.

If you are happy that the majority of features have been identified, click the Apply button to permanently add the annotations to your sequence.

If you hover over any newly added annotation in the sequence viewer window, a yellow pop-up note will appear showing data relating to the annotation, including the Hit name, feature type, gene product function (if known) and a predicted translation if the feature is a coding sequence (CDS).

Note that the Find Annotation tool has also transferred the Source annotation from the emu file (coloured blue). As for Exercise 1, you should edit the Source Annotation to specify Apteryx owenii as the source organism and as for exercise 1,  edit and correct the Feature organism: property and the Feature interval.

Once you have completed checking and editing the transferred annotations, Save the sequence.

In the yellow pop up you will also see the new annotation shows the "Transferred Translation" of the matching emu CDS. To delete the emu translations from all of your CDS annotations, select the Mitochondrion_Kiwi_2 file, click on the Annotations tab , and in the search field type CDS to display annotations of type CDS. Then click in the Annotation table and use command/control-A to select all, then select Edit Annotations. From this window, remove the Transferred Translation property.

Finally, as seen in Exercise 1, because these annotations are transferred based on shared homology with an annotated feature there may be errors in CDS and gene ranges due to slight differences in gene product sizes. Double check all of the newly annotated features to ensure the boundaries and translations make sense. Adjust the annotations ranges if required.

This exercise has demonstrated how the Annotate from: function allows you to rapidly transfer annotations to a sequence, based on the nucleotide similarity between the annotations and the sequence. In the next exercise we will use protein annotations instead of nucleotide annotations to annotate our sequence.

2C. Annotation using a protein database

The Annotate From... tool allows you transfer annotations from protein sequences. In this exercise we will use a list of annotated proteins as our annotation database. Select the list Mitochondrion_Emu_CDS to view the list of annotated proteins. If you zoom in you will see that each protein has a stop at the end. This is required for proper annotation of a complete CDS annotation. As above in exercise 2B step 1, create a new folder, this time call it Protein DB and place the Mitochondrion_Emu_CDS list in the new folder.

Next, select the unannotated sequence Mitochondrion_Kiwi_4 and go to the Live Annotation & Predict tab, check the option to Annotate from..., set the Source: folder to the new Protein DB folder, then hit the Advanced button and make sure the option for Translation Search is checked, then click Done.

The translation search translates the nucleotide sequence in all 6 frames for comparison to the protein sequences in the annotation database. Adjust the Similarity slider to ensure all matches are found, then hit Apply and Save to permanently add the CDS annotations to the sequence.

Go to next exercise, Exercise 3: Using the transfer annotations tool