java.lang.Object

com.biomatters.geneious.publicapi.utilities.SequenceUtilities

public final class SequenceUtilities extends Object

A noninstantiable class providing static methods for common tasks associated with nucleotide and protein sequences.

See Also:

Method Summary

Modifier and Type

Method

Description

static DefaultAlignmentDocument

alignmentFromJeblSequences(String name, List<Sequence> jeblSequences)

Converts the given alignment of Jebl sequences into a DefaultAlignmentDocument

static CharSequence

asDna(CharSequence nucleotideCharSequence)

Views an underlying (nucleotide) CharSequence as DNA by dynamically translating 'U's to 'T's and 'u's to 't's It is guaranteed that if charSequence instanceof SequenceCharSequence, the returned value will also be instanceof SequenceCharSequence (but it may not support log-time modifications).

static Alignment

asJeblAlignment(List<SequenceDocument> sequences)

Convert a list of (aligned) Geneious sequences to a jebl alignmnent

static Sequence

asJeblSequence(AnnotatedPluginDocument referenceDocument, SequenceDocument sequence)

Deprecated.
use asJeblSequence(SequenceAlignmentDocument.ReferencedSequence, SequenceDocument)

static Sequence

asJeblSequence(SequenceAlignmentDocument.ReferencedSequence referencedSequence, SequenceDocument sequence)

Convert from a Geneious sequence to a jebl sequence.

static Sequence

asJeblSequence(SequenceDocument sequence)

Convert from a Geneious sequence to a jebl sequence.

static List<Sequence>

asJeblSequences(SequenceDocument... sequences)

Convert a set of Geneious sequences to jebl sequences.

static List<Sequence>

asJeblSequences(List<SequenceDocument> sequences)

Convert a set of Geneious sequences to jebl sequences.

static CharSequence

asRna(CharSequence nucleotideCharSequence)

Views an underlying (nucleotide) CharSequence as RNA by dynamically translating 'T's to 'U's and 't's to 'u's It is guaranteed that if charSequence instanceof SequenceCharSequence, the returned value will also be instanceof SequenceCharSequence (but it may not support log-time modifications).

static CharSequence

asTranslation(CharSequence nucleotideCharSequence, GeneticCode geneticCode)

Deprecated.
use asTranslation(CharSequence, jebl.evolution.sequences.GeneticCode, boolean) instead.

static CharSequence

asTranslation(CharSequence nucleotideCharSequence, GeneticCode geneticCode, boolean translateFirstCodonUsingFirstCodonTable)

Views an underlying (nucleotide) CharSequence as its translation.

static SequenceDocument

concatenateSequences(List<? extends SequenceDocument> sequences, boolean circular, int indexOfDocumentToUseForOrigin, ProgressListener progressListener)

Concatenate a list of sequence documents.

static String

containsInvalidResidues(SequenceDocument sequenceDocument, boolean allowGaps, boolean fastIncompleteCheck)

Checks if a sequence contain invalid sequence residues.

static String

containsInvalidResidues(CharSequence sequenceResidues, SequenceDocument.Alphabet alphabet, boolean allowGaps, boolean fastIncompleteCheck)

Checks if a sequence contain invalid sequence residues.

static String

containsInvalidResidues(CharSequence sequenceResidues, SequenceType sequenceType, boolean allowGaps, boolean fastIncompleteCheck)

Checks if a sequence contain invalid sequence residues.

static List<AnnotatedPluginDocument>

createNewDocumentsByTransformingSequences(List<AnnotatedPluginDocument> sourceDocuments, SequenceDocument.Transformer transformer, ProgressListener progressListener, String newSequenceOrDocumentNamePrefix)

Transforms the sequence(s) in each input document and returns a new document corresponding to each input document.

static List<AnnotatedPluginDocument>

createNewDocumentsByTransformingSequences(List<AnnotatedPluginDocument> sourceDocuments, SequenceDocument.Transformer transformer, ProgressListener progressListener, String newSequenceOrDocumentNamePrefix, String newSequenceOrDocumentNameSuffix)

Transforms the sequence(s) in each input document and returns a new document corresponding to each input document.

static SequenceDocument

createSequenceCopy(SequenceDocument original)

Creates a copy of the original sequence if necessary.

static SequenceDocument

createSequenceCopyAdjustedForGapInsertion(SequenceDocument sequenceDocument, CharSequence gappedSequenceCharacters)

Creates a copy of the given sequence with annotations, sequence residues, and chromatogram values adjusted to account for gap insertion.

static SequenceDocument

createSequenceCopyAdjustedForGapInsertion(SequenceDocument sequenceDocument, CharSequence gappedSequenceCharacters, boolean includeTracks)

Creates a copy of the given sequence with annotations, sequence residues, and chromatogram values adjusted to account for gap insertion.

static DefaultSequenceDocument

createSequenceCopyEditable(SequenceDocument original)

Creates a copy of the original sequence that is editable.

static DefaultSequenceDocument

createSequenceDocument(SequenceType sequenceType, String name, String description, CharSequence sequenceString, Date creationDate)

Creates a DefaultNucleotideSequence or DefaultAminoAcidSequence depending on sequenceType.

static SequenceDocument

generateConsensus(SequenceAlignmentDocument alignment, ProgressListener progressListener)

Generates a consensus sequence for an alignment using default consensus settings.

static SequenceDocument

generateConsensusSequence(SequenceAlignmentDocument alignment, ProgressListener progressListener)

Deprecated.
use generateConsensus(com.biomatters.geneious.publicapi.documents.sequence.SequenceAlignmentDocument, jebl.util.ProgressListener) instead

static SequenceDocument.Alphabet

getAlphabet(AnnotatedPluginDocument... documents)

static SequenceDocument.Alphabet

getAlphabet(SequenceDocument sequence)

Get the Alphabet of a sequence.

static SequenceDocument.Alphabet

getAlphabet(SequenceType sequenceType)

Gets a Geneious alphabet type that is equivalent to a jebl library SequenceType.

static List<SequenceAnnotation>

getAnnotationsOfType(SequenceDocument document, String type)

Deprecated.
use getAnnotationsOfType(com.biomatters.geneious.publicapi.documents.sequence.SequenceDocument, String, boolean) instead.

static List<SequenceAnnotation>

getAnnotationsOfType(SequenceDocument document, String type, boolean returnAnnotationsInTracks)

Get all annotations in document matching the given type.

static List<SequenceAnnotation>

getAnnotationsOfType(List<SequenceAnnotation> annotations, String type)

Get all annotations in list matching the given type

static String

getBlastAlignmentText(SequenceAlignmentDocument alignment, boolean geneiousFriendly)

Formats the given alignment in BLAST text format

static String

getForwardRegexForSequence(CharSequence querySequence, SequenceType sequenceType, boolean interpretAmbiguitiesInQuery)

Equivalent to calling getForwardRegexForSequence(charSequence, sequenceType, interpretAmbiguities, false, true)

static String

getForwardRegexForSequence(CharSequence querySequence, SequenceType sequenceType, boolean interpretAmbiguitiesInQuery, boolean interpretAmbiguitiesInTarget)

Deprecated.
use getForwardRegexForSequence(CharSequence, jebl.evolution.sequences.SequenceType, boolean, boolean, boolean) instead

static String

getForwardRegexForSequence(CharSequence querySequence, SequenceType sequenceType, boolean interpretAmbiguitiesInQuery, boolean interpretAmbiguitiesInTarget, boolean allowExtraGapsInTarget)

Given a nucleotide or amino acid sequence, returns a regular expression that matches forward occurrences of this sequence in a larger sequence, i.e.

static Pattern

getForwardRegexPatternForSequence(CharSequence sequence, SequenceType sequenceType, boolean interpretAmbiguitiesInQuery)

Equivalent to calling getForwardRegexPatternForSequence(charSequence, sequenceType, interpretAmbiguities, false)

static Pattern

getForwardRegexPatternForSequence(CharSequence querySequence, SequenceType sequenceType, boolean interpretAmbiguitiesInQuery, boolean interpretAmbiguitiesInTarget)

Given a nucleotide or amino acid sequence string, returns a regular expression pattern that matches forward occurrences of this search string in a larger sequence string,

static Integer

getIndexBasedOnExtractionAnnotation(SequenceDocument sequenceDocument, int index, boolean mapToOriginal)

Gets the extraction annotations from the sequence document and maps a residue index to a residue index on either the original sequence or the result sequence, depending on the value of mapToOriginal

static SequenceAnnotationInterval

getIntervalBasedOnExtractionAnnotation(SequenceDocument sequenceDocument, SequenceAnnotationInterval interval, boolean mapToOriginal)

Gets the extraction annotations from the sequence document and maps the interval to either the original sequence or the result sequence, depending on the value of mapToOriginal

static int

getLeadingGapsLength(CharSequence charSequence)

Returns the start index of the non-gap regions in the specified charSequence, i.e.

static String

getMaximalAmbiguitySymbol(SequenceType sequenceType)

get the code for the state in this sequence type which represents a base/residue that is completely unknown

static long

getNumberOfSequences(AnnotatedPluginDocument document, SequenceDocument.Alphabet alphabet)

Gets the total number of nucleotide or amino acid sequences contained in the given document which may be an individual sequence, sequence list, or alignment/contig.

static long

getNumberOfSequences(List<AnnotatedPluginDocument> documents, SequenceDocument.Alphabet alphabet)

Gets the total number of nucleotide or amino acid sequences contained in the given documents which may be individual sequences, sequence lists, or alignments/contigs.

static int

getOriginalIndex(SequenceDocument sequence, int index)

Gets the original numbering of the given index if it is covered by a SequenceAnnotation.TYPE_EXTRACTED_REGION annotation.

static Iterable<SequenceAnnotation>

getSequenceAndTrackAnnotations(SequenceDocument sequence)

A convenience method to get all annotations on the sequence and all annotations on all SequenceTracks on this sequence.

static List<SequenceAnnotation>

getSequenceAnnotationsIncludingImmutableSequencesTrims(SequenceDocument sequence)

Gets all the annotations on the given sequence.

static String

getSequenceCharSequenceHash(SequenceCharSequence charSequence)

static String

getSequenceHash(SequenceDocument sequence)

static String

getSequenceHash(SequenceDocument sequence, List<Interval> intervals)

static List<? extends SequenceDocument>

getSequences(AnnotatedPluginDocument[] documents, SequenceDocument.Alphabet alphabet, ProgressListener progressListener)

get all the sequences out of a set of AnnotatedPluginDocuments that may wrap SequenceDocuments, SequenceListDocuments or SequenceAlignmentDocuments.

static List<? extends SequenceDocument>

getSequences(List<AnnotatedPluginDocument> documents, SequenceDocument.Alphabet alphabet, ProgressListener progressListener)

get all the sequences out of a set of AnnotatedPluginDocuments that may wrap SequenceDocuments, SequenceListDocuments or SequenceAlignmentDocuments.

static Collection<? extends SequenceDocument>

getSequencesWithoutImmediateLoading(AnnotatedPluginDocument[] documents, SequenceDocument.Alphabet alphabet)

Like getSequences(com.biomatters.geneious.publicapi.documents.AnnotatedPluginDocument[], com.biomatters.geneious.publicapi.documents.sequence.SequenceDocument.Alphabet, jebl.util.ProgressListener) but doesn't require each plugin document to be in memory as long as this Collection is around.

static List<SequenceType>

getSequenceType(AnnotatedPluginDocument document)

Examines a document and determines what the (jebl) sequence type (or types) of the document is (or are), and returns it (or them).

Always returns a List<SequenceType> of size 0, 1 or 2.

static SequenceType

getSequenceType(SequenceDocument sequence)

Get the (jebl) sequence type.

static SequenceType

getSequenceType(SequenceDocument.Alphabet alphabet)

Gets a jebl library SequenceType that is equivalent to a Geneious alphabet.

static int

getTrailingGapsLength(CharSequence charSequence)

Get the number of trailing gap ('-') characters in the sequence.

static int

getTrailingGapsStartIndex(CharSequence charSequence)

Returns the end index of the non-gap regions in the specified charSequence.

static CharSequence

getValidSequence(SequenceDocument sequenceDocument, boolean allowGaps)

Replace any invalid bases/residues in the given sequence document with ambiguity symbols.

static CharSequence

getValidSequence(SequenceDocument sequenceDocument, boolean allowGaps, boolean replaceWithGaps)

Replace any invalid bases/residues in the given sequence document with ambiguity symbols or gaps.

static boolean

isPredominantlyRna(CharSequence charSequence, int maximumNonGapsToLookAt)

Checks whether a sequence is predominantly RNA (rather than DNA).

static boolean

isRna(CharSequence charSequence)

Checks whether a sequence is RNA (rather than DNA) based on whether the sequence contains either a T/t or a U/u first.

static boolean

isRna(CharSequence charSequence, int maxNucleotidesToCheck)

Checks whether a sequence is RNA (rather than DNA) based on whether the sequence contains either a T/t or a U/u first.

static boolean

isStateAssignableFrom(State stateA, State stateB)

Same as stateA.getCanonicalStates().containsAll(stateB.getCanonicalStates()) except that for NucleotideStates and AminoAcidStates it caches the result.

static CharSequence

removeGaps(CharSequence charSequence)

Constructs a sequence without gaps ('-') from a specified sequence that potentially has gaps.

static CharSequence

removeInvalidResidues(CharSequence sequence, SequenceType sequenceType, boolean allowGaps)

Get a sequence string identical to sequence except that any invalid residues are removed.

static String

replaceQuestionMarksWithMaximalAmbiguitySymbol(SequenceType sequenceType, String sequence)

get a version of a sequence string with any question marks replaces with N (for nucleotide sequences) or X (for protein sequences)

static CharSequence

reverseComplement(CharSequence charSequence)

Provides a dynamic reverse complement view onto a nucletoide CharSequence.

static CharSequence

reverseComplementAsDna(CharSequence charSequence)

Similar to reverseComplement except that the result will be returned as DNA even if the input sequence is RNA.

static void

setOriginalResidueNumbering(EditableSequenceDocument document, int startIndex, boolean isReverse)

set the original residue numbering of a document the residue index of a document will appear shifted if the user has "show original residue numbers" selected in the sequence view

static String

toHTMLFragment(SequenceDocument sequence, String additionalContent)

Generate a HTML fragment that summarises a sequence, including the sequence string.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Method Details
- getForwardRegexForSequence
  
  public static String getForwardRegexForSequence(CharSequence querySequence, SequenceType sequenceType, boolean interpretAmbiguitiesInQuery)
  
  Equivalent to calling getForwardRegexForSequence(charSequence, sequenceType, interpretAmbiguities, false, true)
- getForwardRegexForSequence
  
  @Deprecated public static String getForwardRegexForSequence(CharSequence querySequence, SequenceType sequenceType, boolean interpretAmbiguitiesInQuery, boolean interpretAmbiguitiesInTarget)
  
  Deprecated.
  use getForwardRegexForSequence(CharSequence, jebl.evolution.sequences.SequenceType, boolean, boolean, boolean) instead
  
  Equivalent to getForwardRegexForSequence(...,true)
- getForwardRegexForSequence
  
  public static String getForwardRegexForSequence(CharSequence querySequence, SequenceType sequenceType, boolean interpretAmbiguitiesInQuery, boolean interpretAmbiguitiesInTarget, boolean allowExtraGapsInTarget)
  
  Given a nucleotide or amino acid sequence, returns a regular expression that matches forward occurrences of this sequence in a larger sequence, i.e. a String s such that Pattern.compile(s, Pattern.CASE_INSENSITIVE) will find all case insensitive forward matches of sequenceString in a larger sequence. The regular expression returned will also match sequences with gaps inserted at any point within the sequence.
  
  Parameters:
  
  querySequence - The nucleotide or amino acid sequence to search for.
  
  sequenceType - The type of the sequence
  
  interpretAmbiguitiesInQuery - If true, then an ambiguous character (e.g. R for nucleotides) in querySequence will match the corresponding canonical states (A and G) in the target.
  
  interpretAmbiguitiesInTarget - If true, then an ambiguous character (e.g. R for nucleotides) in the sequence being searched within will match the corresponding canonical states (A and G) in the querySequence.
  
  allowExtraGapsInTarget - If true, then additional gaps will be allowed in the sequence being search within
  
  Returns:
  
  a regular expression that matches forward occurrences of this search string in a larger sequence string or null if any of the characters in the sequence string are not valid residues for sequenceType
  
  Since:
  
  API 4.610 (Geneious 6.1.0)
- getForwardRegexPatternForSequence
  
  public static Pattern getForwardRegexPatternForSequence(CharSequence sequence, SequenceType sequenceType, boolean interpretAmbiguitiesInQuery)
  
  Equivalent to calling getForwardRegexPatternForSequence(charSequence, sequenceType, interpretAmbiguities, false)
- getForwardRegexPatternForSequence
  
  public static Pattern getForwardRegexPatternForSequence(CharSequence querySequence, SequenceType sequenceType, boolean interpretAmbiguitiesInQuery, boolean interpretAmbiguitiesInTarget)
  
  Given a nucleotide or amino acid sequence string, returns a regular expression pattern that matches forward occurrences of this search string in a larger sequence string,
  
  Parameters:
  
  querySequence - The nucleotide or amino acid sequence to search for.
  
  sequenceType - The type of the sequence
  
  interpretAmbiguitiesInQuery - If true, then an ambiguous character (e.g. R for nucleotides) in sequenceString will match the corresponding canonical states (A and G) in the target.
  
  Returns:
  
  a regular expression that matches forward occurrences of this search string in a larger sequence string or null if any of the characters in the sequence string are not valid residues for sequenceType
- isStateAssignableFrom
  
  public static boolean isStateAssignableFrom(State stateA, State stateB)
  
  Same as stateA.getCanonicalStates().containsAll(stateB.getCanonicalStates()) except that for NucleotideStates and AminoAcidStates it caches the result.
  
  Parameters:
  
  stateA - A state (e.g. a NucleotideState or AminoAcidState)
  
  stateB - A state of the same type as stateB
  
  Returns:
  
  true if stateA.getCanonicalStates().containsAll(stateB.getCanonicalStates))
- createSequenceDocument
  
  public static DefaultSequenceDocument createSequenceDocument(SequenceType sequenceType, String name, String description, CharSequence sequenceString, Date creationDate)
  
  Creates a DefaultNucleotideSequence or DefaultAminoAcidSequence depending on sequenceType. See the documentation of these classes' constructors for the semantics of the parameters.
- setOriginalResidueNumbering
  
  public static void setOriginalResidueNumbering(EditableSequenceDocument document, int startIndex, boolean isReverse)
  
  set the original residue numbering of a document the residue index of a document will appear shifted if the user has "show original residue numbers" selected in the sequence view
  
  Parameters:
  
  document - document to set the residue numbering for
  
  startIndex - start index for the residue numbering. The first original residue is residue 1.
  
  isReverse - true if the residue numbering should count down from startIndex, false if it should count up.
- containsInvalidResidues
  
  public static String containsInvalidResidues(SequenceDocument sequenceDocument, boolean allowGaps, boolean fastIncompleteCheck)
  
  Checks if a sequence contain invalid sequence residues.
  
  Parameters:
  
  sequenceDocument - the sequence to check for validity.
  
  allowGaps - true if the sequence is allowed to contain gaps
  
  fastIncompleteCheck - This parameter is ignored. It was added when Java 5 was widely used which is 10 times slower than Java 6. Checking enormous sequences is slow (a 2GB sequence takes about 20 seconds in Java 5, 2 seconds in Java 6). Set this parameter to true to check only the first and last 1,000,000 residues which catches almost all invalid cases and is much faster on enormous sequences.
  
  Returns:
  
  null if the sequence residues are all valid, or if the sequence contains invalid residues a message describing the first invalid residue is returned.
- getSequenceType
  
  public static SequenceType getSequenceType(SequenceDocument.Alphabet alphabet)
  
  Gets a jebl library SequenceType that is equivalent to a Geneious alphabet.
  
  Parameters:
  
  alphabet -
  
  Returns:
  
  sequence type that is equivalent to this alphabet
- getAlphabet
  
  public static SequenceDocument.Alphabet getAlphabet(SequenceType sequenceType)
  
  Gets a Geneious alphabet type that is equivalent to a jebl library SequenceType.
  
  Parameters:
  
  sequenceType -
  
  Returns:
  
  alphabet that is equivalent to this sequence tyqpe
- containsInvalidResidues
  
  public static String containsInvalidResidues(CharSequence sequenceResidues, SequenceDocument.Alphabet alphabet, boolean allowGaps, boolean fastIncompleteCheck)
  
  Checks if a sequence contain invalid sequence residues.
  
  Parameters:
  
  sequenceResidues - sequence residues to check for validity.
  
  alphabet - the alphabet of residues expected to be in sequenceResidues
  
  allowGaps - true if the sequence is allowed to contain gaps
  
  fastIncompleteCheck - This parameter is ignored. It was added when Java 5 was widely used which is 10 times slower than Java 6. Checking enormous sequences is slow (a 2GB sequence takes about 20 seconds in Java 5, 2 seconds in Java 6). Set this parameter to true to check only the first and last 1,000,000 residues which catches almost all invalid cases and is much faster on enormous sequences.
  
  Returns:
  
  null if the sequence residues are all valid, or if the sequence contains invalid residues a message describing the first invalid residue is returned.
- containsInvalidResidues
  
  public static String containsInvalidResidues(CharSequence sequenceResidues, SequenceType sequenceType, boolean allowGaps, boolean fastIncompleteCheck)
  
  Checks if a sequence contain invalid sequence residues.
  
  Parameters:
  
  sequenceResidues - sequence residues to check for validity.
  
  sequenceType - the type of residues expected to be in sequenceResidues
  
  allowGaps - true if the sequence is allowed to contain gaps
  
  fastIncompleteCheck - This parameter is ignored. It was added when Java 5 was widely used which is 10 times slower than Java 6. Checking enormous sequences is slow (a 2GB sequence takes about 20 seconds in Java 5, 2 seconds in Java 6). Set this parameter to true to check only the first and last 1,000,000 residues which catches almost all invalid cases and is much faster on enormous sequences.
  
  Returns:
  
  null if the sequence residues are all valid, or if the sequence contains invalid residues a message describing the first invalid residue is returned.
- removeInvalidResidues
  
  public static CharSequence removeInvalidResidues(CharSequence sequence, SequenceType sequenceType, boolean allowGaps)
  
  Get a sequence string identical to sequence except that any invalid residues are removed. Gaps are only removed if allowGaps is false. All valid characters remain unchanged (they maintain their original case and there are no U->T replacements for nucleotides.)
  
  Parameters:
  
  sequence - a string of residues that may or may not be valid residues
  
  sequenceType - the type of residues in sequence
  
  allowGaps - if this is true, then gaps are not removed.
  
  Returns:
  
  a sequence string identical to sequence except that any invalid residues are removed. If there are no invalid residues, sequence is returned.
  
  Throws:
  
  OutOfMemoryError - if a sequence comtains invalid residues and a valid version of the sequence cannot fit in memory
- getValidSequence
  
  public static CharSequence getValidSequence(SequenceDocument sequenceDocument, boolean allowGaps)
  
  Replace any invalid bases/residues in the given sequence document with ambiguity symbols.
  
  Parameters:
  
  sequenceDocument - sequence document to replace the invalid bases in
  
  allowGaps - whether gaps are allowed (if false they will be replaced with ambiguity symbols)
  
  Returns:
  
  the version of the sequence string with the invalid bases/residues replaced with ambiguity symbols
- getValidSequence
  
  public static CharSequence getValidSequence(SequenceDocument sequenceDocument, boolean allowGaps, boolean replaceWithGaps)
  
  Replace any invalid bases/residues in the given sequence document with ambiguity symbols or gaps.
  
  Parameters:
  
  sequenceDocument - sequence document to replace the invalid bases in
  
  allowGaps - whether gaps are allowed (if false they will be replaced with ambiguity symbols)
  
  replaceWithGaps - whether invalid bases/residues should be replaced with gaps - should only be done if sequence is in an alignment
  
  Returns:
  
  the version of the sequence string with the invalid bases/residues replaced with ambiguity symbols
  
  Throws:
  
  IllegalArgumentException - if allowGaps is false but replaceWithGaps is true
  
  Since:
  
  API 4.20 (Geneious 5.2.0)
- asRna
  
  public static CharSequence asRna(CharSequence nucleotideCharSequence)
  
  Views an underlying (nucleotide) CharSequence as RNA by dynamically translating 'T's to 'U's and 't's to 'u's It is guaranteed that if charSequence instanceof SequenceCharSequence, the returned value will also be instanceof SequenceCharSequence (but it may not support log-time modifications).
  
  Parameters:
  
  nucleotideCharSequence - A nucleotide sequence which may already have some RNA residues; it is not guaranteed that it is checked whether the sequence contains invalid residues. Must not be null.
  
  Returns:
  
  A CharSequence with the same sequence of characters as charSequence, except that 'T's are replaced with 'U's and 't's are replaced with 'u's
- asDna
  
  public static CharSequence asDna(CharSequence nucleotideCharSequence)
  
  Views an underlying (nucleotide) CharSequence as DNA by dynamically translating 'U's to 'T's and 'u's to 't's It is guaranteed that if charSequence instanceof SequenceCharSequence, the returned value will also be instanceof SequenceCharSequence (but it may not support log-time modifications).
  
  Parameters:
  
  nucleotideCharSequence - A nucleotide sequence which may already have some DNA residues; it is not guaranteed that it is checked whether the sequence contains invalid residues. Must not be null.
  
  Returns:
  
  A CharSequence with the same sequence of characters as charSequence, except that 'U's are replaced with 'T's and 'u's are replaced with 't's
- asTranslation
  
  @Deprecated public static CharSequence asTranslation(CharSequence nucleotideCharSequence, GeneticCode geneticCode)
  
  Deprecated.
  use asTranslation(CharSequence, jebl.evolution.sequences.GeneticCode, boolean) instead.
- asTranslation
  
  public static CharSequence asTranslation(CharSequence nucleotideCharSequence, GeneticCode geneticCode, boolean translateFirstCodonUsingFirstCodonTable)
  
  Views an underlying (nucleotide) CharSequence as its translation. If the CharSequence is not a multiple of 3, the extra 1 or 2 characters are ignored. The translated sequence will have length nucleotideCharSequence.length()/3. If the nucleotide sequence contains unknown nucleotide characters, these are treated as unknown states and the corresponding translated site will also be the unknown state (?) unless the nucleotide base would not affect the translation (e.g. the 3rd base in some triplets). The concrete type of the return value is not guaranteed. The specified charSequence must not change after it was passed to this method, but it is not guaranteed that violations of this contract will be detected.
  
  Parameters:
  
  nucleotideCharSequence - A nucleotide sequence which may be dna, rna or a mixture. Must not be null and must be immutable. Must not contain gaps.
  
  geneticCode - the genetic code to use for the translation. Must not be null.
  
  translateFirstCodonUsingFirstCodonTable - each genetic code specifies a set of codons which get translated as M if they are the first codon even though they normally wouldn't translate as an M when occurring elsewhere a coding region. If this parameter is true the first codon will be translated using this alternative translation table for the genetic code.
  
  Returns:
  
  A CharSequence which is a translation of nucleotideCharSequence
  
  Throws:
  
  IllegalArgumentException - if nucleotideCharSequence contains gaps.
  
  NullPointerException - if nucleotideCharSequence or geneticCode is null.
  
  Since:
  
  API 4.41 (Geneious 5.4.1)
- reverseComplement
  
  public static CharSequence reverseComplement(CharSequence charSequence)
  
  Provides a dynamic reverse complement view onto a nucletoide CharSequence. For performance, it is not guaranteed whether the charSequence will be checked for invalid residues. If an invalid nucleotide CharSequence is passed in, arbitrary nondeterministic behaviour may occur at any later time, such as e.g. unchecked exceptions thrown from CharSequence.charAt(int).
  
  It is guaranteed that if charSequence instanceof SequenceCharSequence, the returned value will also be instanceof SequenceCharSequence (but it may not support log-time modifications).
  
  Attention: Unlike Utils.reverseComplement(String), this method preserves case and doesn't remove gaps. To remove gaps and convert the sequence to upper case, use removeGaps(CharSequence) and CharSequenceUtilities.asUpperCase(CharSequence).
  This method may be slow on sequences which do not contain a T or U near the start of the sequence as it needs to scan through the sequence to determine if it is RNA or DNA. Consider using reverseComplementAsDna(CharSequence) for better performance.
  Parameters:
  
  charSequence - The charSequence for which to construct a reverse complement view.
  
  Returns:
  
  A reverse complement view onto charSequence, with case and gaps preserved.
  
  See Also:
  
  reverseComplementAsDna(CharSequence)
- reverseComplementAsDna
  
  public static CharSequence reverseComplementAsDna(CharSequence charSequence)
  
  Similar to reverseComplement except that the result will be returned as DNA even if the input sequence is RNA. This implementation is more efficient than reverseComplement(CharSequence) because it does not need to check the input sequence data type.
  
  Parameters:
  
  charSequence - The charSequence for which to construct a reverse complement view.
  
  Returns:
  
  A reverse complement view onto charSequence, with case and gaps preserved, but RNA converted to DNA
  
  Since:
  
  API 4.202100 (Geneious 2021.0.0)
- isPredominantlyRna
  
  public static boolean isPredominantlyRna(CharSequence charSequence, int maximumNonGapsToLookAt)
  
  Checks whether a sequence is predominantly RNA (rather than DNA). Same as Utils.isPredominantlyRNA(CharSequence, int), but more efficient on SequenceCharSequences.
  
  Parameters:
  
  charSequence - A charSequence that may contain DNA or RNA characters.
  
  maximumNonGapsToLookAt - Maximum number of non-gap residues to look at before making a decision; Pass in Integer.MAX_VALUE to look at all residues
  
  Returns:
  
  true of the non-gap residues of charSequence are predominantly RNA
- isRna
  
  public static boolean isRna(CharSequence charSequence)
  
  Checks whether a sequence is RNA (rather than DNA) based on whether the sequence contains either a T/t or a U/u first. If it contains neither T/t nor U/u, this method returns false.
  Parameters:
  
  charSequence - A charSequence that may contain DNA or RNA characters.
  
  Returns:
  
  true if this charSequence is RNA (rather than DNA)
  
  See Also:
  
  isPredominantlyRna(CharSequence, int)
- isRna
  
  public static boolean isRna(CharSequence charSequence, int maxNucleotidesToCheck)
  
  Checks whether a sequence is RNA (rather than DNA) based on whether the sequence contains either a T/t or a U/u first. If it contains neither T/t nor U/u, this method returns false.
  Parameters:
  
  charSequence - A charSequence that may contain DNA or RNA characters.
  
  maxNucleotidesToCheck - the maximum number of nucleotides/gaps to check (excluding leading/trailing gaps) before giving up and calling it DNA if no T or U is found.
  
  Returns:
  
  true if this charSequence is RNA (rather than DNA)
  
  Since:
  
  API 4.202200 (Geneious 2022.0.0)
  
  See Also:
  
  isPredominantlyRna(CharSequence, int)
- removeGaps
  
  public static CharSequence removeGaps(CharSequence charSequence)
  
  Constructs a sequence without gaps ('-') from a specified sequence that potentially has gaps. If the specified sequence does contain gaps, then a gapless copy is returned. Otherwise, the original charSequence is returned. It is guaranteed that CharSequenceUtilities.equals(removeGaps(cs), cs.toString().replace("-", "")) for any CharSequence cs that fulfills its contract.
  
  Parameters:
  
  charSequence - A nucleotide or amino acid sequence, potentially with gaps ('-')
  
  Returns:
  
  A CharSequence that contains the same sequence of characters but without the gaps ('-'). Returns charSequence if charSequence doesn't contain any gaps.
- getLeadingGapsLength
  
  public static int getLeadingGapsLength(CharSequence charSequence)
  
  Returns the start index of the non-gap regions in the specified charSequence, i.e. the length of the longset prefix of charSequence that contains only '-' characters.
  
  Parameters:
  
  charSequence - A CharSequence that may contain some leading gap characters '-'
  
  Returns:
  
  The length of the longest prefix of charSequence that contains only '-' characters.
- getTrailingGapsLength
  
  public static int getTrailingGapsLength(CharSequence charSequence)
  
  Get the number of trailing gap ('-') characters in the sequence.
  
  Parameters:
  
  charSequence - A CharSequence that may have trailing gap characters.
  
  Returns:
  
  the number of trailing gap ('-') characters in the sequence or 0 if the sequence is entirely gaps.
- getTrailingGapsStartIndex
  
  public static int getTrailingGapsStartIndex(CharSequence charSequence)
  
  Returns the end index of the non-gap regions in the specified charSequence. This is identical to charSequence.length() minus the length of the longest suffix of charSequence that consists only of '-', except when charSequence consists only of '-', in which case this method returns charSequence.length() because there are no non-gap regions. In other words, in a sequence that consists only of gaps, all gaps are considered leading rather than trailing gaps, i.e. the non-gap region is considered to start just beyond the end of the sequence.
  
  Parameters:
  
  charSequence - A CharSequence that may contain some leading gap characters '-'
  
  Returns:
  
  1+the index of the last nongap character in charSequence, or charSequence.length() if charSequence consists only of gaps
- getAlphabet
  
  public static SequenceDocument.Alphabet getAlphabet(SequenceDocument sequence)
  
  Get the Alphabet of a sequence.
  
  Parameters:
  
  sequence - a SequenceDocument to get the alphabet for.
  
  Returns:
  
  Alphabet of sequence
- getSequenceType
  
  public static SequenceType getSequenceType(SequenceDocument sequence)
  
  Get the (jebl) sequence type.
  
  Parameters:
  
  sequence - a SequenceDocument to get the sequence type of.
  
  Returns:
  
  type of sequence
  
  Throws:
  
  IllegalArgumentException - if sequence is not either a NucleotideSequenceDocument or a AminoAcidSequenceDocument.
- getSequenceType
  
  public static List<SequenceType> getSequenceType(AnnotatedPluginDocument document)
  Examines a document and determines what the (jebl) sequence type (or types) of the document is (or are), and returns it (or them).
  
  Always returns a List<SequenceType> of size 0, 1 or 2.
  
  Size 0: when the given document was a type that could have either SequenceType.AMINO_ACID or SequenceType.NUCLEOTIDE or both, and that document has no sequences at all, for example an empty SequenceListDocument
  
  Size 1: when the given document just contains a single sequence or is of a type where the SequenceType is always known, e,g a NucleotideSequenceDocument or a SequenceAlignmentDocument.
  
  Size 2: when the given document has sequences of both types, e.g. a SequenceListDocument with sequences of both types.
  Parameters:
  
  document - the document to determine the SequenceType of
  
  Returns:
  
  a list containing the sequence type or types of the given document.
  
  Throws:
  
  IllegalArgumentException - if the given document type wasn't a valid type to determine the SequenceType of.
  
  Since:
  
  API 4.610 (Geneious 6.1.0)
- getAlphabet
  
  public static SequenceDocument.Alphabet getAlphabet(AnnotatedPluginDocument... documents)
  
  Parameters:
  
  documents - the documents to get the alphabet for
  
  Returns:
  
  The alphabet that all these documents have in common, or null if they are not all the same alphabet or if any of the documents have multiple alphabets
  
  Throws:
  
  IllegalArgumentException - if any of the documents aren't a type of sequence (nucleotide, protein, sequence list or alignment)
  
  Since:
  
  API 4.1010 (Geneious 10.1.0)
- toHTMLFragment
  
  public static String toHTMLFragment(SequenceDocument sequence, String additionalContent)
  
  Generate a HTML fragment that summarises a sequence, including the sequence string. If the sequence is longer than a certain threshold X, then only the first X residues are shown.
  
  Parameters:
  
  sequence - a SequenceDocument
  
  additionalContent - additional content to include
  
  Returns:
  
  the html formatted summary
- asJeblSequence
  
  public static Sequence asJeblSequence(SequenceDocument sequence)
  
  Convert from a Geneious sequence to a jebl sequence.
  
  Parameters:
  
  sequence - a Geneious sequence
  
  Returns:
  
  sequence as a jebl sequence.
- asJeblSequences
  
  public static List<Sequence> asJeblSequences(List<SequenceDocument> sequences)
  
  Convert a set of Geneious sequences to jebl sequences.
  
  Parameters:
  
  sequences - Geneious sequences
  
  Returns:
  
  the Geneious sequences as jebl sequences
- asJeblSequences
  
  public static List<Sequence> asJeblSequences(SequenceDocument... sequences)
  
  Convert a set of Geneious sequences to jebl sequences.
  
  Parameters:
  
  sequences - Geneious sequences
  
  Returns:
  
  the Geneious sequences as jebl sequences
- asJeblAlignment
  
  public static Alignment asJeblAlignment(List<SequenceDocument> sequences)
  
  Convert a list of (aligned) Geneious sequences to a jebl alignmnent
  
  Parameters:
  
  sequences - aligned Geneious sequences
  
  Returns:
  
  the Geneious sequences as a jebl alignmnent
- createSequenceCopy
  
  public static SequenceDocument createSequenceCopy(SequenceDocument original)
  
  Creates a copy of the original sequence if necessary. If the sequence is an immutable sequence (ImmutableSequence) then it is not copied and is just returned from this method.
  Parameters:
  
  original - the original sequence
  
  Returns:
  
  a new sequence document or the original sequence if the original sequence is immutable.
  
  Since:
  
  API 4.11 (Geneious 5.0)
  
  See Also:
  
  createSequenceCopyEditable(com.biomatters.geneious.publicapi.documents.sequence.SequenceDocument)
- createSequenceCopyEditable
  
  public static DefaultSequenceDocument createSequenceCopyEditable(SequenceDocument original)
  
  Creates a copy of the original sequence that is editable.
  Parameters:
  
  original - the original sequence
  
  Returns:
  
  a new sequence document
  
  Since:
  
  API 4.11 (Geneious 5.0)
  
  See Also:
  
  createSequenceCopy(com.biomatters.geneious.publicapi.documents.sequence.SequenceDocument)
- getSequenceAnnotationsIncludingImmutableSequencesTrims
  
  public static List<SequenceAnnotation> getSequenceAnnotationsIncludingImmutableSequencesTrims(SequenceDocument sequence)
  
  Gets all the annotations on the given sequence. Additionally if it is an ImmutableSequence with ImmutableSequence.getLeadingTrimLength() or ImmutableSequence.getTrailingTrimLength()>0 then annotations are created to represent these trims.
  
  Parameters:
  
  sequence - the sequence to get annotations from
  
  Returns:
  
  the annotations from the sequence
  
  Since:
  
  API 4.52 (Geneious 5.5.2)
- asJeblSequence
  
  public static Sequence asJeblSequence(SequenceAlignmentDocument.ReferencedSequence referencedSequence, SequenceDocument sequence) throws com.biomatters.geneious.publicapi.plugin.DocumentOperationException
  
  Convert from a Geneious sequence to a jebl sequence.
  
  Parameters:
  
  referencedSequence - original referenced sequence to copy additional fields from. May be null.
  
  sequence - a Geneious sequence
  
  Returns:
  
  sequence as a jebl sequence
  
  Throws:
  
  com.biomatters.geneious.publicapi.plugin.DocumentOperationException - when the referenced sequence cannot be loaded
  
  Since:
  
  API 4.700 (Geneious 7.0.0)
- asJeblSequence
  
  @Deprecated public static Sequence asJeblSequence(AnnotatedPluginDocument referenceDocument, SequenceDocument sequence)
  
  Deprecated.
  use asJeblSequence(SequenceAlignmentDocument.ReferencedSequence, SequenceDocument)
  
  Convert from a Geneious sequence to a jebl sequence.
  
  Parameters:
  
  referenceDocument - original AnnotatedPluginDocument to copy additional fields from. May be null.
  
  sequence - a Geneious sequence
  
  Returns:
  
  sequence as a jebl sequence
  
  Since:
  
  API 4.43 (Geneious 5.4.3)
- replaceQuestionMarksWithMaximalAmbiguitySymbol
  
  public static String replaceQuestionMarksWithMaximalAmbiguitySymbol(SequenceType sequenceType, String sequence)
  
  get a version of a sequence string with any question marks replaces with N (for nucleotide sequences) or X (for protein sequences)
  
  Parameters:
  
  sequenceType - sequence type of sequence
  
  sequence - sequence string
  
  Returns:
  
  version of sequence with any question marks replaces with N (for nucleotide sequences) or X (for protein sequences)
- getMaximalAmbiguitySymbol
  
  public static String getMaximalAmbiguitySymbol(SequenceType sequenceType)
  
  get the code for the state in this sequence type which represents a base/residue that is completely unknown
  
  Parameters:
  
  sequenceType -
  
  Returns:
- getAnnotationsOfType
  
  public static List<SequenceAnnotation> getAnnotationsOfType(List<SequenceAnnotation> annotations, String type)
  
  Get all annotations in list matching the given type
  
  Parameters:
  
  annotations - annotations
  
  type - type of annotations to get
  
  Returns:
  
  all annotations in document matching the given type
- getAnnotationsOfType
  
  @Deprecated public static List<SequenceAnnotation> getAnnotationsOfType(SequenceDocument document, String type)
  
  Deprecated.
  use getAnnotationsOfType(com.biomatters.geneious.publicapi.documents.sequence.SequenceDocument, String, boolean) instead.
  
  Get all annotations in document matching the given type
  
  Parameters:
  
  document - document to get annotations form
  
  type - type of annotations to get
  
  Returns:
  
  all annotations in document matching the given type
- getAnnotationsOfType
  
  public static List<SequenceAnnotation> getAnnotationsOfType(SequenceDocument document, String type, boolean returnAnnotationsInTracks)
  
  Get all annotations in document matching the given type.
  WARNING: this list may not include all SequenceAnnotations represented as annotations in the sequence viewer. One such case is with trim annotations, which should be found using getSequenceAnnotationsIncludingImmutableSequencesTrims(com.biomatters.geneious.publicapi.documents.sequence.SequenceDocument).
  
  This may also return annotations that are not visible in the sequence viewer, such as SequenceAnnotation.TYPE_EXTRACTED_REGION.
  
  Parameters:
  
  document - document to get annotations form
  
  type - type of annotations to get
  
  returnAnnotationsInTracks - true iff we want annotations from SequenceTracks as well as those annotated directly on a document
  
  Returns:
  
  all annotations in document matching the given type
  
  Since:
  
  API 4.50 (Geneious 5.5.0)
- getSequenceAndTrackAnnotations
  
  public static Iterable<SequenceAnnotation> getSequenceAndTrackAnnotations(SequenceDocument sequence)
  
  A convenience method to get all annotations on the sequence and all annotations on all SequenceTracks on this sequence. Most code should instead manually load tracks on demand since they may be too large to fit into memory. To get tracks on a sequence use SequenceTrack.getTrackManager(com.biomatters.geneious.publicapi.documents.sequence.SequenceDocument) followed by SequenceTrack.Manager.getTracks().
  
  Iterating over the returned value may throw a RuntimeException whose cause is an XMLSerializationException if there is insufficient memory available to load the annotations. When running DocumentOperations or SequenceAnnotationGenerators core Geneious will automatic catch such exceptions and display a nice message to the user.
  
  Parameters:
  
  sequence - the sequence to get annotations for
  
  Returns:
  
  iterator containing the annotations. Will not return null.
  
  Since:
  
  API 4.50 (Geneious 5.5.0)
- createSequenceCopyAdjustedForGapInsertion
  
  public static SequenceDocument createSequenceCopyAdjustedForGapInsertion(SequenceDocument sequenceDocument, CharSequence gappedSequenceCharacters)
  
  Creates a copy of the given sequence with annotations, sequence residues, and chromatogram values adjusted to account for gap insertion. Note, the returned copy does not create gapped versions of SequenceTracks. Tracks are instead automatically propagated from referenced documents in alignments.
  Parameters:
  
  sequenceDocument - a sequence to insert gaps into. If the sequence alreayd contains gaps, the gaps are removed first
  
  gappedSequenceCharacters - the sequence characters to appear in the new gapped sequence. The positions of gaps in this character sequence determine how annotations and chromatograms are adjusted.
  
  Returns:
  
  a copy of sequenceDocument adjusted for gap insertion. This is always a DefaultSequenceDocument but this method isn't declared to return that for API backwards compatibility reasons
  
  See Also:
  
  SequenceExtractionUtilities.removeGaps(com.biomatters.geneious.publicapi.documents.sequence.SequenceDocument, boolean)
- createSequenceCopyAdjustedForGapInsertion
  
  public static SequenceDocument createSequenceCopyAdjustedForGapInsertion(SequenceDocument sequenceDocument, CharSequence gappedSequenceCharacters, boolean includeTracks)
  
  Creates a copy of the given sequence with annotations, sequence residues, and chromatogram values adjusted to account for gap insertion.
  Parameters:
  
  sequenceDocument - a sequence to insert gaps into. If the sequence alreayd contains gaps, the gaps are removed first
  
  gappedSequenceCharacters - the sequence characters to appear in the new gapped sequence. The positions of gaps in this character sequence determine how annotations and chromatograms are adjusted.
  
  includeTracks - true if tracks should also be copied. If this is intended for use with an alignment which references the original documents, this should be false as alignment documents propagate tracks on demand from referenced documents.
  
  Returns:
  
  a copy of sequenceDocument adjusted for gap insertion. This is always a DefaultSequenceDocument but this method isn't declared to return that for API backwards compatibility reasons
  
  Since:
  
  API 4.202000 (Geneious 2020.0.0)
  
  See Also:
  
  SequenceExtractionUtilities.removeGaps(com.biomatters.geneious.publicapi.documents.sequence.SequenceDocument, boolean)
- concatenateSequences
  
  public static SequenceDocument concatenateSequences(List<? extends SequenceDocument> sequences, boolean circular, int indexOfDocumentToUseForOrigin, ProgressListener progressListener) throws com.biomatters.geneious.publicapi.plugin.DocumentOperationException
  
  Concatenate a list of sequence documents. All sequences must be of the same type (all nucleotide or all amino acid). For circular results, indexOfDocumentToUse may be used to specify which input sequence should be used to determine the origin for the result. If the specified input sequence is circular and has an annotated origin, this position will be used; otherwise, the start of the specified sequence will be the origin of the result. If circular is false, indexOfDocumentToUse must be -1.
  
  Parameters:
  
  sequences - sequence documents to concatenate
  
  circular - if true, the result will be circular
  
  indexOfDocumentToUseForOrigin - index of document to use for the origin (must be -1 for linear results)
  
  progressListener -
  
  Returns:
  
  concatenated sequence
  
  Throws:
  
  com.biomatters.geneious.publicapi.plugin.DocumentOperationException
  
  Since:
  
  API 4.1100 (Geneious 11.0.0)
- getSequences
  
  public static List<? extends SequenceDocument> getSequences(AnnotatedPluginDocument[] documents, SequenceDocument.Alphabet alphabet, ProgressListener progressListener) throws com.biomatters.geneious.publicapi.plugin.DocumentOperationException
  
  get all the sequences out of a set of AnnotatedPluginDocuments that may wrap SequenceDocuments, SequenceListDocuments or SequenceAlignmentDocuments. For large sequence lists (SequenceListOnDisk) and genome sized sequences (those longer than SequenceDocument.GENOME_SEQUENCE_THRESHOLD) in other sequence lists, these are only loaded into memory on demand to ensure this method doesn't use excessive memory. If this method is potentially called on thousands of documents, then getSequencesWithoutImmediateLoading should be considered instead.
  
  Parameters:
  
  documents - documents to get the sequences out of
  
  alphabet - alphabet the sequences need to be to be included
  
  progressListener - for notifying the caller about progress of this method and for cancelling.
  
  Returns:
  
  all the sequences. Sequences are ordered by the AnnotatedPluginDocument they are in, and then by their index in that document.
  
  Throws:
  
  com.biomatters.geneious.publicapi.plugin.DocumentOperationException - if there is a problem getting the PluginDocument out of an AnnotatedPluginDocument or if the progress listener cancels the request.
- getSequences
  
  public static List<? extends SequenceDocument> getSequences(List<AnnotatedPluginDocument> documents, SequenceDocument.Alphabet alphabet, ProgressListener progressListener) throws com.biomatters.geneious.publicapi.plugin.DocumentOperationException
  
  get all the sequences out of a set of AnnotatedPluginDocuments that may wrap SequenceDocuments, SequenceListDocuments or SequenceAlignmentDocuments. For large sequence lists (SequenceListOnDisk) and genome sized sequences (those longer than SequenceDocument.GENOME_SEQUENCE_THRESHOLD) in other sequence lists, these are only loaded into memory on demand to ensure this method doesn't use excessive memory. If this method is potentially called on thousands of documents, then getSequencesWithoutImmediateLoading should be considered instead.
  
  Parameters:
  
  documents - documents to get the sequences out of
  
  alphabet - alphabet the sequences need to be to be included
  
  progressListener - for notifying the caller about progress of this method and for cancelling.
  
  Returns:
  
  all the sequences
  
  Throws:
  
  com.biomatters.geneious.publicapi.plugin.DocumentOperationException - if there is a problem getting the PluginDocument out of an AnnotatedPluginDocument or if the progress listener cancels the request.
  
  Since:
  
  API 4.700 (Geneious 7.0.0)
- getSequencesWithoutImmediateLoading
  
  public static Collection<? extends SequenceDocument> getSequencesWithoutImmediateLoading(AnnotatedPluginDocument[] documents, SequenceDocument.Alphabet alphabet) throws com.biomatters.geneious.publicapi.plugin.DocumentOperationException
  
  Like getSequences(com.biomatters.geneious.publicapi.documents.AnnotatedPluginDocument[], com.biomatters.geneious.publicapi.documents.sequence.SequenceDocument.Alphabet, jebl.util.ProgressListener) but doesn't require each plugin document to be in memory as long as this Collection is around. The trade-off is that the sequences can only be accessed sequentially (hence the Collection return type of this method). Also the Collection does not support removal.
  
  Since this Collection doesn't store the sequences immediately, DocumentOperationExceptions may be thrown down the line. Such a situation may be caught by surrounding the given iteration with try {... } catch (RuntimeDocumentOperationException e) and then handling the exception from there.
  Note that using getSequences(com.biomatters.geneious.publicapi.documents.AnnotatedPluginDocument[], com.biomatters.geneious.publicapi.documents.sequence.SequenceDocument.Alphabet, jebl.util.ProgressListener) is preferable to using getSequencesWithoutImmediateLoading when dealing with under a thousand documents.
  Parameters:
  
  documents - documents to get the sequences out of
  
  alphabet - alphabet the sequences need to be to be included
  
  Returns:
  
  all the sequences whose iterator may throw a RuntimeDocumentOperationException
  
  Throws:
  
  com.biomatters.geneious.publicapi.plugin.DocumentOperationException - if one or more of the documents has more than Integer.MAX_VALUE sequences.
  
  Since:
  
  API 4.610 (Geneious 6.1.0)
  
  See Also:
  
  RuntimeDocumentOperationException
- getOriginalIndex
  
  public static int getOriginalIndex(SequenceDocument sequence, int index)
  
  Gets the original numbering of the given index if it is covered by a SequenceAnnotation.TYPE_EXTRACTED_REGION annotation.
  
  Parameters:
  
  sequence - the sequence this index belongs to.
  
  index - the index to get the original numbering for.
  
  Returns:
  
  'translated' index or the original index if no other numbering can be found.
  
  Since:
  
  API 4.900 (Geneious 9.0.0)
- getNumberOfSequences
  
  public static long getNumberOfSequences(List<AnnotatedPluginDocument> documents, SequenceDocument.Alphabet alphabet)
  
  Gets the total number of nucleotide or amino acid sequences contained in the given documents which may be individual sequences, sequence lists, or alignments/contigs.
  
  Parameters:
  
  documents - the documents to get the number of sequences in
  
  alphabet - the alphabet (nucleotide or amino acid) of the sequences to count.
  
  Returns:
  
  the total number of nucleotide sequences or amino acid contained in the given documents
  
  Since:
  
  API 4.40 (Geneious 5.4.0)
- getNumberOfSequences
  
  public static long getNumberOfSequences(AnnotatedPluginDocument document, SequenceDocument.Alphabet alphabet)
  
  Gets the total number of nucleotide or amino acid sequences contained in the given document which may be an individual sequence, sequence list, or alignment/contig.
  
  Parameters:
  
  document - the document to get the number of sequences in
  
  alphabet - the alphabet (nucleotide or amino acid) of the sequences to count.
  
  Returns:
  
  the total number of nucleotide or amino acid sequences contained in the given document
  
  Since:
  
  API 4.40 (Geneious 5.4.0)
- generateConsensusSequence
  
  @Deprecated public static SequenceDocument generateConsensusSequence(SequenceAlignmentDocument alignment, ProgressListener progressListener)
  
  Deprecated.
  use generateConsensus(com.biomatters.geneious.publicapi.documents.sequence.SequenceAlignmentDocument, jebl.util.ProgressListener) instead
  
  Generates a consensus sequence for an alignment using default consensus settings. Note that the returned sequence may contain gaps. If it is to be used as a stand-alone sequence, then SequenceExtractionUtilities.removeGaps(com.biomatters.geneious.publicapi.documents.sequence.SequenceDocument) should be used.
  
  Parameters:
  
  alignment - the alignment to generate the consensus sequence for
  
  progressListener - for reporting progress can cancelling.
  
  Returns:
  
  a sequence equal in length to the alignment. The sequence may contain gaps. May return null if progressListener requests this get cancelled.
  
  Since:
  
  API 4.60 (Geneious 5.6.0)
- generateConsensus
  
  public static SequenceDocument generateConsensus(SequenceAlignmentDocument alignment, ProgressListener progressListener) throws com.biomatters.geneious.publicapi.plugin.DocumentOperationException
  
  Generates a consensus sequence for an alignment using default consensus settings. Note that the returned sequence may contain gaps. If it is to be used as a stand-alone sequence, then SequenceExtractionUtilities.removeGaps(com.biomatters.geneious.publicapi.documents.sequence.SequenceDocument) should be used.
  To generate consensus sequences with non-default options, use PluginUtilities.getDocumentOperation("Generate_Consensus"). Note that this operation generates an sequence with gaps removed by default.
  
  Parameters:
  
  alignment - the alignment to generate the consensus sequence for
  
  progressListener - for reporting progress can cancelling.
  
  Returns:
  
  a sequence equal in length to the alignment. The sequence may contain gaps. Will not return null
  
  Throws:
  
  com.biomatters.geneious.publicapi.plugin.DocumentOperationException - if the consensus can't be generated because there is insufficient free memory.
  
  com.biomatters.geneious.publicapi.plugin.DocumentOperationException.Canceled - if the progressListener requests the consensus generation be cancelled.
  
  Since:
  
  API 4.610 (Geneious 6.1.0)
- getBlastAlignmentText
  
  public static String getBlastAlignmentText(SequenceAlignmentDocument alignment, boolean geneiousFriendly)
  
  Formats the given alignment in BLAST text format
  
  Parameters:
  
  alignment - alignment to format
  
  geneiousFriendly - whether to format the alignment in an html-formatted "Geneious friendly" way that is useful generally for alignments and not just for BLAST output
  
  Returns:
  
  alignment represented in BLAST text format
  
  Since:
  
  API 4.700 (Geneious 7.0.0)
- alignmentFromJeblSequences
  
  public static DefaultAlignmentDocument alignmentFromJeblSequences(String name, List<Sequence> jeblSequences)
  
  Converts the given alignment of Jebl sequences into a DefaultAlignmentDocument
  
  Parameters:
  
  name - name for alignment
  
  jeblSequences - aligned jebl sequences
  
  Returns:
  
  a DefaultAlignmentDocument representing the given alignment.
  
  Since:
  
  API 4.700 (Geneious 7.0.0)
- createNewDocumentsByTransformingSequences
  
  public static List<AnnotatedPluginDocument> createNewDocumentsByTransformingSequences(List<AnnotatedPluginDocument> sourceDocuments, SequenceDocument.Transformer transformer, ProgressListener progressListener, String newSequenceOrDocumentNamePrefix) throws com.biomatters.geneious.publicapi.plugin.DocumentOperationException
  
  Transforms the sequence(s) in each input document and returns a new document corresponding to each input document.
  
  Parameters:
  
  sourceDocuments - the source documents containing sequences to transform. These may be SequenceDocuments or SequenceListDocuments or SequenceAlignmentDocuments
  
  transformer - the transformer for transforming each sequence
  
  progressListener - for reporting progress and canceling
  
  newSequenceOrDocumentNamePrefix - an optional prefix to assign to the name of each newly generated document. May be an empty String to leave names unchanged.
  
  Returns:
  
  the new documents
  
  Throws:
  
  com.biomatters.geneious.publicapi.plugin.DocumentOperationException - if documents can't be loaded, or if the input documents are not SequenceDocuments or SequenceListDocuments or SequenceAlignmentDocuments
  
  Since:
  
  API 4.701 (Geneious 7.0.1)
- createNewDocumentsByTransformingSequences
  
  public static List<AnnotatedPluginDocument> createNewDocumentsByTransformingSequences(List<AnnotatedPluginDocument> sourceDocuments, SequenceDocument.Transformer transformer, ProgressListener progressListener, String newSequenceOrDocumentNamePrefix, String newSequenceOrDocumentNameSuffix) throws com.biomatters.geneious.publicapi.plugin.DocumentOperationException
  
  Transforms the sequence(s) in each input document and returns a new document corresponding to each input document.
  
  Parameters:
  
  sourceDocuments - the source documents containing sequences to transform. These may be SequenceDocuments or SequenceListDocuments or SequenceAlignmentDocuments
  
  transformer - the transformer for transforming each sequence
  
  progressListener - for reporting progress and canceling
  
  newSequenceOrDocumentNamePrefix - an optional prefix to assign to the name of each newly generated document. May be an empty String to leave names unchanged.
  
  newSequenceOrDocumentNameSuffix - an optional suffix to assign to the name of each newly generated document. May be an empty String to leave names unchanged.
  
  Returns:
  
  the new documents
  
  Throws:
  
  com.biomatters.geneious.publicapi.plugin.DocumentOperationException - if documents can't be loaded, or if the input documents are not SequenceDocuments or SequenceListDocuments or SequenceAlignmentDocuments
  
  Since:
  
  API 4.201920 (Geneious 2019.2.0)
- getIntervalBasedOnExtractionAnnotation
  
  public static SequenceAnnotationInterval getIntervalBasedOnExtractionAnnotation(SequenceDocument sequenceDocument, SequenceAnnotationInterval interval, boolean mapToOriginal)
  
  Gets the extraction annotations from the sequence document and maps the interval to either the original sequence or the result sequence, depending on the value of mapToOriginal
  
  Parameters:
  
  sequenceDocument - the document to get the extractionAnnotations from
  
  interval - the interval to re-map
  
  mapToOriginal - whether to map this interval to the corresponding bit on the original or to the corresponding bit on the result
  
  Returns:
  
  a new interval that represents the given interval on either the original or result document, return parameter interval back if can not find mapping
  
  Since:
  
  API 4.1000 (Geneious 10.0.0)
- getIndexBasedOnExtractionAnnotation
  
  public static Integer getIndexBasedOnExtractionAnnotation(SequenceDocument sequenceDocument, int index, boolean mapToOriginal)
  
  Gets the extraction annotations from the sequence document and maps a residue index to a residue index on either the original sequence or the result sequence, depending on the value of mapToOriginal
  
  Parameters:
  
  sequenceDocument - the document to get the extractionAnnotations from
  
  index - the 1-based residue position in the sequence to re-map.
  
  mapToOriginal - whether to map this interval to the corresponding bit on the original or to the corresponding bit on the result
  
  Returns:
  
  a new index that represents the given index on either the original or result document, return null if the index can't be mapped.
  
  Since:
  
  API 4.1000 (Geneious 10.0.0)
- getSequenceCharSequenceHash
  
  public static String getSequenceCharSequenceHash(SequenceCharSequence charSequence)
  
  Parameters:
  
  charSequence - a sequence returned from SequenceDocument.getCharSequence()
  
  Returns:
  
  a hexadecimal encoded MD5 hash of the nucleotides or amino acids in a sequence
  
  Since:
  
  API 4.202500 (Geneious 2025.0.0)
- getSequenceHash
  
  public static String getSequenceHash(SequenceDocument sequence)
  
  Parameters:
  
  sequence - sequence to get a MD5 hash of
  
  Returns:
  
  a hexadecimal encoded MD5 hash of the nucleotides or amino acids in this sequence
  
  Since:
  
  API 4.202500 (Geneious 2025.0.0)
- getSequenceHash
  
  public static String getSequenceHash(SequenceDocument sequence, List<Interval> intervals)
  
  Parameters:
  
  sequence - sequence to get a MD5 hash of
  
  intervals - residue (nucleotide or amino acid) intervals within the sequence
  
  Returns:
  
  a hexadecimal encoded MD5 hash of the nucleotides or amino acids within the specified intervals in this sequence
  
  Since:
  
  API 4.202500 (Geneious 2025.0.0)

Class SequenceUtilities

Method Summary

Methods inherited from class java.lang.Object

Method Details

getForwardRegexForSequence

getForwardRegexForSequence

getForwardRegexForSequence

getForwardRegexPatternForSequence

getForwardRegexPatternForSequence

isStateAssignableFrom

createSequenceDocument

setOriginalResidueNumbering

containsInvalidResidues

getSequenceType

getAlphabet

containsInvalidResidues

containsInvalidResidues

removeInvalidResidues

getValidSequence

getValidSequence

asRna

asDna

asTranslation

asTranslation

reverseComplement

reverseComplementAsDna

isPredominantlyRna

isRna

isRna

removeGaps

getLeadingGapsLength

getTrailingGapsLength

getTrailingGapsStartIndex

getAlphabet

getSequenceType

getSequenceType

getAlphabet

toHTMLFragment

asJeblSequence

asJeblSequences

asJeblSequences

asJeblAlignment

createSequenceCopy

createSequenceCopyEditable

getSequenceAnnotationsIncludingImmutableSequencesTrims

asJeblSequence

asJeblSequence

replaceQuestionMarksWithMaximalAmbiguitySymbol

getMaximalAmbiguitySymbol

getAnnotationsOfType

getAnnotationsOfType

getAnnotationsOfType

getSequenceAndTrackAnnotations

createSequenceCopyAdjustedForGapInsertion

createSequenceCopyAdjustedForGapInsertion

concatenateSequences

getSequences

getSequences

getSequencesWithoutImmediateLoading

getOriginalIndex

getNumberOfSequences

getNumberOfSequences

generateConsensusSequence

generateConsensus

getBlastAlignmentText

alignmentFromJeblSequences

createNewDocumentsByTransformingSequences

createNewDocumentsByTransformingSequences

getIntervalBasedOnExtractionAnnotation

getIndexBasedOnExtractionAnnotation

getSequenceCharSequenceHash

getSequenceHash

getSequenceHash