Class SequenceUtilities
- See Also:
-
Method Summary
Modifier and TypeMethodDescriptionstatic DefaultAlignmentDocumentalignmentFromJeblSequences(String name, List<Sequence> jeblSequences) Converts the given alignment of Jebl sequences into a DefaultAlignmentDocumentstatic CharSequenceasDna(CharSequence nucleotideCharSequence) Views an underlying (nucleotide) CharSequence as DNA by dynamically translating 'U's to 'T's and 'u's to 't's It is guaranteed that if charSequence instanceof SequenceCharSequence, the returned value will also be instanceof SequenceCharSequence (but it may not support log-time modifications).static AlignmentasJeblAlignment(List<SequenceDocument> sequences) Convert a list of (aligned) Geneious sequences to a jebl alignmnentstatic SequenceasJeblSequence(AnnotatedPluginDocument referenceDocument, SequenceDocument sequence) Deprecated.static SequenceasJeblSequence(SequenceAlignmentDocument.ReferencedSequence referencedSequence, SequenceDocument sequence) Convert from a Geneious sequence to a jebl sequence.static SequenceasJeblSequence(SequenceDocument sequence) Convert from a Geneious sequence to a jebl sequence.asJeblSequences(SequenceDocument... sequences) Convert a set of Geneious sequences to jebl sequences.asJeblSequences(List<SequenceDocument> sequences) Convert a set of Geneious sequences to jebl sequences.static CharSequenceasRna(CharSequence nucleotideCharSequence) Views an underlying (nucleotide) CharSequence as RNA by dynamically translating 'T's to 'U's and 't's to 'u's It is guaranteed that if charSequence instanceof SequenceCharSequence, the returned value will also be instanceof SequenceCharSequence (but it may not support log-time modifications).static CharSequenceasTranslation(CharSequence nucleotideCharSequence, GeneticCode geneticCode) Deprecated.static CharSequenceasTranslation(CharSequence nucleotideCharSequence, GeneticCode geneticCode, boolean translateFirstCodonUsingFirstCodonTable) Views an underlying (nucleotide) CharSequence as its translation.static SequenceDocumentconcatenateSequences(List<? extends SequenceDocument> sequences, boolean circular, int indexOfDocumentToUseForOrigin, ProgressListener progressListener) Concatenate a list of sequence documents.static StringcontainsInvalidResidues(SequenceDocument sequenceDocument, boolean allowGaps, boolean fastIncompleteCheck) Checks if a sequence contain invalid sequence residues.static StringcontainsInvalidResidues(CharSequence sequenceResidues, SequenceDocument.Alphabet alphabet, boolean allowGaps, boolean fastIncompleteCheck) Checks if a sequence contain invalid sequence residues.static StringcontainsInvalidResidues(CharSequence sequenceResidues, SequenceType sequenceType, boolean allowGaps, boolean fastIncompleteCheck) Checks if a sequence contain invalid sequence residues.static List<AnnotatedPluginDocument> createNewDocumentsByTransformingSequences(List<AnnotatedPluginDocument> sourceDocuments, SequenceDocument.Transformer transformer, ProgressListener progressListener, String newSequenceOrDocumentNamePrefix) Transforms the sequence(s) in each input document and returns a new document corresponding to each input document.static List<AnnotatedPluginDocument> createNewDocumentsByTransformingSequences(List<AnnotatedPluginDocument> sourceDocuments, SequenceDocument.Transformer transformer, ProgressListener progressListener, String newSequenceOrDocumentNamePrefix, String newSequenceOrDocumentNameSuffix) Transforms the sequence(s) in each input document and returns a new document corresponding to each input document.static SequenceDocumentcreateSequenceCopy(SequenceDocument original) Creates a copy of the original sequence if necessary.static SequenceDocumentcreateSequenceCopyAdjustedForGapInsertion(SequenceDocument sequenceDocument, CharSequence gappedSequenceCharacters) Creates a copy of the given sequence with annotations, sequence residues, and chromatogram values adjusted to account for gap insertion.static SequenceDocumentcreateSequenceCopyAdjustedForGapInsertion(SequenceDocument sequenceDocument, CharSequence gappedSequenceCharacters, boolean includeTracks) Creates a copy of the given sequence with annotations, sequence residues, and chromatogram values adjusted to account for gap insertion.static DefaultSequenceDocumentcreateSequenceCopyEditable(SequenceDocument original) Creates a copy of the original sequence that is editable.static DefaultSequenceDocumentcreateSequenceDocument(SequenceType sequenceType, String name, String description, CharSequence sequenceString, Date creationDate) Creates aDefaultNucleotideSequenceorDefaultAminoAcidSequencedepending on sequenceType.static SequenceDocumentgenerateConsensus(SequenceAlignmentDocument alignment, ProgressListener progressListener) Generates a consensus sequence for an alignment using default consensus settings.static SequenceDocumentgenerateConsensusSequence(SequenceAlignmentDocument alignment, ProgressListener progressListener) static SequenceDocument.AlphabetgetAlphabet(AnnotatedPluginDocument... documents) static SequenceDocument.AlphabetgetAlphabet(SequenceDocument sequence) Get the Alphabet of a sequence.static SequenceDocument.AlphabetgetAlphabet(SequenceType sequenceType) Gets a Geneious alphabet type that is equivalent to a jebl library SequenceType.static List<SequenceAnnotation> getAnnotationsOfType(SequenceDocument document, String type) static List<SequenceAnnotation> getAnnotationsOfType(SequenceDocument document, String type, boolean returnAnnotationsInTracks) Get all annotations in document matching the given type.static List<SequenceAnnotation> getAnnotationsOfType(List<SequenceAnnotation> annotations, String type) Get all annotations in list matching the given typestatic StringgetBlastAlignmentText(SequenceAlignmentDocument alignment, boolean geneiousFriendly) Formats the given alignment in BLAST text formatstatic StringgetForwardRegexForSequence(CharSequence querySequence, SequenceType sequenceType, boolean interpretAmbiguitiesInQuery) static StringgetForwardRegexForSequence(CharSequence querySequence, SequenceType sequenceType, boolean interpretAmbiguitiesInQuery, boolean interpretAmbiguitiesInTarget) static StringgetForwardRegexForSequence(CharSequence querySequence, SequenceType sequenceType, boolean interpretAmbiguitiesInQuery, boolean interpretAmbiguitiesInTarget, boolean allowExtraGapsInTarget) Given a nucleotide or amino acid sequence, returns a regular expression that matches forward occurrences of this sequence in a larger sequence, i.e.static PatterngetForwardRegexPatternForSequence(CharSequence sequence, SequenceType sequenceType, boolean interpretAmbiguitiesInQuery) static PatterngetForwardRegexPatternForSequence(CharSequence querySequence, SequenceType sequenceType, boolean interpretAmbiguitiesInQuery, boolean interpretAmbiguitiesInTarget) Given a nucleotide or amino acid sequence string, returns a regular expression pattern that matches forward occurrences of this search string in a larger sequence string,static IntegergetIndexBasedOnExtractionAnnotation(SequenceDocument sequenceDocument, int index, boolean mapToOriginal) Gets the extraction annotations from the sequence document and maps a residue index to a residue index on either the original sequence or the result sequence, depending on the value ofmapToOriginalstatic SequenceAnnotationIntervalgetIntervalBasedOnExtractionAnnotation(SequenceDocument sequenceDocument, SequenceAnnotationInterval interval, boolean mapToOriginal) Gets the extraction annotations from the sequence document and maps the interval to either the original sequence or the result sequence, depending on the value ofmapToOriginalstatic intgetLeadingGapsLength(CharSequence charSequence) Returns the start index of the non-gap regions in the specified charSequence, i.e.static StringgetMaximalAmbiguitySymbol(SequenceType sequenceType) get the code for the state in this sequence type which represents a base/residue that is completely unknownstatic longgetNumberOfSequences(AnnotatedPluginDocument document, SequenceDocument.Alphabet alphabet) Gets the total number of nucleotide or amino acid sequences contained in the given document which may be an individual sequence, sequence list, or alignment/contig.static longgetNumberOfSequences(List<AnnotatedPluginDocument> documents, SequenceDocument.Alphabet alphabet) Gets the total number of nucleotide or amino acid sequences contained in the given documents which may be individual sequences, sequence lists, or alignments/contigs.static intgetOriginalIndex(SequenceDocument sequence, int index) Gets the original numbering of the given index if it is covered by aSequenceAnnotation.TYPE_EXTRACTED_REGIONannotation.static Iterable<SequenceAnnotation> A convenience method to get all annotations on the sequence and all annotations on allSequenceTrackson this sequence.static List<SequenceAnnotation> Gets all the annotations on the given sequence.static StringgetSequenceCharSequenceHash(SequenceCharSequence charSequence) static StringgetSequenceHash(SequenceDocument sequence) static StringgetSequenceHash(SequenceDocument sequence, List<Interval> intervals) static List<? extends SequenceDocument> getSequences(AnnotatedPluginDocument[] documents, SequenceDocument.Alphabet alphabet, ProgressListener progressListener) get all the sequences out of a set of AnnotatedPluginDocuments that may wrap SequenceDocuments, SequenceListDocuments or SequenceAlignmentDocuments.static List<? extends SequenceDocument> getSequences(List<AnnotatedPluginDocument> documents, SequenceDocument.Alphabet alphabet, ProgressListener progressListener) get all the sequences out of a set of AnnotatedPluginDocuments that may wrap SequenceDocuments, SequenceListDocuments or SequenceAlignmentDocuments.static Collection<? extends SequenceDocument> getSequencesWithoutImmediateLoading(AnnotatedPluginDocument[] documents, SequenceDocument.Alphabet alphabet) LikegetSequences(com.biomatters.geneious.publicapi.documents.AnnotatedPluginDocument[], com.biomatters.geneious.publicapi.documents.sequence.SequenceDocument.Alphabet, jebl.util.ProgressListener)but doesn't require each plugin document to be in memory as long as this Collection is around.static List<SequenceType> getSequenceType(AnnotatedPluginDocument document) Examines a document and determines what the (jebl) sequence type (or types) of the document is (or are), and returns it (or them).
Always returns a List<SequenceType> of size 0, 1 or 2.static SequenceTypegetSequenceType(SequenceDocument sequence) Get the (jebl) sequence type.static SequenceTypegetSequenceType(SequenceDocument.Alphabet alphabet) Gets a jebl library SequenceType that is equivalent to a Geneious alphabet.static intgetTrailingGapsLength(CharSequence charSequence) Get the number of trailing gap ('-') characters in the sequence.static intgetTrailingGapsStartIndex(CharSequence charSequence) Returns the end index of the non-gap regions in the specified charSequence.static CharSequencegetValidSequence(SequenceDocument sequenceDocument, boolean allowGaps) Replace any invalid bases/residues in the given sequence document with ambiguity symbols.static CharSequencegetValidSequence(SequenceDocument sequenceDocument, boolean allowGaps, boolean replaceWithGaps) Replace any invalid bases/residues in the given sequence document with ambiguity symbols or gaps.static booleanisPredominantlyRna(CharSequence charSequence, int maximumNonGapsToLookAt) Checks whether a sequence is predominantly RNA (rather than DNA).static booleanisRna(CharSequence charSequence) Checks whether a sequence is RNA (rather than DNA) based on whether the sequence contains either a T/t or a U/u first.static booleanisRna(CharSequence charSequence, int maxNucleotidesToCheck) Checks whether a sequence is RNA (rather than DNA) based on whether the sequence contains either a T/t or a U/u first.static booleanisStateAssignableFrom(State stateA, State stateB) Same as stateA.getCanonicalStates().containsAll(stateB.getCanonicalStates()) except that for NucleotideStates and AminoAcidStates it caches the result.static CharSequenceremoveGaps(CharSequence charSequence) Constructs a sequence without gaps ('-') from a specified sequence that potentially has gaps.static CharSequenceremoveInvalidResidues(CharSequence sequence, SequenceType sequenceType, boolean allowGaps) Get a sequence string identical tosequenceexcept that any invalid residues are removed.static StringreplaceQuestionMarksWithMaximalAmbiguitySymbol(SequenceType sequenceType, String sequence) get a version of a sequence string with any question marks replaces with N (for nucleotide sequences) or X (for protein sequences)static CharSequencereverseComplement(CharSequence charSequence) Provides a dynamic reverse complement view onto a nucletoide CharSequence.static CharSequencereverseComplementAsDna(CharSequence charSequence) Similar toreverseComplementexcept that the result will be returned as DNA even if the input sequence is RNA.static voidsetOriginalResidueNumbering(EditableSequenceDocument document, int startIndex, boolean isReverse) set the original residue numbering of a document the residue index of a document will appear shifted if the user has "show original residue numbers" selected in the sequence viewstatic StringtoHTMLFragment(SequenceDocument sequence, String additionalContent) Generate a HTML fragment that summarises a sequence, including the sequence string.
-
Method Details
-
getForwardRegexForSequence
public static String getForwardRegexForSequence(CharSequence querySequence, SequenceType sequenceType, boolean interpretAmbiguitiesInQuery) -
getForwardRegexForSequence
@Deprecated public static String getForwardRegexForSequence(CharSequence querySequence, SequenceType sequenceType, boolean interpretAmbiguitiesInQuery, boolean interpretAmbiguitiesInTarget) Deprecated.Equivalent togetForwardRegexForSequence(...,true) -
getForwardRegexForSequence
public static String getForwardRegexForSequence(CharSequence querySequence, SequenceType sequenceType, boolean interpretAmbiguitiesInQuery, boolean interpretAmbiguitiesInTarget, boolean allowExtraGapsInTarget) Given a nucleotide or amino acid sequence, returns a regular expression that matches forward occurrences of this sequence in a larger sequence, i.e. a String s such that Pattern.compile(s, Pattern.CASE_INSENSITIVE) will find all case insensitive forward matches of sequenceString in a larger sequence. The regular expression returned will also match sequences with gaps inserted at any point within the sequence.- Parameters:
querySequence- The nucleotide or amino acid sequence to search for.sequenceType- The type of the sequenceinterpretAmbiguitiesInQuery- If true, then an ambiguous character (e.g. R for nucleotides) in querySequence will match the corresponding canonical states (A and G) in the target.interpretAmbiguitiesInTarget- If true, then an ambiguous character (e.g. R for nucleotides) in the sequence being searched within will match the corresponding canonical states (A and G) in the querySequence.allowExtraGapsInTarget- If true, then additional gaps will be allowed in the sequence being search within- Returns:
- a regular expression that matches forward occurrences of this search string in a larger sequence string or null if any of the characters in the sequence string are not valid residues for sequenceType
- Since:
- API 4.610 (Geneious 6.1.0)
-
getForwardRegexPatternForSequence
public static Pattern getForwardRegexPatternForSequence(CharSequence sequence, SequenceType sequenceType, boolean interpretAmbiguitiesInQuery) -
getForwardRegexPatternForSequence
public static Pattern getForwardRegexPatternForSequence(CharSequence querySequence, SequenceType sequenceType, boolean interpretAmbiguitiesInQuery, boolean interpretAmbiguitiesInTarget) Given a nucleotide or amino acid sequence string, returns a regular expression pattern that matches forward occurrences of this search string in a larger sequence string,- Parameters:
querySequence- The nucleotide or amino acid sequence to search for.sequenceType- The type of the sequenceinterpretAmbiguitiesInQuery- If true, then an ambiguous character (e.g. R for nucleotides) in sequenceString will match the corresponding canonical states (A and G) in the target.- Returns:
- a regular expression that matches forward occurrences of this search string in a larger sequence string or null if any of the characters in the sequence string are not valid residues for sequenceType
-
isStateAssignableFrom
Same as stateA.getCanonicalStates().containsAll(stateB.getCanonicalStates()) except that for NucleotideStates and AminoAcidStates it caches the result.- Parameters:
stateA- A state (e.g. a NucleotideState or AminoAcidState)stateB- A state of the same type as stateB- Returns:
- true if stateA.getCanonicalStates().containsAll(stateB.getCanonicalStates))
-
createSequenceDocument
public static DefaultSequenceDocument createSequenceDocument(SequenceType sequenceType, String name, String description, CharSequence sequenceString, Date creationDate) Creates aDefaultNucleotideSequenceorDefaultAminoAcidSequencedepending on sequenceType. See the documentation of these classes' constructors for the semantics of the parameters. -
setOriginalResidueNumbering
public static void setOriginalResidueNumbering(EditableSequenceDocument document, int startIndex, boolean isReverse) set the original residue numbering of a document the residue index of a document will appear shifted if the user has "show original residue numbers" selected in the sequence view- Parameters:
document- document to set the residue numbering forstartIndex- start index for the residue numbering. The first original residue is residue 1.isReverse- true if the residue numbering should count down from startIndex, false if it should count up.
-
containsInvalidResidues
public static String containsInvalidResidues(SequenceDocument sequenceDocument, boolean allowGaps, boolean fastIncompleteCheck) Checks if a sequence contain invalid sequence residues.- Parameters:
sequenceDocument- the sequence to check for validity.allowGaps- true if the sequence is allowed to contain gapsfastIncompleteCheck- This parameter is ignored. It was added when Java 5 was widely used which is 10 times slower than Java 6. Checking enormous sequences is slow (a 2GB sequence takes about 20 seconds in Java 5, 2 seconds in Java 6). Set this parameter to true to check only the first and last 1,000,000 residues which catches almost all invalid cases and is much faster on enormous sequences.- Returns:
- null if the sequence residues are all valid, or if the sequence contains invalid residues a message describing the first invalid residue is returned.
-
getSequenceType
Gets a jebl library SequenceType that is equivalent to a Geneious alphabet.- Parameters:
alphabet-- Returns:
- sequence type that is equivalent to this alphabet
-
getAlphabet
Gets a Geneious alphabet type that is equivalent to a jebl library SequenceType.- Parameters:
sequenceType-- Returns:
- alphabet that is equivalent to this sequence tyqpe
-
containsInvalidResidues
public static String containsInvalidResidues(CharSequence sequenceResidues, SequenceDocument.Alphabet alphabet, boolean allowGaps, boolean fastIncompleteCheck) Checks if a sequence contain invalid sequence residues.- Parameters:
sequenceResidues- sequence residues to check for validity.alphabet- the alphabet of residues expected to be in sequenceResiduesallowGaps- true if the sequence is allowed to contain gapsfastIncompleteCheck- This parameter is ignored. It was added when Java 5 was widely used which is 10 times slower than Java 6. Checking enormous sequences is slow (a 2GB sequence takes about 20 seconds in Java 5, 2 seconds in Java 6). Set this parameter to true to check only the first and last 1,000,000 residues which catches almost all invalid cases and is much faster on enormous sequences.- Returns:
- null if the sequence residues are all valid, or if the sequence contains invalid residues a message describing the first invalid residue is returned.
-
containsInvalidResidues
public static String containsInvalidResidues(CharSequence sequenceResidues, SequenceType sequenceType, boolean allowGaps, boolean fastIncompleteCheck) Checks if a sequence contain invalid sequence residues.- Parameters:
sequenceResidues- sequence residues to check for validity.sequenceType- the type of residues expected to be in sequenceResiduesallowGaps- true if the sequence is allowed to contain gapsfastIncompleteCheck- This parameter is ignored. It was added when Java 5 was widely used which is 10 times slower than Java 6. Checking enormous sequences is slow (a 2GB sequence takes about 20 seconds in Java 5, 2 seconds in Java 6). Set this parameter to true to check only the first and last 1,000,000 residues which catches almost all invalid cases and is much faster on enormous sequences.- Returns:
- null if the sequence residues are all valid, or if the sequence contains invalid residues a message describing the first invalid residue is returned.
-
removeInvalidResidues
public static CharSequence removeInvalidResidues(CharSequence sequence, SequenceType sequenceType, boolean allowGaps) Get a sequence string identical tosequenceexcept that any invalid residues are removed. Gaps are only removed if allowGaps is false. All valid characters remain unchanged (they maintain their original case and there are no U->T replacements for nucleotides.)- Parameters:
sequence- a string of residues that may or may not be valid residuessequenceType- the type of residues in sequenceallowGaps- if this is true, then gaps are not removed.- Returns:
- a sequence string identical to
sequenceexcept that any invalid residues are removed. If there are no invalid residues,sequenceis returned. - Throws:
OutOfMemoryError- if a sequence comtains invalid residues and a valid version of the sequence cannot fit in memory
-
getValidSequence
Replace any invalid bases/residues in the given sequence document with ambiguity symbols.- Parameters:
sequenceDocument- sequence document to replace the invalid bases inallowGaps- whether gaps are allowed (if false they will be replaced with ambiguity symbols)- Returns:
- the version of the sequence string with the invalid bases/residues replaced with ambiguity symbols
-
getValidSequence
public static CharSequence getValidSequence(SequenceDocument sequenceDocument, boolean allowGaps, boolean replaceWithGaps) Replace any invalid bases/residues in the given sequence document with ambiguity symbols or gaps.- Parameters:
sequenceDocument- sequence document to replace the invalid bases inallowGaps- whether gaps are allowed (if false they will be replaced with ambiguity symbols)replaceWithGaps- whether invalid bases/residues should be replaced with gaps - should only be done if sequence is in an alignment- Returns:
- the version of the sequence string with the invalid bases/residues replaced with ambiguity symbols
- Throws:
IllegalArgumentException- if allowGaps is false but replaceWithGaps is true- Since:
- API 4.20 (Geneious 5.2.0)
-
asRna
Views an underlying (nucleotide) CharSequence as RNA by dynamically translating 'T's to 'U's and 't's to 'u's It is guaranteed that if charSequence instanceof SequenceCharSequence, the returned value will also be instanceof SequenceCharSequence (but it may not support log-time modifications).- Parameters:
nucleotideCharSequence- A nucleotide sequence which may already have some RNA residues; it is not guaranteed that it is checked whether the sequence contains invalid residues. Must not be null.- Returns:
- A CharSequence with the same sequence of characters as charSequence, except that 'T's are replaced with 'U's and 't's are replaced with 'u's
-
asDna
Views an underlying (nucleotide) CharSequence as DNA by dynamically translating 'U's to 'T's and 'u's to 't's It is guaranteed that if charSequence instanceof SequenceCharSequence, the returned value will also be instanceof SequenceCharSequence (but it may not support log-time modifications).- Parameters:
nucleotideCharSequence- A nucleotide sequence which may already have some DNA residues; it is not guaranteed that it is checked whether the sequence contains invalid residues. Must not be null.- Returns:
- A CharSequence with the same sequence of characters as charSequence, except that 'U's are replaced with 'T's and 'u's are replaced with 't's
-
asTranslation
@Deprecated public static CharSequence asTranslation(CharSequence nucleotideCharSequence, GeneticCode geneticCode) Deprecated. -
asTranslation
public static CharSequence asTranslation(CharSequence nucleotideCharSequence, GeneticCode geneticCode, boolean translateFirstCodonUsingFirstCodonTable) Views an underlying (nucleotide) CharSequence as its translation. If the CharSequence is not a multiple of 3, the extra 1 or 2 characters are ignored. The translated sequence will have length nucleotideCharSequence.length()/3. If the nucleotide sequence contains unknown nucleotide characters, these are treated as unknown states and the corresponding translated site will also be the unknown state (?) unless the nucleotide base would not affect the translation (e.g. the 3rd base in some triplets). The concrete type of the return value is not guaranteed. The specified charSequence must not change after it was passed to this method, but it is not guaranteed that violations of this contract will be detected.- Parameters:
nucleotideCharSequence- A nucleotide sequence which may be dna, rna or a mixture. Must not be null and must be immutable. Must not contain gaps.geneticCode- the genetic code to use for the translation. Must not be null.translateFirstCodonUsingFirstCodonTable- each genetic code specifies a set of codons which get translated as M if they are the first codon even though they normally wouldn't translate as an M when occurring elsewhere a coding region. If this parameter is true the first codon will be translated using this alternative translation table for the genetic code.- Returns:
- A CharSequence which is a translation of nucleotideCharSequence
- Throws:
IllegalArgumentException- if nucleotideCharSequence contains gaps.NullPointerException- if nucleotideCharSequence or geneticCode is null.- Since:
- API 4.41 (Geneious 5.4.1)
-
reverseComplement
Provides a dynamic reverse complement view onto a nucletoide CharSequence. For performance, it is not guaranteed whether the charSequence will be checked for invalid residues. If an invalid nucleotide CharSequence is passed in, arbitrary nondeterministic behaviour may occur at any later time, such as e.g. unchecked exceptions thrown from
CharSequence.charAt(int).It is guaranteed that if charSequence instanceof SequenceCharSequence, the returned value will also be instanceof SequenceCharSequence (but it may not support log-time modifications).
Attention: Unlike
This method may be slow on sequences which do not contain a T or U near the start of the sequence as it needs to scan through the sequence to determine if it is RNA or DNA. Consider usingUtils.reverseComplement(String), this method preserves case and doesn't remove gaps. To remove gaps and convert the sequence to upper case, useremoveGaps(CharSequence)andCharSequenceUtilities.asUpperCase(CharSequence).reverseComplementAsDna(CharSequence)for better performance.- Parameters:
charSequence- The charSequence for which to construct a reverse complement view.- Returns:
- A reverse complement view onto charSequence, with case and gaps preserved.
- See Also:
-
reverseComplementAsDna
Similar toreverseComplementexcept that the result will be returned as DNA even if the input sequence is RNA. This implementation is more efficient thanreverseComplement(CharSequence)because it does not need to check the input sequence data type.- Parameters:
charSequence- The charSequence for which to construct a reverse complement view.- Returns:
- A reverse complement view onto charSequence, with case and gaps preserved, but RNA converted to DNA
- Since:
- API 4.202100 (Geneious 2021.0.0)
-
isPredominantlyRna
Checks whether a sequence is predominantly RNA (rather than DNA). Same asUtils.isPredominantlyRNA(CharSequence, int), but more efficient onSequenceCharSequences.- Parameters:
charSequence- A charSequence that may contain DNA or RNA characters.maximumNonGapsToLookAt- Maximum number of non-gap residues to look at before making a decision; Pass in Integer.MAX_VALUE to look at all residues- Returns:
- true of the non-gap residues of charSequence are predominantly RNA
-
isRna
Checks whether a sequence is RNA (rather than DNA) based on whether the sequence contains either a T/t or a U/u first. If it contains neither T/t nor U/u, this method returns false.- Parameters:
charSequence- A charSequence that may contain DNA or RNA characters.- Returns:
- true if this charSequence is RNA (rather than DNA)
- See Also:
-
isRna
Checks whether a sequence is RNA (rather than DNA) based on whether the sequence contains either a T/t or a U/u first. If it contains neither T/t nor U/u, this method returns false.- Parameters:
charSequence- A charSequence that may contain DNA or RNA characters.maxNucleotidesToCheck- the maximum number of nucleotides/gaps to check (excluding leading/trailing gaps) before giving up and calling it DNA if no T or U is found.- Returns:
- true if this charSequence is RNA (rather than DNA)
- Since:
- API 4.202200 (Geneious 2022.0.0)
- See Also:
-
removeGaps
Constructs a sequence without gaps ('-') from a specified sequence that potentially has gaps. If the specified sequence does contain gaps, then a gapless copy is returned. Otherwise, the original charSequence is returned. It is guaranteed that CharSequenceUtilities.equals(removeGaps(cs), cs.toString().replace("-", "")) for any CharSequence cs that fulfills its contract.- Parameters:
charSequence- A nucleotide or amino acid sequence, potentially with gaps ('-')- Returns:
- A CharSequence that contains the same sequence of characters but without the gaps ('-'). Returns charSequence if charSequence doesn't contain any gaps.
-
getLeadingGapsLength
Returns the start index of the non-gap regions in the specified charSequence, i.e. the length of the longset prefix of charSequence that contains only '-' characters.- Parameters:
charSequence- A CharSequence that may contain some leading gap characters '-'- Returns:
- The length of the longest prefix of charSequence that contains only '-' characters.
-
getTrailingGapsLength
Get the number of trailing gap ('-') characters in the sequence.- Parameters:
charSequence- A CharSequence that may have trailing gap characters.- Returns:
- the number of trailing gap ('-') characters in the sequence or 0 if the sequence is entirely gaps.
-
getTrailingGapsStartIndex
Returns the end index of the non-gap regions in the specified charSequence. This is identical to charSequence.length() minus the length of the longest suffix of charSequence that consists only of '-', except when charSequence consists only of '-', in which case this method returns charSequence.length() because there are no non-gap regions. In other words, in a sequence that consists only of gaps, all gaps are considered leading rather than trailing gaps, i.e. the non-gap region is considered to start just beyond the end of the sequence.- Parameters:
charSequence- A CharSequence that may contain some leading gap characters '-'- Returns:
- 1+the index of the last nongap character in charSequence, or charSequence.length() if charSequence consists only of gaps
-
getAlphabet
Get the Alphabet of a sequence.- Parameters:
sequence- a SequenceDocument to get the alphabet for.- Returns:
- Alphabet of sequence
-
getSequenceType
Get the (jebl) sequence type.- Parameters:
sequence- a SequenceDocument to get the sequence type of.- Returns:
- type of sequence
- Throws:
IllegalArgumentException- if sequence is not either a NucleotideSequenceDocument or a AminoAcidSequenceDocument.
-
getSequenceType
Examines a document and determines what the (jebl) sequence type (or types) of the document is (or are), and returns it (or them).
Always returns a List<SequenceType> of size 0, 1 or 2.- Size 0: when the given document was a type that could have either
SequenceType.AMINO_ACIDorSequenceType.NUCLEOTIDEor both, and that document has no sequences at all, for example an emptySequenceListDocument - Size 1: when the given document just contains a single sequence or is of a type where the SequenceType is always known, e,g a
NucleotideSequenceDocumentor aSequenceAlignmentDocument. - Size 2: when the given document has sequences of both types, e.g. a
SequenceListDocumentwith sequences of both types.
- Parameters:
document- the document to determine the SequenceType of- Returns:
- a list containing the sequence type or types of the given document.
- Throws:
IllegalArgumentException- if the given document type wasn't a valid type to determine the SequenceType of.- Since:
- API 4.610 (Geneious 6.1.0)
- Size 0: when the given document was a type that could have either
-
getAlphabet
- Parameters:
documents- the documents to get the alphabet for- Returns:
- The alphabet that all these documents have in common, or null if they are not all the same alphabet or if any of the documents have multiple alphabets
- Throws:
IllegalArgumentException- if any of the documents aren't a type of sequence (nucleotide, protein, sequence list or alignment)- Since:
- API 4.1010 (Geneious 10.1.0)
-
toHTMLFragment
Generate a HTML fragment that summarises a sequence, including the sequence string. If the sequence is longer than a certain threshold X, then only the first X residues are shown.- Parameters:
sequence- a SequenceDocumentadditionalContent- additional content to include- Returns:
- the html formatted summary
-
asJeblSequence
Convert from a Geneious sequence to a jebl sequence.- Parameters:
sequence- a Geneious sequence- Returns:
- sequence as a jebl sequence.
-
asJeblSequences
Convert a set of Geneious sequences to jebl sequences.- Parameters:
sequences- Geneious sequences- Returns:
- the Geneious sequences as jebl sequences
-
asJeblSequences
Convert a set of Geneious sequences to jebl sequences.- Parameters:
sequences- Geneious sequences- Returns:
- the Geneious sequences as jebl sequences
-
asJeblAlignment
Convert a list of (aligned) Geneious sequences to a jebl alignmnent- Parameters:
sequences- aligned Geneious sequences- Returns:
- the Geneious sequences as a jebl alignmnent
-
createSequenceCopy
Creates a copy of the original sequence if necessary. If the sequence is an immutable sequence (ImmutableSequence) then it is not copied and is just returned from this method.- Parameters:
original- the original sequence- Returns:
- a new sequence document or the original sequence if the original sequence is immutable.
- Since:
- API 4.11 (Geneious 5.0)
- See Also:
-
createSequenceCopyEditable
Creates a copy of the original sequence that is editable.- Parameters:
original- the original sequence- Returns:
- a new sequence document
- Since:
- API 4.11 (Geneious 5.0)
- See Also:
-
getSequenceAnnotationsIncludingImmutableSequencesTrims
public static List<SequenceAnnotation> getSequenceAnnotationsIncludingImmutableSequencesTrims(SequenceDocument sequence) Gets all the annotations on the given sequence. Additionally if it is anImmutableSequencewithImmutableSequence.getLeadingTrimLength()orImmutableSequence.getTrailingTrimLength()>0 then annotations are created to represent these trims.- Parameters:
sequence- the sequence to get annotations from- Returns:
- the annotations from the sequence
- Since:
- API 4.52 (Geneious 5.5.2)
-
asJeblSequence
public static Sequence asJeblSequence(SequenceAlignmentDocument.ReferencedSequence referencedSequence, SequenceDocument sequence) throws com.biomatters.geneious.publicapi.plugin.DocumentOperationException Convert from a Geneious sequence to a jebl sequence.- Parameters:
referencedSequence- original referenced sequence to copy additional fields from. May be null.sequence- a Geneious sequence- Returns:
- sequence as a jebl sequence
- Throws:
com.biomatters.geneious.publicapi.plugin.DocumentOperationException- when the referenced sequence cannot be loaded- Since:
- API 4.700 (Geneious 7.0.0)
-
asJeblSequence
@Deprecated public static Sequence asJeblSequence(AnnotatedPluginDocument referenceDocument, SequenceDocument sequence) Convert from a Geneious sequence to a jebl sequence.- Parameters:
referenceDocument- original AnnotatedPluginDocument to copy additional fields from. May be null.sequence- a Geneious sequence- Returns:
- sequence as a jebl sequence
- Since:
- API 4.43 (Geneious 5.4.3)
-
replaceQuestionMarksWithMaximalAmbiguitySymbol
public static String replaceQuestionMarksWithMaximalAmbiguitySymbol(SequenceType sequenceType, String sequence) get a version of a sequence string with any question marks replaces with N (for nucleotide sequences) or X (for protein sequences)- Parameters:
sequenceType- sequence type of sequencesequence- sequence string- Returns:
- version of sequence with any question marks replaces with N (for nucleotide sequences) or X (for protein sequences)
-
getMaximalAmbiguitySymbol
get the code for the state in this sequence type which represents a base/residue that is completely unknown- Parameters:
sequenceType-- Returns:
-
getAnnotationsOfType
public static List<SequenceAnnotation> getAnnotationsOfType(List<SequenceAnnotation> annotations, String type) Get all annotations in list matching the given type- Parameters:
annotations- annotationstype- type of annotations to get- Returns:
- all annotations in document matching the given type
-
getAnnotationsOfType
@Deprecated public static List<SequenceAnnotation> getAnnotationsOfType(SequenceDocument document, String type) Deprecated.Get all annotations in document matching the given type- Parameters:
document- document to get annotations formtype- type of annotations to get- Returns:
- all annotations in document matching the given type
-
getAnnotationsOfType
public static List<SequenceAnnotation> getAnnotationsOfType(SequenceDocument document, String type, boolean returnAnnotationsInTracks) Get all annotations in document matching the given type. WARNING: this list may not include all SequenceAnnotations represented as annotations in the sequence viewer. One such case is with trim annotations, which should be found usinggetSequenceAnnotationsIncludingImmutableSequencesTrims(com.biomatters.geneious.publicapi.documents.sequence.SequenceDocument).
This may also return annotations that are not visible in the sequence viewer, such asSequenceAnnotation.TYPE_EXTRACTED_REGION.- Parameters:
document- document to get annotations formtype- type of annotations to getreturnAnnotationsInTracks- true iff we want annotations fromSequenceTracksas well as those annotated directly on a document- Returns:
- all annotations in document matching the given type
- Since:
- API 4.50 (Geneious 5.5.0)
-
getSequenceAndTrackAnnotations
public static Iterable<SequenceAnnotation> getSequenceAndTrackAnnotations(SequenceDocument sequence) A convenience method to get all annotations on the sequence and all annotations on allSequenceTrackson this sequence. Most code should instead manually load tracks on demand since they may be too large to fit into memory. To get tracks on a sequence useSequenceTrack.getTrackManager(com.biomatters.geneious.publicapi.documents.sequence.SequenceDocument)followed bySequenceTrack.Manager.getTracks().
Iterating over the returned value may throw aRuntimeExceptionwhose cause is anXMLSerializationExceptionif there is insufficient memory available to load the annotations. When runningDocumentOperationsorSequenceAnnotationGeneratorscore Geneious will automatic catch such exceptions and display a nice message to the user.- Parameters:
sequence- the sequence to get annotations for- Returns:
- iterator containing the annotations. Will not return null.
- Since:
- API 4.50 (Geneious 5.5.0)
-
createSequenceCopyAdjustedForGapInsertion
public static SequenceDocument createSequenceCopyAdjustedForGapInsertion(SequenceDocument sequenceDocument, CharSequence gappedSequenceCharacters) Creates a copy of the given sequence with annotations, sequence residues, and chromatogram values adjusted to account for gap insertion. Note, the returned copy does not create gapped versions ofSequenceTracks. Tracks are instead automatically propagated from referenced documents in alignments.- Parameters:
sequenceDocument- a sequence to insert gaps into. If the sequence alreayd contains gaps, the gaps are removed firstgappedSequenceCharacters- the sequence characters to appear in the new gapped sequence. The positions of gaps in this character sequence determine how annotations and chromatograms are adjusted.- Returns:
- a copy of sequenceDocument adjusted for gap insertion. This is always a DefaultSequenceDocument but this method isn't declared to return that for API backwards compatibility reasons
- See Also:
-
createSequenceCopyAdjustedForGapInsertion
public static SequenceDocument createSequenceCopyAdjustedForGapInsertion(SequenceDocument sequenceDocument, CharSequence gappedSequenceCharacters, boolean includeTracks) Creates a copy of the given sequence with annotations, sequence residues, and chromatogram values adjusted to account for gap insertion.- Parameters:
sequenceDocument- a sequence to insert gaps into. If the sequence alreayd contains gaps, the gaps are removed firstgappedSequenceCharacters- the sequence characters to appear in the new gapped sequence. The positions of gaps in this character sequence determine how annotations and chromatograms are adjusted.includeTracks- true if tracks should also be copied. If this is intended for use with an alignment which references the original documents, this should be false as alignment documents propagate tracks on demand from referenced documents.- Returns:
- a copy of sequenceDocument adjusted for gap insertion. This is always a DefaultSequenceDocument but this method isn't declared to return that for API backwards compatibility reasons
- Since:
- API 4.202000 (Geneious 2020.0.0)
- See Also:
-
concatenateSequences
public static SequenceDocument concatenateSequences(List<? extends SequenceDocument> sequences, boolean circular, int indexOfDocumentToUseForOrigin, ProgressListener progressListener) throws com.biomatters.geneious.publicapi.plugin.DocumentOperationException Concatenate a list of sequence documents. All sequences must be of the same type (all nucleotide or all amino acid). For circular results, indexOfDocumentToUse may be used to specify which input sequence should be used to determine the origin for the result. If the specified input sequence is circular and has an annotated origin, this position will be used; otherwise, the start of the specified sequence will be the origin of the result. If circular is false, indexOfDocumentToUse must be -1.- Parameters:
sequences- sequence documents to concatenatecircular- if true, the result will be circularindexOfDocumentToUseForOrigin- index of document to use for the origin (must be -1 for linear results)progressListener-- Returns:
- concatenated sequence
- Throws:
com.biomatters.geneious.publicapi.plugin.DocumentOperationException- Since:
- API 4.1100 (Geneious 11.0.0)
-
getSequences
public static List<? extends SequenceDocument> getSequences(AnnotatedPluginDocument[] documents, SequenceDocument.Alphabet alphabet, ProgressListener progressListener) throws com.biomatters.geneious.publicapi.plugin.DocumentOperationException get all the sequences out of a set of AnnotatedPluginDocuments that may wrap SequenceDocuments, SequenceListDocuments or SequenceAlignmentDocuments. For large sequence lists (SequenceListOnDisk) and genome sized sequences (those longer thanSequenceDocument.GENOME_SEQUENCE_THRESHOLD) in other sequence lists, these are only loaded into memory on demand to ensure this method doesn't use excessive memory. If this method is potentially called on thousands of documents, thengetSequencesWithoutImmediateLoadingshould be considered instead.- Parameters:
documents- documents to get the sequences out ofalphabet- alphabet the sequences need to be to be includedprogressListener- for notifying the caller about progress of this method and for cancelling.- Returns:
- all the sequences. Sequences are ordered by the AnnotatedPluginDocument they are in, and then by their index in that document.
- Throws:
com.biomatters.geneious.publicapi.plugin.DocumentOperationException- if there is a problem getting the PluginDocument out of an AnnotatedPluginDocument or if the progress listener cancels the request.
-
getSequences
public static List<? extends SequenceDocument> getSequences(List<AnnotatedPluginDocument> documents, SequenceDocument.Alphabet alphabet, ProgressListener progressListener) throws com.biomatters.geneious.publicapi.plugin.DocumentOperationException get all the sequences out of a set of AnnotatedPluginDocuments that may wrap SequenceDocuments, SequenceListDocuments or SequenceAlignmentDocuments. For large sequence lists (SequenceListOnDisk) and genome sized sequences (those longer thanSequenceDocument.GENOME_SEQUENCE_THRESHOLD) in other sequence lists, these are only loaded into memory on demand to ensure this method doesn't use excessive memory. If this method is potentially called on thousands of documents, thengetSequencesWithoutImmediateLoadingshould be considered instead.- Parameters:
documents- documents to get the sequences out ofalphabet- alphabet the sequences need to be to be includedprogressListener- for notifying the caller about progress of this method and for cancelling.- Returns:
- all the sequences
- Throws:
com.biomatters.geneious.publicapi.plugin.DocumentOperationException- if there is a problem getting the PluginDocument out of an AnnotatedPluginDocument or if the progress listener cancels the request.- Since:
- API 4.700 (Geneious 7.0.0)
-
getSequencesWithoutImmediateLoading
public static Collection<? extends SequenceDocument> getSequencesWithoutImmediateLoading(AnnotatedPluginDocument[] documents, SequenceDocument.Alphabet alphabet) throws com.biomatters.geneious.publicapi.plugin.DocumentOperationException LikegetSequences(com.biomatters.geneious.publicapi.documents.AnnotatedPluginDocument[], com.biomatters.geneious.publicapi.documents.sequence.SequenceDocument.Alphabet, jebl.util.ProgressListener)but doesn't require each plugin document to be in memory as long as this Collection is around. The trade-off is that the sequences can only be accessed sequentially (hence the Collection return type of this method). Also the Collection does not support removal.
Since this Collection doesn't store the sequences immediately, DocumentOperationExceptions may be thrown down the line. Such a situation may be caught by surrounding the given iteration withtry {... } catch (and then handling the exception from there. Note that usingRuntimeDocumentOperationExceptione)getSequences(com.biomatters.geneious.publicapi.documents.AnnotatedPluginDocument[], com.biomatters.geneious.publicapi.documents.sequence.SequenceDocument.Alphabet, jebl.util.ProgressListener)is preferable to using getSequencesWithoutImmediateLoading when dealing with under a thousand documents.- Parameters:
documents- documents to get the sequences out ofalphabet- alphabet the sequences need to be to be included- Returns:
- all the sequences whose iterator may throw a RuntimeDocumentOperationException
- Throws:
com.biomatters.geneious.publicapi.plugin.DocumentOperationException- if one or more of the documents has more than Integer.MAX_VALUE sequences.- Since:
- API 4.610 (Geneious 6.1.0)
- See Also:
-
getOriginalIndex
Gets the original numbering of the given index if it is covered by aSequenceAnnotation.TYPE_EXTRACTED_REGIONannotation.- Parameters:
sequence- the sequence this index belongs to.index- the index to get the original numbering for.- Returns:
- 'translated' index or the original index if no other numbering can be found.
- Since:
- API 4.900 (Geneious 9.0.0)
-
getNumberOfSequences
public static long getNumberOfSequences(List<AnnotatedPluginDocument> documents, SequenceDocument.Alphabet alphabet) Gets the total number of nucleotide or amino acid sequences contained in the given documents which may be individual sequences, sequence lists, or alignments/contigs.- Parameters:
documents- the documents to get the number of sequences inalphabet- the alphabet (nucleotide or amino acid) of the sequences to count.- Returns:
- the total number of nucleotide sequences or amino acid contained in the given documents
- Since:
- API 4.40 (Geneious 5.4.0)
-
getNumberOfSequences
public static long getNumberOfSequences(AnnotatedPluginDocument document, SequenceDocument.Alphabet alphabet) Gets the total number of nucleotide or amino acid sequences contained in the given document which may be an individual sequence, sequence list, or alignment/contig.- Parameters:
document- the document to get the number of sequences inalphabet- the alphabet (nucleotide or amino acid) of the sequences to count.- Returns:
- the total number of nucleotide or amino acid sequences contained in the given document
- Since:
- API 4.40 (Geneious 5.4.0)
-
generateConsensusSequence
@Deprecated public static SequenceDocument generateConsensusSequence(SequenceAlignmentDocument alignment, ProgressListener progressListener) Deprecated.Generates a consensus sequence for an alignment using default consensus settings. Note that the returned sequence may contain gaps. If it is to be used as a stand-alone sequence, thenSequenceExtractionUtilities.removeGaps(com.biomatters.geneious.publicapi.documents.sequence.SequenceDocument)should be used.- Parameters:
alignment- the alignment to generate the consensus sequence forprogressListener- for reporting progress can cancelling.- Returns:
- a sequence equal in length to the alignment. The sequence may contain gaps. May return null if progressListener requests this get cancelled.
- Since:
- API 4.60 (Geneious 5.6.0)
-
generateConsensus
public static SequenceDocument generateConsensus(SequenceAlignmentDocument alignment, ProgressListener progressListener) throws com.biomatters.geneious.publicapi.plugin.DocumentOperationException Generates a consensus sequence for an alignment using default consensus settings. Note that the returned sequence may contain gaps. If it is to be used as a stand-alone sequence, thenSequenceExtractionUtilities.removeGaps(com.biomatters.geneious.publicapi.documents.sequence.SequenceDocument)should be used. To generate consensus sequences with non-default options, use PluginUtilities.getDocumentOperation("Generate_Consensus"). Note that this operation generates an sequence with gaps removed by default.- Parameters:
alignment- the alignment to generate the consensus sequence forprogressListener- for reporting progress can cancelling.- Returns:
- a sequence equal in length to the alignment. The sequence may contain gaps. Will not return null
- Throws:
com.biomatters.geneious.publicapi.plugin.DocumentOperationException- if the consensus can't be generated because there is insufficient free memory.com.biomatters.geneious.publicapi.plugin.DocumentOperationException.Canceled- if the progressListener requests the consensus generation be cancelled.- Since:
- API 4.610 (Geneious 6.1.0)
-
getBlastAlignmentText
public static String getBlastAlignmentText(SequenceAlignmentDocument alignment, boolean geneiousFriendly) Formats the given alignment in BLAST text format- Parameters:
alignment- alignment to formatgeneiousFriendly- whether to format the alignment in an html-formatted "Geneious friendly" way that is useful generally for alignments and not just for BLAST output- Returns:
- alignment represented in BLAST text format
- Since:
- API 4.700 (Geneious 7.0.0)
-
alignmentFromJeblSequences
public static DefaultAlignmentDocument alignmentFromJeblSequences(String name, List<Sequence> jeblSequences) Converts the given alignment of Jebl sequences into a DefaultAlignmentDocument- Parameters:
name- name for alignmentjeblSequences- aligned jebl sequences- Returns:
- a DefaultAlignmentDocument representing the given alignment.
- Since:
- API 4.700 (Geneious 7.0.0)
-
createNewDocumentsByTransformingSequences
public static List<AnnotatedPluginDocument> createNewDocumentsByTransformingSequences(List<AnnotatedPluginDocument> sourceDocuments, SequenceDocument.Transformer transformer, ProgressListener progressListener, String newSequenceOrDocumentNamePrefix) throws com.biomatters.geneious.publicapi.plugin.DocumentOperationException Transforms the sequence(s) in each input document and returns a new document corresponding to each input document.- Parameters:
sourceDocuments- the source documents containing sequences to transform. These may be SequenceDocuments or SequenceListDocuments or SequenceAlignmentDocumentstransformer- the transformer for transforming each sequenceprogressListener- for reporting progress and cancelingnewSequenceOrDocumentNamePrefix- an optional prefix to assign to the name of each newly generated document. May be an empty String to leave names unchanged.- Returns:
- the new documents
- Throws:
com.biomatters.geneious.publicapi.plugin.DocumentOperationException- if documents can't be loaded, or if the input documents are not SequenceDocuments or SequenceListDocuments or SequenceAlignmentDocuments- Since:
- API 4.701 (Geneious 7.0.1)
-
createNewDocumentsByTransformingSequences
public static List<AnnotatedPluginDocument> createNewDocumentsByTransformingSequences(List<AnnotatedPluginDocument> sourceDocuments, SequenceDocument.Transformer transformer, ProgressListener progressListener, String newSequenceOrDocumentNamePrefix, String newSequenceOrDocumentNameSuffix) throws com.biomatters.geneious.publicapi.plugin.DocumentOperationException Transforms the sequence(s) in each input document and returns a new document corresponding to each input document.- Parameters:
sourceDocuments- the source documents containing sequences to transform. These may be SequenceDocuments or SequenceListDocuments or SequenceAlignmentDocumentstransformer- the transformer for transforming each sequenceprogressListener- for reporting progress and cancelingnewSequenceOrDocumentNamePrefix- an optional prefix to assign to the name of each newly generated document. May be an empty String to leave names unchanged.newSequenceOrDocumentNameSuffix- an optional suffix to assign to the name of each newly generated document. May be an empty String to leave names unchanged.- Returns:
- the new documents
- Throws:
com.biomatters.geneious.publicapi.plugin.DocumentOperationException- if documents can't be loaded, or if the input documents are not SequenceDocuments or SequenceListDocuments or SequenceAlignmentDocuments- Since:
- API 4.201920 (Geneious 2019.2.0)
-
getIntervalBasedOnExtractionAnnotation
public static SequenceAnnotationInterval getIntervalBasedOnExtractionAnnotation(SequenceDocument sequenceDocument, SequenceAnnotationInterval interval, boolean mapToOriginal) Gets the extraction annotations from the sequence document and maps the interval to either the original sequence or the result sequence, depending on the value ofmapToOriginal- Parameters:
sequenceDocument- the document to get the extractionAnnotations frominterval- the interval to re-mapmapToOriginal- whether to map this interval to the corresponding bit on the original or to the corresponding bit on the result- Returns:
- a new interval that represents the given interval on either the original or result document, return parameter interval back if can not find mapping
- Since:
- API 4.1000 (Geneious 10.0.0)
-
getIndexBasedOnExtractionAnnotation
public static Integer getIndexBasedOnExtractionAnnotation(SequenceDocument sequenceDocument, int index, boolean mapToOriginal) Gets the extraction annotations from the sequence document and maps a residue index to a residue index on either the original sequence or the result sequence, depending on the value ofmapToOriginal- Parameters:
sequenceDocument- the document to get the extractionAnnotations fromindex- the 1-based residue position in the sequence to re-map.mapToOriginal- whether to map this interval to the corresponding bit on the original or to the corresponding bit on the result- Returns:
- a new index that represents the given index on either the original or result document, return null if the index can't be mapped.
- Since:
- API 4.1000 (Geneious 10.0.0)
-
getSequenceCharSequenceHash
- Parameters:
charSequence- a sequence returned fromSequenceDocument.getCharSequence()- Returns:
- a hexadecimal encoded MD5 hash of the nucleotides or amino acids in a sequence
- Since:
- API 4.202500 (Geneious 2025.0.0)
-
getSequenceHash
- Parameters:
sequence- sequence to get a MD5 hash of- Returns:
- a hexadecimal encoded MD5 hash of the nucleotides or amino acids in this sequence
- Since:
- API 4.202500 (Geneious 2025.0.0)
-
getSequenceHash
- Parameters:
sequence- sequence to get a MD5 hash ofintervals- residue (nucleotide or amino acid) intervals within the sequence- Returns:
- a hexadecimal encoded MD5 hash of the nucleotides or amino acids within the specified intervals in this sequence
- Since:
- API 4.202500 (Geneious 2025.0.0)
-
asJeblSequence(SequenceAlignmentDocument.ReferencedSequence, SequenceDocument)