Class SequenceUtilities
- java.lang.Object
-
- com.biomatters.geneious.publicapi.utilities.SequenceUtilities
-
public final class SequenceUtilities extends java.lang.Object
A noninstantiable class providing static methods for common tasks associated with nucleotide and protein sequences.- See Also:
SequenceDocument
,SequenceExtractionUtilities
-
-
Method Summary
All Methods Static Methods Concrete Methods Deprecated Methods Modifier and Type Method Description static DefaultAlignmentDocument
alignmentFromJeblSequences(java.lang.String name, java.util.List<jebl.evolution.sequences.Sequence> jeblSequences)
Converts the given alignment of Jebl sequences into a DefaultAlignmentDocumentstatic java.lang.CharSequence
asDna(java.lang.CharSequence nucleotideCharSequence)
Views an underlying (nucleotide) CharSequence as DNA by dynamically translating 'U's to 'T's and 'u's to 't's It is guaranteed that if charSequence instanceof SequenceCharSequence, the returned value will also be instanceof SequenceCharSequence (but it may not support log-time modifications).static jebl.evolution.alignments.Alignment
asJeblAlignment(java.util.List<SequenceDocument> sequences)
Convert a list of (aligned) Geneious sequences to a jebl alignmnentstatic jebl.evolution.sequences.Sequence
asJeblSequence(AnnotatedPluginDocument referenceDocument, SequenceDocument sequence)
static jebl.evolution.sequences.Sequence
asJeblSequence(SequenceAlignmentDocument.ReferencedSequence referencedSequence, SequenceDocument sequence)
Convert from a Geneious sequence to a jebl sequence.static jebl.evolution.sequences.Sequence
asJeblSequence(SequenceDocument sequence)
Convert from a Geneious sequence to a jebl sequence.static java.util.List<jebl.evolution.sequences.Sequence>
asJeblSequences(SequenceDocument... sequences)
Convert a set of Geneious sequences to jebl sequences.static java.util.List<jebl.evolution.sequences.Sequence>
asJeblSequences(java.util.List<SequenceDocument> sequences)
Convert a set of Geneious sequences to jebl sequences.static java.lang.CharSequence
asRna(java.lang.CharSequence nucleotideCharSequence)
Views an underlying (nucleotide) CharSequence as RNA by dynamically translating 'T's to 'U's and 't's to 'u's It is guaranteed that if charSequence instanceof SequenceCharSequence, the returned value will also be instanceof SequenceCharSequence (but it may not support log-time modifications).static java.lang.CharSequence
asTranslation(java.lang.CharSequence nucleotideCharSequence, jebl.evolution.sequences.GeneticCode geneticCode)
Deprecated.static java.lang.CharSequence
asTranslation(java.lang.CharSequence nucleotideCharSequence, jebl.evolution.sequences.GeneticCode geneticCode, boolean translateFirstCodonUsingFirstCodonTable)
Views an underlying (nucleotide) CharSequence as its translation.static SequenceDocument
concatenateSequences(java.util.List<? extends SequenceDocument> sequences, boolean circular, int indexOfDocumentToUseForOrigin, jebl.util.ProgressListener progressListener)
Concatenate a list of sequence documents.static java.lang.String
containsInvalidResidues(SequenceDocument sequenceDocument, boolean allowGaps, boolean fastIncompleteCheck)
Checks if a sequence contain invalid sequence residues.static java.lang.String
containsInvalidResidues(java.lang.CharSequence sequenceResidues, SequenceDocument.Alphabet alphabet, boolean allowGaps, boolean fastIncompleteCheck)
Checks if a sequence contain invalid sequence residues.static java.lang.String
containsInvalidResidues(java.lang.CharSequence sequenceResidues, jebl.evolution.sequences.SequenceType sequenceType, boolean allowGaps, boolean fastIncompleteCheck)
Checks if a sequence contain invalid sequence residues.static java.util.List<AnnotatedPluginDocument>
createNewDocumentsByTransformingSequences(java.util.List<AnnotatedPluginDocument> sourceDocuments, SequenceDocument.Transformer transformer, jebl.util.ProgressListener progressListener, java.lang.String newSequenceOrDocumentNamePrefix)
Transforms the sequence(s) in each input document and returns a new document corresponding to each input document.static java.util.List<AnnotatedPluginDocument>
createNewDocumentsByTransformingSequences(java.util.List<AnnotatedPluginDocument> sourceDocuments, SequenceDocument.Transformer transformer, jebl.util.ProgressListener progressListener, java.lang.String newSequenceOrDocumentNamePrefix, java.lang.String newSequenceOrDocumentNameSuffix)
Transforms the sequence(s) in each input document and returns a new document corresponding to each input document.static SequenceDocument
createSequenceCopy(SequenceDocument original)
Creates a copy of the original sequence if necessary.static SequenceDocument
createSequenceCopyAdjustedForGapInsertion(SequenceDocument sequenceDocument, java.lang.CharSequence gappedSequenceCharacters)
Creates a copy of the given sequence with annotations, sequence residues, and chromatogram values adjusted to account for gap insertion.static SequenceDocument
createSequenceCopyAdjustedForGapInsertion(SequenceDocument sequenceDocument, java.lang.CharSequence gappedSequenceCharacters, boolean includeTracks)
Creates a copy of the given sequence with annotations, sequence residues, and chromatogram values adjusted to account for gap insertion.static DefaultSequenceDocument
createSequenceCopyEditable(SequenceDocument original)
Creates a copy of the original sequence that is editable.static DefaultSequenceDocument
createSequenceDocument(jebl.evolution.sequences.SequenceType sequenceType, java.lang.String name, java.lang.String description, java.lang.CharSequence sequenceString, java.util.Date creationDate)
Creates aDefaultNucleotideSequence
orDefaultAminoAcidSequence
depending on sequenceType.static SequenceDocument
generateConsensus(SequenceAlignmentDocument alignment, jebl.util.ProgressListener progressListener)
Generates a consensus sequence for an alignment using default consensus settings.static SequenceDocument
generateConsensusSequence(SequenceAlignmentDocument alignment, jebl.util.ProgressListener progressListener)
static SequenceDocument.Alphabet
getAlphabet(AnnotatedPluginDocument... documents)
static SequenceDocument.Alphabet
getAlphabet(SequenceDocument sequence)
Get the Alphabet of a sequence.static SequenceDocument.Alphabet
getAlphabet(jebl.evolution.sequences.SequenceType sequenceType)
Gets a Geneious alphabet type that is equivalent to a jebl library SequenceType.static java.util.List<SequenceAnnotation>
getAnnotationsOfType(SequenceDocument document, java.lang.String type)
static java.util.List<SequenceAnnotation>
getAnnotationsOfType(SequenceDocument document, java.lang.String type, boolean returnAnnotationsInTracks)
Get all annotations in document matching the given type.static java.util.List<SequenceAnnotation>
getAnnotationsOfType(java.util.List<SequenceAnnotation> annotations, java.lang.String type)
Get all annotations in list matching the given typestatic java.lang.String
getBlastAlignmentText(SequenceAlignmentDocument alignment, boolean geneiousFriendly)
Formats the given alignment in BLAST text formatstatic java.lang.String
getForwardRegexForSequence(java.lang.CharSequence querySequence, jebl.evolution.sequences.SequenceType sequenceType, boolean interpretAmbiguitiesInQuery)
static java.lang.String
getForwardRegexForSequence(java.lang.CharSequence querySequence, jebl.evolution.sequences.SequenceType sequenceType, boolean interpretAmbiguitiesInQuery, boolean interpretAmbiguitiesInTarget)
static java.lang.String
getForwardRegexForSequence(java.lang.CharSequence querySequence, jebl.evolution.sequences.SequenceType sequenceType, boolean interpretAmbiguitiesInQuery, boolean interpretAmbiguitiesInTarget, boolean allowExtraGapsInTarget)
Given a nucleotide or amino acid sequence, returns a regular expression that matches forward occurrences of this sequence in a larger sequence, i.e.static java.util.regex.Pattern
getForwardRegexPatternForSequence(java.lang.CharSequence sequence, jebl.evolution.sequences.SequenceType sequenceType, boolean interpretAmbiguitiesInQuery)
static java.util.regex.Pattern
getForwardRegexPatternForSequence(java.lang.CharSequence querySequence, jebl.evolution.sequences.SequenceType sequenceType, boolean interpretAmbiguitiesInQuery, boolean interpretAmbiguitiesInTarget)
Given a nucleotide or amino acid sequence string, returns a regular expression pattern that matches forward occurrences of this search string in a larger sequence string,static java.lang.Integer
getIndexBasedOnExtractionAnnotation(SequenceDocument sequenceDocument, int index, boolean mapToOriginal)
Gets the extraction annotations from the sequence document and maps a residue index to a residue index on either the original sequence or the result sequence, depending on the value ofmapToOriginal
static SequenceAnnotationInterval
getIntervalBasedOnExtractionAnnotation(SequenceDocument sequenceDocument, SequenceAnnotationInterval interval, boolean mapToOriginal)
Gets the extraction annotations from the sequence document and maps the interval to either the original sequence or the result sequence, depending on the value ofmapToOriginal
static int
getLeadingGapsLength(java.lang.CharSequence charSequence)
Returns the start index of the non-gap regions in the specified charSequence, i.e.static java.lang.String
getMaximalAmbiguitySymbol(jebl.evolution.sequences.SequenceType sequenceType)
get the code for the state in this sequence type which represents a base/residue that is completely unknownstatic long
getNumberOfSequences(AnnotatedPluginDocument document, SequenceDocument.Alphabet alphabet)
Gets the total number of nucleotide or amino acid sequences contained in the given document which may be an individual sequence, sequence list, or alignment/contig.static long
getNumberOfSequences(java.util.List<AnnotatedPluginDocument> documents, SequenceDocument.Alphabet alphabet)
Gets the total number of nucleotide or amino acid sequences contained in the given documents which may be individual sequences, sequence lists, or alignments/contigs.static int
getOriginalIndex(SequenceDocument sequence, int index)
Gets the original numbering of the given index if it is covered by aSequenceAnnotation.TYPE_EXTRACTED_REGION
annotation.static java.lang.Iterable<SequenceAnnotation>
getSequenceAndTrackAnnotations(SequenceDocument sequence)
A convenience method to get all annotations on the sequence and all annotations on allSequenceTracks
on this sequence.static java.util.List<SequenceAnnotation>
getSequenceAnnotationsIncludingImmutableSequencesTrims(SequenceDocument sequence)
Gets all the annotations on the given sequence.static java.util.List<? extends SequenceDocument>
getSequences(AnnotatedPluginDocument[] documents, SequenceDocument.Alphabet alphabet, jebl.util.ProgressListener progressListener)
get all the sequences out of a set of AnnotatedPluginDocuments that may wrap SequenceDocuments, SequenceListDocuments or SequenceAlignmentDocuments.static java.util.List<? extends SequenceDocument>
getSequences(java.util.List<AnnotatedPluginDocument> documents, SequenceDocument.Alphabet alphabet, jebl.util.ProgressListener progressListener)
get all the sequences out of a set of AnnotatedPluginDocuments that may wrap SequenceDocuments, SequenceListDocuments or SequenceAlignmentDocuments.static java.util.Collection<? extends SequenceDocument>
getSequencesWithoutImmediateLoading(AnnotatedPluginDocument[] documents, SequenceDocument.Alphabet alphabet)
LikegetSequences(com.biomatters.geneious.publicapi.documents.AnnotatedPluginDocument[], com.biomatters.geneious.publicapi.documents.sequence.SequenceDocument.Alphabet, jebl.util.ProgressListener)
but doesn't require each plugin document to be in memory as long as this Collection is around.static java.util.List<jebl.evolution.sequences.SequenceType>
getSequenceType(AnnotatedPluginDocument document)
Examines a document and determines what the (jebl) sequence type (or types) of the document is (or are), and returns it (or them).
Always returns a List<SequenceType> of size 0, 1 or 2.static jebl.evolution.sequences.SequenceType
getSequenceType(SequenceDocument sequence)
Get the (jebl) sequence type.static jebl.evolution.sequences.SequenceType
getSequenceType(SequenceDocument.Alphabet alphabet)
Gets a jebl library SequenceType that is equivalent to a Geneious alphabet.static int
getTrailingGapsLength(java.lang.CharSequence charSequence)
Get the number of trailing gap ('-') characters in the sequence.static int
getTrailingGapsStartIndex(java.lang.CharSequence charSequence)
Returns the end index of the non-gap regions in the specified charSequence.static java.lang.CharSequence
getValidSequence(SequenceDocument sequenceDocument, boolean allowGaps)
Replace any invalid bases/residues in the given sequence document with ambiguity symbols.static java.lang.CharSequence
getValidSequence(SequenceDocument sequenceDocument, boolean allowGaps, boolean replaceWithGaps)
Replace any invalid bases/residues in the given sequence document with ambiguity symbols or gaps.static boolean
isPredominantlyRna(java.lang.CharSequence charSequence, int maximumNonGapsToLookAt)
Checks whether a sequence is predominantly RNA (rather than DNA).static boolean
isRna(java.lang.CharSequence charSequence)
Checks whether a sequence is RNA (rather than DNA) based on whether the sequence contains either a T/t or a U/u first.static boolean
isRna(java.lang.CharSequence charSequence, int maxNucleotidesToCheck)
Checks whether a sequence is RNA (rather than DNA) based on whether the sequence contains either a T/t or a U/u first.static boolean
isStateAssignableFrom(jebl.evolution.sequences.State stateA, jebl.evolution.sequences.State stateB)
Same as stateA.getCanonicalStates().containsAll(stateB.getCanonicalStates()) except that for NucleotideStates and AminoAcidStates it caches the result.static java.lang.CharSequence
removeGaps(java.lang.CharSequence charSequence)
Constructs a sequence without gaps ('-') from a specified sequence that potentially has gaps.static java.lang.CharSequence
removeInvalidResidues(java.lang.CharSequence sequence, jebl.evolution.sequences.SequenceType sequenceType, boolean allowGaps)
Get a sequence string identical tosequence
except that any invalid residues are removed.static java.lang.String
replaceQuestionMarksWithMaximalAmbiguitySymbol(jebl.evolution.sequences.SequenceType sequenceType, java.lang.String sequence)
get a version of a sequence string with any question marks replaces with N (for nucleotide sequences) or X (for protein sequences)static java.lang.CharSequence
reverseComplement(java.lang.CharSequence charSequence)
Provides a dynamic reverse complement view onto a nucletoide CharSequence.static java.lang.CharSequence
reverseComplementAsDna(java.lang.CharSequence charSequence)
Similar toreverseComplement
except that the result will be returned as DNA even if the input sequence is RNA.static void
setOriginalResidueNumbering(EditableSequenceDocument document, int startIndex, boolean isReverse)
set the original residue numbering of a document the residue index of a document will appear shifted if the user has "show original residue numbers" selected in the sequence viewstatic java.lang.String
toHTMLFragment(SequenceDocument sequence, java.lang.String additionalContent)
Generate a HTML fragment that summarises a sequence, including the sequence string.
-
-
-
Method Detail
-
getForwardRegexForSequence
public static java.lang.String getForwardRegexForSequence(java.lang.CharSequence querySequence, jebl.evolution.sequences.SequenceType sequenceType, boolean interpretAmbiguitiesInQuery)
-
getForwardRegexForSequence
@Deprecated public static java.lang.String getForwardRegexForSequence(java.lang.CharSequence querySequence, jebl.evolution.sequences.SequenceType sequenceType, boolean interpretAmbiguitiesInQuery, boolean interpretAmbiguitiesInTarget)
Deprecated.Equivalent togetForwardRegexForSequence(...,true)
-
getForwardRegexForSequence
public static java.lang.String getForwardRegexForSequence(java.lang.CharSequence querySequence, jebl.evolution.sequences.SequenceType sequenceType, boolean interpretAmbiguitiesInQuery, boolean interpretAmbiguitiesInTarget, boolean allowExtraGapsInTarget)
Given a nucleotide or amino acid sequence, returns a regular expression that matches forward occurrences of this sequence in a larger sequence, i.e. a String s such that Pattern.compile(s, Pattern.CASE_INSENSITIVE) will find all case insensitive forward matches of sequenceString in a larger sequence. The regular expression returned will also match sequences with gaps inserted at any point within the sequence.- Parameters:
querySequence
- The nucleotide or amino acid sequence to search for.sequenceType
- The type of the sequenceinterpretAmbiguitiesInQuery
- If true, then an ambiguous character (e.g. R for nucleotides) in querySequence will match the corresponding canonical states (A and G) in the target.interpretAmbiguitiesInTarget
- If true, then an ambiguous character (e.g. R for nucleotides) in the sequence being searched within will match the corresponding canonical states (A and G) in the querySequence.allowExtraGapsInTarget
- If true, then additional gaps will be allowed in the sequence being search within- Returns:
- a regular expression that matches forward occurrences of this search string in a larger sequence string or null if any of the characters in the sequence string are not valid residues for sequenceType
- Since:
- API 4.610 (Geneious 6.1.0)
-
getForwardRegexPatternForSequence
public static java.util.regex.Pattern getForwardRegexPatternForSequence(java.lang.CharSequence sequence, jebl.evolution.sequences.SequenceType sequenceType, boolean interpretAmbiguitiesInQuery)
-
getForwardRegexPatternForSequence
public static java.util.regex.Pattern getForwardRegexPatternForSequence(java.lang.CharSequence querySequence, jebl.evolution.sequences.SequenceType sequenceType, boolean interpretAmbiguitiesInQuery, boolean interpretAmbiguitiesInTarget)
Given a nucleotide or amino acid sequence string, returns a regular expression pattern that matches forward occurrences of this search string in a larger sequence string,- Parameters:
querySequence
- The nucleotide or amino acid sequence to search for.sequenceType
- The type of the sequenceinterpretAmbiguitiesInQuery
- If true, then an ambiguous character (e.g. R for nucleotides) in sequenceString will match the corresponding canonical states (A and G) in the target.- Returns:
- a regular expression that matches forward occurrences of this search string in a larger sequence string or null if any of the characters in the sequence string are not valid residues for sequenceType
-
isStateAssignableFrom
public static boolean isStateAssignableFrom(jebl.evolution.sequences.State stateA, jebl.evolution.sequences.State stateB)
Same as stateA.getCanonicalStates().containsAll(stateB.getCanonicalStates()) except that for NucleotideStates and AminoAcidStates it caches the result.- Parameters:
stateA
- A state (e.g. a NucleotideState or AminoAcidState)stateB
- A state of the same type as stateB- Returns:
- true if stateA.getCanonicalStates().containsAll(stateB.getCanonicalStates))
-
createSequenceDocument
public static DefaultSequenceDocument createSequenceDocument(jebl.evolution.sequences.SequenceType sequenceType, java.lang.String name, java.lang.String description, java.lang.CharSequence sequenceString, java.util.Date creationDate)
Creates aDefaultNucleotideSequence
orDefaultAminoAcidSequence
depending on sequenceType. See the documentation of these classes' constructors for the semantics of the parameters.
-
setOriginalResidueNumbering
public static void setOriginalResidueNumbering(EditableSequenceDocument document, int startIndex, boolean isReverse)
set the original residue numbering of a document the residue index of a document will appear shifted if the user has "show original residue numbers" selected in the sequence view- Parameters:
document
- document to set the residue numbering forstartIndex
- start index for the residue numbering. The first original residue is residue 1.isReverse
- true if the residue numbering should count down from startIndex, false if it should count up.
-
containsInvalidResidues
public static java.lang.String containsInvalidResidues(SequenceDocument sequenceDocument, boolean allowGaps, boolean fastIncompleteCheck)
Checks if a sequence contain invalid sequence residues.- Parameters:
sequenceDocument
- the sequence to check for validity.allowGaps
- true if the sequence is allowed to contain gapsfastIncompleteCheck
- This parameter is ignored. It was added when Java 5 was widely used which is 10 times slower than Java 6. Checking enormous sequences is slow (a 2GB sequence takes about 20 seconds in Java 5, 2 seconds in Java 6). Set this parameter to true to check only the first and last 1,000,000 residues which catches almost all invalid cases and is much faster on enormous sequences.- Returns:
- null if the sequence residues are all valid, or if the sequence contains invalid residues a message describing the first invalid residue is returned.
-
getSequenceType
public static jebl.evolution.sequences.SequenceType getSequenceType(SequenceDocument.Alphabet alphabet)
Gets a jebl library SequenceType that is equivalent to a Geneious alphabet.- Parameters:
alphabet
-- Returns:
- sequence type that is equivalent to this alphabet
-
getAlphabet
public static SequenceDocument.Alphabet getAlphabet(jebl.evolution.sequences.SequenceType sequenceType)
Gets a Geneious alphabet type that is equivalent to a jebl library SequenceType.- Parameters:
sequenceType
-- Returns:
- alphabet that is equivalent to this sequence tyqpe
-
containsInvalidResidues
public static java.lang.String containsInvalidResidues(java.lang.CharSequence sequenceResidues, SequenceDocument.Alphabet alphabet, boolean allowGaps, boolean fastIncompleteCheck)
Checks if a sequence contain invalid sequence residues.- Parameters:
sequenceResidues
- sequence residues to check for validity.alphabet
- the alphabet of residues expected to be in sequenceResiduesallowGaps
- true if the sequence is allowed to contain gapsfastIncompleteCheck
- This parameter is ignored. It was added when Java 5 was widely used which is 10 times slower than Java 6. Checking enormous sequences is slow (a 2GB sequence takes about 20 seconds in Java 5, 2 seconds in Java 6). Set this parameter to true to check only the first and last 1,000,000 residues which catches almost all invalid cases and is much faster on enormous sequences.- Returns:
- null if the sequence residues are all valid, or if the sequence contains invalid residues a message describing the first invalid residue is returned.
-
containsInvalidResidues
public static java.lang.String containsInvalidResidues(java.lang.CharSequence sequenceResidues, jebl.evolution.sequences.SequenceType sequenceType, boolean allowGaps, boolean fastIncompleteCheck)
Checks if a sequence contain invalid sequence residues.- Parameters:
sequenceResidues
- sequence residues to check for validity.sequenceType
- the type of residues expected to be in sequenceResiduesallowGaps
- true if the sequence is allowed to contain gapsfastIncompleteCheck
- This parameter is ignored. It was added when Java 5 was widely used which is 10 times slower than Java 6. Checking enormous sequences is slow (a 2GB sequence takes about 20 seconds in Java 5, 2 seconds in Java 6). Set this parameter to true to check only the first and last 1,000,000 residues which catches almost all invalid cases and is much faster on enormous sequences.- Returns:
- null if the sequence residues are all valid, or if the sequence contains invalid residues a message describing the first invalid residue is returned.
-
removeInvalidResidues
public static java.lang.CharSequence removeInvalidResidues(java.lang.CharSequence sequence, jebl.evolution.sequences.SequenceType sequenceType, boolean allowGaps)
Get a sequence string identical tosequence
except that any invalid residues are removed. Gaps are only removed if allowGaps is false. All valid characters remain unchanged (they maintain their original case and there are no U->T replacements for nucleotides.)- Parameters:
sequence
- a string of residues that may or may not be valid residuessequenceType
- the type of residues in sequenceallowGaps
- if this is true, then gaps are not removed.- Returns:
- a sequence string identical to
sequence
except that any invalid residues are removed. If there are no invalid residues,sequence
is returned. - Throws:
java.lang.OutOfMemoryError
- if a sequence comtains invalid residues and a valid version of the sequence cannot fit in memory
-
getValidSequence
public static java.lang.CharSequence getValidSequence(SequenceDocument sequenceDocument, boolean allowGaps)
Replace any invalid bases/residues in the given sequence document with ambiguity symbols.- Parameters:
sequenceDocument
- sequence document to replace the invalid bases inallowGaps
- whether gaps are allowed (if false they will be replaced with ambiguity symbols)- Returns:
- the version of the sequence string with the invalid bases/residues replaced with ambiguity symbols
-
getValidSequence
public static java.lang.CharSequence getValidSequence(SequenceDocument sequenceDocument, boolean allowGaps, boolean replaceWithGaps)
Replace any invalid bases/residues in the given sequence document with ambiguity symbols or gaps.- Parameters:
sequenceDocument
- sequence document to replace the invalid bases inallowGaps
- whether gaps are allowed (if false they will be replaced with ambiguity symbols)replaceWithGaps
- whether invalid bases/residues should be replaced with gaps - should only be done if sequence is in an alignment- Returns:
- the version of the sequence string with the invalid bases/residues replaced with ambiguity symbols
- Throws:
java.lang.IllegalArgumentException
- if allowGaps is false but replaceWithGaps is true- Since:
- API 4.20 (Geneious 5.2.0)
-
asRna
public static java.lang.CharSequence asRna(java.lang.CharSequence nucleotideCharSequence)
Views an underlying (nucleotide) CharSequence as RNA by dynamically translating 'T's to 'U's and 't's to 'u's It is guaranteed that if charSequence instanceof SequenceCharSequence, the returned value will also be instanceof SequenceCharSequence (but it may not support log-time modifications).- Parameters:
nucleotideCharSequence
- A nucleotide sequence which may already have some RNA residues; it is not guaranteed that it is checked whether the sequence contains invalid residues. Must not be null.- Returns:
- A CharSequence with the same sequence of characters as charSequence, except that 'T's are replaced with 'U's and 't's are replaced with 'u's
-
asDna
public static java.lang.CharSequence asDna(java.lang.CharSequence nucleotideCharSequence)
Views an underlying (nucleotide) CharSequence as DNA by dynamically translating 'U's to 'T's and 'u's to 't's It is guaranteed that if charSequence instanceof SequenceCharSequence, the returned value will also be instanceof SequenceCharSequence (but it may not support log-time modifications).- Parameters:
nucleotideCharSequence
- A nucleotide sequence which may already have some DNA residues; it is not guaranteed that it is checked whether the sequence contains invalid residues. Must not be null.- Returns:
- A CharSequence with the same sequence of characters as charSequence, except that 'U's are replaced with 'T's and 'u's are replaced with 't's
-
asTranslation
@Deprecated public static java.lang.CharSequence asTranslation(java.lang.CharSequence nucleotideCharSequence, jebl.evolution.sequences.GeneticCode geneticCode)
Deprecated.
-
asTranslation
public static java.lang.CharSequence asTranslation(java.lang.CharSequence nucleotideCharSequence, jebl.evolution.sequences.GeneticCode geneticCode, boolean translateFirstCodonUsingFirstCodonTable)
Views an underlying (nucleotide) CharSequence as its translation. If the CharSequence is not a multiple of 3, the extra 1 or 2 characters are ignored. The translated sequence will have length nucleotideCharSequence.length()/3. If the nucleotide sequence contains unknown nucleotide characters, these are treated as unknown states and the corresponding translated site will also be the unknown state (?) unless the nucleotide base would not affect the translation (e.g. the 3rd base in some triplets). The concrete type of the return value is not guaranteed. The specified charSequence must not change after it was passed to this method, but it is not guaranteed that violations of this contract will be detected.- Parameters:
nucleotideCharSequence
- A nucleotide sequence which may be dna, rna or a mixture. Must not be null and must be immutable. Must not contain gaps.geneticCode
- the genetic code to use for the translation. Must not be null.translateFirstCodonUsingFirstCodonTable
- each genetic code specifies a set of codons which get translated as M if they are the first codon even though they normally wouldn't translate as an M when occurring elsewhere a coding region. If this parameter is true the first codon will be translated using this alternative translation table for the genetic code.- Returns:
- A CharSequence which is a translation of nucleotideCharSequence
- Throws:
java.lang.IllegalArgumentException
- if nucleotideCharSequence contains gaps.java.lang.NullPointerException
- if nucleotideCharSequence or geneticCode is null.- Since:
- API 4.41 (Geneious 5.4.1)
-
reverseComplement
public static java.lang.CharSequence reverseComplement(java.lang.CharSequence charSequence)
Provides a dynamic reverse complement view onto a nucletoide CharSequence. For performance, it is not guaranteed whether the charSequence will be checked for invalid residues. If an invalid nucleotide CharSequence is passed in, arbitrary nondeterministic behaviour may occur at any later time, such as e.g. unchecked exceptions thrown from
CharSequence.charAt(int)
.It is guaranteed that if charSequence instanceof SequenceCharSequence, the returned value will also be instanceof SequenceCharSequence (but it may not support log-time modifications).
Attention: Unlike
This method may be slow on sequences which do not contain a T or U near the start of the sequence as it needs to scan through the sequence to determine if it is RNA or DNA. Consider usingUtils.reverseComplement(String)
, this method preserves case and doesn't remove gaps. To remove gaps and convert the sequence to upper case, useremoveGaps(CharSequence)
andCharSequenceUtilities.asUpperCase(CharSequence)
.reverseComplementAsDna(CharSequence)
for better performance.- Parameters:
charSequence
- The charSequence for which to construct a reverse complement view.- Returns:
- A reverse complement view onto charSequence, with case and gaps preserved.
- See Also:
reverseComplementAsDna(CharSequence)
-
reverseComplementAsDna
public static java.lang.CharSequence reverseComplementAsDna(java.lang.CharSequence charSequence)
Similar toreverseComplement
except that the result will be returned as DNA even if the input sequence is RNA. This implementation is more efficient thanreverseComplement(CharSequence)
because it does not need to check the input sequence data type.- Parameters:
charSequence
- The charSequence for which to construct a reverse complement view.- Returns:
- A reverse complement view onto charSequence, with case and gaps preserved, but RNA converted to DNA
- Since:
- API 4.202100 (Geneious 2021.0.0)
-
isPredominantlyRna
public static boolean isPredominantlyRna(java.lang.CharSequence charSequence, int maximumNonGapsToLookAt)
Checks whether a sequence is predominantly RNA (rather than DNA). Same asUtils.isPredominantlyRNA(CharSequence, int)
, but more efficient onSequenceCharSequence
s.- Parameters:
charSequence
- A charSequence that may contain DNA or RNA characters.maximumNonGapsToLookAt
- Maximum number of non-gap residues to look at before making a decision; Pass in Integer.MAX_VALUE to look at all residues- Returns:
- true of the non-gap residues of charSequence are predominantly RNA
-
isRna
public static boolean isRna(java.lang.CharSequence charSequence)
Checks whether a sequence is RNA (rather than DNA) based on whether the sequence contains either a T/t or a U/u first. If it contains neither T/t nor U/u, this method returns false.- Parameters:
charSequence
- A charSequence that may contain DNA or RNA characters.- Returns:
- true if this charSequence is RNA (rather than DNA)
- See Also:
isPredominantlyRna(CharSequence, int)
-
isRna
public static boolean isRna(java.lang.CharSequence charSequence, int maxNucleotidesToCheck)
Checks whether a sequence is RNA (rather than DNA) based on whether the sequence contains either a T/t or a U/u first. If it contains neither T/t nor U/u, this method returns false.- Parameters:
charSequence
- A charSequence that may contain DNA or RNA characters.maxNucleotidesToCheck
- the maximum number of nucleotides/gaps to check (excluding leading/trailing gaps) before giving up and calling it DNA if no T or U is found.- Returns:
- true if this charSequence is RNA (rather than DNA)
- Since:
- API 4.202200 (Geneious 2022.0.0)
- See Also:
isPredominantlyRna(CharSequence, int)
-
removeGaps
public static java.lang.CharSequence removeGaps(java.lang.CharSequence charSequence)
Constructs a sequence without gaps ('-') from a specified sequence that potentially has gaps. If the specified sequence does contain gaps, then a gapless copy is returned. Otherwise, the original charSequence is returned. It is guaranteed that CharSequenceUtilities.equals(removeGaps(cs), cs.toString().replace("-", "")) for any CharSequence cs that fulfills its contract.- Parameters:
charSequence
- A nucleotide or amino acid sequence, potentially with gaps ('-')- Returns:
- A CharSequence that contains the same sequence of characters but without the gaps ('-'). Returns charSequence if charSequence doesn't contain any gaps.
-
getLeadingGapsLength
public static int getLeadingGapsLength(java.lang.CharSequence charSequence)
Returns the start index of the non-gap regions in the specified charSequence, i.e. the length of the longset prefix of charSequence that contains only '-' characters.- Parameters:
charSequence
- A CharSequence that may contain some leading gap characters '-'- Returns:
- The length of the longest prefix of charSequence that contains only '-' characters.
-
getTrailingGapsLength
public static int getTrailingGapsLength(java.lang.CharSequence charSequence)
Get the number of trailing gap ('-') characters in the sequence.- Parameters:
charSequence
- A CharSequence that may have trailing gap characters.- Returns:
- the number of trailing gap ('-') characters in the sequence or 0 if the sequence is entirely gaps.
-
getTrailingGapsStartIndex
public static int getTrailingGapsStartIndex(java.lang.CharSequence charSequence)
Returns the end index of the non-gap regions in the specified charSequence. This is identical to charSequence.length() minus the length of the longest suffix of charSequence that consists only of '-', except when charSequence consists only of '-', in which case this method returns charSequence.length() because there are no non-gap regions. In other words, in a sequence that consists only of gaps, all gaps are considered leading rather than trailing gaps, i.e. the non-gap region is considered to start just beyond the end of the sequence.- Parameters:
charSequence
- A CharSequence that may contain some leading gap characters '-'- Returns:
- 1+the index of the last nongap character in charSequence, or charSequence.length() if charSequence consists only of gaps
-
getAlphabet
public static SequenceDocument.Alphabet getAlphabet(SequenceDocument sequence)
Get the Alphabet of a sequence.- Parameters:
sequence
- a SequenceDocument to get the alphabet for.- Returns:
- Alphabet of sequence
-
getSequenceType
public static jebl.evolution.sequences.SequenceType getSequenceType(SequenceDocument sequence)
Get the (jebl) sequence type.- Parameters:
sequence
- a SequenceDocument to get the sequence type of.- Returns:
- type of sequence
- Throws:
java.lang.IllegalArgumentException
- if sequence is not either a NucleotideSequenceDocument or a AminoAcidSequenceDocument.
-
getSequenceType
public static java.util.List<jebl.evolution.sequences.SequenceType> getSequenceType(AnnotatedPluginDocument document)
Examines a document and determines what the (jebl) sequence type (or types) of the document is (or are), and returns it (or them).
Always returns a List<SequenceType> of size 0, 1 or 2.- Size 0: when the given document was a type that could have either
SequenceType.AMINO_ACID
orSequenceType.NUCLEOTIDE
or both, and that document has no sequences at all, for example an emptySequenceListDocument
- Size 1: when the given document just contains a single sequence or is of a type where the SequenceType is always known, e,g a
NucleotideSequenceDocument
or aSequenceAlignmentDocument
. - Size 2: when the given document has sequences of both types, e.g. a
SequenceListDocument
with sequences of both types.
- Parameters:
document
- the document to determine the SequenceType of- Returns:
- a list containing the sequence type or types of the given document.
- Throws:
java.lang.IllegalArgumentException
- if the given document type wasn't a valid type to determine the SequenceType of.- Since:
- API 4.610 (Geneious 6.1.0)
- Size 0: when the given document was a type that could have either
-
getAlphabet
public static SequenceDocument.Alphabet getAlphabet(AnnotatedPluginDocument... documents)
- Parameters:
documents
- the documents to get the alphabet for- Returns:
- The alphabet that all these documents have in common, or null if they are not all the same alphabet or if any of the documents have multiple alphabets
- Throws:
java.lang.IllegalArgumentException
- if any of the documents aren't a type of sequence (nucleotide, protein, sequence list or alignment)- Since:
- API 4.1010 (Geneious 10.1.0)
-
toHTMLFragment
public static java.lang.String toHTMLFragment(SequenceDocument sequence, java.lang.String additionalContent)
Generate a HTML fragment that summarises a sequence, including the sequence string. If the sequence is longer than a certain threshold X, then only the first X residues are shown.- Parameters:
sequence
- a SequenceDocumentadditionalContent
- additional content to include- Returns:
- the html formatted summary
-
asJeblSequence
public static jebl.evolution.sequences.Sequence asJeblSequence(SequenceDocument sequence)
Convert from a Geneious sequence to a jebl sequence.- Parameters:
sequence
- a Geneious sequence- Returns:
- sequence as a jebl sequence.
-
asJeblSequences
public static java.util.List<jebl.evolution.sequences.Sequence> asJeblSequences(java.util.List<SequenceDocument> sequences)
Convert a set of Geneious sequences to jebl sequences.- Parameters:
sequences
- Geneious sequences- Returns:
- the Geneious sequences as jebl sequences
-
asJeblSequences
public static java.util.List<jebl.evolution.sequences.Sequence> asJeblSequences(SequenceDocument... sequences)
Convert a set of Geneious sequences to jebl sequences.- Parameters:
sequences
- Geneious sequences- Returns:
- the Geneious sequences as jebl sequences
-
asJeblAlignment
public static jebl.evolution.alignments.Alignment asJeblAlignment(java.util.List<SequenceDocument> sequences)
Convert a list of (aligned) Geneious sequences to a jebl alignmnent- Parameters:
sequences
- aligned Geneious sequences- Returns:
- the Geneious sequences as a jebl alignmnent
-
createSequenceCopy
public static SequenceDocument createSequenceCopy(SequenceDocument original)
Creates a copy of the original sequence if necessary. If the sequence is an immutable sequence (ImmutableSequence
) then it is not copied and is just returned from this method.- Parameters:
original
- the original sequence- Returns:
- a new sequence document or the original sequence if the original sequence is immutable.
- Since:
- API 4.11 (Geneious 5.0)
- See Also:
createSequenceCopyEditable(com.biomatters.geneious.publicapi.documents.sequence.SequenceDocument)
-
createSequenceCopyEditable
public static DefaultSequenceDocument createSequenceCopyEditable(SequenceDocument original)
Creates a copy of the original sequence that is editable.- Parameters:
original
- the original sequence- Returns:
- a new sequence document
- Since:
- API 4.11 (Geneious 5.0)
- See Also:
createSequenceCopy(com.biomatters.geneious.publicapi.documents.sequence.SequenceDocument)
-
getSequenceAnnotationsIncludingImmutableSequencesTrims
public static java.util.List<SequenceAnnotation> getSequenceAnnotationsIncludingImmutableSequencesTrims(SequenceDocument sequence)
Gets all the annotations on the given sequence. Additionally if it is anImmutableSequence
withImmutableSequence.getLeadingTrimLength()
orImmutableSequence.getTrailingTrimLength()
>0 then annotations are created to represent these trims.- Parameters:
sequence
- the sequence to get annotations from- Returns:
- the annotations from the sequence
- Since:
- API 4.52 (Geneious 5.5.2)
-
asJeblSequence
public static jebl.evolution.sequences.Sequence asJeblSequence(SequenceAlignmentDocument.ReferencedSequence referencedSequence, SequenceDocument sequence) throws com.biomatters.geneious.publicapi.plugin.DocumentOperationException
Convert from a Geneious sequence to a jebl sequence.- Parameters:
referencedSequence
- original referenced sequence to copy additional fields from. May be null.sequence
- a Geneious sequence- Returns:
- sequence as a jebl sequence
- Throws:
com.biomatters.geneious.publicapi.plugin.DocumentOperationException
- when the referenced sequence cannot be loaded- Since:
- API 4.700 (Geneious 7.0.0)
-
asJeblSequence
@Deprecated public static jebl.evolution.sequences.Sequence asJeblSequence(AnnotatedPluginDocument referenceDocument, SequenceDocument sequence)
Convert from a Geneious sequence to a jebl sequence.- Parameters:
referenceDocument
- original AnnotatedPluginDocument to copy additional fields from. May be null.sequence
- a Geneious sequence- Returns:
- sequence as a jebl sequence
- Since:
- API 4.43 (Geneious 5.4.3)
-
replaceQuestionMarksWithMaximalAmbiguitySymbol
public static java.lang.String replaceQuestionMarksWithMaximalAmbiguitySymbol(jebl.evolution.sequences.SequenceType sequenceType, java.lang.String sequence)
get a version of a sequence string with any question marks replaces with N (for nucleotide sequences) or X (for protein sequences)- Parameters:
sequenceType
- sequence type of sequencesequence
- sequence string- Returns:
- version of sequence with any question marks replaces with N (for nucleotide sequences) or X (for protein sequences)
-
getMaximalAmbiguitySymbol
public static java.lang.String getMaximalAmbiguitySymbol(jebl.evolution.sequences.SequenceType sequenceType)
get the code for the state in this sequence type which represents a base/residue that is completely unknown- Parameters:
sequenceType
-- Returns:
-
getAnnotationsOfType
public static java.util.List<SequenceAnnotation> getAnnotationsOfType(java.util.List<SequenceAnnotation> annotations, java.lang.String type)
Get all annotations in list matching the given type- Parameters:
annotations
- annotationstype
- type of annotations to get- Returns:
- all annotations in document matching the given type
-
getAnnotationsOfType
@Deprecated public static java.util.List<SequenceAnnotation> getAnnotationsOfType(SequenceDocument document, java.lang.String type)
Deprecated.Get all annotations in document matching the given type- Parameters:
document
- document to get annotations formtype
- type of annotations to get- Returns:
- all annotations in document matching the given type
-
getAnnotationsOfType
public static java.util.List<SequenceAnnotation> getAnnotationsOfType(SequenceDocument document, java.lang.String type, boolean returnAnnotationsInTracks)
Get all annotations in document matching the given type. WARNING: this list may not include all SequenceAnnotations represented as annotations in the sequence viewer. One such case is with trim annotations, which should be found usinggetSequenceAnnotationsIncludingImmutableSequencesTrims(com.biomatters.geneious.publicapi.documents.sequence.SequenceDocument)
.
This may also return annotations that are not visible in the sequence viewer, such asSequenceAnnotation.TYPE_EXTRACTED_REGION
.- Parameters:
document
- document to get annotations formtype
- type of annotations to getreturnAnnotationsInTracks
- true iff we want annotations fromSequenceTracks
as well as those annotated directly on a document- Returns:
- all annotations in document matching the given type
- Since:
- API 4.50 (Geneious 5.5.0)
-
getSequenceAndTrackAnnotations
public static java.lang.Iterable<SequenceAnnotation> getSequenceAndTrackAnnotations(SequenceDocument sequence)
A convenience method to get all annotations on the sequence and all annotations on allSequenceTracks
on this sequence. Most code should instead manually load tracks on demand since they may be too large to fit into memory. To get tracks on a sequence useSequenceTrack.getTrackManager(com.biomatters.geneious.publicapi.documents.sequence.SequenceDocument)
followed bySequenceTrack.Manager.getTracks()
.
Iterating over the returned value may throw aRuntimeException
whose cause is anXMLSerializationException
if there is insufficient memory available to load the annotations. When runningDocumentOperations
orSequenceAnnotationGenerators
core Geneious will automatic catch such exceptions and display a nice message to the user.- Parameters:
sequence
- the sequence to get annotations for- Returns:
- iterator containing the annotations. Will not return null.
- Since:
- API 4.50 (Geneious 5.5.0)
-
createSequenceCopyAdjustedForGapInsertion
public static SequenceDocument createSequenceCopyAdjustedForGapInsertion(SequenceDocument sequenceDocument, java.lang.CharSequence gappedSequenceCharacters)
Creates a copy of the given sequence with annotations, sequence residues, and chromatogram values adjusted to account for gap insertion. Note, the returned copy does not create gapped versions ofSequenceTracks
. Tracks are instead automatically propagated from referenced documents in alignments.- Parameters:
sequenceDocument
- a sequence to insert gaps into. If the sequence alreayd contains gaps, the gaps are removed firstgappedSequenceCharacters
- the sequence characters to appear in the new gapped sequence. The positions of gaps in this character sequence determine how annotations and chromatograms are adjusted.- Returns:
- a copy of sequenceDocument adjusted for gap insertion. This is always a DefaultSequenceDocument but this method isn't declared to return that for API backwards compatibility reasons
- See Also:
SequenceExtractionUtilities.removeGaps(com.biomatters.geneious.publicapi.documents.sequence.SequenceDocument, boolean)
-
createSequenceCopyAdjustedForGapInsertion
public static SequenceDocument createSequenceCopyAdjustedForGapInsertion(SequenceDocument sequenceDocument, java.lang.CharSequence gappedSequenceCharacters, boolean includeTracks)
Creates a copy of the given sequence with annotations, sequence residues, and chromatogram values adjusted to account for gap insertion.- Parameters:
sequenceDocument
- a sequence to insert gaps into. If the sequence alreayd contains gaps, the gaps are removed firstgappedSequenceCharacters
- the sequence characters to appear in the new gapped sequence. The positions of gaps in this character sequence determine how annotations and chromatograms are adjusted.includeTracks
- true if tracks should also be copied. If this is intended for use with an alignment which references the original documents, this should be false as alignment documents propagate tracks on demand from referenced documents.- Returns:
- a copy of sequenceDocument adjusted for gap insertion. This is always a DefaultSequenceDocument but this method isn't declared to return that for API backwards compatibility reasons
- Since:
- API 4.202000 (Geneious 2020.0.0)
- See Also:
SequenceExtractionUtilities.removeGaps(com.biomatters.geneious.publicapi.documents.sequence.SequenceDocument, boolean)
-
concatenateSequences
public static SequenceDocument concatenateSequences(java.util.List<? extends SequenceDocument> sequences, boolean circular, int indexOfDocumentToUseForOrigin, jebl.util.ProgressListener progressListener) throws com.biomatters.geneious.publicapi.plugin.DocumentOperationException
Concatenate a list of sequence documents. All sequences must be of the same type (all nucleotide or all amino acid). For circular results, indexOfDocumentToUse may be used to specify which input sequence should be used to determine the origin for the result. If the specified input sequence is circular and has an annotated origin, this position will be used; otherwise, the start of the specified sequence will be the origin of the result. If circular is false, indexOfDocumentToUse must be -1.- Parameters:
sequences
- sequence documents to concatenatecircular
- if true, the result will be circularindexOfDocumentToUseForOrigin
- index of document to use for the origin (must be -1 for linear results)progressListener
-- Returns:
- concatenated sequence
- Throws:
com.biomatters.geneious.publicapi.plugin.DocumentOperationException
- Since:
- API 4.1100 (Geneious 11.0.0)
-
getSequences
public static java.util.List<? extends SequenceDocument> getSequences(AnnotatedPluginDocument[] documents, SequenceDocument.Alphabet alphabet, jebl.util.ProgressListener progressListener) throws com.biomatters.geneious.publicapi.plugin.DocumentOperationException
get all the sequences out of a set of AnnotatedPluginDocuments that may wrap SequenceDocuments, SequenceListDocuments or SequenceAlignmentDocuments. For large sequence lists (SequenceListOnDisk
) and genome sized sequences (those longer thanSequenceDocument.GENOME_SEQUENCE_THRESHOLD
) in other sequence lists, these are only loaded into memory on demand to ensure this method doesn't use excessive memory. If this method is potentially called on thousands of documents, thengetSequencesWithoutImmediateLoading
should be considered instead.- Parameters:
documents
- documents to get the sequences out ofalphabet
- alphabet the sequences need to be to be includedprogressListener
- for notifying the caller about progress of this method and for cancelling.- Returns:
- all the sequences. Sequences are ordered by the AnnotatedPluginDocument they are in, and then by their index in that document.
- Throws:
com.biomatters.geneious.publicapi.plugin.DocumentOperationException
- if there is a problem getting the PluginDocument out of an AnnotatedPluginDocument or if the progress listener cancels the request.
-
getSequences
public static java.util.List<? extends SequenceDocument> getSequences(java.util.List<AnnotatedPluginDocument> documents, SequenceDocument.Alphabet alphabet, jebl.util.ProgressListener progressListener) throws com.biomatters.geneious.publicapi.plugin.DocumentOperationException
get all the sequences out of a set of AnnotatedPluginDocuments that may wrap SequenceDocuments, SequenceListDocuments or SequenceAlignmentDocuments. For large sequence lists (SequenceListOnDisk
) and genome sized sequences (those longer thanSequenceDocument.GENOME_SEQUENCE_THRESHOLD
) in other sequence lists, these are only loaded into memory on demand to ensure this method doesn't use excessive memory. If this method is potentially called on thousands of documents, thengetSequencesWithoutImmediateLoading
should be considered instead.- Parameters:
documents
- documents to get the sequences out ofalphabet
- alphabet the sequences need to be to be includedprogressListener
- for notifying the caller about progress of this method and for cancelling.- Returns:
- all the sequences
- Throws:
com.biomatters.geneious.publicapi.plugin.DocumentOperationException
- if there is a problem getting the PluginDocument out of an AnnotatedPluginDocument or if the progress listener cancels the request.- Since:
- API 4.700 (Geneious 7.0.0)
-
getSequencesWithoutImmediateLoading
public static java.util.Collection<? extends SequenceDocument> getSequencesWithoutImmediateLoading(AnnotatedPluginDocument[] documents, SequenceDocument.Alphabet alphabet) throws com.biomatters.geneious.publicapi.plugin.DocumentOperationException
LikegetSequences(com.biomatters.geneious.publicapi.documents.AnnotatedPluginDocument[], com.biomatters.geneious.publicapi.documents.sequence.SequenceDocument.Alphabet, jebl.util.ProgressListener)
but doesn't require each plugin document to be in memory as long as this Collection is around. The trade-off is that the sequences can only be accessed sequentially (hence the Collection return type of this method). Also the Collection does not support removal.
Since this Collection doesn't store the sequences immediately, DocumentOperationExceptions may be thrown down the line. Such a situation may be caught by surrounding the given iteration withtry {... } catch (
and then handling the exception from there. Note that usingRuntimeDocumentOperationException
e)getSequences(com.biomatters.geneious.publicapi.documents.AnnotatedPluginDocument[], com.biomatters.geneious.publicapi.documents.sequence.SequenceDocument.Alphabet, jebl.util.ProgressListener)
is preferable to using getSequencesWithoutImmediateLoading when dealing with under a thousand documents.- Parameters:
documents
- documents to get the sequences out ofalphabet
- alphabet the sequences need to be to be included- Returns:
- all the sequences whose iterator may throw a RuntimeDocumentOperationException
- Throws:
com.biomatters.geneious.publicapi.plugin.DocumentOperationException
- if one or more of the documents has more than Integer.MAX_VALUE sequences.- Since:
- API 4.610 (Geneious 6.1.0)
- See Also:
RuntimeDocumentOperationException
-
getOriginalIndex
public static int getOriginalIndex(SequenceDocument sequence, int index)
Gets the original numbering of the given index if it is covered by aSequenceAnnotation.TYPE_EXTRACTED_REGION
annotation.- Parameters:
sequence
- the sequence this index belongs to.index
- the index to get the original numbering for.- Returns:
- 'translated' index or the original index if no other numbering can be found.
- Since:
- API 4.900 (Geneious 9.0.0)
-
getNumberOfSequences
public static long getNumberOfSequences(java.util.List<AnnotatedPluginDocument> documents, SequenceDocument.Alphabet alphabet)
Gets the total number of nucleotide or amino acid sequences contained in the given documents which may be individual sequences, sequence lists, or alignments/contigs.- Parameters:
documents
- the documents to get the number of sequences inalphabet
- the alphabet (nucleotide or amino acid) of the sequences to count.- Returns:
- the total number of nucleotide sequences or amino acid contained in the given documents
- Since:
- API 4.40 (Geneious 5.4.0)
-
getNumberOfSequences
public static long getNumberOfSequences(AnnotatedPluginDocument document, SequenceDocument.Alphabet alphabet)
Gets the total number of nucleotide or amino acid sequences contained in the given document which may be an individual sequence, sequence list, or alignment/contig.- Parameters:
document
- the document to get the number of sequences inalphabet
- the alphabet (nucleotide or amino acid) of the sequences to count.- Returns:
- the total number of nucleotide or amino acid sequences contained in the given document
- Since:
- API 4.40 (Geneious 5.4.0)
-
generateConsensusSequence
@Deprecated public static SequenceDocument generateConsensusSequence(SequenceAlignmentDocument alignment, jebl.util.ProgressListener progressListener)
Deprecated.Generates a consensus sequence for an alignment using default consensus settings. Note that the returned sequence may contain gaps. If it is to be used as a stand-alone sequence, thenSequenceExtractionUtilities.removeGaps(com.biomatters.geneious.publicapi.documents.sequence.SequenceDocument)
should be used.- Parameters:
alignment
- the alignment to generate the consensus sequence forprogressListener
- for reporting progress can cancelling.- Returns:
- a sequence equal in length to the alignment. The sequence may contain gaps. May return null if progressListener requests this get cancelled.
- Since:
- API 4.60 (Geneious 5.6.0)
-
generateConsensus
public static SequenceDocument generateConsensus(SequenceAlignmentDocument alignment, jebl.util.ProgressListener progressListener) throws com.biomatters.geneious.publicapi.plugin.DocumentOperationException
Generates a consensus sequence for an alignment using default consensus settings. Note that the returned sequence may contain gaps. If it is to be used as a stand-alone sequence, thenSequenceExtractionUtilities.removeGaps(com.biomatters.geneious.publicapi.documents.sequence.SequenceDocument)
should be used. To generate consensus sequences with non-default options, use PluginUtilities.getDocumentOperation("Generate_Consensus"). Note that this operation generates an sequence with gaps removed by default.- Parameters:
alignment
- the alignment to generate the consensus sequence forprogressListener
- for reporting progress can cancelling.- Returns:
- a sequence equal in length to the alignment. The sequence may contain gaps. Will not return null
- Throws:
com.biomatters.geneious.publicapi.plugin.DocumentOperationException
- if the consensus can't be generated because there is insufficient free memory.com.biomatters.geneious.publicapi.plugin.DocumentOperationException.Canceled
- if the progressListener requests the consensus generation be cancelled.- Since:
- API 4.610 (Geneious 6.1.0)
-
getBlastAlignmentText
public static java.lang.String getBlastAlignmentText(SequenceAlignmentDocument alignment, boolean geneiousFriendly)
Formats the given alignment in BLAST text format- Parameters:
alignment
- alignment to formatgeneiousFriendly
- whether to format the alignment in an html-formatted "Geneious friendly" way that is useful generally for alignments and not just for BLAST output- Returns:
- alignment represented in BLAST text format
- Since:
- API 4.700 (Geneious 7.0.0)
-
alignmentFromJeblSequences
public static DefaultAlignmentDocument alignmentFromJeblSequences(java.lang.String name, java.util.List<jebl.evolution.sequences.Sequence> jeblSequences)
Converts the given alignment of Jebl sequences into a DefaultAlignmentDocument- Parameters:
name
- name for alignmentjeblSequences
- aligned jebl sequences- Returns:
- a DefaultAlignmentDocument representing the given alignment.
- Since:
- API 4.700 (Geneious 7.0.0)
-
createNewDocumentsByTransformingSequences
public static java.util.List<AnnotatedPluginDocument> createNewDocumentsByTransformingSequences(java.util.List<AnnotatedPluginDocument> sourceDocuments, SequenceDocument.Transformer transformer, jebl.util.ProgressListener progressListener, java.lang.String newSequenceOrDocumentNamePrefix) throws com.biomatters.geneious.publicapi.plugin.DocumentOperationException
Transforms the sequence(s) in each input document and returns a new document corresponding to each input document.- Parameters:
sourceDocuments
- the source documents containing sequences to transform. These may be SequenceDocuments or SequenceListDocuments or SequenceAlignmentDocumentstransformer
- the transformer for transforming each sequenceprogressListener
- for reporting progress and cancelingnewSequenceOrDocumentNamePrefix
- an optional prefix to assign to the name of each newly generated document. May be an empty String to leave names unchanged.- Returns:
- the new documents
- Throws:
com.biomatters.geneious.publicapi.plugin.DocumentOperationException
- if documents can't be loaded, or if the input documents are not SequenceDocuments or SequenceListDocuments or SequenceAlignmentDocuments- Since:
- API 4.701 (Geneious 7.0.1)
-
createNewDocumentsByTransformingSequences
public static java.util.List<AnnotatedPluginDocument> createNewDocumentsByTransformingSequences(java.util.List<AnnotatedPluginDocument> sourceDocuments, SequenceDocument.Transformer transformer, jebl.util.ProgressListener progressListener, java.lang.String newSequenceOrDocumentNamePrefix, java.lang.String newSequenceOrDocumentNameSuffix) throws com.biomatters.geneious.publicapi.plugin.DocumentOperationException
Transforms the sequence(s) in each input document and returns a new document corresponding to each input document.- Parameters:
sourceDocuments
- the source documents containing sequences to transform. These may be SequenceDocuments or SequenceListDocuments or SequenceAlignmentDocumentstransformer
- the transformer for transforming each sequenceprogressListener
- for reporting progress and cancelingnewSequenceOrDocumentNamePrefix
- an optional prefix to assign to the name of each newly generated document. May be an empty String to leave names unchanged.newSequenceOrDocumentNameSuffix
- an optional suffix to assign to the name of each newly generated document. May be an empty String to leave names unchanged.- Returns:
- the new documents
- Throws:
com.biomatters.geneious.publicapi.plugin.DocumentOperationException
- if documents can't be loaded, or if the input documents are not SequenceDocuments or SequenceListDocuments or SequenceAlignmentDocuments- Since:
- API 4.201920 (Geneious 2019.2.0)
-
getIntervalBasedOnExtractionAnnotation
public static SequenceAnnotationInterval getIntervalBasedOnExtractionAnnotation(SequenceDocument sequenceDocument, SequenceAnnotationInterval interval, boolean mapToOriginal)
Gets the extraction annotations from the sequence document and maps the interval to either the original sequence or the result sequence, depending on the value ofmapToOriginal
- Parameters:
sequenceDocument
- the document to get the extractionAnnotations frominterval
- the interval to re-mapmapToOriginal
- whether to map this interval to the corresponding bit on the original or to the corresponding bit on the result- Returns:
- a new interval that represents the given interval on either the original or result document, return parameter interval back if can not find mapping
- Since:
- API 4.1000 (Geneious 10.0.0)
-
getIndexBasedOnExtractionAnnotation
public static java.lang.Integer getIndexBasedOnExtractionAnnotation(SequenceDocument sequenceDocument, int index, boolean mapToOriginal)
Gets the extraction annotations from the sequence document and maps a residue index to a residue index on either the original sequence or the result sequence, depending on the value ofmapToOriginal
- Parameters:
sequenceDocument
- the document to get the extractionAnnotations fromindex
- the 1-based residue position in the sequence to re-map.mapToOriginal
- whether to map this interval to the corresponding bit on the original or to the corresponding bit on the result- Returns:
- a new index that represents the given index on either the original or result document, return null if the index can't be mapped.
- Since:
- API 4.1000 (Geneious 10.0.0)
-
-