Class SequenceGapInformation
- java.lang.Object
-
- com.biomatters.geneious.publicapi.implementations.SequenceGapInformation
-
public final class SequenceGapInformation extends java.lang.Object
Precalculates information about the location of gaps ('-') in a CharSequence, and can efficiently calculate translate between indices in the gapped and ungapped sequence. This translation also works for indices beyond the ends of the sequence, which is necessary for translatingSequenceAnnotations
, which can go beyond sequence ends. Like all other sequence indices except for the ones in SequenceAnnotations, indices used in this class are 0-based. When the sequence actually contains internal gaps, this class uses memory about 0.5 bytes of memory per base in the gapped internal sequence for large sequences (over 10 million base pairs) SomeDefaultSequenceDocument
instances (usually only instances of reference sequences in a big contig) store a pre-built SequenceGapInformation which is availableDefaultSequenceDocument.getSequenceGapInformation()
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static interface
SequenceGapInformation.Provider
An interface thatSequenceDocuments
can optionally to implement to indicate they may provide a (potentially) pre-built SequenceGapInformation
-
Constructor Summary
Constructors Constructor Description SequenceGapInformation(java.lang.CharSequence gappedSequence)
Constructs SequenceGapInformation for the specified gapped sequence.SequenceGapInformation(java.lang.CharSequence gappedSequence, jebl.util.ProgressListener progressListener)
Constructs SequenceGapInformation for the specified gapped sequence.SequenceGapInformation(org.jdom.Element element, SequenceCharSequence gappedCharSequence)
Deserializes a SequenceGapInformation from XML previously returned fromtoXML(String)
.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description java.util.List<SequenceAnnotation>
adjustAnnotationsForGapInsertion(java.util.List<SequenceAnnotation> annotations)
Creates a new list of annotations by adjusting their locations to compensate for adding gaps into the sequence.java.util.List<SequenceAnnotation>
adjustAnnotationsForGapRemoval(java.util.List<SequenceAnnotation> annotations)
Creates a new list of annotations by adjusting their locations to compensate for removing all the gaps in the sequence.static SequenceGapInformation
forSequenceDocument(SequenceDocument sequence)
Get SequenceGapInformation for the given sequence.char
getGappedCharAt(int indexInGappedSequence)
Returns the character at the given index in the gapped sequence.SequenceCharSequence
getGappedCharSequence()
int
getGappedIndex(int indexInUngappedSequence)
Translates an index in the ungapped sequence to the index of the corresponding nongap character in thegappedSequence
passed to the constructor.int
getGappedIndexTreatingEndGapsLikeInternalGaps(int indexInUngappedSequence)
Translates an index in the ungapped sequence to the index of the corresponding nongap character in thegappedSequence
passed to the constructor.int
getGappedSequenceLength()
int
getLeadingGapsLength()
int
getTrailingGapsLength()
int
getTrailingGapsStartIndex()
int
getUngappedIndexOfThisOrNextResidue(int indexInGappedSequence)
Same asgetUngappedIndexOfThisOrPreviousResidue(int)
, but if the specified index is on a gap in the gapped sequence, then the ungapped index of the next rather than the previous nongap residue is returned.static int
getUngappedIndexOfThisOrNextResidue(SequenceCharSequence sequence, int indexInGappedSequence)
Gets the ungapped index corresponding to a gapped index.int
getUngappedIndexOfThisOrNextResidueTreatingEndGapsLikeInternalGaps(int indexInGappedSequence)
Same asgetUngappedIndexOfThisOrPreviousResidueTreatingEndGapsLikeInternalGaps(int)
, but if the specified index is on a gap in the gapped sequence, then the ungapped index of the next rather than the previous nongap residue is returned.int
getUngappedIndexOfThisOrPreviousResidue(int indexInGappedSequence)
Calculates the index where the character gappedSequence.charAt(indexInGappedSequence) would move if all gaps were stripped from gappedSequence.static int
getUngappedIndexOfThisOrPreviousResidue(SequenceCharSequence sequence, int indexInGappedSequence)
Gets the ungapped index corresponding to a gapped index.int
getUngappedIndexOfThisOrPreviousResidueTreatingEndGapsLikeInternalGaps(int indexInGappedSequence)
Calculates the index where the character gappedSequence.charAt(indexInGappedSequence) would move if all gaps were stripped from gappedSequence.int
getUngappedSequenceLength()
Returns the ungapped length of the sequence passed to the constructorstatic Geneious.MajorVersion
getVersionSupportStatic(XMLSerializable.VersionSupportType versionType)
Returns version support as defined byXMLSerializable.OldVersionCompatible.getVersionSupport(com.biomatters.geneious.publicapi.documents.XMLSerializable.VersionSupportType)
.boolean
isGap(int indexInGappedSequence)
Return true if the character at the specified gapped sequence index is an internal gap or end gap Characters beyond the ends of the gapped sequence are assumed to be non-gaps (for consistency withgetUngappedIndexOfThisOrPreviousResidue(int)
) therefore isGap(x) where x<0 or x>=gappedSequenceLength will return false.boolean
isInternalGap(int indexInGappedSequence)
Return true if the character at the specified gapped sequence index is an internal (non-end) gap.org.jdom.Element
toXML(Geneious.MajorVersion version, java.lang.String name)
Serializes this gap information (excluding the char sequence) to XML which may usePluginDocument.FILE_DATA_ATTRIBUTE_NAME
org.jdom.Element
toXML(java.lang.String name)
Serializes this gap information (excluding the char sequence) to XML which may usePluginDocument.FILE_DATA_ATTRIBUTE_NAME
-
-
-
Constructor Detail
-
SequenceGapInformation
public SequenceGapInformation(org.jdom.Element element, SequenceCharSequence gappedCharSequence)
Deserializes a SequenceGapInformation from XML previously returned fromtoXML(String)
.- Parameters:
element
- an element previously returned fromtoXML(String)
.gappedCharSequence
- the gapped char sequence previously associated with the previously serialized SequenceGapInformation.- Since:
- API 4.50 (Geneious 5.5.0)
-
SequenceGapInformation
public SequenceGapInformation(java.lang.CharSequence gappedSequence)
Constructs SequenceGapInformation for the specified gapped sequence.- Parameters:
gappedSequence
- A CharSequence that contains gaps and for which we want to be able to translate between gapped and ungapped indices.
-
SequenceGapInformation
public SequenceGapInformation(java.lang.CharSequence gappedSequence, jebl.util.ProgressListener progressListener) throws com.biomatters.geneious.publicapi.plugin.DocumentOperationException.Canceled
Constructs SequenceGapInformation for the specified gapped sequence.- Parameters:
gappedSequence
- A CharSequence that contains gaps and for which we want to be able to translate between gapped and ungapped indices.progressListener
- for reporting progress and cancelling- Throws:
com.biomatters.geneious.publicapi.plugin.DocumentOperationException.Canceled
- if the progress listener requests we cancel.- Since:
- API 4.50 (Geneious 5.5.0)
-
-
Method Detail
-
getVersionSupportStatic
public static Geneious.MajorVersion getVersionSupportStatic(XMLSerializable.VersionSupportType versionType)
Returns version support as defined byXMLSerializable.OldVersionCompatible.getVersionSupport(com.biomatters.geneious.publicapi.documents.XMLSerializable.VersionSupportType)
. All SequenceGapInformation instances support version serialization in the same way, so this method is static.- Parameters:
versionType
- the type of version support to know about.- Returns:
- version support as defined by
XMLSerializable.OldVersionCompatible.getVersionSupport(com.biomatters.geneious.publicapi.documents.XMLSerializable.VersionSupportType)
- Since:
- API 4.600 (Geneious 6.0.0)
-
toXML
public org.jdom.Element toXML(java.lang.String name) throws java.io.IOException
Serializes this gap information (excluding the char sequence) to XML which may usePluginDocument.FILE_DATA_ATTRIBUTE_NAME
- Parameters:
name
- the name of the element to retrn- Returns:
- some xml
- Throws:
java.io.IOException
- if it can't be serialized because we can't write to a local temporary file- Since:
- API 4.50 (Geneious 5.5.0)
-
toXML
public org.jdom.Element toXML(Geneious.MajorVersion version, java.lang.String name) throws java.io.IOException
Serializes this gap information (excluding the char sequence) to XML which may usePluginDocument.FILE_DATA_ATTRIBUTE_NAME
- Parameters:
version
- the version of Geneious to serialize forname
- the name of the element to retrn- Returns:
- some xml
- Throws:
java.io.IOException
- if it can't be serialized because we can't write to a local temporary file- Since:
- API 4.600 (Geneious 6.0.0)
-
forSequenceDocument
public static SequenceGapInformation forSequenceDocument(SequenceDocument sequence)
Get SequenceGapInformation for the given sequence. This is the preferred method of getting a SequenceGapInformation because it can return the cached copy for a DefaultSequenceDocument.- Parameters:
sequence
- a SequenceDocument that contains gaps and for which we want to be able to translate between gapped and ungapped indices.- Returns:
- gap information for the sequence
- Since:
- API 4.600 (Geneious 6.0.0)
-
getLeadingGapsLength
public int getLeadingGapsLength()
- Returns:
- the number of leading gaps in the sequence passed to the constructor. Equivalent to
SequenceCharSequence.getLeadingGapsLength()
- Since:
- API 4.60 (Geneious 5.6.0)
-
getTrailingGapsLength
public int getTrailingGapsLength()
- Returns:
- the number of trailing gaps in the sequence passed to the constructor. Equivalent to
SequenceCharSequence.getTrailingGapsLength()
- Since:
- API 4.60 (Geneious 5.6.0)
-
getTrailingGapsStartIndex
public int getTrailingGapsStartIndex()
- Returns:
- the start index of the trailing gaps in the sequence passed to the constructor. Equivalent to
SequenceCharSequence.getTrailingGapsStartIndex()
- Since:
- API 4.60 (Geneious 5.6.0)
-
getUngappedIndexOfThisOrPreviousResidue
public int getUngappedIndexOfThisOrPreviousResidue(int indexInGappedSequence)
Calculates the index where the character gappedSequence.charAt(indexInGappedSequence) would move if all gaps were stripped from gappedSequence. If the specified index is on a gap, the adjusted index of its nearest nongap neighbour on the left (or -1 if there is none) is returned. This corresponds to stripping all gap characters out of the sequence and the intervals covering those residues, with the characters moving to the left to fill the gaps, and implicitly treating characters beyond the sequence length and in end-gap regions as nongaps.Example: 012345678901 old indices -ABC--DE--f- -101222344456 new indices
- Parameters:
indexInGappedSequence
- The 0-based index in the gapped sequence to convert to an index in the sequence without gaps- Returns:
- the resulting index in the gapless CharSequence.
-
getUngappedIndexOfThisOrPreviousResidueTreatingEndGapsLikeInternalGaps
public int getUngappedIndexOfThisOrPreviousResidueTreatingEndGapsLikeInternalGaps(int indexInGappedSequence)
Calculates the index where the character gappedSequence.charAt(indexInGappedSequence) would move if all gaps were stripped from gappedSequence. If the specified index is on a gap, the adjusted index of its nearest nongap neighbour on the left (or -1 if there is none) is returned. This corresponds to stripping all gap characters out of the sequence and the intervals covering those residues, with the characters moving to the left to fill the gaps, and implicitly treating characters beyond the sequence length as nongaps. Characters in end gap regions will be treated the same as internal gaps.Example: 012345678901 old indices -ABC--DE--f- -101222344455 new indices
- Parameters:
indexInGappedSequence
- The 0-based index in the gapped sequence to convert to an index in the sequence without gaps- Returns:
- the resulting index in the gapless CharSequence.
- Since:
- API 4.700 (Geneious 7.0.0
-
isGap
public boolean isGap(int indexInGappedSequence)
Return true if the character at the specified gapped sequence index is an internal gap or end gap Characters beyond the ends of the gapped sequence are assumed to be non-gaps (for consistency withgetUngappedIndexOfThisOrPreviousResidue(int)
) therefore isGap(x) where x<0 or x>=gappedSequenceLength will return false.- Parameters:
indexInGappedSequence
- an index of a character in the gapped sequence.- Returns:
- true if the character at the specified gapped sequence index is a gap.
-
isInternalGap
public boolean isInternalGap(int indexInGappedSequence)
Return true if the character at the specified gapped sequence index is an internal (non-end) gap. Characters beyond the ends of the gapped sequence are assumed to be non-gaps (for consistency withgetUngappedIndexOfThisOrPreviousResidue(int)
) therefore isGap(x) where x<0 or x>=gappedSequenceLength will return false.- Parameters:
indexInGappedSequence
- an index of a character in the gapped sequence.- Returns:
- true if the character at the specified gapped sequence index is an internal gap.
- Since:
- API 4.60 (Geneious 5.6.0)
-
getGappedCharAt
public char getGappedCharAt(int indexInGappedSequence)
Returns the character at the given index in the gapped sequence.- Parameters:
indexInGappedSequence
- 0-based index of the character in the gapped sequence- Returns:
- the character at the gapped sequence
- Throws:
java.lang.IndexOutOfBoundsException
- ifindexInGappedSequence
is less than 0 or greater than or equal to the gapped sequence length- Since:
- API 4.202000 (Geneious 2020.0.0)
-
getGappedCharSequence
public SequenceCharSequence getGappedCharSequence()
- Returns:
- the gapped sequence passed to the constructor.
- Since:
- API 4.202010 (Geneious 2020.1.0)
-
getUngappedIndexOfThisOrNextResidueTreatingEndGapsLikeInternalGaps
public int getUngappedIndexOfThisOrNextResidueTreatingEndGapsLikeInternalGaps(int indexInGappedSequence)
Same asgetUngappedIndexOfThisOrPreviousResidueTreatingEndGapsLikeInternalGaps(int)
, but if the specified index is on a gap in the gapped sequence, then the ungapped index of the next rather than the previous nongap residue is returned.- Parameters:
indexInGappedSequence
- a 0-based index in the gapped sequence- Returns:
- the 0-based index of the first residue at or after the specified position in the ungapped sequence.
- Since:
- API 4.700 (Geneious 7.0.0)
-
getUngappedIndexOfThisOrNextResidue
public int getUngappedIndexOfThisOrNextResidue(int indexInGappedSequence)
Same asgetUngappedIndexOfThisOrPreviousResidue(int)
, but if the specified index is on a gap in the gapped sequence, then the ungapped index of the next rather than the previous nongap residue is returned.- Parameters:
indexInGappedSequence
- a 0-based index in the gapped sequence- Returns:
- the 0-based index of the first residue at or after the specified position in the ungapped sequence.
-
getUngappedIndexOfThisOrNextResidue
public static int getUngappedIndexOfThisOrNextResidue(SequenceCharSequence sequence, int indexInGappedSequence)
Gets the ungapped index corresponding to a gapped index. If the gapped index is a gap then the ungapped index of the next non-gap is returned. This is a static version ofgetUngappedIndexOfThisOrNextResidue(int)
that doesn't require the time and memory usage of constructing a reusableSequenceGapInformation
- Parameters:
sequence
- the sequenceindexInGappedSequence
- a 0-based index in the gapped sequence- Returns:
- the 0-based index of the first residue at or after the specified position in the ungapped sequence.
- Since:
- API 4.31 (Geneious 5.3.1)
-
getUngappedIndexOfThisOrPreviousResidue
public static int getUngappedIndexOfThisOrPreviousResidue(SequenceCharSequence sequence, int indexInGappedSequence)
Gets the ungapped index corresponding to a gapped index. If the gapped index is a gap then the ungapped index of the previous non-gap is returned. This is a static version ofgetUngappedIndexOfThisOrPreviousResidue(int)
that doesn't require the time and memory usage of constructing a reusableSequenceGapInformation
- Parameters:
sequence
- the sequenceindexInGappedSequence
- a 0-based index in the gapped sequence- Returns:
- the 0-based index of the first residue at or before the specified position in the ungapped sequence.
- Since:
- API 4.31 (Geneious 5.3.1)
-
getGappedIndex
public int getGappedIndex(int indexInUngappedSequence)
Translates an index in the ungapped sequence to the index of the corresponding nongap character in the
gappedSequence
passed to the constructor.It is permissible for indexInUngappedSequence to be < 0 or >=
getUngappedSequenceLength()
. For that case, it is assumed thatgappedSequence
is part of a larger sequence that contains no gaps beyondgappedSequence
's ends. If the gapped sequence contains end gaps, the returned position may lie within the end gap region.- Parameters:
indexInUngappedSequence
- An index in the ungapped sequence, i.e.SequenceUtilities.removeGaps(gappedSequence)- Returns:
- The position of the
indexInUngappedSequence
'th nongap charcter ingappedSequence
- See Also:
getGappedIndexTreatingEndGapsLikeInternalGaps(int)
-
getGappedIndexTreatingEndGapsLikeInternalGaps
public int getGappedIndexTreatingEndGapsLikeInternalGaps(int indexInUngappedSequence)
Translates an index in the ungapped sequence to the index of the corresponding nongap character in the
gappedSequence
passed to the constructor.It is permissible for indexInUngappedSequence to be < 0 or >=
getUngappedSequenceLength()
. For that case, it is assumed thatgappedSequence
is part of a larger sequence that contains no gaps beyondgappedSequence
's ends. If the gapped sequence contains end gaps, the returned position will always lie outside the end gap region.- Parameters:
indexInUngappedSequence
-- Returns:
- Since:
- API 4.910 (Geneious 9.1.0)
- See Also:
getGappedIndex(int)
,getUngappedIndexOfThisOrNextResidueTreatingEndGapsLikeInternalGaps(int)
,getUngappedIndexOfThisOrPreviousResidueTreatingEndGapsLikeInternalGaps(int)
-
getUngappedSequenceLength
public int getUngappedSequenceLength()
Returns the ungapped length of the sequence passed to the constructor- Returns:
- the ungapped length of the sequence passed to the constructor
-
getGappedSequenceLength
public int getGappedSequenceLength()
- Returns:
- the length of the gapped sequence passed to the constructor.
-
adjustAnnotationsForGapRemoval
public java.util.List<SequenceAnnotation> adjustAnnotationsForGapRemoval(java.util.List<SequenceAnnotation> annotations)
Creates a new list of annotations by adjusting their locations to compensate for removing all the gaps in the sequence. Opposite ofadjustAnnotationsForGapInsertion(java.util.List)
- Parameters:
annotations
- the original list of sequence annotations- Returns:
- the adjusted sequence annotations
- Since:
- API 4.13 (Geneious 5.0.2)
- See Also:
adjustAnnotationsForGapInsertion(java.util.List)
-
adjustAnnotationsForGapInsertion
public java.util.List<SequenceAnnotation> adjustAnnotationsForGapInsertion(java.util.List<SequenceAnnotation> annotations)
Creates a new list of annotations by adjusting their locations to compensate for adding gaps into the sequence. Opposite ofadjustAnnotationsForGapRemoval(java.util.List)
- Parameters:
annotations
- annotations to be adjusted for gap removal- Returns:
- annotations adjusted for gap removal
- Since:
- API 4.13 (Geneious 5.0.2)
- See Also:
adjustAnnotationsForGapRemoval(java.util.List)
-
-