Class SequenceGapInformation
- java.lang.Object
-
- com.biomatters.geneious.publicapi.implementations.SequenceGapInformation
-
public final class SequenceGapInformation extends java.lang.ObjectPrecalculates information about the location of gaps ('-') in a CharSequence, and can efficiently calculate translate between indices in the gapped and ungapped sequence. This translation also works for indices beyond the ends of the sequence, which is necessary for translatingSequenceAnnotations, which can go beyond sequence ends. Like all other sequence indices except for the ones in SequenceAnnotations, indices used in this class are 0-based. When the sequence actually contains internal gaps, this class uses memory about 0.5 bytes of memory per base in the gapped internal sequence for large sequences (over 10 million base pairs) SomeDefaultSequenceDocumentinstances (usually only instances of reference sequences in a big contig) store a pre-built SequenceGapInformation which is availableDefaultSequenceDocument.getSequenceGapInformation()
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static interfaceSequenceGapInformation.ProviderAn interface thatSequenceDocumentscan optionally to implement to indicate they may provide a (potentially) pre-built SequenceGapInformation
-
Constructor Summary
Constructors Constructor Description SequenceGapInformation(java.lang.CharSequence gappedSequence)Constructs SequenceGapInformation for the specified gapped sequence.SequenceGapInformation(java.lang.CharSequence gappedSequence, jebl.util.ProgressListener progressListener)Constructs SequenceGapInformation for the specified gapped sequence.SequenceGapInformation(org.jdom.Element element, SequenceCharSequence gappedCharSequence)Deserializes a SequenceGapInformation from XML previously returned fromtoXML(String).
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description java.util.List<SequenceAnnotation>adjustAnnotationsForGapInsertion(java.util.List<SequenceAnnotation> annotations)Creates a new list of annotations by adjusting their locations to compensate for adding gaps into the sequence.java.util.List<SequenceAnnotation>adjustAnnotationsForGapRemoval(java.util.List<SequenceAnnotation> annotations)Creates a new list of annotations by adjusting their locations to compensate for removing all the gaps in the sequence.static SequenceGapInformationforSequenceDocument(SequenceDocument sequence)Get SequenceGapInformation for the given sequence.chargetGappedCharAt(int indexInGappedSequence)Returns the character at the given index in the gapped sequence.SequenceCharSequencegetGappedCharSequence()intgetGappedIndex(int indexInUngappedSequence)Translates an index in the ungapped sequence to the index of the corresponding nongap character in thegappedSequencepassed to the constructor.intgetGappedIndexTreatingEndGapsLikeInternalGaps(int indexInUngappedSequence)Translates an index in the ungapped sequence to the index of the corresponding nongap character in thegappedSequencepassed to the constructor.intgetGappedSequenceLength()intgetLeadingGapsLength()intgetTrailingGapsLength()intgetTrailingGapsStartIndex()intgetUngappedIndexOfThisOrNextResidue(int indexInGappedSequence)Same asgetUngappedIndexOfThisOrPreviousResidue(int), but if the specified index is on a gap in the gapped sequence, then the ungapped index of the next rather than the previous nongap residue is returned.static intgetUngappedIndexOfThisOrNextResidue(SequenceCharSequence sequence, int indexInGappedSequence)Gets the ungapped index corresponding to a gapped index.intgetUngappedIndexOfThisOrNextResidueTreatingEndGapsLikeInternalGaps(int indexInGappedSequence)Same asgetUngappedIndexOfThisOrPreviousResidueTreatingEndGapsLikeInternalGaps(int), but if the specified index is on a gap in the gapped sequence, then the ungapped index of the next rather than the previous nongap residue is returned.intgetUngappedIndexOfThisOrPreviousResidue(int indexInGappedSequence)Calculates the index where the character gappedSequence.charAt(indexInGappedSequence) would move if all gaps were stripped from gappedSequence.static intgetUngappedIndexOfThisOrPreviousResidue(SequenceCharSequence sequence, int indexInGappedSequence)Gets the ungapped index corresponding to a gapped index.intgetUngappedIndexOfThisOrPreviousResidueTreatingEndGapsLikeInternalGaps(int indexInGappedSequence)Calculates the index where the character gappedSequence.charAt(indexInGappedSequence) would move if all gaps were stripped from gappedSequence.intgetUngappedSequenceLength()Returns the ungapped length of the sequence passed to the constructorstatic Geneious.MajorVersiongetVersionSupportStatic(XMLSerializable.VersionSupportType versionType)Returns version support as defined byXMLSerializable.OldVersionCompatible.getVersionSupport(com.biomatters.geneious.publicapi.documents.XMLSerializable.VersionSupportType).booleanisGap(int indexInGappedSequence)Return true if the character at the specified gapped sequence index is an internal gap or end gap Characters beyond the ends of the gapped sequence are assumed to be non-gaps (for consistency withgetUngappedIndexOfThisOrPreviousResidue(int)) therefore isGap(x) where x<0 or x>=gappedSequenceLength will return false.booleanisInternalGap(int indexInGappedSequence)Return true if the character at the specified gapped sequence index is an internal (non-end) gap.org.jdom.ElementtoXML(Geneious.MajorVersion version, java.lang.String name)Serializes this gap information (excluding the char sequence) to XML which may usePluginDocument.FILE_DATA_ATTRIBUTE_NAMEorg.jdom.ElementtoXML(java.lang.String name)Serializes this gap information (excluding the char sequence) to XML which may usePluginDocument.FILE_DATA_ATTRIBUTE_NAME
-
-
-
Constructor Detail
-
SequenceGapInformation
public SequenceGapInformation(org.jdom.Element element, SequenceCharSequence gappedCharSequence)Deserializes a SequenceGapInformation from XML previously returned fromtoXML(String).- Parameters:
element- an element previously returned fromtoXML(String).gappedCharSequence- the gapped char sequence previously associated with the previously serialized SequenceGapInformation.- Since:
- API 4.50 (Geneious 5.5.0)
-
SequenceGapInformation
public SequenceGapInformation(java.lang.CharSequence gappedSequence)
Constructs SequenceGapInformation for the specified gapped sequence.- Parameters:
gappedSequence- A CharSequence that contains gaps and for which we want to be able to translate between gapped and ungapped indices.
-
SequenceGapInformation
public SequenceGapInformation(java.lang.CharSequence gappedSequence, jebl.util.ProgressListener progressListener) throws com.biomatters.geneious.publicapi.plugin.DocumentOperationException.CanceledConstructs SequenceGapInformation for the specified gapped sequence.- Parameters:
gappedSequence- A CharSequence that contains gaps and for which we want to be able to translate between gapped and ungapped indices.progressListener- for reporting progress and cancelling- Throws:
com.biomatters.geneious.publicapi.plugin.DocumentOperationException.Canceled- if the progress listener requests we cancel.- Since:
- API 4.50 (Geneious 5.5.0)
-
-
Method Detail
-
getVersionSupportStatic
public static Geneious.MajorVersion getVersionSupportStatic(XMLSerializable.VersionSupportType versionType)
Returns version support as defined byXMLSerializable.OldVersionCompatible.getVersionSupport(com.biomatters.geneious.publicapi.documents.XMLSerializable.VersionSupportType). All SequenceGapInformation instances support version serialization in the same way, so this method is static.- Parameters:
versionType- the type of version support to know about.- Returns:
- version support as defined by
XMLSerializable.OldVersionCompatible.getVersionSupport(com.biomatters.geneious.publicapi.documents.XMLSerializable.VersionSupportType) - Since:
- API 4.600 (Geneious 6.0.0)
-
toXML
public org.jdom.Element toXML(java.lang.String name) throws java.io.IOExceptionSerializes this gap information (excluding the char sequence) to XML which may usePluginDocument.FILE_DATA_ATTRIBUTE_NAME- Parameters:
name- the name of the element to retrn- Returns:
- some xml
- Throws:
java.io.IOException- if it can't be serialized because we can't write to a local temporary file- Since:
- API 4.50 (Geneious 5.5.0)
-
toXML
public org.jdom.Element toXML(Geneious.MajorVersion version, java.lang.String name) throws java.io.IOException
Serializes this gap information (excluding the char sequence) to XML which may usePluginDocument.FILE_DATA_ATTRIBUTE_NAME- Parameters:
version- the version of Geneious to serialize forname- the name of the element to retrn- Returns:
- some xml
- Throws:
java.io.IOException- if it can't be serialized because we can't write to a local temporary file- Since:
- API 4.600 (Geneious 6.0.0)
-
forSequenceDocument
public static SequenceGapInformation forSequenceDocument(SequenceDocument sequence)
Get SequenceGapInformation for the given sequence. This is the preferred method of getting a SequenceGapInformation because it can return the cached copy for a DefaultSequenceDocument.- Parameters:
sequence- a SequenceDocument that contains gaps and for which we want to be able to translate between gapped and ungapped indices.- Returns:
- gap information for the sequence
- Since:
- API 4.600 (Geneious 6.0.0)
-
getLeadingGapsLength
public int getLeadingGapsLength()
- Returns:
- the number of leading gaps in the sequence passed to the constructor. Equivalent to
SequenceCharSequence.getLeadingGapsLength() - Since:
- API 4.60 (Geneious 5.6.0)
-
getTrailingGapsLength
public int getTrailingGapsLength()
- Returns:
- the number of trailing gaps in the sequence passed to the constructor. Equivalent to
SequenceCharSequence.getTrailingGapsLength() - Since:
- API 4.60 (Geneious 5.6.0)
-
getTrailingGapsStartIndex
public int getTrailingGapsStartIndex()
- Returns:
- the start index of the trailing gaps in the sequence passed to the constructor. Equivalent to
SequenceCharSequence.getTrailingGapsStartIndex() - Since:
- API 4.60 (Geneious 5.6.0)
-
getUngappedIndexOfThisOrPreviousResidue
public int getUngappedIndexOfThisOrPreviousResidue(int indexInGappedSequence)
Calculates the index where the character gappedSequence.charAt(indexInGappedSequence) would move if all gaps were stripped from gappedSequence. If the specified index is on a gap, the adjusted index of its nearest nongap neighbour on the left (or -1 if there is none) is returned. This corresponds to stripping all gap characters out of the sequence and the intervals covering those residues, with the characters moving to the left to fill the gaps, and implicitly treating characters beyond the sequence length and in end-gap regions as nongaps.Example: 012345678901 old indices -ABC--DE--f- -101222344456 new indices- Parameters:
indexInGappedSequence- The 0-based index in the gapped sequence to convert to an index in the sequence without gaps- Returns:
- the resulting index in the gapless CharSequence.
-
getUngappedIndexOfThisOrPreviousResidueTreatingEndGapsLikeInternalGaps
public int getUngappedIndexOfThisOrPreviousResidueTreatingEndGapsLikeInternalGaps(int indexInGappedSequence)
Calculates the index where the character gappedSequence.charAt(indexInGappedSequence) would move if all gaps were stripped from gappedSequence. If the specified index is on a gap, the adjusted index of its nearest nongap neighbour on the left (or -1 if there is none) is returned. This corresponds to stripping all gap characters out of the sequence and the intervals covering those residues, with the characters moving to the left to fill the gaps, and implicitly treating characters beyond the sequence length as nongaps. Characters in end gap regions will be treated the same as internal gaps.Example: 012345678901 old indices -ABC--DE--f- -101222344455 new indices- Parameters:
indexInGappedSequence- The 0-based index in the gapped sequence to convert to an index in the sequence without gaps- Returns:
- the resulting index in the gapless CharSequence.
- Since:
- API 4.700 (Geneious 7.0.0
-
isGap
public boolean isGap(int indexInGappedSequence)
Return true if the character at the specified gapped sequence index is an internal gap or end gap Characters beyond the ends of the gapped sequence are assumed to be non-gaps (for consistency withgetUngappedIndexOfThisOrPreviousResidue(int)) therefore isGap(x) where x<0 or x>=gappedSequenceLength will return false.- Parameters:
indexInGappedSequence- an index of a character in the gapped sequence.- Returns:
- true if the character at the specified gapped sequence index is a gap.
-
isInternalGap
public boolean isInternalGap(int indexInGappedSequence)
Return true if the character at the specified gapped sequence index is an internal (non-end) gap. Characters beyond the ends of the gapped sequence are assumed to be non-gaps (for consistency withgetUngappedIndexOfThisOrPreviousResidue(int)) therefore isGap(x) where x<0 or x>=gappedSequenceLength will return false.- Parameters:
indexInGappedSequence- an index of a character in the gapped sequence.- Returns:
- true if the character at the specified gapped sequence index is an internal gap.
- Since:
- API 4.60 (Geneious 5.6.0)
-
getGappedCharAt
public char getGappedCharAt(int indexInGappedSequence)
Returns the character at the given index in the gapped sequence.- Parameters:
indexInGappedSequence- 0-based index of the character in the gapped sequence- Returns:
- the character at the gapped sequence
- Throws:
java.lang.IndexOutOfBoundsException- ifindexInGappedSequenceis less than 0 or greater than or equal to the gapped sequence length- Since:
- API 4.202000 (Geneious 2020.0.0)
-
getGappedCharSequence
public SequenceCharSequence getGappedCharSequence()
- Returns:
- the gapped sequence passed to the constructor.
- Since:
- API 4.202010 (Geneious 2020.1.0)
-
getUngappedIndexOfThisOrNextResidueTreatingEndGapsLikeInternalGaps
public int getUngappedIndexOfThisOrNextResidueTreatingEndGapsLikeInternalGaps(int indexInGappedSequence)
Same asgetUngappedIndexOfThisOrPreviousResidueTreatingEndGapsLikeInternalGaps(int), but if the specified index is on a gap in the gapped sequence, then the ungapped index of the next rather than the previous nongap residue is returned.- Parameters:
indexInGappedSequence- a 0-based index in the gapped sequence- Returns:
- the 0-based index of the first residue at or after the specified position in the ungapped sequence.
- Since:
- API 4.700 (Geneious 7.0.0)
-
getUngappedIndexOfThisOrNextResidue
public int getUngappedIndexOfThisOrNextResidue(int indexInGappedSequence)
Same asgetUngappedIndexOfThisOrPreviousResidue(int), but if the specified index is on a gap in the gapped sequence, then the ungapped index of the next rather than the previous nongap residue is returned.- Parameters:
indexInGappedSequence- a 0-based index in the gapped sequence- Returns:
- the 0-based index of the first residue at or after the specified position in the ungapped sequence.
-
getUngappedIndexOfThisOrNextResidue
public static int getUngappedIndexOfThisOrNextResidue(SequenceCharSequence sequence, int indexInGappedSequence)
Gets the ungapped index corresponding to a gapped index. If the gapped index is a gap then the ungapped index of the next non-gap is returned. This is a static version ofgetUngappedIndexOfThisOrNextResidue(int)that doesn't require the time and memory usage of constructing a reusableSequenceGapInformation- Parameters:
sequence- the sequenceindexInGappedSequence- a 0-based index in the gapped sequence- Returns:
- the 0-based index of the first residue at or after the specified position in the ungapped sequence.
- Since:
- API 4.31 (Geneious 5.3.1)
-
getUngappedIndexOfThisOrPreviousResidue
public static int getUngappedIndexOfThisOrPreviousResidue(SequenceCharSequence sequence, int indexInGappedSequence)
Gets the ungapped index corresponding to a gapped index. If the gapped index is a gap then the ungapped index of the previous non-gap is returned. This is a static version ofgetUngappedIndexOfThisOrPreviousResidue(int)that doesn't require the time and memory usage of constructing a reusableSequenceGapInformation- Parameters:
sequence- the sequenceindexInGappedSequence- a 0-based index in the gapped sequence- Returns:
- the 0-based index of the first residue at or before the specified position in the ungapped sequence.
- Since:
- API 4.31 (Geneious 5.3.1)
-
getGappedIndex
public int getGappedIndex(int indexInUngappedSequence)
Translates an index in the ungapped sequence to the index of the corresponding nongap character in the
gappedSequencepassed to the constructor.It is permissible for indexInUngappedSequence to be < 0 or >=
getUngappedSequenceLength(). For that case, it is assumed thatgappedSequenceis part of a larger sequence that contains no gaps beyondgappedSequence's ends. If the gapped sequence contains end gaps, the returned position may lie within the end gap region.- Parameters:
indexInUngappedSequence- An index in the ungapped sequence, i.e.SequenceUtilities.removeGaps(gappedSequence)- Returns:
- The position of the
indexInUngappedSequence'th nongap charcter ingappedSequence - See Also:
getGappedIndexTreatingEndGapsLikeInternalGaps(int)
-
getGappedIndexTreatingEndGapsLikeInternalGaps
public int getGappedIndexTreatingEndGapsLikeInternalGaps(int indexInUngappedSequence)
Translates an index in the ungapped sequence to the index of the corresponding nongap character in the
gappedSequencepassed to the constructor.It is permissible for indexInUngappedSequence to be < 0 or >=
getUngappedSequenceLength(). For that case, it is assumed thatgappedSequenceis part of a larger sequence that contains no gaps beyondgappedSequence's ends. If the gapped sequence contains end gaps, the returned position will always lie outside the end gap region.- Parameters:
indexInUngappedSequence-- Returns:
- Since:
- API 4.910 (Geneious 9.1.0)
- See Also:
getGappedIndex(int),getUngappedIndexOfThisOrNextResidueTreatingEndGapsLikeInternalGaps(int),getUngappedIndexOfThisOrPreviousResidueTreatingEndGapsLikeInternalGaps(int)
-
getUngappedSequenceLength
public int getUngappedSequenceLength()
Returns the ungapped length of the sequence passed to the constructor- Returns:
- the ungapped length of the sequence passed to the constructor
-
getGappedSequenceLength
public int getGappedSequenceLength()
- Returns:
- the length of the gapped sequence passed to the constructor.
-
adjustAnnotationsForGapRemoval
public java.util.List<SequenceAnnotation> adjustAnnotationsForGapRemoval(java.util.List<SequenceAnnotation> annotations)
Creates a new list of annotations by adjusting their locations to compensate for removing all the gaps in the sequence. Opposite ofadjustAnnotationsForGapInsertion(java.util.List)- Parameters:
annotations- the original list of sequence annotations- Returns:
- the adjusted sequence annotations
- Since:
- API 4.13 (Geneious 5.0.2)
- See Also:
adjustAnnotationsForGapInsertion(java.util.List)
-
adjustAnnotationsForGapInsertion
public java.util.List<SequenceAnnotation> adjustAnnotationsForGapInsertion(java.util.List<SequenceAnnotation> annotations)
Creates a new list of annotations by adjusting their locations to compensate for adding gaps into the sequence. Opposite ofadjustAnnotationsForGapRemoval(java.util.List)- Parameters:
annotations- annotations to be adjusted for gap removal- Returns:
- annotations adjusted for gap removal
- Since:
- API 4.13 (Geneious 5.0.2)
- See Also:
adjustAnnotationsForGapRemoval(java.util.List)
-
-