Operation options

Figure 15.3: Analyze CRISPR editing results options dialog.

Reference Sequence

The reference sequence should be a short sequence spanning the CRISPR editing site, of similar length to the reads. This sequence is normally the unedited, or target sequence for calling variants against. The reference sequence can be selected together with the reads prior to opening the operation, or can be set from the operation dialog.

Workﬂows: The reference sequence option is not available from workﬂows. If this operation is included in a workﬂow, the reference sequence must be provided as input to the workﬂow. Or you can insert it into the workﬂow using the ’Add document chosen when running workﬂow’ option.

Variants of Interest

Only the portion of each read which spans the speciﬁed region of interest will be used for variant calling. This region can be either a speciﬁed number of bases around the probable cut site (default 50bp), the region currently selected in the sequence viewer, or the entire range covered by the reads.

Reads will be entirely excluded from variant calling if they match poorly on the ends of the reference sequence range matched by 99% of reads. See the algorithm overview for details.

Minimum Variant Frequency

The minimum variant frequency setting is used to exclude low frequency variants from the results displayed. Note that this setting does not change the reported frequencies of variants, i.e. the frequencies will be a percentage of both included and excluded variants.

Translation Frame

The translation frame is used for calculating variant eﬀects on the protein. The genetic code is obtained from the reference sequence properties which can be set in the Info tab or Sequence View.

Sequencing Error Handling

Most of the time we can have reasonable conﬁdence whether or not a rare variant is likely due to sequencing error and either correctly collapse it into the cluster it belongs to or correctly keep it separate. The setting Collapse sequencing errors with conﬁdence controls what to do in borderline cases.

The value is log scale, so a value of +10 (or -10) means reads are collapsed (or not collapsed) with 90% conﬁdence it is correct to do so, ±20 means 99% conﬁdence, ±30 means 99.9% conﬁdence.

Turning oﬀ this setting is equivalent to using a large positive value. For sequencing reads without Phred quality scores, each base is assumed to have quality score of 20 (99% conﬁdence)

15.2.1 Operation options

Reference Sequence

Variants of Interest

Minimum Variant Frequency

Translation Frame

Sequencing Error Handling