The Illumina MiSeq Integration Package v8.3.0 supports the integration of Clarity LIMS to Illumina MiSeq Sequencing Systems.
The integration allows for automated tracking of an Illumina sequencing run in Clarity LIMS. This capability includes tracking sequencing run status, generating run report, and capturing and parsing run statistics. In addition, this integration provides automated generation of a sample sheet file for use with the MiSeq Control Software (MCS) and Local Run Manager (LRM).
The MiSeq Sequencing v3.2 workflow is compatible with MiSeq Integration Package v8.2.0 and v8.3.0.
MiSeq Integration v8.2.0 and above is only required for MiSeq Control Software (MCS) v4.0. For MCS v3.1 or earlier, do not upgrade to MiSeq Integration v8.2.0 or above. This upgrade breaks the integration.
Prerequisites and Assumptions
Before samples are assigned to the MiSeq Sequencing v3.2 workflow, make sure that the following prerequisites are completed:
Samples have been accessioned into Clarity LIMS.
Samples have been run through QC and library prep.
Samples have been normalized, and the value is captured in a field called Normalized Molarity (nM).
For more information on sample accessioning, refer to Sample Accessioning and Upload and Modify Samples in the Clarity LIMS (Clarity & LabLink Reference Guide) documentation.
Samples can be assigned to the MiSeq Sequencing v3.2 workflow automatically using a routing script or manually from the Projects & Samples dashboard. Refer to Assign and Process Samples in the Clarity LIMS (Clarity & LabLink Reference Guide) documentation.
Workflows, Protocols, and Steps
The Illumina MiSeq Integration includes the MiSeq Sequencing v3.2 workflow, which contains a single protocol of the same name.
Step 1: Library Pooling (MiSeq v3.2)
In this step, the lab scientist manually places libraries into pools in the Clarity LIMS Placement screen.
Set Next Step - Advance Automation
This automation advances samples to the next step in the protocol. The automation is automatically triggered on exit of the Record Details screen.
This automation checks the maximum number of samples that is allowed in a single pool. The default maximum for Illumina MiSeq Control Software (MCS) is 1536 samples. The automation is automatically triggered on entry of the Pooling screen.
The following field is configured on the Library Pooling (MiSeq v3.2) master step. The field displays on the Record Details screen at run time.
Library Pooling (MiSeq v3.2) Master Step Field Configuration
Global Fields
The following table lists the global fields that are displayed on the Queue and Ice Bucket screens of the Library Pooling (MiSeq v3.2) step. Most fields display in expanded view only.
Global Field Configuration (Submitted Sample)
Global Field Configuration (Derived Sample)
Step 2: Denature and Dilute (MiSeq v3.2)
In this step, pooled libraries are denatured and diluted, and then placed into the reagent cartridge loaded into the MiSeq instrument.
Validate Single Input Automation
This automation checks that there is only one pooled input in the step. The automation is automatically triggered when starting the step.
Validate Run Setup and Generate MiSeq SampleSheet Automation
This automation validates the information entered on the Record Details screen, generates the sample sheet (refer to Sample Sheet Generation), and attaches the sample sheet to the Denature and Dilute step.
Select a button on the Record Details screen to trigger this automation.
Validation Rules
The Experiment Name custom field cannot exceed 40 characters.
The Experiment Name field can only contain letters, numbers, periods, nonconsecutive spaces, and the following special characters: `~!@#%-_}{.
If Starling is selected as the variant caller, the Export to gVCF field must be set to Yes.
If the Read 2 Cycles field is empty or 0, the following rules apply:
The Custom Primers field cannot have a selection that contains Read 2.
The Adapter Read 2 field cannot have a value.
The UMI - Read 2 Length field and the UMI - Read 2 Start From Cycle field cannot have values.
The UMI - Read 1 Length field and the UMI - Read 1 Start From Cycle field cannot be used independently.
The UMI - Read 2 Length field and the UMI - Read 2 Start From Cycle field cannot be used independently.
If the UMI - Read 2 Length field is greater than 0, the UMI - Read 1 Length field must be greater than 0.
The Adapter Read 1 and Adapter Read 2 fields can only contain ACGT+ characters. The adapter sequence cannot start or end with + or contain more than one +.
For DNA Enrichment analysis, if the Variant Caller value is Somatic, the Variant Frequency Percentage must be between 1 and 100.
For DNA Amplicon analysis, if the Variant Caller value is Somatic, the Variant Frequency Percentage field must contain a value between 0.05 and 30.
If the Aligner value is BWA or TruSeq Amplicon Aligner, an error occurs when the Workflow value is not equal to DNA Amplicon.
If the Aligner value is BWA-MEM or BWA-Backtrack Legacy, an error occurs when the Workflow value is equal to DNA Amplicon.
If the Variant Caller value is Starling or GATK, an error occurs when the Workflow value is equal to DNA Amplicon.
If the Variant Caller value is Somatic, an error occurs when the Workflow value is not equal to DNA Enrichment or DNA Amplicon.
If the Variant Caller value is Germline, an error occurs when the Workflow value is not equal to DNA Amplicon.
There is an 8000 character limit in the Oracle database used for the automation storage. Because of this character limit, the automation splits and saves the validation expressions to the Validation Script 1 and Validation Script 2. These fields are configured as master step fields.
Validation Script 1 Command Line
if (step.::Experiment Name::.length() > 40) { fail(::Experiment Name cannot exceed 40 characters.::) }; if (!step.::Experiment Name::.matches(::^(?!.*[ ]{2})[a-zA-Z0-9-_`.~!#@%{ }]+::)) { fail(::Experiment Name contains prohibited characters. Allowed characters are letters, numbers, periods, non-consecutive spaces and the following special characters: `~!@#%-_}{::) }; if (step.::Workflow:: == ::Resequencing:: && step.::Variant Caller:: == ::Starling:: && (!step.hasValue(::Export to gVCF::) || step.::Export to gVCF:: != ::Yes::)) { fail(::Export to gVCF must be set to Yes for Starling variant caller.::) }; if (step.::Workflow:: == ::DNA Enrichment:: && step.::Variant Caller:: == ::Somatic:: && (!step.hasValue(::Variant Frequency Percentage::) || step.::Variant Frequency Percentage:: > 100 || step.::Variant Frequency Percentage:: < 1)) { fail(::In the Variant Frequency Percentage field, please enter values between 1 and 100.::) }; if (step.::Workflow:: == ::DNA Amplicon:: && step.::Variant Caller:: == ::Somatic:: && (!step.hasValue(::Variant Frequency Percentage::) || step.::Variant Frequency Percentage:: > 30 || step.::Variant Frequency Percentage:: < 0.05)) { fail(::In the Variant Frequency Percentage field, please enter values between 0.05 and 30.::) }; if (!step.hasValue(::Read 2 Cycles::) || step.::Read 2 Cycles:: == 0) { if (step.::Custom Primers::.contains(::Read 2::)) { fail(::Custom Primers setting selected is invalid and can only be used in a Paired-End run.::) }; if (step.hasValue(::Adapter Read 2::)) { fail(::Adapter Read 2 is only applicable for a Paired-End run.::) }; if (step.hasValue(::UMI - Read 2 Length::) || step.hasValue(::UMI - Read 2 Start From Cycle::)) { fail(::UMI - Read 2 Length and UMI - Read 2 Start From Cycle are only applicable for a Paired-End run.::) }; }; if (step.hasValue(::UMI - Read 1 Length::) && !step.hasValue(::UMI - Read 1 Start From Cycle::)) { fail(::UMI - Read 1 Start From Cycle must be greater than 0 if UMI - Read 1 Length is greater than 0.::) }; if (!step.hasValue(::UMI - Read 1 Length::) && step.hasValue(::UMI - Read 1 Start From Cycle::)) { fail(::UMI - Read 1 Length must be greater than 0 if UMI - Read 1 Start From Cycle is greater than 0.::) }; if (step.hasValue(::UMI - Read 2 Length::) && !step.hasValue(::UMI - Read 2 Start From Cycle::)) { fail(::UMI - Read 2 Start From Cycle must be greater than 0 if UMI - Read 2 Length is greater than 0.::) }; if (!step.hasValue(::UMI - Read 2 Length::) && step.hasValue(::UMI - Read 2 Start From Cycle::)) { fail(::UMI - Read 2 Length must be greater than 0 if UMI - Read 2 Start From Cycle is greater than 0.::) }; if (!step.hasValue(::UMI - Read 1 Length::) && step.hasValue(::UMI - Read 2 Length::)) { fail(::UMI - Read 1 Length must be greater than 0 if UMI - Read 2 Length is greater than 0.::) }; if (step.hasValue(::Adapter Read 1::) && !step.::Adapter Read 1::.matches(::^[ACGT]+(\\+[ACGT]+){0,1}$::)) { fail(::Adapter Read 1 contains prohibited characters. Allowed characters are: ACGT+ and the adapter sequence cannot start or end with + or contain more than one +.::) }; if (step.hasValue(::Adapter Read 2::) && !step.::Adapter Read 2::.matches(::^[ACGT]+(\\+[ACGT]+){0,1}$::)) { fail(::Adapter Read 2 contains prohibited characters. Allowed characters are: ACGT+ and the adapter sequence cannot start or end with + or contain more than one +.::) };
Validation Script 2 Command Line
if ((step.::Aligner:: == ::BWA:: || step.::Aligner:: == ::TruSeq Amplicon Aligner::) && step.::Workflow:: != ::DNA Amplicon::) { fail(::Aligner field contains invalid value. Please refer to the \u005c\u0022Applicable analysis fields for the selected Workflow:\u005c\u0022 section for more information.::) }; if ((step.::Aligner:: == ::BWA-MEM:: || step.::Aligner:: == ::BWA-Backtrack Legacy::) && step.::Workflow:: == ::DNA Amplicon::) { fail(::Aligner field contains invalid value. Please refer to the \u005c\u0022Applicable analysis fields for the selected Workflow:\u005c\u0022 section for more information.::) }; if ((step.::Variant Caller:: == ::Starling:: || step.::Variant Caller:: == ::GATK::) && step.::Workflow:: == ::DNA Amplicon::) { fail(::Variant Caller field contains invalid value. Please refer to the \u005c\u0022Applicable analysis fields for the selected Workflow:\u005c\u0022 section for more information.::) }; if (step.::Variant Caller:: == ::Somatic:: && step.::Workflow:: != ::DNA Enrichment:: && step.::Workflow:: != ::DNA Amplicon::) { fail(::Variant Caller field contains invalid value. Please refer to the \u005c\u0022Applicable analysis fields for the selected Workflow:\u005c\u0022 section for more information.::) }; if (step.::Variant Caller:: == ::Germline:: && step.::Workflow:: != ::DNA Amplicon::) { fail(::Variant Caller field contains invalid value. Please refer to the \u005c\u0022Applicable analysis fields for the selected Workflow:\u005c\u0022 section for more information.::) }; step.::Indel-Realignment-Key:: = ::::; step.::Indel-Realignment-Value:: = ::::; step.::Aligner-Key:: = ::::; step.::Aligner-Value:: = ::::; step.::Variant-Caller-Value:: = ::::; if (step.::Workflow:: == ::DNA Amplicon::) { if (step.::Indel Realignment:: == ::Yes::) { step.::Indel-Realignment-Key:: = ::variantcallerrealignindels::; step.::Indel-Realignment-Value:: = ::1::; }; else if (step.::Indel Realignment:: == ::No::) { step.::Indel-Realignment-Key:: = ::variantcallerrealignindels::; step.::Indel-Realignment-Value:: = ::0::; }; if (step.::Aligner:: == ::TruSeq Amplicon Aligner::) { step.::Aligner-Key:: = ::aligner::; step.::Aligner-Value:: = ::Amplicon::; }; else if (step.::Aligner:: == ::BWA::) { step.::Aligner-Key:: = ::aligner::; step.::Aligner-Value:: = ::BWA::; }; }; if (step.::Workflow:: != ::DNA Amplicon::) { if (step.::Indel Realignment:: == ::Yes::) { step.::Indel-Realignment-Key:: = ::indelrealignment::; step.::Indel-Realignment-Value:: = ::GATK::; }; else if (step.::Indel Realignment:: == ::No::) { step.::Indel-Realignment-Key:: = ::indelrealignment::; step.::Indel-Realignment-Value:: = ::None::; }; if (step.::Aligner:: == ::BWA-Backtrack Legacy::) { step.::Aligner-Key:: = ::runbwaaln::; step.::Aligner-Value:: = ::1::; }; else if (step.::Aligner:: == ::BWA-MEM::) { step.::Aligner-Key:: = ::runbwaaln::; step.::Aligner-Value:: = ::0::; }; }; if (step.hasValue(::Variant Caller::)) { if (step.::Variant Caller:: == ::None::) { step.::Variant-Caller-Value:: = ::::; }; if (step.::Variant Caller:: == ::GATK::) { step.::Variant-Caller-Value:: = ::GATK::; }; else if (step.::Variant Caller:: == ::Starling::) { step.::Variant-Caller-Value:: = ::Starling::; }; else if (step.::Variant Caller:: == ::Germline::) { step.::Variant-Caller-Value:: = ::PiscesGermline::; }; else if (step.::Variant Caller:: == ::Somatic:: && step.::Workflow:: == ::DNA Amplicon::) { step.::Variant-Caller-Value:: = ::PiscesSomatic::; }; else if (step.::Variant Caller:: == ::Somatic:: && step.::Workflow:: != ::DNA Amplicon::) { step.::Variant-Caller-Value:: = ::Somatic::; }; }; if (step.hasValue(::Manifest::)) { step.::Manifest-Section:: = ::[Manifests]::; }; else { step.::Manifest-Section:: = ::::; }
Set Next Step - Advance Automation
This automation advances samples to the next step in the protocol. The automation is automatically triggered by exiting the Record Details screen.
Most fields configured on the Denature and Dilute (MiSeq v3.2) step display on the Record Details screen in the Step Data table.
These fields are manually populated at run time. The values are then used to generate the sample sheet.
Denature & Dilute (MiSeq v3.2) Master Step Field Configuration
Groups of Defaults
Resequencing
Workflow = Resequencing
Read 1 Cycles = 251
Custom Primers = None
Aligner = BWA-MEM
Annotation = None
Export to gVCF = No
Flag PCR Duplicates = Yes
Genome Folder = Required
Indel Realignment = No
Indel Repeat Filter Cutoff = None
Manifest = None
Manifest Padding = None
Picard HS Metric Reporting = None
Reverse Complement = None
Read Stitching = None
Variant Caller = GATK
Applicable analysis fields for the selected Workflow =
Aligner (BWA-Backtrack Legacy, BWA-MEM)
Export to gVCF
Flag PCR Duplicates
Genome Folder (fill in the folder path)
Indel Realignment
Variant Caller (Starling, GATK)
Library QC
Workflow = Library QC
Read 1 Cycles = 251
Custom Primers = None
Aligner = None
Annotation = None
Export to gVCF = None
Flag PCR Duplicates = Yes
Genome Folder = Required
Indel Realignment = None
Indel Repeat Filter Cutoff = None
Manifest = None
Manifest Padding = None
Picard HS Metric Reporting = None
Reverse Complement = None
Read Stitching = None
Variant Caller = None
Applicable analysis fields for the selected Workflow =
Flag PCR Duplicates
Genome Folder (fill in the folder path)
GenerateFastQ
Workflow = GenerateFastQ
Read 1 Cycles = 251
Custom Primers = None
Aligner = None
Annotation = None
Export to gVCF = None
Flag PCR Duplicates = None
Genome Folder = None
Indel Realignment = None
Indel Repeat Filter Cutoff = None
Manifest = None
Manifest Padding = None
Picard HS Metric Reporting = None
Reverse Complement = None
Read Stitching = None
Variant Caller = None
Applicable analysis fields for the selected Workflow = None
DNA Enrichment
Workflow = DNA Enrichment
Read 1 Cycles = 251
Custom Primers = None
Aligner = BWA-MEM
Annotation = None
Export to gVCF = None
Flag PCR Duplicates = Yes
Genome Folder = Required
Indel Realignment = Yes
Indel Repeat Filter Cutoff = None
Manifest = Required
Manifest Padding = 150
Picard HS Metric Reporting = No
Reverse Complement = None
Read Stitching = None
Variant Caller = Starling
Applicable analysis fields for the selected Workflow =
Aligner (BWA-Backtrack Legacy, BWA-MEM)
Flag PCR Duplicates
Genome Folder (fill in the folder path)
Indel Realignment
Indel Repeat Filter Cutoff (if Variant Caller is Somatic)
Manifest (fill in the file name)
Manifest Padding
Picard HS Metric Reporting
Variant Caller (Starling, Somatic, GATK)
Variant Frequency Percentage (if Variant Caller is Somatic)
DNA Amplicon
Workflow = DNA Amplicon
Read 1 Cycles = 251
Custom Primers = None
Aligner = BWA
Annotation = RefSeq
Export to gVCF = None
Flag PCR Duplicates = None
Genome Folder = Required
Indel Realignment = Yes
Indel Repeat Filter Cutoff = None
Manifest = Required
Manifest Padding = None
Picard HS Metric Reporting = None
Reverse Complement = None
Read Stitching = None
Variant Caller = Germline
Variant Caller Depth Filter = 10
Variant Quality Filter = 30
Applicable analysis fields for the selected Workflow =
Aligner (BWA, TruSeq Amplicon Aligner)
Annotation (RefSeq, Ensembl)
Genome Folder (fill in the folder path)
Indel Realignment
Manifest (fill in the file name)
Read stitching (if Aligner is TruSeq Amplicon Aligner)
Variant Caller (Germline, Somatic)
Variant Caller Depth Filter (10–10000)
Variant Frequency Percentage (if Variant Caller is Somatic)
Variant Quality Filter (2–1000)
Global Fields
The following table shows the global fields that are configured to display on the Queue, Ice Bucket, and Record Details screens of the Denature and Dilute (MiSeq v3.2) step:
Global Field Configuration (Submitted Sample)
Global Field Configuration (Derived Sample)
Step File Placeholders
Placeholders for the following files are configured on the Record Details screen of the Denature and Dilute (MiSeq v3.2) step.
Lab Tracking Form
Manually uploaded
This item in Clarity LIMS allows the lab scientist to attach a lab-specific tracking form to the step manually.
MiSeq SampleSheet
Automatically attached
This CSV file is automatically generated by Clarity LIMS for use with the MiSeq system. The file can be opened as a text file or an Excel spreadsheet.
MiSeq SampleSheet Generation Log
Automatically attached
This log file is automatically generated by Clarity LIMS. The log file captures any errors that Clarity LIMS might encounter when generating the sample sheet.
Log File
Automatically attached
This log file is automatically generated by Clarity LIMS. The log file captures the status of the EvaluateDynamicExpression script that is invoked by the Set Next Step - Advance automation.
Step 3: MiSeq Run (MiSeq v3.2)
Set Next Step - Advance Automation
This automation advances samples to the next step in the protocol. The automation is automatically triggered by exiting the Record Details screen.
The following fields are configured on the MiSeq Run (MiSeq v3.2) step. These fields display on the Record Details screen at run time. The read-only fields are automatically populated at the end of the run.
MiSeq Run (MiSeq v3.2) Master Step Field Configuration
Global Fields
There are several sample and measurement global fields that are displayed on the Record Details screen of the MiSeq Run (MiSeq v3.2) step. These fields are autopopulated at the end of the sequencing run.
Sample sheet generation occurs in the Denature & Dilute (MiSeq v3.2) step. This step places samples on the container loaded in the system.
The default configuration provides only the Validate Run Setup and Generate MiSeq SampleSheet automation. This automation uses the Template File Generator (DriverFileGenerator.jar) and a template file to generate a CSV format file for use with the MiSeq Control Software (MCS).
The sample sheet content is determined by the fields on the Record Details screen of the step in the Step Data table. The values entered into these fields are used to populate the sample sheet.
To customize the template used to create the sample sheet, you can insert additional columns.
The MiSeq Run (MiSeq v3.2) step records information for the flow cell lane and generates a report summarizing the results. In addition, run parameters, run info, and a link to the run folder are automatically captured.
Generated and Captured Files
The following table describes the run information files, reports, placeholders, and links that Clarity LIMS automatically generates or captures during a sequencing run:
Run Information Generated or Captured by MiSeq Run (MiSeq v3.2) Step
Metadata
The following list shows metadata that Clarity LIMS automatically captures from the Illumina sequencing software as part of a sequencing run. This information is gathered from various run result files and events.
Chemistry
Experiment Name (entered in software)
Finish Date (run completion date)
If the End Run event contains a date in the format YYYY-MM-DD, Finish Date is set to the date in the event file.
If the End Run event does not contain a date or the date is in the wrong format, Finish Date is set to the date when the event file is processed.
Flow Cell ID
Flow Cell Version
Index 1 Read Cycles (intended Index cycles)
Index 2 Read Cycles (intended Index cycles)
Output Folder (run folder root)
PR2 Bottle ID
Reagent Cartridge ID
Reagent Cartridge Part #
Read 1 Cycles
Read 2 Cycles
Run ID (the unique run ID)
Run Type
Status (current status of the sequencing run on the instrument)
Workflow
Primary Analysis Metrics
The following table lists the real-time analysis (RTA) primary analysis metrics Clarity LIMS automatically captures and records per read, for samples in each flow cell lane. These metrics are captured upon run completion and are stored as fields in the Sample Details table on the Record Details screen.
To see both per read and per lane metrics, expand the output.
RTA Primary Analysis Metrics Captured by MiSeq Run (MiSeq v3.2) Step
How It Works
The sequencing service runs on the Clarity LIMS server. The service detects event files that instrument RTA produces as the run progresses. The event files let the service know where to find the run data.
As the run data are written out and the End Run event is detected, the following events occur:
The data are matched to the step based on the reagent cartridge ID that was entered or scanned on the Denature and Dilute (MiSeq v3.2) step.
Read-only field values on the Record Details screen are populated accordingly.
When the service has finished processing the end run event and updating the fields in Clarity LIMS, the sequencing service generates the report and attaches it to the step.
Scripts and Files Installed
This integration requires components installed with the Illumina Preset Protocols (IPP).
The Illumina MiSeq Integration Package v8.3.0 RPM installs the scripts and files listed in the following table.
Reagent categories or label groups are installed with the IPP workflow configuration slices.
The MiSeq Reagent Kit is included in the Illumina MiSeq Integration.
Control Types
The PhiX v3 control type is included in the Illumina MiSeq Integration.
Container Types
The following container types are included in the Illumina MiSeq Integration:
MiSeq Reagent Cartridge
96 well plate
Tube
All one-dimensional container types with both numeric rows and numeric columns are supported.
Instrument Integration
To make sure that the Illumina instrument warranty remains valid, the instrument integration must be performed and maintained by the Clarity LIMS Support team. To perform this integration, the Support team requires remote access to the system while it is idle.
The following steps are performed by the Clarity LIMS Support team when configuring the sequencing for use with the Illumina MiSeq Integration.
Create a directory on the local computer to hold the batch files. These batch files write event files to the network attached storage (NAS) shares.
Create a directory on the NAS to hold the event files.
Modify Illumina software configuration files to call the batch files that create the event files.
Update sequencing service default properties to match the specifics of the installation.
Rules and Constraints
The Illumina MiSeq Integration operates with the following constraints:
The reagent cartridge ID must be unique. There should not be multiple reagent cartridge containers in the system with the same name.
The reagent cartridge ID must be scanned as the reagent cartridge Container Name on the Denature & Dilute (MiSeq v3.2) step.