Validating Illumina TruSeq Index Adapter Combinations

When pooling samples, there are often numerous complex rules and restrictions regarding which combinations of adapters are acceptable.

As a method of applying custom business logic, it is possible to automate the verification of your pools using Clarity LIMS.

This example shows how to confirm the composition of pools before they are created, allowing the lab scientist to alter the composition of pools that have caused an error.

Solution

In this example, we will enforce the following Illumina TruSeq DNA LT adapter tube pooling guidelines:

Process

The example script is configured to run on the Library Pooling (Illumina SBS) 4.0 process.

Parameters

The EPP command is configured to pass the following parameters to the script:

-i

The URI of the step that launches the script (Required)

The {stepURI:v2:http} token - in the form http://<Hostname>/api/v2/steps/<ProtocolStepLimsid>

-u

The username of the current user (Required)

The {username} token

-p

The password of the current user (Required)

The {password} token

An example of the full syntax to invoke the script is as follows:

bash -c "/opt/gls/groovy/current/bin/groovy -cp /opt/groovy/lib /opt/gls/clarity/customextensions/ConfirmationOfPoolComposition.groovy -i {stepURI:v2:http} -u {username} -p {password}"

NOTE: The location of Groovy on your server may be different from the one shown in this example. If this is the case, modify the script accordingly.

User Interaction

Assuming samples have been worked through the protocol and have reached the Library Pooling (Illumina SBS) 4.0 protocol step, the user pools the samples following the specified guidelines.
When the pools are created, the user attempts to proceed to the next page.
A message box displays alerting the user that a custom program is executing.
On successful completion, a success message displays.

About the Code

Once the script has processed the input and ensured that all the required information is available, we process the pools to determine if they meet the required specifications.

The first challenge is to represent the adapter combinations in the script.
- This is accomplished by a map comprised of the adapter names, indexed by their respective number, ie. AD001 indexed at 1.

Next, we define the three combination groups: 2 plex, 3 plex, and 4 plex.

This is achieved by creating a List of Lists, with the inner lists representing our combinations.

To facilitate the required fallback through lower plexity combinations, we store the combinations groups in a list, in ascending plexity.\

{% code overflow="wrap" %}

// Reagent Adapters
public static final def ADAPTERS = [
            (1): 'AD001 (ATCACG)', (2): 'AD002 (CGATGT)', (3): 'AD003 (TTAGGC)',
            (4): 'AD004 (TGACCA)', (5): 'AD005 (ACAGTG)', (6): 'AD006 (GCCAAT)',
            (7): 'AD007 (CAGATC)', (8): 'AD008 (ACTTGA)', (9): 'AD009 (GATCAG)'...]
// Pooling Combinations
public static final def TWO_PLEX_PATTERNS = [
        [ADAPTERS[6], ADAPTERS[12]], // Combination 1
        [ADAPTERS[5], ADAPTERS[19]]  // Combination 2
]
....
public static final def PATTERNS = [TWO_PLEX_PATTERNS, THREE_PLEX_PATTERNS, FOUR_PLEX_PATTERNS]

{% endcode %}

Once the combinations are defined, we need to create a method which will compare the actual combination of adapters in a pool with our ideal combinations. There are two cases we need to handle:
- When we are comparing two equal plexity combinations.
- When we are comparing a higher plexity pool to a lower plexity combination.
To handle the first case, we create a function that takes in our actual combination, and the ideal combination.

If the actual combination contains the entire combination, we remove those adapters. We then ensure that the leftover adapters are not in our Illumina TruSeq DNA LT adapter.\

Boolean noWildcardsMatch(List disposableList, def combination) {
    if(disposableList.containsAll(combination)) {
        disposableList.removeAll(combination)
         
        // Ensure that there are no other reagent-labels in the pool that may conflict with the pattern
        return !disposableList.find { leftoverReagent -> ADAPTERS.containsValue(leftoverReagent) }
    }
    return false
}

The second case is similar to the first.

We create a function that takes in our actual combination, the ideal combination, and the amount of wildcards. A wildcard represents an 'any adapter' condition in Illumina's TruSeq DNA LT adapter tube pooling guidelines.
Like the first case, we ensure that the actual list contains the entire ideal combination.

After removing the ideal adapters, we ensure that the amount of leftover Illumina TruSeq DNA LT adapters is equal to the amount of wildcards.

Boolean wildCardMatch(List disposableList, def combination, int wildcards) {
    // If the reagents contain the entire combination
    if(disposableList.containsAll(combination)) {
        disposableList.removeAll(combination)
         
        // If there are not more reagents found in the pool than there are wildcards, return true
        return (disposableList.findAll { leftoverReagent -> ADAPTERS.containsValue(leftoverReagent) }.size() == wildcards)
    }
    return false
}

To represent the adapter combination fallbacks, we require a method which will attempt to match the highest possible plexity for a given list of adapters. If it cannot do this, it will attempt to match it with a lower plexity combination with a wildcard.

To achieve this, we define a recursive function that handles both the exact and wildcard cases. The ideal combination plexitys will be chosen by the patternIndex input.
If no wildcards are present, we check each combination in the designated plexity.

If a match is not found, we call the match function again. This time, we increase the amount of wildcards by 1 and reduce the plexity of the combinations by 1. The function will now compare the adapter list using the wildCardMatch function. If a match is found, the function will exit and return true.

Boolean match(List reagents, int patternIndex, int wildcards = 0) {
    Boolean matches = false
    // If there are wild cards, handle them separately
    if(wildcards == 0) {
        // For each combination, check for a match
        PATTERNS[patternIndex].each {
            if(noWildcardsMatch(reagents, it)) {
                matches = true
            }
        }
    } else {
        PATTERNS[patternIndex].each {
            if(wildCardMatch(reagents, it, wildcards)) {
                matches = true
            }
        }
    }
    // If there was no match, determine if a match is found with an easier set of restrictions
    if(!matches && patternIndex != 0) {
        matches = match(reagents, patternIndex - 1, wildcards + 1)
    }
    return matches
}

Now, with our supporting functions defined, we can start processing our pools.

First we retrieve the definitions of the pools from the API. This node contains a list of the output pools, in addition to what input each pool contains.
Using this list, we create a map that stores the URIs of the output pools and the amount of inputs to each pool.

We then retrieve the output pools using a batchGET.

// Retrieve the pooling information
Node pooling = GLSRestApiUtils.httpGET(stepURI + '/pools', username, password)
         
// Collect all of the unique potential output pools and their number of input samples
Map poolURIs = [:]
pooling.'pooled-inputs'.'pool'.each {
    String poolURI = it.@'output-uri'
    poolURIs[poolURI] = it.'input'.size()
}
 
// Retrieve the pool artifacts
def poolNodes = GLSRestApiUtils.batchGET(poolURIs.keySet(), username, password)

Once we have the pools, we iterate through the list.

If a pool is valid, we increment a counter which will be used in our success message.
If invalid, we set the script outcome to failure, and append to the failure message.

The script continues searching for other issues and adding their information to the failure message.

// Verify that each pool contains acceptable reagent labels
Boolean failure = false
String errorMessage = ''
poolNodes.each {
    Boolean accepted = verifyReagents(it, poolURIs[GLSRestApiUtils.stripQuery(it.@uri)])
    if(accepted) {
        validatedPools++
    } else {
        failure = true
        errorMessage = errorMessage + "${it.'name'[0].text()} has an invalid combination of Illumina Adapters.${LINE_TERMINATOR}"
    }
}

After each pool has been checked, we determine how to alert the user of the script's completion.

If a pool is invalid, an error is thrown containing the list of failures and a recommendation to review the Illumina pooling guidelines.

If all pools are valid, we alert the user of a success.

{% code overflow="wrap" %}

// If a pool failed validation, report the message to the user and halt their progress
if(failure) {
    throw new Exception(errorMessage + 'Please consult the Illumina TruSeq adapter pooling guidelines.')
}
 
// Define the success message to the user
outputMessage = "Script has completed successfully.${LINE_TERMINATOR}All ${validatedPools} pools passed Illumina low-plexity pooling guidelines."

{% endcode %}

Assumptions and Notes

Your configuration conforms with the script's requirements, as documented in Solution.
You are running a version of Groovy that is supported by Clarity LIMS, as documented in the Clarity LIMS Technical Requirements.
The attached Groovy file is placed on the LIMS server, in the /opt/gls/clarity/customextensions folder.
GLSRestApiUtils.groovy is placed in your Groovy lib folder.
You have imported the attached Reagent XML file into your system using the Config Slicer tool.
The example code is provided for illustrative purposes only. It does not contain sufficient exception handling for use 'as is' in a production environment.

Attachments

ConfirmationOfPoolComposition.groovy:

10KB

ConfirmationOfPoolComposition.groovy

Open

Single Indexing ReagentTypes.xml:

6KB

Single Indexing ReagentTypes.xml

Open

PreviousChecking for Index Clashes Based on Index Sequence NextScripts Triggered Outside of Workflows/Steps

Last updated 1 year ago

Was this helpful?