Parsing Metadata into UDFs (BCL Conversion and Demultiplexing)

This example provides a script that can be used to parse lanebarcode.html files from demultiplexing. This script is written to be easily used with the out of the box Bcl Conversion & Demultiplexing (HiSeq 3000/4000) protocol.

  • Result values are associated with a barcode sequence as well as lane.

  • Values are attached to the result file output in Clarity LIMS, with matching barcode sequence (index on derived sample input) and lane (container placement of derived sample input).

  • Script modifications may be needed to match the format of index in Clarity LIMS to the index in the HTML result file.

Parameters

The script accepts the following parameters:

-u

The username of the current user (Required)

-p

The password of the current user (Required)

-o

The limsid of the result file artifact with attached lanebarecode.html file (Required)

-s

The LIMS IDs of the individual result files. (Required)

An example of the full syntax to invoke the script is as follows:

bash -l -c "/usr/bin/python /opt/gls/clarity/customextensions/demux_stats_parser.py -s {stepURI:v2} -o {compoundOutputFileLuid0} -u {username} -p {password}" 

Configuration

Defining the UDFs / Custom Fields

All user defined fields (UDFs) / custom fields must first be defined in the script. Within the UDF / custom field dictionary, the name of the field as it appears in Clarity LIMS (the key) must be associated with the field from the result file (the value).

The fields should be preconfigured in Clarity LIMS for result file outputs.

udfs_in_clarity = {"Yield PF (Gb)":"Yield (Mbases)"
    "%PF":"% PF Clusters",
    "% One Mismatch Reads (Index)":"% One mismatch barcode", 
    "% Bases >=Q30":"% >= Q30 bases",
    "Ave Q Score":"Mean Quality Score",
    "% Perfect Index Read":"% Perfect barcode",
    "# Reads":"PF Clusters",
    "% of Raw Clusters Per Lane":"% of the lane"}

Modifying individual UDFs / Custom Fields

The UDF / custom field values can be modified before being brought into Clarity LIMS. In the following example, the value in megabases is modified to gigabases.

if clarity_udf == 'Yield PF (Gb)':
    yieldmb = udf_value
    yieldmb = yieldmb.replace(",","")
    yieldgb = float(yieldmb)*.001
    udf_value = yieldgb

Checking for matching flow cell ID

The script currently checks the flow cell ID for the projects in Clarity LIMS against the flow cell IS in the result file.

NOTE: The script will still complete and attach UDF / custom field values. You may wish to modify the script to not attach the field values if the flow cell ID does not match.

Assumptions and Notes

  • Your configuration conforms with the script's requirements, as documented in the Configuration section of this document.

  • You are running a version of Python that is supported by Clarity LIMS, as documented in the Clarity LIMS Technical Requirements.

  • The attached Python file is placed on the LIMS server, in the /opt/gls/clarity/customextensions folder.

  • The glsapiutil file is placed on the Clarity LIMS server, in the /opt/gls/clarity/customextensions folder.

  • The example code is provided for illustrative purposes only. It does not contain sufficient exception handling for use 'as is' in a production environment.

Attachments

demux_stats_parser.py:

demux_stats_parser_4.py:

Last updated