Accessing Step UDFs from a different Step

This section outlines several strategies to enable this feature.

In all cases, assume that a UDF called Batch ID that was on Step A, and you want to access it on Step D:

NOTE: If the samples in Step D do not have a homogeneous lineage, expect multiple values for the Batch ID.

Scenario 1: Crawl Back

This method involves crawling backwards from Step D to Step A.

The general form is as follows.

  1. Examine the inputs to Step D.

    Each input (I) has a parent-process element with a URI to the step that created the artifact. In this case, it is the URI to Step C.

  2. Get the input-output maps for Step C (from the /details resource) and find the input (I') that produced output I. Each input (I') has a parent-process element with a URI to the step that created the artifact. In this case, it is the URI to Step B.

  3. Get the input-output maps for Step B (from the /details resource) and find the input (I'') that produced the output I'. Each input (I'') has a parent-process element with a URI to the step that created the artifact. In this case, it is the URI to Step A.

  4. Get the value of the UDF (Batch ID) from Step A: 1234.

This method is computationally slow, but it is safe. As the number of steps that need to be crawled back through increases, so does the duration of the script to retrieve the value.

Scenario 2: Jump Back

This method tried to jump straight to Step A, without passing through Steps B and C.

The general form is as follows.

  1. Examine the inputs to Step D. Each input (I) has a sample element that contains the limsid (S) of the related submitted sample.

  2. https://<your_hostname>/api/v2/artifacts?samplelimsid=Sandprocess-type=Step%20A

    This query should give an XML response containing the URI to Step A. From there, get the value of the UDF (Batch ID): 1234.

This method makes two assumptions:

  1. That Step A produces analytes (derived samples). Thus, if Step A is a QC process, or does not produce analyte outputs, this method fails.

  2. That the analytes (derived samples) resulting from S only passed through Step A one time. If this assumption is not true, you receive multiple URIs to the individual instances of Step A that relate. Also, you cannot be certain which Batch ID to rely upon.

This method is computationally fast, and its duration is not reduced if there are many steps between Step A and Step D.

Scenario 3: Pay it Forward

This method works well, but it involves making configuration changes to the steps. As such, this method is useless for legacy data resulting from samples that passed through the steps before the configuration was applied.

Its general form involves:

  • In Step A: Add a script that copies the value of the Batch ID UDF (1234) to every input and output of type analyte in the step.

  • In Step B: Add a script that copies the value of the Batch ID UDF (1234) to every output of type analyte in the step.

  • In Step C: Add a script that copies the value of the Batch ID UDF (1234) to every output of type analyte in the step.

  • In Step D: The inputs contain the value of the Batch ID.

This method relies on propagating the Step UDF through Steps A, B, and C to Step D. It is safe and fast. However, if the protocol is edited and a new step is inserted between B and C, add the script that propagates the value. This addition is so the chain does not break. This method is safe if any of the steps are QC steps or do not produce analyte outputs.

Scenario 4: Along for the Ride

This method is a niche solution, but it works well. It assumes that the samples from Step A proceed to Step D as an intact group, and they are joined by a control sample.

This method involves making configuration changes to the steps. As such, this method is useless for legacy data resulting from samples that passed through the steps before the configuration was applied.

  • In Step A: Identify the control sample for the group, then copy the value of the Batch ID to the control sample.

  • In Step D: Identify the control sample for the group, then retrieve the value of the Batch ID from it.

This method is the least work, but it does make several assumptions that might make it impracticable.

Last updated