DRAGEN Array v1.3.0 Release Notes

RELEASE DATE

August 2025

RELEASE HIGHLIGHTS

  • Provides mosaic fraction estimation for mosaic events.

  • Improved accuracy of sex chromosome calling, including pseudo-autosomal regions (PAR).

  • New QC metrics available in cytogenetics JSON output.

NEW FEATURES IN DETAIL

  • Genotyping & Core

    • GenomeStudio backwards compatible samplesheet support and related deprecation of separate IDAT and GTC samplesheets.

    • User-defined data from the samplesheet will get passed to gt_sample_summary files during genotyping.

    • Samples that fail IDAT->GTC conversion during genotype call will be added to the gt_sample_summary instead of skipped. For these samples, the Autosomal Call Rate and Call Rate will be set to 0 while the Log R Ratio Std Dev and TGA_Ctrl_5716 Norm R (when applicable for PGx products) are set to NaN.

    • Removed --smoothing parameter from genotype gtc-to-bedgraph which was causing wrong values in the LogR Ratios (LRR) bedgraph.

  • Cytogenetics

    • Fixed an issue causing cyto calling to crash due to overflow errors for noisy samples in v1.2.

    • Fixed a memory issue in v1.2 that limited the number of samples able to run to about 200.

    • Improved accuracy of length normalized median copy number calculation by removing lower limit on included variant size (1 kb).

    • Added reporting of mosaic fraction for mosaic events.

    • Added a method for promoting mosaic events above user-defined mosaic fraction.

    • Reduced verbosity in STDOUT messages produced by annotate command.

    • Added sample-level Median Log R Dev statistic to annotate JSON output.

    • Added chromosome-level QC metrics to the annotation JSON output.

    • Added an event-level QC metric effective size to the JSON output.

    • Variants are filtered by their effective size in the cyto call command. In the cyto annotate command, they are filtered by the raw size.

    • Fixed a bug whereby the minimum deletion/LOH/duplication thresholds were shown in the wrong units in the annotation JSON, when set higher than the calling thresholds.

    • Fixed a bug that prevented cyto CNV variants with quality scores of 0 from appearing in the output json files.

    • The cytogenetic caller now attempts to resolve sample sex if previously classified as unknown by the upstream genotyping module, enabling more accurate results. A log message is generated when sex is resolved, e.g., "Sample XXX sex updated from Unknown to Male."

    • Fixed an issue where the CytoPlatform was incorrectly always reported as LCG in the log regardless of product used.

  • Pharmacogenomics

    • Fixed a bug in the pgx star-allele annotate command, sample with a reference allele for ABCG2 genes will now be annotated properly using default annotation "Normal" for reference alleles.

    • Corrected the CYP2A6 *1 definition. Removed NC_000019.10:g.40848264_40848265delinsT variant that was incorrectly added to the CYP2A6 *1 definition

KNOWN ISSUES

  • Corrupt or invalid GTC files will abort with an error instead of skipping. The corrupt or invalid GTC files will need to be removed before proceeding.

  • Some multi-nucleotide variant (MNV) designs reverse complement the "Allele1/2 Top" fields in the Final Report

  • GTC files do not support non-ASCII characters. This is especially problematic when running DRAGEN Array local if operating system locale settings are not English-based (e.g., en-US) as internal datetime fields could write non-ASCII characters. This will result in the following error:

There is a workaround to disable globalization and produce valid GTC files:

  1. Locate the dragena.runtimeconfig.json file inside the installation directory of DRAGEN Array (i.e., where the .zip or .tar.gz file was downloaded and unzipped).

  2. Add the key System.Globalization.Invariant to that file and set its value to true. (i.e., step #2 here: https://github.com/dotnet/corefx/blob/master/Documentation/architecture/globalization-invariant-mode.md#enabling-the-invariant-mode)

  3. Re-generate the GTC using the genotype call subcommand.

  • SNV and indel variants are always treated as separate variants and are not collapsed in gtc-to-vcf even if they are designed to the same locus.

  • Some indel variants are missing from SNV VCF due to mapping issue between the designed indels and the reference genome.

  • In the gtc-to-vcf subcommand a mismatch between BPM and CSV manifests will not cause the command to abort with an error. The mismatch will need to be addressed before proceeding.

  • During locus combination logic for gtc-to-vcf, duplicate INDELS with HET calls on opposite strands (i.e., probe calls of D/I + I/D) incorrectly report a no call (./.) in the VCF instead of HET (0/1).

  • The samplesheet does not handle empty columns. For example this samplesheet:

Will throw the following error: System.ArgumentException : Duplicate column found. Column names are case-insensitive. Please remove or rename the column from the samplesheet and re-process. And this example:

Will produce an empty column/field in the Genotype Sample Summary files, e.g.,

  • Rare intermittent memory issues during star allele calling. Example error message: The model has been changed since the solution was last computed.. To work around the issue, the user should restart star allele calling or run it on a machine with more memory.

  • Some simple variants have REF and ALT delimited by _ instead of > in the star_alleles.csv and metabolizer status JSON files (e.g., "ryr1.38577931a_c" instead of "ryr1.38577931a>c")

  • Very large samplesheets (i.e., >1K samples) can significantly slow analysis. The recommended workaround is to split analyses into batches of at most 1K samples.

KNOWN LIMITATIONS

  • Command line options unsquash-duplicates and filter-loci for gtc-to-vcf conversion should not be used when star allele calling is desired. In addition, VCFs must be gzipped and tabix indexed (the default for gtc-to-vcf) to be used in star allele calling.

  • Genotyping only supports diploid organisms. Polyploid genotyping is currently not supported.

  • If the genotyping module reports an unknown sex and the cytogenetic caller cannot resolve it, the caller assumes the sample is male. As a result, sex chromosome detection may be inaccurate if the sample is actually female. This behavior is not currently output in the log.

  • ISCN annotations in the cytogenetic annotation JSON output file are only provided for variants greater than 1 kb in length. This is often cited as a minimum size limit used to define copy number variants.

  • Centromere regions typically have low sequence complexity and are prone to artifacts. As a result, cytogenetic calling results in these regions are likely to be false positives.

  • ISCN annotations are not provided for LOH variants in the cytogenetic annotation JSON output file.

  • DRAGEN Array Cytogenetics analysis is intended for constitutional samples only, oncology samples not supported at this time.

  • DRAGEN Array Cytogenetics analysis is validated only for specific array platforms: Infinium Global Diversity Array with Cytogenetics-8, Infinium Global Screening Array with Cytogenetics-24, and Infinium CytoSNP-850K BeadChip (iScan System).

    • Note: DRAGEN Array can process IDAT files from the NextSeq550 for cytogenetic analysis, but this setup hasn’t been formally validated. If you're interested in trying it, check out the demo data in the ‘Demo Data’ section on BaseSpace, which was generated using the iScan system.

  • DRAGEN Array Cytogenetics analysis may call large events that are broken into smaller pieces and require visual confirmation.

  • GT is hardcoded to homozygous alt (1/1) for cyto VCF entries.

  • Tabix indexing from DRAGEN Array is not exactly the same as bcftools index --tbi. For instance, if you run bcftools index --stats in.vcf.gz or bcftools index --nrecords in.vcf.gz, with certain versions of bcftools, you may get the following error: index of in.snv.vcf.gz does not contain any count metadata. Please re-index with a newer version of bcftools or tabix.. If these tools are critical to user's bioinformatics pipelines a workaround would be to unzip and re-index DRAGEN Array VCFs using bcftool's tabix. But please note, these index files may not work in downstream VCF-based DRAGEN Array commands like pgx star-allele call. Please use DRAGEN Array end-to-end for analysis flows like the ones detailed in the Quick Start guide.

  • There can be some minor differences when running pgx star-allele call on Windows vs. Linux. During verification testing, out of 1576 samples, we noticed the following discordance:

Field name
Number of differences

Collapsed Star-Alleles

2

Missing/Masked Core Variants

1

Solution Long

1

Supporting Variants

2

  • Note: All overall solutions tested for comparison were found to be concordant.

  • DRAGEN Array v1.3 is not compatible with Emedgene (EMG) v38. I.e., it does not support automatic case creation and you can't manually upload Cytogenetics VCF Files from v1.3 into EMG. Users should continue to use DRAGEN Array - Cytogenetics analysis + Emedgene interpretation 1.2.0 for DRAGEN Array + EMG cyto analyses.

  • Star allele calling does not support novel alleles; only alleles defined in the PharmVar and PharmGKB databases are supported.

  • CYP2D6 non-*36 star alleles with exon 9 conversion, such as *83, are reported as *36 with *83 as an underlying allele.

Last updated

Was this helpful?