DRAGEN Array QC Report
The DRAGEN Array QC Report is a self-contained, interactive HTML dashboard that helps you evaluate the quality of microarray datasets processed with the DRAGEN Array pipeline. It combines per-sample functional QC metrics, control-probe QC metrics (probe-level and summarized), and interactive visualizations to help you quickly:
Inspect per-sample metrics, for example, Autosomal Call Rate, Log R Ratio Standard Deviation (LogRDev), Sex estimate
Detect assay or instrument issues using control-probe intensity patterns
Identify outliers, spatial artifacts, and batch effects using heatmaps and trend plots
Apply automated QC thresholds and export results for downstream review
Quickly calculate project-wide average call rate and LogRDev, and monitor trends across multiple datasets
Analysis Workflow
Use the following instructions to generate an interactive QC Report (HTML format) and QC Table (spreadsheet format). If you used a Sample Sheet in the upstream workflow, user-defined metadata can be carried into the final QC report outputs. See Command Index for all command parameters.
Methylation and genotyping workflow differences are highlighted below.
Workflow
Upstream command
Key inputs
Dataset folder contents for dragena qc report
When to use
Methylation
dragena qc call
CSV manifest (--csv-manifest) + IDAT folder
controls.raw_metrics.csv + controls.qc_metrics.csv
Standard methylation QC-report workflow
Genotyping, recommended
dragena genotype call
BPM manifest (--bpm-manifest) + cluster file (--cluster-file) + IDAT folder
controls.raw_metrics.csv + controls.qc_metrics.csv + gt_sample_summary.csv
Recommended when you want the richer HTML QC report experience, including functional QC, Autosomal Call Rate and LogRDev views, and sample heatmaps
Genotyping, limited
dragena qc call
CSV manifest (--csv-manifest) + --array-type genotyping + IDAT folder
controls.raw_metrics.csv + controls.qc_metrics.csv
Use only for control-based genotyping QC inputs when gt_sample_summary.csv is not needed
Methylation workflow
Genotyping workflow
For genotyping, dragena genotype call is the recommended upstream path because it produces gt_sample_summary.csv, which enables functional QC metrics and richer QC-report visualizations such as Autosomal Call Rate and LogRDev views, plus the sample heatmaps.
Control-only path: dragena qc call
dragena qc callUse this path when you only need control-based genotyping QC inputs. It uses a CSV manifest and does not generate gt_sample_summary.csv, so the QC report will not include functional QC metrics.
If you have existing gt_sample_summary.csv files generated by older versions of DRAGEN Array prior to v1.4.0 release, you can combine those with the outputs from dragena qc call by specifying the existing output folder through the --output-folder option of dragena qc call, and then generate the full QC Report.
Recommended path: dragena genotype call
dragena genotype callUse this path for most genotyping datasets. It uses the BPM manifest and cluster file, and the output folder already contains the control QC files plus gt_sample_summary.csv for functional QC reporting.
Use dragena qc call for methylation or for control-only genotyping inputs that rely on a CSV manifest. For genotyping, prefer dragena genotype call because it uses the BPM manifest plus cluster file and generates gt_sample_summary.csv, enabling functional QC metrics and the fuller QC-report feature set. In either workflow, --data points to the dataset folder, a parent folder, or a comma-separated dataset list for dragena qc report.
Instructions
Open a command prompt (Windows) or terminal window (Linux) and navigate to the directory where the software was installed. Alternatively, navigate to any working directory if the executable was added to your PATH.
Generate the QC input files using one of the following commands:
For genotyping datasets, use
dragena genotype callin most cases. This is the recommended path when you want both control-based QC and functional QC metrics such as Autosomal Call Rate, LogRDev, and Sex estimate.
Use
dragena qc callif you want control-based QC inputs only, or if you have existinggt_sample_summary.csvfile for the input IDATs, or if you are preparing QC-report inputs for methylation datasets. For genotyping, this command uses a CSV manifest and does not producegt_sample_summary.csv.
Prepare a dataset folder for
dragena qc report. The dataset folder must contain at least the following files:controls.raw_metrics.csv(required)controls.qc_metrics.csv(required)gt_sample_summary.csv(highly recommended)
If you ran dragena genotype call for the dataset you want to review, you can use the genotype output folder directly as the dataset folder because it already contains controls.raw_metrics.csv, controls.qc_metrics.csv, and gt_sample_summary.csv.
If you want to combine multiple datasets or use a parent folder that contains many dataset folders, see Example input folder structures below.
Run
dragena qc reportusing the dataset folder from Step 3 as input. The--configfile is optional. See Configuration file (optional) below for template links and default-threshold behavior.
If you want to use the genotype output folder directly and do not need a custom config file, the command can be as simple as:
If you want to combine multiple known dataset folders into one QC report, provide them as a comma-separated list after --data. If you also provide --label, use the same comma-separated style and supply one label per dataset. Do not add spaces between items; use commas only.
Open the HTML report in your web browser:
DRAGENArray_QC_Report_YYYY_MM_DD.html
If you generated the QC report in BaseSpace Sequence Hub (BSSH), download the HTML report to your computer and then double-click the file to open it in your web browser.
If the input data for qc report was generated while using a sample sheet, sample metadata such as sample names, processing information, and other sample-sheet values will be carried into the final QC report outputs when present in the input files. Review the QC report outputs before sharing them. If you need to remove some metadata before sharing, remove it from gt_sample_summary.csv and rerun dragena qc report.
If you provide a parent folder (for example --data data/parent_folder), the tool can automatically discover and process multiple dataset folders. See Example input folder structures below.
Local QC reports can combine multiple dataset folders into one report. This is the supported way to compare runs or batches in a single Trend Analysis view. Current cloud QC reports are generated for a single dataset, so their Trend Analysis view summarizes that dataset only rather than comparing multiple runs.
Command-line options
The dragena qc report command supports the following options to control output location, format, and QC thresholds.
CLI options are case-sensitive and must be entered exactly as shown. In particular, the threshold flags use mixed case: --threshold-callRate and --threshold-logRdev.
Output options
--output-folder
--output-folderDirectory path where output files are written.
Default: Current working directory
Applies to both the HTML report and QC tables
Example:
--output-format <csv|xlsx>
--output-format <csv|xlsx>Use this option to choose whether the per-sample QC table is written as an xlsx workbook or a csv file. The default output format is xlsx.
When the QC table is written as xlsx, the workbook includes conditional formatting based on the thresholds that were applied when the report was generated, and it includes a Thresholds worksheet that records those threshold settings. Those applied thresholds can come from built-in defaults, a YAML file provided with --config, or command-line overrides described in QC threshold overrides (CLI).
When the QC table is written as csv, the output contains values only. It does not include workbook formatting or additional worksheets.
Example:
QC threshold overrides (CLI)
These options override functional QC thresholds directly from the command line. They take precedence over built-in defaults and any corresponding values provided via --config.
For example, if --config sets callRate: 0.90 but you run with --threshold-callRate 0.95, the report uses 0.95.
--threshold-callRate <value>
--threshold-callRate <value>Override the Autosomal Call Rate threshold.
Default:
0.98Samples with Autosomal Call Rate below this value are marked as FAIL.
Example:
--threshold-logRdev <value>
--threshold-logRdev <value>Override the Log R Ratio Standard Deviation (LogRDev) threshold.
Default:
0.20Samples with LogRDev above this value are marked as FAIL.
Example:
For complex or reproducible QC configurations, use a YAML configuration file via --config. CLI threshold options are most useful for quick exploratory runs.
Precedence (highest to lowest):
Command-line threshold flags (e.g.,
--threshold-callRate,--threshold-logRdev)YAML config file values provided via
--configBuilt-in defaults
Input files
Each dataset folder must include the required QC metric files below. Optional files enable additional features (for example, functional genotyping QC metrics) or override defaults (configuration).
controls.raw_metrics.csv
Probe-level control intensities used to compute control QC metrics.
Yes
controls.qc_metrics.csv
Per-sample summarized QC metrics derived from control probes.
Yes
gt_sample_summary.csv
Per-sample genotyping metrics (including Autosomal Call Rate, LogRDev, Sex estimate). Strongly recommended for functional QC evaluation.
No (strongly recommended)
config.yaml
Overrides QC thresholds and report behavior when provided via --config.
No
See Metadata propagation (gt_sample_summary) for details on how user-defined columns are exposed in the report.
Only a subset of control probe metrics from controls.raw_metrics.csv are propagated to the merged QC outputs.
See Control probe propagation from controls.raw_metrics.csv for details.
Example input files
controls.qc_metrics.csv
controls.qc_metrics.csvcontrols.raw_metrics.csv
controls.raw_metrics.csvgt_sample_summary.csv
gt_sample_summary.csvMetadata propagation (gt_sample_summary)
When the upstream workflow uses a sample sheet, sample-sheet metadata columns are first propagated into gt_sample_summary.csv. During QC report generation, all columns present in gt_sample_summary.csv are then propagated into the report sample metadata and into the merged QC table outputs.
In the merged QC xlsx and csv outputs, propagated metadata columns from gt_sample_summary.csv are included and ordered alphabetically.
Not every propagated metadata field is offered in the report UI for Color by or Facet by. Those controls only include fields that behave like useful categorical groupings. In particular, fields with only one unique value are not offered, and fields with more than 20 unique values are also not offered for coloring or faceting.
Metadata fields that are not eligible for Color by or Facet by can still appear in hover text when space allows. However, some metadata may be omitted from hover tooltips when there is not enough room to display all fields.
Control probe propagation from controls.raw_metrics.csv
controls.raw_metrics.csvNot every raw control-probe row present in controls.raw_metrics.csv is carried forward into the QC report outputs.
Here, not propagated means that specific raw probe entries from controls.raw_metrics.csv are excluded from the merged downstream raw-probe data used by the report and QC table.
The report focuses on raw control probes that support actionable review and on summarized QC metrics used for downstream evaluation. As a result, some raw probe categories are intentionally excluded during merge and summarization.
Included versus excluded raw probe names
The key distinction is between the standard raw probe names used by the report and the additional raw probe names that may appear in some input files but are not propagated downstream.
NON-POLYMORPHIC
Standard non-polymorphic probe rows such as NP (G) and NP (T)
Additional methylation-platform rows NP (G) 1, NP (G) 2, NP (G) 3, NP (G) 4, NP (G) 5
STAINING
Standard staining probe rows Biotin (Bkg), Biotin (High), DNP (Bkg), DNP (High)
Additional methylation-platform rows Biotin(5K) and DNP(20K)
Excluded raw control-probe entries
The following raw control-probe categories or probe names are excluded from the merged downstream raw-probe outputs:
NEGATIVE
NORM
Additional NON-POLYMORPHIC NP (G) probe rows found on some methylation platforms
NP (G) 1
NP (G) 2
NP (G) 3
NP (G) 4
NP (G) 5
Additional STAINING probe rows found on some methylation platforms
DNP(20K)
Biotin(5K)
These probe rows can still be present in the original controls.raw_metrics.csv input file for completeness, but they are not propagated as downstream raw-probe entries in the QC report outputs.
In other words, users may still see summarized Staining or Non-Polymorphic control metrics in the report, and may also see the standard raw probe rows used for those metrics, even though the extra raw probe rows listed above are excluded.
Configuration file (optional)
Provide a YAML configuration file to override the report’s QC thresholds. You can start from one of the Illumina template config files below or supply your own YAML file. Set a value to null to disable a check, or set a numeric value to enable it (for example tgaControl: 1.0). The template config files below are suggested starting points and should be adjusted based on sample type, platform, and lab-specific performance.
Illumina template config files are available below:
Suggested starting points by product and assay chemistry:
Genotyping assays that use Infinium non-EX reagents/chemistry: start with
config_genotyping.yaml.Genotyping assays that use Infinium EX reagents/chemistry: start with
config_genotyping_EX.yaml. Example products include GSA: Infinium Global Screening Array-48 v4.0 Kit, GSA-ePGx: Infinium Global Screening Array with Enhanced PGx-48 v4.0, GCRA: Infinium Global Clinical Research Array-24 v1.0 Kit, GCRA-ePGx: Infinium Global Clinical Research Array with Enhanced PGx-24 v1.0 Kit, and other customized Infinium EX products.Methylation assays that use MethylationEPIC reagents/chemistry: start with
config_methylation_EPIC.yaml.Methylation assays that use Infinium EX reagents/chemistry: start with
config_methylation_MSA.yaml. Example products include Infinium Methylation Screening Array-48 Kit and Infinium Methyl EX iSelect Custom BeadChip (24/48 formats).
If you are using a customized product and are unsure which assay chemistry it uses, contact Illumina Technical Support before selecting a QC configuration file.
You can edit the QC config YAML using a plain‑text editor (for example, Notepad).
When saving a config file from Windows Notepad, verify that the file extension remains .yaml or .yml and was not changed to .yaml.txt or .yml.txt. If needed, use Save As, set Save as type to All Files, and enter a filename such as config.yaml.
For guidance on adjusting DNA methylation QC thresholds, see the following Illumina documentation.
Apply a configuration file
Use the --config CLI flag to apply the selected configuration file:
If --config is omitted, the report uses built-in defaults.
How suggested thresholds were derived
The suggested thresholds in the example configuration files were derived empirically from an internal review of more than 10 datasets spanning both expected good-quality samples and failed samples.
For each metric, Illumina evaluated the observed distribution of values, including the center and spread of the apparent null or background distribution, and used that information to choose practical starting thresholds for routine QC review.
These values are intended as suggested starting points, not universal acceptance criteria. Users should review and revise thresholds for their own assay, sample type, laboratory workflow, scanner settings, bisulfite conversion method, FFPE usage, and historical performance.
If a dataset repeatedly shows a consistent offset for one of these control metrics while other QC evidence remains acceptable, review the threshold in context rather than treating the default value as absolute.
Note: For config_genotyping.yaml, those thresholds were evaluated only using LCG and HTS datasets.
For control metrics, configured thresholds are primarily used to flag samples for review in the QC report. They are recommended operating cutoffs, not assay-independent pass/fail truths.
Generic example YAML structure (illustrative only)
The two YAML blocks below are generic illustrative examples, not chemistry-specific recommended starting points.
The genotyping example below is not specific to Infinium EX or LCG/HTS chemistry.
The methylation example below is not specific to Infinium EX/MSA or MethylationEPIC chemistry.
For chemistry-specific starting points, use the recommended template files listed above.
Generic genotyping YAML example
Generic DNA methylation YAML example
Output files
DRAGENArray_QC_Report_YYYY_MM_DD.html
Self-contained interactive HTML report intended to be distributable and viewable offline in a web browser. The report includes dashboards such as Control Dashboard, Automated QC, Sample QC Heatmaps, Trend Analysis, and a QC Metric Config menu for threshold customization.
DRAGENArray_QC_table_YYYY_MM_DD.<xlsx|csv>
Per-sample QC table (one row per sample) containing functional metrics (when available), derived control metrics, and selected raw control-probe intensities. The file extension depends on --output-format (xlsx or csv).
QC evaluation criteria
Functional QC (per sample)
Functional QC evaluates overall genotyping performance of each sample:
Autosomal Call Rate Fraction of autosomal probes successfully called for a sample. Higher values indicate better performance.
Log R Ratio Standard Deviation (LogRDev) Measures signal noise across probes. Lower values indicate more stable intensity measurements.
Functional QC status (PASS/FAIL)
A sample’s functional QC status is determined by comparing its metrics to configured thresholds:
PASS
Autosomal Call Rate≥ thresholdLogRDev≤ threshold
FAIL
One or both metrics fall outside thresholds
Functional QC requires a per-sample metric file (for example gt_sample_summary.csv). If that file is not provided, functional QC is unavailable. It does not support methylation at this time. For methylation QC please refer to DRAGEN Array - Methylation QC on cloud.
Control-based QC (per sample)
Control-based QC evaluates whether array chemistry and processing performed as expected. Control metrics originate from:
controls.raw_metrics.csv— raw, probe-level control intensitiescontrols.qc_metrics.csv— summarized per-sample control QC values
Common control categories include:
Staining
Extension
Hybridization (High/Medium/Low)
Non-polymorphic
Non-specific binding
Target removal
Stringency
Restoration (when applicable)
Bisulfite conversion controls (methylation arrays)
Specificity (methylation arrays)
For more background on interpreting Infinium controls, see: Evaluation of Infinium Genotyping Assay Controls Training Guide
Control QC status & flags
Each control metric is compared to its configured threshold. When a value is outside the acceptable range, the sample receives a flag indicating the affected control and channel.
Interpreting combined QC results
The report shows both functional QC and control-based QC for every sample:
A sample may PASS functional QC but still receive control warnings.
Multiple or severe control failures may indicate assay-related issues that impact downstream results.
This combined view can help distinguish:
Biological failures (for example, degraded DNA)
Technical failures (for example, staining or hybridization issues)
HTML QC report
The HTML report is organized into dashboards designed for routine QC review and deeper troubleshooting.
Control dashboard
The Control Dashboard provides interactive visualizations of raw control-probe intensities to help identify instrument and assay issues. These plots help you review whether control signals such as staining, extension, hybridization, and non-polymorphic probes behave as expected across samples. You can:
Select samples directly from plots to highlight those same samples across related plots in both the Control Dashboard and the Automated QC dashboard
Hover to see details (for example, sample ID, barcode, position, autosomal call rate when available)
Explore distributions and trends to detect outliers or systematic shifts
Zoom and filter for focused investigation
To clear a plot-based selection, double-click in the plot area. This restores the full sample set in the Automated QC table and removes the cross-plot highlighting.

Chart toolbar reference
Each plot includes a toolbar for zooming, panning, selection, autoscaling, and exporting.

Automated QC
The Automated QC dashboard provides a consolidated, objective view of sample-level QC by applying QC rules and thresholds across all samples. Use it to:
Quickly assess pass/fail status
Identify samples and metrics outside thresholds
Support decisions for sample inclusion, reprocessing, or follow-up analysis
Functional QC Status Histogram
This histogram shows the number and percentage of samples in the dataset that are classified as PASS or FAIL for functional QC.
Functional QC status is determined from the sample-level functional metrics, using the Autosomal Call Rate and LogRDev thresholds currently applied in the report. Those thresholds can come from the defaults, a YAML config file, command-line overrides, or the QC Metric Config settings described in Updating thresholds in the HTML report.
A sample is counted as FAIL if it fails any individual functional QC metric that is currently enabled. Otherwise, the sample is counted as PASS.
Control-Based QC Status Histogram
This histogram shows the number and percentage of samples in the dataset that are FLAGGED or CLEAR for control-based QC.
Control-based QC status is determined from the control QC metric thresholds currently applied in the report. A sample is counted as FLAGGED if any individual control QC metric is outside its applied threshold. Samples without any active control-based QC flags are counted as CLEAR.
Sample QC table
The Sample QC table provides a per-sample summary of the same QC decisions shown in the histograms, along with the underlying metrics used to support review. It includes:
Functional QC status (PASS/FAIL) when functional metrics are available
Control QC flags and annotations that highlight values outside thresholds
Sorting and filtering to focus on failing samples or specific metrics
The table can be exported in either xlsx or csv format. The xlsx output preserves conditional coloring, while the csv output contains values only. For more detail about the table output formats, see --output-format <csv|xlsx>.

QC metric plots
Supporting plots in the Automated QC dashboard complement the table by showing per-sample QC metric scatter plots. In these plots, the x-axis represents samples in the current dataset view, and the y-axis represents one derived QC metric. Users can sort the sample order by Autosomal Call Rate, Log R Dev, any propagated user-provided metadata field, or any derived QC metric available in the report. Users can also color samples by eligible propagated metadata fields to help reveal group-specific patterns or batch effects.
For genotyping arrays, the initial point colors are based on functional QC status: PASS or FAIL determined from the applied Autosomal Call Rate and Log R Dev thresholds.
Plotted metrics can include control-derived metrics such as Staining, Extension, Hybridization, Target Removal, Nonpolymorphic, Stringency, Specificity, and Bisulfite Conversion. For example, Staining Red is calculated as DNP High Red / DNP Bkg Red, and Stringency is calculated as Stringency PM (Red) / Stringency MM (Red).
Each Automated QC scatter plot also includes help text with the formula used to derive the selected metric.
Most control metrics are constructed as ratios that compare expected signal against background or against an opposing control signal. That makes them more stable for QC review than raw intensities alone because even if absolute intensities shift between scanners or runs, the signal-versus-background relationship is expected to remain relatively consistent.
When a QC threshold is set for a plotted metric, the scatter plot shows that cutoff as a dashed threshold reference line. These thresholds can come from built-in defaults, a YAML config file, command-line overrides, or values applied in the QC Metric Config menu. For more detail on how methylation QC control metrics and their recommended starting thresholds are defined, see Methylation Sample QC Summary Files and Methylation QC Threshold Adjustment.
These plots also participate in linked selection. When you select samples in one scatter plot, those same samples are highlighted in the other scatter plots across the Automated QC and Control Dashboard views, and the Automated QC table is filtered to show only the selected samples. To clear the selection, double-click in the plot area.
These plots help you:
See which individual samples fall outside a QC threshold
Compare sample-to-sample variation for one metric at a time
Relate noisy or failing samples to other QC views in the report
In the example figure below, the bottom plot shows a group of samples below the Stringency threshold line, and those points are also colored as functional QC FAIL. Other samples in the same figure are also colored as FAIL even though they do not fall below the Stringency cutoff, which suggests that other QC metrics may be contributing to the failure; review the other QC metric plots and the table to identify the control probes or sample-level metrics associated with those samples.

Sample QC heatmaps
The Sample QC Heatmaps dashboard provides a spatial view of sample-level QC metrics across chips or plates to help detect spatial artifacts and localized issues.
Overview
Each cell represents a sample
Cell color reflects a selected QC metric (for example Autosomal Call Rate, LogRDev)
Hover shows sample identifiers and metric values
Plate information may be derived from:
IDAT metadata, or
A user-provided sample sheet (columns:
Sample_PlateandSample_Well)
If both sources provide plate information, the report uses the user-provided sample sheet values.


Trend analysis
The Trend Analysis dashboard provides a high-level view of QC trends across multiple datasets, runs, batches, or instruments.
For local analysis, this cross-dataset summary is available when you generate one report from multiple dataset folders. If you generate the report from a single dataset folder, the dashboard still appears but summarizes that one dataset only. Current cloud QC reports also operate on a single dataset at a time.
Use it to:
Monitor changes over runs and scan dates
Detect drift, batch effects, or instrument-specific anomalies
Compare QC performance between datasets or groups of chips
Summary table
The summary table aggregates QC metrics at the dataset level (for example):
Number of samples / chips
Scan date range
Autosomal Call rate statistics (min/mean/standard deviation, counts above/below threshold)
LogRDev statistics (mean/standard deviation, counts above threshold)
TGA control statistics for PGx Genotyping datasets
Sex prediction summary (number of males, females and unknowns)

Summary plots
Summary plots provide visual comparisons across datasets or barcodes, such as:
Autosomal Call rate distributions
LogRDev box plots
Sample count plots
Sex estimate distributions
For example, box-and-whisker plots in Trend Analysis summarize how a metric is distributed within each dataset so you can compare center, spread, and outliers across runs. In a box plot, the box represents the middle 50% of the data distribution, the center line marks the median, and the whiskers extend to the smallest and largest non-outlier values shown for that dataset. When a threshold is configured for a plotted metric, the plot also shows the applied cutoff as a dashed reference line.
The following schematic shows the main parts of a box-and-whisker plot:

Updating thresholds in the HTML report
You can update QC threshold cutoffs directly in the HTML report:
Open QC Metric Config (⚙).
Enter new numeric values for metrics you want to enforce.
Leave a field blank (or set it to
nullin YAML) to disable that check.
Click Apply Thresholds.
Buttons
Apply Thresholds: Recalculates pass/fail and refreshes visuals, including threshold reference lines in plots, using the current values.
Download Thresholds: Exports the currently applied thresholds as a YAML file.
Upload Thresholds: Imports a YAML file and fills the threshold fields.
Changes made in the HTML report are session-only and apply only to the currently opened report file.
Download Thresholds exports the currently applied thresholds as a YAML file. To reuse the same thresholds in a future run, pass that YAML file to the qc report command (for example, via --config) when generating a new report. Upload Thresholds loads a YAML file back into the UI fields for the current report.

Interactive features
Filter table samples using scatter plot selection
Selecting samples in a scatter plot filters the corresponding rows in the Automated QC table. The same selected samples are also highlighted in related scatter plots across the Automated QC and Control Dashboard views. To clear this plot-based filtering and cross-plot highlighting, return to the plot and double-click in the plot area to remove the selection.

Highlight samples in scatter plots by selecting from the table
Selecting rows in the table highlights the corresponding points in scatter plots. To clear this table-based highlighting, click Select None in the bottom-right corner of the table.

When exporting the table, any active row selection is preserved. If you click Excel or CSV while rows are selected, only those selected rows are exported. To export all samples, click Select None before exporting.
Table filtering and sorting
The interactive table supports:
Showing/hiding columns
Sorting by any metric
Filtering by search or criteria
Combining table filtering with plot selection for cross-linked exploration
Performance & dataset size limits
For performance reasons, large datasets may use a tabbed layout to keep the browser responsive.
More than 1,000 samples in a single dataset: control and automated-QC plots are arranged into tabs (interactive features remain available).
More than 12,000 samples in a single dataset: scatter plots are disabled and replaced with a notice.
Example tabbed layout:


How to visualize very large projects
If your project exceeds the scatter-plot limit, consider:
Filtering input by run date/batch/folder and generating separate reports
Splitting input into logical batches (for example per-run or per-center)
When a QC Report html contains more than 20,000 samples across datasets, the loading can be slowed down.
Example input folder structures
Single dataset folder
In this example, project_folder/ is one dataset folder that contains the required input files for a single QC report.
Multiple dataset folders
Use this mode on local analysis when you want to combine multiple runs or batches into a single QC report and compare them in Trend Analysis.
In this mode, --data is a comma-separated list of dataset folder paths. Each path must point directly to a folder that contains the required input files. In the example below, dataset_A/ and dataset_B/ are example folder paths.
When writing a list for --data, do not include spaces before or after the commas. If you use --label, write the labels the same way: one comma-separated item per dataset, with no spaces between items.
The command below tells dragena qc report to load both dataset folders into one report:
Use this form when you already know the exact dataset folders you want to include.
Parent folder containing multiple dataset folders
Use this mode when many dataset folders are organized under one parent folder and you want the local QC report to discover them automatically.
This is different from the previous example only in how --data is used:
In the previous example,
--datalists each dataset folder path explicitly.In this example,
--datapoints to one parent folder path, and the tool searches below that parent folder to find valid dataset folders automatically.
Some users describe this as a "recursive" search. In practice, it means the tool starts from the parent folder and looks through its subfolders for dataset folders that contain the required QC files.
The command below tells dragena qc report to start from the parent folder project_folder and automatically discover dataset folders under it:
If you do not provide --label, the report assigns a dataset label based on each detected dataset folder name (for example dataset_A, dataset_B).
To compare multiple runs on local, use one of these two approaches:
Use
--data dataset_A,dataset_B,...when you want to list the dataset folder paths yourself.Use
--data project_folderwhenproject_folderis a parent folder that contains many dataset folders and you want the tool to discover them automatically.
Parent-folder discovery labels and ordering
If you supply
--label, labels are assigned in the order datasets are discovered.If multiple datasets share the same folder name, a numeric suffix is appended (for example
datasetA,datasetA_2) to ensure uniqueness.
Troubleshooting & FAQ
Missing samples
Possible cause: Sample IDs do not match across input files.
Recommended action: Align Sample IDs across controls.* and (if provided) gt_sample_summary.csv.
Unexpected QC failures
Possible cause: Thresholds are too strict for your dataset or application.
Recommended action: Review and adjust thresholds using --config or the QC Metric Config menu in the HTML report.
Scatter plots are disabled
Scatter plots are disabled when the dataset exceeds 12,000 samples. Recommended action: Filter or split the dataset and generate separate reports.
CSV parsing errors
Possible cause: Quoting/encoding issues. Recommended action: Ensure UTF-8 encoding and valid comma delimiters.
Report is extremely slow to load or does not finish loading
Possible cause: Very large datasets (for example, >50,000 samples) generate large input files that require significant client‑side processing. In the current implementation, extremely large datasets (for example, ~100,000 samples) may exceed browser or memory limits and fail to load completely.
Recommended action: Filter or split the dataset into smaller subsets and generate separate QC reports for each subset.
What happens if the input folder contains both genotyping and methylation datasets?
If dragena qc report detects both genotyping and methylation dataset folders within the same --data input set, report generation stops with an error.
In this guide, mixed dataset types specifically means this combination.
Example: A parent folder passed to --data contains one dataset folder with methylation qc call outputs and two dataset folders with genotyping qc call outputs.
This folder structure triggers the mixed-dataset error:
Recommended action: Generate separate QC reports for genotyping and methylation inputs. For example, separate them into different parent folders:
What happens if the input folder contains both PGx and non-PGx genotyping datasets?
This combination is supported. The report is generated as long as each dataset folder contains the required input files.
PGx datasets can include TGA control information, while non-PGx datasets do not. Other shared genotyping QC outputs are generated normally.
For example, if the input contains one PGx dataset folder and one non-PGx genotyping dataset folder, the report still runs as a single genotyping report. TGA-related values are populated only for the PGx dataset.
What happens if the input folder contains both legacy and EX genotyping chips?
This combination is supported and does not trigger the mixed-dataset error.
There is no special difference in report generation or QC interpretation beyond chip-format-specific heatmap layout. Because legacy and EX chips use different physical layouts, the chip heatmaps can appear different even when the rest of the report is comparable.
The same guidance applies to different versions of the same genotyping BeadChip family: they are treated as supported genotyping inputs rather than as mixed dataset types.
Warning and error messages
During report generation, the following situations may trigger warnings or errors:
Input data directory does not exist
Required input files are missing in one or more dataset folders
Number of labels does not match number of detected datasets
Labels are modified during sanitization and become non-unique
Genotyping and methylation dataset folders are detected together in a single run
Invalid output format is specified
Invalid or unsupported QC threshold configuration is provided
Note, mixed dataset types refers specifically to combining genotyping and methylation datasets in one report. Supported mixed genotyping inputs, such as PGx + non-PGx, legacy + EX, or different versions of the same genotyping BeadChip family, do not trigger this error.
Examples
❌ Input data folder does not exist: data/non_existent_dir
Input directory does not exist
❌ No valid dataset folders detected. Each folder must include at least: controls.raw_metrics.csv, controls.qc_metrics.csv
No folders containing both required CSV files were found
❌ Dataset MyDataset is missing required file(s): ... controls.qc_metrics.csv ...
One or more required control metric files are absent
❌ Number of labels (2) does not match number of datasets (3).
--label count does not match detected datasets
⚠️ Label sanitized: My Label<1> → My_Label_1
Label contained unsupported characters
❌ Duplicate labels after sanitization: dataset (x2). Each dataset label must be unique.
Labels became non-unique after sanitization
❌ Mixed dataset types are not supported in a single report. Genotyping datasets: geno_run_A, geno_run_B ... Methylation datasets: meth_run_C ...
A single run included both genotyping and methylation dataset folders. This error is not raised for supported mixed genotyping inputs such as PGx + non-PGx or legacy + EX.
❌ Invalid --output-format value. Only csv or xlsx are accepted.
Unsupported output format
❌ QC config file was provided but does not exist: /path/to/config.yaml
Config path does not exist
❌ Input file must be a YAML file with extension .yaml/.yml. Current config file extension: .json
Config must be YAML
❌ Failed to parse QC config as YAML ...
YAML syntax error
❌ QC config must be wrapped in a top-level QC object containing a Report object.
Config structure is incorrect
❌ Invalid threshold overrides: Unknown threshold key ...
Unrecognized threshold name
❌ Invalid threshold overrides: Invalid numeric value ...
Value could not be parsed as a number
❌ Threshold for Autosomal Call Rate must be between 0 and 1 ...
Autosomal Call Rate threshold out of bounds
❌ Threshold for stainingGreen must be non-negative ...
Threshold must be ≥ 0
Last updated
Was this helpful?
