VAnDa: Variant Annotation Dashboard

Variant annotation server with advanced query builder

Biofold Logo

What is VAnDa?

VAnDa (Variant Annotation Dashboard) is a web-based platform for annotating genomic variants from VCF files. It integrates multiple annotation sources to provide comprehensive functional, clinical, and population-level information for each variant.

Annotation Sources

  • VEP (Variant Effect Predictor) v114 — Functional consequences, transcript impact
  • dbNSFP v5.2a — Deleteriousness scores (CADD, REVEL, MetaSVM) and population frequencies
  • ClinVar — Clinical significance and disease associations
  • COSMIC — Cancer mutation annotations and genome screen counts
  • dbscSNV — Splice site predictions
  • Orphanet — Rare disease associations with HPO categories
  • gnomAD v4.1 — Population allele frequencies

Annotation Pipeline

The annotation process runs through the following steps:

Step 1: VCF Normalization
Normalizes variant representation using bcftools norm. Splits multiallelic variants, left-aligns indels, and ensures consistent representation.
Step 2: VCF Cleaning
Removes any pre-existing annotation fields (CSQ, ANN, GCSQ) using bcftools annotate to ensure a clean starting point.
Step 3: VEP Annotation
Runs Ensembl VEP v114 with the following plugins and custom annotations:
  • Plugins: dbNSFP (ALL fields), dbscSNV, CADD v1.7
  • Custom tracks: ClinVar (CLNSIG, CLNDN, CLNDISDB), COSMIC (TIER, GENOME_SCREEN_SAMPLE_COUNT)
  • Transcript selection: By default all transcripts are reported. With --severity flag, picks the most severe consequence per variant.
Step 4: Gene-Level Annotation
Adds gene-level information from dbNSFP using annotateNSFP_gene.py, including gene intolerance scores (pLI, LOEUF) and RefSeq gene annotations.
Step 5: Variant Extraction
Extracts relevant fields from the annotated VCF into a TSV format, filtering by disease category if specified (all, rare-disease, or cancer).
Step 6: HTML Report Generation
Generates an interactive DataTables HTML report with advanced query builder, HPO category badges, and clickable links to Orphanet and HPO ontologies.

Usage

Web Interface

The VAnDa web interface accepts two input modes:

  • Upload VCF file — Drag & drop or browse for a .vcf or .vcf.gz file (max 50 MB)
  • Submit variant list — Paste variant coordinates in chr,pos,ref,alt format

Filter Options

  • All variants — No disease-specific filter
  • Rare Disease — Filters for variants with known rare disease associations (Orphanet, ClinVar)
  • Cancer — Filters for variants with cancer annotations (COSMIC)

Genotype Control

  • Keep sample genotypes — Toggle ON to include genotype information in the output

Email Notification

Optional email notification when the annotation job completes (runs on SLURM cluster).

Output Files

Annotated VCF (.vcf.gz)

The full annotated VCF with all VEP, dbNSFP, ClinVar, COSMIC, and gene-level annotations in the INFO/CSQ field.

Variant TSV (_vars.tsv.gz)

Extracted tab-separated file with key annotation fields for downstream analysis.

HTML Report (_vars.tsv.html)

Interactive DataTables report featuring:

  • Sortable, searchable columns
  • Advanced query builder with AND/OR/NOT logic and grouping
  • Expandable rows showing all annotation fields
  • HPO category badges with links to OLS4 ontology browser
  • Orphanet and ClinVar links for disease associations
  • Export to CSV, Copy, Print functionality

Key Annotation Fields

Variant Identification

  • CHROM — Chromosome
  • POS — Genomic position (1-based)
  • REF — Reference allele
  • ALT — Alternate allele

Functional Impact

  • Consequence — Sequence Ontology term (e.g., missense_variant)
  • IMPACT — Impact category: HIGH, MODERATE, LOW, MODIFIER
  • SYMBOL — Gene symbol (HGNC)
  • Gene — Ensembl Gene ID

Pathogenicity Scores

  • CADD_PHRED — Combined Annotation Dependent Depletion score
  • phyloP100way_vertebrate — Basewise conservation score

Population Frequencies

  • gnomADe_AF — gnomAD exome allele frequency (v4.1)
  • dbNSFP_POPMAX_AF — Maximum population allele frequency

Clinical Significance

  • ClinVar_CLNSIG — Clinical significance (Pathogenic, Benign, etc.)
  • ClinVar_CLNDISDB — Associated disease databases

Cancer Annotations

  • Cancer — COSMIC variant ID
  • Cancer_TIER — Cancer gene tier classification
  • Cancer_GENOME_SCREEN_SAMPLE_COUNT — Number of screens with variant

Rare Disease

  • Orphanet_id — Orphanet disorder identifier
  • Orphanet_disorder — Disease name from Orphanet
  • HPO_id — Human Phenotype Ontology term IDs
  • HPO_Categories — Disease class categories (e.g., NEUROLOGICAL, CARDIOVASCULAR)