Skip to main content
LANCR documentation
Docs
Using Synthetic Datasets for Testing

Using Synthetic Datasets for Testing

Guide to selecting, validating, and operationalizing synthetic datasets in sandbox workflows.

Why Synthetic Data

Synthetic datasets help teams test product and model behavior without exposing sensitive customer data.

Benefits include:

  • Faster experimentation cycles
  • Safer data governance posture
  • Better edge-case coverage for controlled testing

Dataset Categories

  • Retail banking behavior
  • Lending and repayment patterns
  • Payments and transaction streams
  • Open banking account and consent models
  • Claims and insurance event simulations

Selection Workflow

  1. Define product decision points to validate.
  2. Choose dataset category aligned to decision scope.
  3. Verify schema fit, volume, and temporal distribution.
  4. Run quality checks before model or API testing.

Data Quality Checklist

  • Field and type consistency
  • Relational integrity across entities
  • Coverage of normal and stress scenarios
  • Reproducibility and version metadata

Example Access Pattern

const response = await fetch("/api/sandbox/datasets");
const datasets = await response.json();

Download and Usage Rules

  • Use datasets only in approved sandbox environments.
  • Do not mix with unauthorized production data.
  • Document all transformations used in testing.
  • Dataset ID and version used
  • Test objectives and assumptions
  • Observed anomalies and mitigation updates
  • Impact on KPI confidence levels

Next Step

Use dataset outputs to refine application, monitoring, and risk controls before live supervised testing.