Docs
Using Synthetic Datasets for Testing
Using Synthetic Datasets for Testing
Guide to selecting, validating, and operationalizing synthetic datasets in sandbox workflows.
Why Synthetic Data
Synthetic datasets help teams test product and model behavior without exposing sensitive customer data.
Benefits include:
- Faster experimentation cycles
- Safer data governance posture
- Better edge-case coverage for controlled testing
Dataset Categories
- Retail banking behavior
- Lending and repayment patterns
- Payments and transaction streams
- Open banking account and consent models
- Claims and insurance event simulations
Selection Workflow
- Define product decision points to validate.
- Choose dataset category aligned to decision scope.
- Verify schema fit, volume, and temporal distribution.
- Run quality checks before model or API testing.
Data Quality Checklist
- Field and type consistency
- Relational integrity across entities
- Coverage of normal and stress scenarios
- Reproducibility and version metadata
Example Access Pattern
const response = await fetch("/api/sandbox/datasets");
const datasets = await response.json();Download and Usage Rules
- Use datasets only in approved sandbox environments.
- Do not mix with unauthorized production data.
- Document all transformations used in testing.
Recommended Reporting
- Dataset ID and version used
- Test objectives and assumptions
- Observed anomalies and mitigation updates
- Impact on KPI confidence levels
Next Step
Use dataset outputs to refine application, monitoring, and risk controls before live supervised testing.