Skip to content

Examples

Five worked examples covering the complete synth-bench API.

Notebook Topic
DGP Families All 8 data-generating process families: generate, metadata, complexity effects
Corruptors All 6 corruptors: MCAR/MAR/MNAR, severity levels, chained pipelines
Sweeps & Suites severity_sweep, difficulty_sweep, experiment_grid, BenchSuite
End-to-End Workflow Generate -> corrupt -> sweep -> serialize -> reload -> sklearn benchmark
Mini AMLB Benchmark OpenML task + 3 sklearn classifiers + synthbench corruption severity sweep and Bayes error floor

Installation

To run these notebooks locally:

pip install synthbench[docs,neural,io]
pip install torch --index-url https://download.pytorch.org/whl/cpu  # for RandomNeuralDGP
jupyter lab

All notebooks use n_samples <= 500 so they run quickly. Cell outputs shown in the site are generated by mkdocs-jupyter at build time from the current codebase.