RandomNeuralDGP

RandomNeuralDGP generates datasets using a randomly initialized neural network as the signal function. The random network maps input features to a nonlinear target, producing complex high-dimensional relationships that are difficult to approximate with simple models.

Optional dependency

RandomNeuralDGP requires PyTorch. Install it with:

pip install synthbench[neural]

Quick Start

import synthbench
from synthbench import BenchPipeline, RandomNeuralDGP

dgp = RandomNeuralDGP(complexity="medium", task_type="regression", random_state=0)
pipeline = BenchPipeline(dgp)
result = pipeline.run(n_samples=500, n_features=10, random_state=42)

print(result.X.shape)   # (500, 10)
print(result.y.shape)   # (500,)
print(list(result.metadata.keys()))

# Signal importances sum to 1.0
importances = result.metadata["signal_feature_importances"]
print(sum(importances.values()))  # 1.0

Parameters

Parameter	Default	Description
`complexity`	`"medium"`	Controls network width, depth, and activation nonlinearity
`task_type`	`"regression"`	`"regression"` for continuous target, `"classification"` for binary labels
`random_state`	`0`	Integer seed for reproducibility
`class_weight`	`0.5`	(Classification only) Fraction of samples in the positive class

Notes

Importing synthbench does not load PyTorch into sys.modules. PyTorch is imported lazily only when RandomNeuralDGP is first accessed.
Feature importances are computed using input gradient magnitudes, normalized to sum to 1.0. The normalization is done in Python (not torch float32) to guarantee exact sum == 1.0.
The network architecture (width, depth) is determined by the complexity parameter.
Reproducibility: the same random_state always produces an identical network and dataset.