Synthetic datasets#
SMX includes utilities for generating synthetic spectral datasets. These are useful for demos, tests, and quick experimentation.
Generate synthetic spectra#
from smx import generate_synthetic_spectral_data
classes_config = [
{
"name": "A",
"n_samples": 80,
"peaks": [250, 380, 550, 700, 850],
"amplitude_mean": 1.0,
"width_mean": 15.0,
"noise_std": 0.04,
},
{
"name": "B",
"n_samples": 80,
"peaks": [50, 250, 380, 550, 850],
"amplitude_mean": 1.2,
"width_mean": 18.0,
"noise_std": 0.035,
},
]
df = generate_synthetic_spectral_data(
classes_config=classes_config,
n_points=500,
x_min=1,
x_max=1000,
seed=0,
)
The resulting DataFrame has a Class column followed by spectral variables
named after their x-axis values.
Peak model#
The generator uses a Gaussian peak model internally via gaussian_peak_model.
It supports either scalar peak centers or per-peak dictionaries that override
amplitude and width.