smx.datasets.synthetic ====================== .. py:module:: smx.datasets.synthetic Functions --------- .. autoapisummary:: smx.datasets.synthetic.gaussian_peak_model smx.datasets.synthetic.generate_synthetic_spectral_data Module Contents --------------- .. py:function:: gaussian_peak_model(x, center, amplitude, width) Generate a one-dimensional Gaussian peak. Implements the equation: g(x) = A * exp(-(x - c)² / (2σ²)) Parameters ---------- x : array_like Spectral axis (wavelengths, energy, channels). center : float Central position of the peak (same units as x). amplitude : float Maximum height of the peak (intensity at the center). width : float Standard deviation (σ) of the peak — controls spread/width. Returns ------- ndarray Array with the Gaussian peak evaluated at each point of x. Notes ----- - For XRF: use a small width (~5–15) to simulate narrow lines. - For Vis-NIR: use a larger width (~20–50) for broad absorption bands. .. py:function:: generate_synthetic_spectral_data(classes_config, n_points=500, x_min=0, x_max=1000, seed=None) Generate a synthetic spectral dataset for multiple classes. Returns a DataFrame where: - First column: ``'Class'`` (values defined by the user: 'A', 'B', 'C', …). - Remaining columns: spectral variables (intensity values). - Rows: individual samples. Parameters ---------- classes_config : list of dict List of dicts, each defining one class. Supported keys: - ``'name'`` (str): class label (e.g. ``'A'``, ``'B'``, ``'Soil'``). - ``'n_samples'`` (int): number of samples to generate. - ``'peaks'`` (list): peak definitions on the spectral axis. Supported formats:: [250, 550, 700] or:: [ {'center': 250, 'amplitude_mean': 0.9, 'width_mean': 10}, {'center': 550, 'amplitude_mean': 1.3, 'width_mean': 18}, {'center': 700, 'amplitude_mean': 0.7, 'width_mean': 25}, ] The second form allows per-peak amplitude/width customisation. Optional per-peak keys: ``amplitude_mean``, ``amplitude_std``, ``width_mean``, ``width_std``. Missing keys fallback to class-level defaults below. - ``'amplitude_mean'`` (float, optional, default ``1.0``): mean peak amplitude. - ``'amplitude_std'`` (float, optional, default ``0.1``): std dev of amplitude. - ``'width_mean'`` (float, optional, default ``15.0``): mean peak width (σ). - ``'width_std'`` (float, optional, default ``2.0``): std dev of peak width. - ``'noise_std'`` (float, optional, default ``0.02``): std dev of baseline noise. Example:: [ { 'name': 'A', 'n_samples': 50, 'peaks': [ {'center': 250, 'amplitude_mean': 0.9, 'width_mean': 12}, {'center': 550, 'amplitude_mean': 1.4, 'width_mean': 20}, {'center': 700, 'amplitude_mean': 0.8, 'width_mean': 16}, {'center': 850, 'amplitude_mean': 1.1, 'width_mean': 24}, ], 'amplitude_mean': 1.0, 'amplitude_std': 0.1, 'width_mean': 15.0, 'width_std': 2.0, }, { 'name': 'B', 'n_samples': 50, 'peaks': [250, 700, 850], 'amplitude_mean': 1.2, 'width_mean': 20.0, }, ] n_points : int, default ``500`` Number of points on the spectral axis (resolution). x_min, x_max : float, default ``0``, ``1000`` Limits of the spectral axis (e.g. 400–1000 nm for Vis-NIR, 0–40 keV for XRF). seed : int, optional Random seed for reproducibility. Returns ------- df : pandas.DataFrame Synthetic spectral dataset. - Column 0: ``'Class'`` (str — class name from *classes_config*). - Columns 1 … n_points: spectral intensities named after x-axis values. - Shape: ``(total_samples, n_points + 1)``.