Pipeline overview#

The SMX class wraps the full explanation workflow so it can be run with a single fit() call. It orchestrates:

Spectral zone extraction
PCA aggregation of each zone
Quantile predicate generation
Bagging over samples (and predicates if requested)
Metric-based predicate ranking
Directed predicate graph construction
Local Reaching Centrality (LRC) ranking
Optional mapping back to natural spectral units

Core inputs#

spectral_cuts: zone definitions for the spectral axis
quantiles: thresholds used to build predicates per zone
X_cal_prep: preprocessed calibration spectra
y_pred_cal: continuous model outputs aligned with X_cal_prep
X_cal_natural: original (unpreprocessed) calibration spectra

Key outputs#

After fit(), the most used attributes are:

lrc_summed_: mean LRC ranking across seeds
lrc_summed_unique_: one row per zone, sorted by LRC
lrc_natural_: thresholds mapped back to natural units
zone_scores_: PCA scores per zone
predicates_df_: full predicate catalog
graphs_by_seed_: per-seed predicate graphs

Example#

from smx import SMX

smx = SMX(
    spectral_cuts=spectral_cuts,
    quantiles=[0.25, 0.5, 0.75],
    n_repetitions=4,
    n_bags=10,
    metric="perturbation",
    estimator=model,
)

smx.fit(X_cal_prep, y_pred_cal, X_cal_natural=X_cal)
print(smx.lrc_summed_unique_.head())