smx.graph.interpretation#
Threshold mapping and predicate interpretation utilities.
Functions in this module translate LRC results from the preprocessed (score) space back to the natural (unpreprocessed) spectral space, and reconstruct multivariate threshold spectra for visualisation.
Functions#
|
Extract components from a predicate rule string. |
|
Map predicate thresholds from the preprocessed space to natural space. |
|
Reconstruct a scalar threshold to a multivariate threshold spectrum. |
Module Contents#
- smx.graph.interpretation.extract_predicate_info(predicate_rule: str) dict[source]#
Extract components from a predicate rule string.
Parameters#
- predicate_rulestr
Rule in the format
"zone_name <= threshold"or"zone_name > threshold".
Returns#
- dict
{'zone': str, 'operator': str, 'threshold': float}
Examples#
>>> extract_predicate_info("Ca ka <= 25.50") {'zone': 'Ca ka', 'operator': '<=', 'threshold': 25.5}
- smx.graph.interpretation.map_thresholds_to_natural(lrc_df: pandas.DataFrame, zone_sums_preprocessed: pandas.DataFrame, zone_sums_natural: pandas.DataFrame) pandas.DataFrame[source]#
Map predicate thresholds from the preprocessed space to natural space.
For each predicate in lrc_df, this finds the calibration sample whose zone score in the preprocessed space is closest to the predicate’s threshold, and retrieves that sample’s value in the natural (unpreprocessed) space as the best approximation.
Parameters#
- lrc_dfpd.DataFrame
LRC results. Must contain columns
'Zone','Threshold','Operator', and'Node'.- zone_sums_preprocessedpd.DataFrame
Zone aggregation scores computed on preprocessed calibration data (same zones as lrc_df).
- zone_sums_naturalpd.DataFrame
Zone aggregation scores computed on original (unprocessed) data.
Returns#
- pd.DataFrame
Copy of lrc_df with additional columns:
'Threshold_Natural'— threshold value in the natural space'Reference_Sample_Index'— index of the nearest calibration sample'Approximation_Error'— distance (preprocessed space) to the nearest sample'Node_Natural'— predicate rule string using the natural threshold
- smx.graph.interpretation.reconstruct_threshold_to_spectrum(threshold_value: float, zone_name: str, pca_info_dict: Dict) pandas.Series[source]#
Reconstruct a scalar threshold to a multivariate threshold spectrum.
Uses the PCA model fitted during zone aggregation to reconstruct a threshold value in score space back into the original spectral variable space:
\[\tau = \bar{x} + q \cdot \mathbf{w}\]where \(\bar{x}\) is the zone mean, \(\mathbf{w}\) the PC1 loadings vector, and \(q\) the threshold score value.
Parameters#
- threshold_valuefloat
Threshold in PC1 score space.
- zone_namestr
Name of the spectral zone.
- pca_info_dictdict
PCA info dictionary as stored in
smx.zones.aggregation.ZoneAggregator.pca_info_.
Returns#
- pd.Series
Threshold spectrum indexed by original column names.