smx.graph.interpretation ======================== .. py:module:: smx.graph.interpretation .. autoapi-nested-parse:: Threshold mapping and predicate interpretation utilities. Functions in this module translate LRC results from the preprocessed (score) space back to the natural (unpreprocessed) spectral space, and reconstruct multivariate threshold spectra for visualisation. Functions --------- .. autoapisummary:: smx.graph.interpretation.extract_predicate_info smx.graph.interpretation.map_thresholds_to_natural smx.graph.interpretation.reconstruct_threshold_to_spectrum Module Contents --------------- .. py:function:: extract_predicate_info(predicate_rule: str) -> dict Extract components from a predicate rule string. Parameters ---------- predicate_rule : str Rule in the format ``"zone_name <= threshold"`` or ``"zone_name > threshold"``. Returns ------- dict ``{'zone': str, 'operator': str, 'threshold': float}`` Examples -------- >>> extract_predicate_info("Ca ka <= 25.50") {'zone': 'Ca ka', 'operator': '<=', 'threshold': 25.5} .. py:function:: map_thresholds_to_natural(lrc_df: pandas.DataFrame, zone_sums_preprocessed: pandas.DataFrame, zone_sums_natural: pandas.DataFrame) -> pandas.DataFrame Map predicate thresholds from the preprocessed space to natural space. For each predicate in *lrc_df*, this finds the calibration sample whose zone score in the *preprocessed* space is closest to the predicate's threshold, and retrieves that sample's value in the *natural* (unpreprocessed) space as the best approximation. Parameters ---------- lrc_df : pd.DataFrame LRC results. Must contain columns ``'Zone'``, ``'Threshold'``, ``'Operator'``, and ``'Node'``. zone_sums_preprocessed : pd.DataFrame Zone aggregation scores computed on *preprocessed* calibration data (same zones as *lrc_df*). zone_sums_natural : pd.DataFrame Zone aggregation scores computed on *original* (unprocessed) data. Returns ------- pd.DataFrame Copy of *lrc_df* with additional columns: * ``'Threshold_Natural'`` — threshold value in the natural space * ``'Reference_Sample_Index'`` — index of the nearest calibration sample * ``'Approximation_Error'`` — distance (preprocessed space) to the nearest sample * ``'Node_Natural'`` — predicate rule string using the natural threshold .. py:function:: reconstruct_threshold_to_spectrum(threshold_value: float, zone_name: str, pca_info_dict: Dict) -> pandas.Series Reconstruct a scalar threshold to a multivariate threshold spectrum. Uses the PCA model fitted during zone aggregation to reconstruct a threshold value *in score space* back into the original spectral variable space: .. math:: \tau = \bar{x} + q \cdot \mathbf{w} where :math:`\bar{x}` is the zone mean, :math:`\mathbf{w}` the PC1 loadings vector, and :math:`q` the threshold score value. Parameters ---------- threshold_value : float Threshold in PC1 score space. zone_name : str Name of the spectral zone. pca_info_dict : dict PCA info dictionary as stored in :attr:`smx.zones.aggregation.ZoneAggregator.pca_info_`. Returns ------- pd.Series Threshold spectrum indexed by original column names.