# Predicates and bagging

SMX builds logical predicates from zone scores and uses bagging to stabilize
importance rankings across subsamples.

## Predicate generation

`PredicateGenerator` creates two predicates per quantile per zone:

- `zone <= threshold`
- `zone > threshold`

```python
from smx import PredicateGenerator

generator = PredicateGenerator(quantiles=[0.25, 0.5, 0.75])
generator.fit(zone_scores)

predicates_df = generator.predicates_df_
indicator_df = generator.indicator_df_
```

## Bagging

`PredicateBagger` subsamples rows (and optionally predicates) to build bags
that feed the metric computations:

```python
from smx import PredicateBagger

bagger = PredicateBagger(n_bags=10, n_samples_fraction=0.8, replace=False, random_seed=42)
bags = bagger.run(zone_scores, y_pred_cal, predicates_df)
```

## Metrics

Two main metrics are provided:

- `CovarianceMetric`: covariance or mutual information between zone values and predictions
- `PerturbationMetric`: replace a zone and measure prediction shift

```python
from smx import CovarianceMetric, PerturbationMetric

cov_metric = CovarianceMetric(metric="covariance", threshold=0.01)
rankings = cov_metric.compute(bags)

pert_metric = PerturbationMetric(
    estimator=model,
    Xcalclass_prep=X_cal_prep,
    predicates_df=predicates_df,
    spectral_cuts=spectral_cuts,
    perturbation_mode="median",
    metric="probability_shift",
)
rankings = pert_metric.compute(bags)
```