8. CStaR (Causal Stability Ranking)
CStaR (Stekhoven et al., 2012) is a ranking method rather than a pure structure-learning algorithm.
Given a set of possible causes and possible effects, it repeatedly:
Subsamples the data.
Learns a CPDAG on that subsample.
Uses IDA on that CPDAG to compute a minimum total effect for each candidate cause–effect pair.
Records which pairs are among the “top” strongest effects in that subsample.
Over many subsamples, CStaR estimates for each edge (X \to Y):
how often (X) behaves like a cause of (Y) (
π), andhow large the effect tends to be (
minBeta/ “Effect” column),
and then uses stability selection ideas (Meinshausen & Bühlmann, 2010) to bound the expected number of false positives.
It is especially useful when you care about prioritizing a small set of robust, high-confidence effects (e.g., candidate causal predictors of a biological or clinical outcome) rather than recovering the entire causal graph.
8.1. High-level idea
For each effect variable (Y) and each candidate cause (X):
Subsample the data
Draw a half-sample (with or without replacement, depending on the chosen sampling style).
Learn a CPDAG on the subsample
Use one of several CPDAG-producing algorithms:
PC-Stable
FGES
BOSS
Restricted BOSS
Run IDA on the CPDAG
For each candidate effect (Y), CStaR runs IDA to compute the minimum total effect of each possible cause (X) on (Y) across all DAGs in the CPDAG equivalence class.
This produces an effects matrix for that subsample: one effect size per (cause, effect) pair.
Select the strongest effects in that subsample
For each subsample, CStaR sorts all cause–effect effects and identifies a “top bracket” of strongest effects (size =
topBracket × #effects).Any pair whose effect lies in that top bracket is regarded as “selected” in that subsample.
Aggregate across subsamples
Over all subsamples:
π= proportion of subsamples in which (X \to Y) falls into the top bracket.Effect= average of the minimal total effects from IDA across subsamples.
Rank and filter
Pairs are ranked primarily by
π(more stable first), then by effect size.Pairs with effect size below
selectionAlphaare discarded.A PCER (Per-Comparison Error Rate) is reported using the stability-selection bound.
The final output is a ranked table of candidate causal edges, with stability and effect-size information, and a simple graph view that keeps the most stable edges.
8.2. Inputs
CStaR requires:
A continuous data set (or at least, data for which the chosen score and test are appropriate).
A set of possible causes (predictor variables).
A set of possible effects (outcome variables) — often one or a small number of “targets” of interest.
Choices for:
CPDAG algorithm (PC-Stable, FGES, BOSS, Restricted BOSS)
Sampling style (bootstrap or subsample)
Number of subsamples
Top bracket size (
q)Selection threshold (
selectionAlpha)
Background knowledge about forbidden/required edges is not currently used; CStaR relies purely on the chosen CPDAG algorithm.
8.3. Outputs
CStaR produces:
A ranked table of records
Each row corresponds to a candidate edge (X \to Y) and includes:
Cause– the candidate predictor (X).Effect– the target (Y).PI– the stability frequency ( \hat{\pi} ) (fraction of subsamples where (X \to Y) lies in the top bracket).Effect– the average minimal IDA effect for (X \to Y) across subsamples.PCER– an estimated per-comparison error rate bound based on Meinshausen–Bühlmann stability selection; for edges with low stability (π ≤ 0.5),PCERis replaced by*to flag them as below the reliable range.#Potential causesand#Potential effects– the sizes of the candidate sets used to compute the table.
A graph view (optional)
CStaR can be used to construct a graph where:
Nodes are the variables that appear in the records.
A directed edge (X \to Y) is drawn when
π > 0.5.
This graph highlights highly stable candidate causal relations but is not meant as a full causal discovery result; it is a visualization of the top-ranked edges.
Optional intermediate files
For reproducibility and resumability, CStaR can write:
The subsampled data sets,
The CPDAGs fitted on each subsample, and
The matrices of IDA effects per subsample.
If rerun with the same output directory, CStaR will reload existing intermediate results instead of recomputing them.
8.4. Parameters
Parameter (camelCase) |
Description |
|---|---|
|
Non-negative double. Minimum absolute effect size required for a variable to be considered statistically relevant during stability selection. Smaller values make selection more permissive; larger values make it conservative. |
|
Integer ≥ 1. Number of subsamples (bootstrap or subsample splits) to use for stability scoring. Higher values give more stable results but increase computation. Typical range: 20–200. |
|
List of variable names. Restricts CStaR to estimating the parent sets only for the specified target variables. If empty, CStaR analyzes all variables. |
|
Integer ≥ 1. Number of top-ranked candidate graphs (or parent sets) retained per subsample before voting. Controls model diversity and stability. |
|
Boolean. If |
|
String. The algorithm used to convert the aggregated results into a CPDAG (e.g., |
|
String path. If non-empty, results (e.g., subsample graphs, selection frequencies) are written to disk at the given location. Useful for large studies or reproducibility. |
|
Boolean. If |
|
String. Controls how subsamples are constructed (e.g., |
|
Boolean. If |
8.4.1. Interpreting the table
For a given row (X \to Y):
PI (stability)
Close to 1.0: (X \to Y) consistently appears among the strongest effects across subsamples.
Around 0.5: borderline; may be interesting but less robust.
Close to 0: rarely selected; often noise.
Effect (average minimal effect)
Positive values indicate a consistent positive causal effect estimate.
Larger magnitude suggests a stronger effect, but interpretation depends on scale and model assumptions.
PCER
Gives an upper bound on the expected per-comparison error rate for that edge, given the overall selection procedure.
Edges with
*(π ≤ 0.5) are not in the reliable regime of the bound and should be treated cautiously.
A typical use is to pick a PI threshold (e.g. π ≥ 0.8) and an effect threshold (e.g. Effect ≥ 0.1), and then focus on that shortlist as candidate causal predictors for follow-up analysis or experiments.
8.5. When to use CStaR
CStaR is most useful when:
You have many potential predictors and a smaller number of key outcomes, and you want a prioritized list of robust causal candidates.
You are worried about model-selection instability: different subsamples might suggest different graphs, and you want edges that “survive” this variability.
You care about controlling false positives in a stability-selection sense, rather than recovering a single “best” graph.
It pairs naturally with workflows where:
The full causal graph is complex or high-dimensional, but
You mainly need a short, interpretable list of predictors that are repeatedly supported by the data across resamples and CPDAG variations.
8.6. References
Stekhoven, D. J., Moraes, I., Sveinbjörnsson, G., Hennig, L., Maathuis, M. H., & Bühlmann, P. (2012).
Causal stability ranking. Bioinformatics, 28(21), 2819–2823.
Meinshausen, N., & Bühlmann, P. (2010).
Stability selection. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 72(4), 417–473.
Colombo, D., & Maathuis, M. H. (2014).
Order-independent constraint-based causal structure learning. Journal of Machine Learning Research, 15(1), 3741–3782.
8.7. Summary
CStaR is a stability-based causal ranking method that repeatedly subsamples the data, fits a CPDAG, and applies IDA to estimate minimal total effects for each candidate cause–effect pair. It aggregates these results using stability selection, producing a ranked list of robust causal candidates with interpretable stability frequencies and effect sizes.
CStaR is ideal when the goal is prioritizing reliable causal predictors rather than recovering a full graph, especially in high-dimensional settings where model-selection variability is high. It supports multiple CPDAG learners (PC-Stable, FGES, BOSS, RBOSS), parallelization, and reproducible output, but does not currently incorporate background-knowledge constraints.
⸻