23. GFCI — Greedy Fast Causal Inference
Type: Hybrid (Score + Constraints)
Output: PAG
Reference: Ogarrio, Spirtes & Ramsey (PGM 2016)
GFCI is a hybrid latent-variable causal discovery algorithm that combines:
a score-based CPDAG search (FGES), and
a constraint-based refinement phase (FCI-style tests)
to produce a PAG that is sound for models containing latent confounding and selection bias.
It serves as the parent template for the other hybrid latent-variable algorithms in Tetrad, including:
➡️ BOSS-FCI, GRaSP-FCI, and FCIT.
23.1. 🔍 Key Idea
GFCI proceeds in two stages:
23.1.1. 1. Score-Based CPDAG Search (FGES)
FGES finds a high-scoring CPDAG using SEM-BIC or another score.
This provides:
a good skeleton,
many arrow orientations,
and a search space strongly biased toward plausible structures.
This dramatically reduces the number of CI tests required in the next stage.
23.1.2. 2. FCI-Style Refinement
Given the FGES CPDAG, GFCI applies:
conditional independence tests,
collider identification logic,
and PAG orientation rules
to correct mistakes caused by latent variables or selection bias, ultimately producing a correct PAG in the large-sample limit.
This refinement step is similar to FCI and RFCI but operates over a much smaller set of adjacencies—making it faster and more robust.
23.2. 🎯 When to Use GFCI
You expect latent confounding or selection bias.
You want a PAG but need more efficiency or robustness than pure FCI.
You trust a score-based model (FGES) to find a good CPDAG skeleton.
You want a strong hybrid baseline before trying BOSS-FCI, GRaSP-FCI, or FCIT.
GFCI remains one of the most practical algorithms for medium-to-large models with latent structure.
23.3. 🧠 Prior Knowledge
Fully supported.
GFCI respects:
required edges
forbidden edges
temporal/tier constraints
arbitrary
Knowledgeobjects
Knowledge is honored during both FGES and the FCI-style refinement.
23.4. ⭐ Strengths
Much faster than FCI on moderate–large data
Reduces spurious CI tests by starting from a strong score-based CPDAG
Produces high-quality PAGs in many real datasets
Template for all modern hybrid latent-variable methods in Tetrad
23.5. ⚠️ Limitations
Inherits FGES’s weaknesses on dense models.
Final PAG depends partly on score heuristics from stage 1.
Not as aggressive or accurate as FCIT when data are noisy or small-sample.
23.6. 🔧 Key Parameters (Tetrad)
Parameter |
Meaning |
|---|---|
|
The score used by FGES (SEM-BIC, mixed-BIC, etc.). |
|
If true, FGES may skip some tests and orientations. |
|
Pruning constraint for FGES; limits parent set size. |
|
Degree of parallelism for FGES scoring. |
|
Prints decisions during both stages. |
|
For time-series adaptations. |
(FGES parameters come directly from FGES; GFCI adds its own CI-test choices for the refinement stage.)
23.7. ⛓ Relation to Other Algorithms
FCI — Constraint-only PAG learning
RFCI — Faster, more conservative variant
GFCI — Hybrid: FGES → FCI refinement
BOSS-FCI — Uses BOSS instead of FGES
GRaSP-FCI — Uses GRaSP instead of FGES
FCIT — Uses targeted testing guided by scores
GFCI is the parent template for the hybrid latent-variable algorithms in Tetrad.
23.8. 📚 Reference
Ogarrio, J. M., Spirtes, P., & Ramsey, J. (2016).
A hybrid causal search algorithm for latent variable models.
In PGM 2016, pp. 368–379.