23. GFCI — Greedy Fast Causal Inference

Type: Hybrid (Score + Constraints)
Output: PAG
Reference: Ogarrio, Spirtes & Ramsey (PGM 2016)

GFCI is a hybrid latent-variable causal discovery algorithm that combines:

  • a score-based CPDAG search (FGES), and

  • a constraint-based refinement phase (FCI-style tests)

to produce a PAG that is sound for models containing latent confounding and selection bias.

It serves as the parent template for the other hybrid latent-variable algorithms in Tetrad, including:
➡️ BOSS-FCI, GRaSP-FCI, and FCIT.


23.1. 🔍 Key Idea

GFCI proceeds in two stages:

23.1.1. 1. Score-Based CPDAG Search (FGES)

FGES finds a high-scoring CPDAG using SEM-BIC or another score.
This provides:

  • a good skeleton,

  • many arrow orientations,

  • and a search space strongly biased toward plausible structures.

This dramatically reduces the number of CI tests required in the next stage.

23.1.2. 2. FCI-Style Refinement

Given the FGES CPDAG, GFCI applies:

  • conditional independence tests,

  • collider identification logic,

  • and PAG orientation rules

to correct mistakes caused by latent variables or selection bias, ultimately producing a correct PAG in the large-sample limit.

This refinement step is similar to FCI and RFCI but operates over a much smaller set of adjacencies—making it faster and more robust.


23.2. 🎯 When to Use GFCI

  • You expect latent confounding or selection bias.

  • You want a PAG but need more efficiency or robustness than pure FCI.

  • You trust a score-based model (FGES) to find a good CPDAG skeleton.

  • You want a strong hybrid baseline before trying BOSS-FCI, GRaSP-FCI, or FCIT.

GFCI remains one of the most practical algorithms for medium-to-large models with latent structure.


23.3. 🧠 Prior Knowledge

Fully supported.
GFCI respects:

  • required edges

  • forbidden edges

  • temporal/tier constraints

  • arbitrary Knowledge objects

Knowledge is honored during both FGES and the FCI-style refinement.


23.4. ⭐ Strengths

  • Much faster than FCI on moderate–large data

  • Reduces spurious CI tests by starting from a strong score-based CPDAG

  • Produces high-quality PAGs in many real datasets

  • Template for all modern hybrid latent-variable methods in Tetrad


23.5. ⚠️ Limitations

  • Inherits FGES’s weaknesses on dense models.

  • Final PAG depends partly on score heuristics from stage 1.

  • Not as aggressive or accurate as FCIT when data are noisy or small-sample.


23.6. 🔧 Key Parameters (Tetrad)

Parameter

Meaning

score

The score used by FGES (SEM-BIC, mixed-BIC, etc.).

faithfulnessAssumed

If true, FGES may skip some tests and orientations.

maxDegree

Pruning constraint for FGES; limits parent set size.

numThreads

Degree of parallelism for FGES scoring.

verbose

Prints decisions during both stages.

timeLag, timeLagReplicatingGraph

For time-series adaptations.

(FGES parameters come directly from FGES; GFCI adds its own CI-test choices for the refinement stage.)


23.7. ⛓ Relation to Other Algorithms

  • FCI — Constraint-only PAG learning

  • RFCI — Faster, more conservative variant

  • GFCI — Hybrid: FGES → FCI refinement

  • BOSS-FCI — Uses BOSS instead of FGES

  • GRaSP-FCI — Uses GRaSP instead of FGES

  • FCIT — Uses targeted testing guided by scores

GFCI is the parent template for the hybrid latent-variable algorithms in Tetrad.


23.8. 📚 Reference

Ogarrio, J. M., Spirtes, P., & Ramsey, J. (2016).
A hybrid causal search algorithm for latent variable models.
In PGM 2016, pp. 368–379.