# PC — Peter–Clark Algorithm **Type:** Constraint-based **Output:** CPDAG PC is the classic constraint-based causal discovery algorithm given in *Causation, Prediction, and Search*. It learns a **CPDAG** (the Markov equivalence class of DAGs) under the assumptions of acyclicity, causal sufficiency, and the Causal Markov + Faithfulness conditions. PC proceeds by systematically removing edges using conditional independence (CI) tests, then orienting v-structures and propagating orientations while avoiding new unshielded colliders. --- ## Key Idea PC operates in two stages: 1. **Adjacency search (FAS phase)** Begin with a fully connected undirected graph and iteratively remove edges whenever variables are found conditionally independent given some subset of their neighbors. 2. **Orientation phase** Using the separating sets from the adjacency search: - Orient **unshielded colliders** (`X → Y ← Z` where X–Z are nonadjacent and Y not in the separating set), - Apply orientation propagation rules (Meek’s rules) to complete the CPDAG without introducing new v-structures or directed cycles. This yields a **sound and complete CPDAG** for oracle CI tests. --- ## When to Use - Data are plausibly generated by a **DAG without latent confounders**. - You want explicit **statistical control via α**. - The CI test used (Gaussian, G², RCIT, etc.) matches the data type. - You want a **lightweight** and widely understood baseline for comparison. PC is also used as the adjacency phase of many other algorithms (e.g., FCI, PC-Max, CCD). --- ## Prior Knowledge Support **PC fully supports background knowledge.** You may supply: - **Required edges** - **Forbidden edges** - **Temporal/tier constraints** - Other standard `Knowledge` constraints These constraints are respected during both adjacency removal and orientation propagation. --- ## Strengths - Highly interpretable — every edge removal justified by a CI test - Produces a **clean CPDAG** with explicit orientation logic - Works with many types of CI tests (parametric, nonparametric, hybrid) - Often very fast, especially with sparse graphs - Serves as the foundation for many advanced algorithms (FCI, GFCI, BOSS-FCI, FCIT) --- ## Limitations - The **theoretical guarantee** of correctness requires the data to be i.i.d. and faithful to a causally sufficient DAG. - Sensitive to CI test errors when conditioning sets become large. - May leave many edges unoriented, especially in dense graphs or with weak dependencies. - Cannot by itself distinguish latent confounding from direct causation (FCI and its variants address this). --- ## Key Parameters in Tetrad The PC wrapper exposes the following parameters (camelCase names shown): | Parameter (camelCase) | Description | |------------------------|-------------| | `stableFas` | If true, uses the *stable* adjacency phase (order-independent edge removals). | | `colliderOrientationStyle` | Chooses the style of collider identification/orientation (PC-standard, conservative, etc.). | | `allowBidirected` | Allows temporary bidirected edges during orientation. | | `depth` | Maximum size of conditioning sets during adjacency search. (`-1` = unlimited) | | `fdrQ` | Use FDR correction at level Q instead of fixed α (optional). | | `timeLag` | Lag τ when applying PC to time-series or lagged datasets. | | `timeLagReplicatingGraph` | Whether lagged graphs replicate structure across time slices. | | `verbose` | Print detailed CI-test and orientation decisions. | These allow fine control over complexity, reproducibility, and collider logic. ## Historical Notes The **PC algorithm** was one of the original “Tetrad” algorithms and has been implemented and studied widely. Its development unfolded across three major milestones: 1. **Spirtes & Glymour (1991)** *“An Algorithm for Fast Recovery of Sparse Causal Graphs”* Introduced the first version of the algorithm and established the key insight that conditional independence information can efficiently prune a fully connected graph under sparsity assumptions. 2. **Spirtes, Glymour & Scheines (1993)** *Causation, Prediction, and Search (1st edition)* Refined and extended the algorithm, with clearer statements of assumptions, correctness proofs, and the introduction of what later became known as the **FAS** (Fast Adjacency Search) phase. 3. **Spirtes, Glymour & Scheines (2000)** *Causation, Prediction, and Search (2nd edition)* Formalized PC in its modern form. Meek’s 1995 orientation rules were incorporated, establishing the now-standard procedure of (i) adjacency pruning via conditional independence, followed by (ii) propagation of orientations via sound and complete rules. ### Why PC is historically important - **First scalable causal graph algorithm**: PC demonstrated that conditional independence tests, together with sparsity, allowed structure discovery in time that is *polynomial* for sparse graphs. - **Introduced the idea of equivalence classes**: PC made CPDAGs the standard representation for causal structures under Markov and faithfulness assumptions. - **Inspired nearly every major causal algorithm since**: FCI, RFCI, GFCI, PC-Max, CCD, and many others are direct conceptual descendants of PC. Today PC remains: - a baseline that all new algorithms are compared against, - a teaching tool for causal inference fundamentals, and - a practical choice in many real-world settings. The Tetrad implementation continues this lineage, with additional improvements (parallelization, stable FAS, PC-Max, skewness extensions, etc.) that preserve the spirit of the original algorithm while adapting it to modern data sizes. --- ## Additional Reference Meek, C. (1995). Causal inference and construction of explanation with background knowledge. In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence (UAI-95), pp. 403-411. Morgan Kaufmann Publishers. --- ## Summary PC is the foundational constraint-based algorithm for learning CPDAGs, combining systematic CI testing with principled orientation rules. It remains a gold-standard baseline for settings without latent confounders.