# FOFC — Find One-Factor Clusters **Type:** Latent cluster discovery **Output:** A set of clusters, each corresponding to a single latent factor FOFC (Find One-Factor Clusters) detects groups of observed variables that share a **single latent parent**. It does this by testing **tetrad constraints**, algebraic equalities that hold exactly when a set of variables is generated by the same latent factor. FOFC is usually the first step in pipelines that incorporate latent variables into causal discovery. --- ## Key Idea Suppose several observed variables are indicators of the same latent factor. Then, for any four of them, a specific algebraic expression involving products of covariances must equal zero in the population. These expressions are called **tetrads**. FOFC identifies clusters of variables by: 1. Computing the correlation matrix. 2. Testing whether all tetrads for a candidate group vanish at significance level alpha. 3. Applying a "substitution test": for every variable inside the group, replacing it with any outside variable should break the tetrad relationships. This ensures the cluster is **pure**, not accidentally formed by correlations or mixed latent structure. 4. Growing the cluster by adding additional variables that also satisfy all tetrad tests relative to the existing cluster. 5. Returning all such clusters as one-factor groups. Everything is done using statistical tetrad tests rather than fitting full factor models. --- ## When to Use - When latent factors are known or suspected. - When performing MimBuild, GIN, or Blocks-Test-TS analyses. - Before running causal discovery to factor out measurement structure. - When indicators are continuous or approximately continuous. FOFC works best when latent variables each have at least three indicators. --- ## Prior Knowledge Support FOFC does **not** incorporate background knowledge (forbidden/required edges, tiers, etc.). It is purely a clustering procedure on the covariance matrix. --- ## Strengths - Identifies latent measurement structure directly from covariances. - Robust to some violations of linearity because tetrads capture broad patterns, not detailed SEM fits. - Fast enough to be used routinely before causal discovery. - Helps remove confounding due to measurement structure. --- ## Limitations - Requires continuous data or at least near-continuous behavior. - Needs adequate sample size for reliable tetrad testing. - Only finds **one-factor** clusters; multi-factor models need FTFC, GFFC, or BPC. - Sensitive to highly correlated factors or weak loadings. --- ## Key Parameters in Tetrad | Parameter (camelCase) | Description | |------------------------|-------------| | `alpha` | Significance level for tetrad tests. | | `ess` | Equivalent sample size used in rank and tetrad tests. | | `verbose` | Whether to print detailed cluster-finding steps. | --- ## Reference Kummerfeld, E., & Ramsey, J. (2016). *Causal clustering for 1-factor measurement models.* Proceedings of KDD. --- ## Summary FOFC discovers clusters of indicators generated by a single latent factor by testing tetrad constraints. It is a simple, effective preprocessing step for any causal discovery workflow involving latent variables.