21. FOFC — Find One-Factor Clusters
Type: Latent cluster discovery
Output: A set of clusters, each corresponding to a single latent factor
FOFC (Find One-Factor Clusters) detects groups of observed variables that share a single latent parent. It does this by testing tetrad constraints, algebraic equalities that hold exactly when a set of variables is generated by the same latent factor. FOFC is usually the first step in pipelines that incorporate latent variables into causal discovery.
21.1. Key Idea
Suppose several observed variables are indicators of the same latent factor. Then, for any four of them, a specific algebraic expression involving products of covariances must equal zero in the population. These expressions are called tetrads. FOFC identifies clusters of variables by:
Computing the correlation matrix.
Testing whether all tetrads for a candidate group vanish at significance level alpha.
Applying a “substitution test”: for every variable inside the group, replacing it with any outside variable should break the tetrad relationships.
This ensures the cluster is pure, not accidentally formed by correlations or mixed latent structure.Growing the cluster by adding additional variables that also satisfy all tetrad tests relative to the existing cluster.
Returning all such clusters as one-factor groups.
Everything is done using statistical tetrad tests rather than fitting full factor models.
21.2. When to Use
When latent factors are known or suspected.
When performing MimBuild, GIN, or Blocks-Test-TS analyses.
Before running causal discovery to factor out measurement structure.
When indicators are continuous or approximately continuous.
FOFC works best when latent variables each have at least three indicators.
21.3. Prior Knowledge Support
FOFC does not incorporate background knowledge (forbidden/required edges, tiers, etc.).
It is purely a clustering procedure on the covariance matrix.
21.4. Strengths
Identifies latent measurement structure directly from covariances.
Robust to some violations of linearity because tetrads capture broad patterns, not detailed SEM fits.
Fast enough to be used routinely before causal discovery.
Helps remove confounding due to measurement structure.
21.5. Limitations
Requires continuous data or at least near-continuous behavior.
Needs adequate sample size for reliable tetrad testing.
Only finds one-factor clusters; multi-factor models need FTFC, GFFC, or BPC.
Sensitive to highly correlated factors or weak loadings.
21.6. Key Parameters in Tetrad
Parameter (camelCase) |
Description |
|---|---|
|
Significance level for tetrad tests. |
|
Equivalent sample size used in rank and tetrad tests. |
|
Whether to print detailed cluster-finding steps. |
21.7. Reference
Kummerfeld, E., & Ramsey, J. (2016).
Causal clustering for 1-factor measurement models. Proceedings of KDD.
21.8. Summary
FOFC discovers clusters of indicators generated by a single latent factor by testing tetrad constraints. It is a simple, effective preprocessing step for any causal discovery workflow involving latent variables.