Choosing Tests & Scores
Selecting the right conditional-independence test or score is one of the most important modeling decisions in Tetrad. Many search algorithms (PC, FGES, BOSS, GFCI, etc.) allow you to choose a test/score tailored to your data type and modeling assumptions.
This page gives practical guidance on which tests and scores to use for:
Continuous (Gaussian / nearly Gaussian)
Discrete
Mixed continuous/discrete
Non-Gaussian linear models
Nonlinear models (KCI, basis expansions)
Latent-variable block-based searches
The goal is not to list every option, but to highlight the ones that work well in practice in real causal-discovery workflows.
1. Continuous, Approximately Gaussian Data
When variables are continuous, roughly symmetric, and well-modelled by linear relationships:
Recommended Tests
Fisher Z Test
The default in Tetrad and the most reliable choice for continuous data.
Fast, robust, and statistically well-understood.Partial Covariance / Partial Correlation Tests
Equivalent variants used by some algorithms (conceptually the same family as the Fisher Z / partial-correlation approach).
Recommended Scores
Sem BIC Score
The standard linear-Gaussian score.
Best-Fit Algorithms
PC, PC-Max, CPC
FGES (high-dimensional)
BOSS (small–medium dimensional; often gives better precision than FGES)
GRaSP
GFCI / RFCI
Rule of thumb:
If nothing fancy is required, use Fisher Z and Sem BIC.
2. Discrete Data (Binary / Ordinal / Categorical)
Discrete CI testing behaves very differently from continuous modeling. Tetrad offers fast standard tests plus Bayesian alternatives.
Recommended Tests
G² / Chi-Square Test
Best for moderate–large samples.“BDeu-style” Bayesian tests
(Logically aligned with the BDeu Score, especially in small-sample / sparse-table settings.)
Recommended Scores
Discrete BDeu Score
Works well for small–medium models.
Note: For large numbers of discrete variables, CPTs explode.
Best-Fit Algorithms
BOSS (preferred for discrete; handles small–medium models extremely well)
FGES (works, but CPT blowup can be prohibitive; use only when p is modest)
PC / GFCI / RFCI with G² or BDeu-type tests
Rule of thumb:
Finite-state models are rarely high-dimensional; for score-based search on discrete data:
BOSS > FGES.
3. Mixed Continuous/Discrete Data
Mixed data requires special handling. In Tetrad you have three practically useful options:
A. Conditional Gaussian (CG)
CG Independence Test → ConditionalGaussianLrt
CG BIC Score → ConditionalGaussianBicScore
Works when continuous variables are approximately Gaussian within levels of discrete parents.
Statistically principled but slower.
B. Degenerate Gaussian (DGC)
Degenerate Gaussian Test → DegenerateGaussianLrt
Degenerate Gaussian Score → DegenerateGaussianBicScore
Treats discrete variables as “degenerate Gaussians.”
Much faster than full CG for large data.
C. Basis Function (BF) Tests/Scores
Basis Function CI Test → BasisFunctionLrt
Basis Function BIC Score → BasisFunctionBicScore
This approach expands continuous variables using orthogonal polynomials (Legendre/Chebyshev):
works for mixed data
handles nonlinear relationships
extremely fast compared to kernel methods
sample-size scaling ~ constant
4. Non-Gaussian Linear Models
If relationships are linear but noise terms are non-Gaussian (LiNGAM-type settings or visibly skewed residuals):
Recommended Tests
For LiNGAM-style algorithms (DirectLiNGAM, ICA-LiNGAM, etc.), independence is handled internally via ICA or related objective functions; there is no separate CI test to choose in the GUI.
If you still run “ordinary” DAG search (PC, FGES, BOSS, GRaSP) on linear non-Gaussian data:
It is usually fine to keep using Fisher Z as the CI test.
The structure is still identifiable from second-order moments under many conditions, even if the noise is non-Gaussian.
Recommended Scores
Sem BIC Score (heuristic but empirically strong)
Although derived under a Gaussian assumption, Sem BIC often works very well in linear non-Gaussian settings. In particular, the BOSS paperAndrews, B., Ramsey, J., Sanchez-Romero, R., Camchong, J., & Kummerfeld, E. (2023).
Fast scalable and accurate discovery of DAGs using the best order score search and grow shrink trees.
NeurIPS 36, 63945–63956.shows that BOSS + Sem BIC performs comparably to DirectLiNGAM in linear non-Gaussian simulations (see Figure 4a).
PoissonPriorScore (optional structural prior)
This is a structural sparsity prior (Poisson on edges/parents), not a noise model. You can combine it with Sem BIC if you want an explicit probabilistic prior over graph complexity, but it is not specific to linear non-Gaussian noise.
Best-Fit Algorithms
DirectLiNGAM, ICA-LiNGAM, ICA-LiNG-D (internal ICA-type objective)
FASK / FASK-Vote
BOSS or FGES with Sem BIC (heuristic but well supported by experiments)
Pairwise-skewness-based orientation methods
Rule of thumb:
If residuals are visibly skewed or heavy-tailed, it is still quite reasonable to use Sem BIC with BOSS/FGES for structure learning, and to compare against dedicated LiNGAM-style methods when possible.
5. Nonlinear Models
Tetrad provides three practically useful nonlinear CI tests.
A. Kernel Conditional Independence Test (KCI)
Captures arbitrary nonlinear dependencies.
Very powerful but computationally expensive for large N.
Best for small–medium datasets.
B. Random Conditional Independence Test (RCIT)
Captures arbitrary nonlinear dependencies.
Faster than KCI.
Useful for larger datasets.
B. Basis Function Test / Score (Recommended for scalability)
Nonlinear via polynomial/orthogonal expansions.
Post-nonlinear models.
Often matches or outperforms KCI on large N due to superior speed.
Rule of thumb:
Goal |
Recommended |
|---|---|
Best accuracy for nonlinear CI |
|
Best speed + strong accuracy |
6. Latent Variable Workflows (Block-Based Search)
When you run latent clustering (TSC, FOFC, FTFC, GFFC, BPC), you obtain clusters → latent nodes.
You can then run structure discovery over those latent nodes.
Block-Based Tests/Scores
Blocks-Test-TS
Trek-separation test on clusters.Blocks-BIC Score
A block-aware score for FGES, BOSS, GRaSP, SP, etc.
(These are not yet documented as separate test/score pages in this manual.)
Compatible Algorithms
Any algorithm that accepts a test and/or score:
PC (default choice)
GFCI / RFCI
FGES (when clusters are few)
BOSS (very strong for moderate-sized latent structures)
GRaSP or SP (if small)
Typical Workflow
Run clustering (e.g., TSC)
Convert clusters → latent variables
Choose Blocks-Test-TS or Blocks-BIC
Run PC, GFCI, BOSS, FGES, or others
This is the recommended approach for latent causal structure without specifying measurement models.
Summary Table (Practical Defaults)
Setting |
Test |
Score |
Algorithms |
|---|---|---|---|
Continuous linear |
PC, FGES, BOSS*, GFCI |
||
Discrete |
G² or BDeu-style tests |
BOSS*, FGES, PC |
|
Mixed |
PC, FGES, BOSS, GFCI, FCIT |
||
Linear non-Gaussian |
Internal ICA criteria or Fisher Z when using PC/BOSS |
DirectLiNGAM, FASK, BOSS (heuristic), FGES |
|
Nonlinear |
(none / kernel-based) |
PC+KCI, CAM |
|
Nonlinear scalable |
PC+BF, GFCI+BF |
||
Latent blocks |
Blocks-Test-TS |
Blocks-BIC |
PC, GFCI, FGES, BOSS |
* BOSS is recommended over FGES unless the number of variables is very large. GRaSP is a similar algorithm that should be considered as well.
Next Steps
Per-algorithm documentation for parameter definitions