Choosing Tests & Scores

Selecting the right conditional-independence test or score is one of the most important modeling decisions in Tetrad. Many search algorithms (PC, FGES, BOSS, GFCI, etc.) allow you to choose a test/score tailored to your data type and modeling assumptions.

This page gives practical guidance on which tests and scores to use for:

Continuous (Gaussian / nearly Gaussian)
Discrete
Mixed continuous/discrete
Non-Gaussian linear models
Nonlinear models (KCI, basis expansions)
Latent-variable block-based searches

The goal is not to list every option, but to highlight the ones that work well in practice in real causal-discovery workflows.

1. Continuous, Approximately Gaussian Data

When variables are continuous, roughly symmetric, and well-modelled by linear relationships:

Recommended Tests

Fisher Z Test
The default in Tetrad and the most reliable choice for continuous data.
Fast, robust, and statistically well-understood.
Partial Covariance / Partial Correlation Tests
Equivalent variants used by some algorithms (conceptually the same family as the Fisher Z / partial-correlation approach).

Recommended Scores

Sem BIC Score
The standard linear-Gaussian score.

Best-Fit Algorithms

PC, PC-Max, CPC
FGES (high-dimensional)
BOSS (small–medium dimensional; often gives better precision than FGES)
GRaSP
GFCI / RFCI

Rule of thumb:
If nothing fancy is required, use Fisher Z and Sem BIC.

2. Discrete Data (Binary / Ordinal / Categorical)

Discrete CI testing behaves very differently from continuous modeling. Tetrad offers fast standard tests plus Bayesian alternatives.

Recommended Tests

G² / Chi-Square Test
Best for moderate–large samples.
“BDeu-style” Bayesian tests
(Logically aligned with the BDeu Score, especially in small-sample / sparse-table settings.)

Recommended Scores

Discrete BDeu Score
Works well for small–medium models.
Note: For large numbers of discrete variables, CPTs explode.

Best-Fit Algorithms

BOSS (preferred for discrete; handles small–medium models extremely well)
FGES (works, but CPT blowup can be prohibitive; use only when p is modest)
PC / GFCI / RFCI with G² or BDeu-type tests

Rule of thumb:
Finite-state models are rarely high-dimensional; for score-based search on discrete data:
BOSS > FGES.

3. Mixed Continuous/Discrete Data

Mixed data requires special handling. In Tetrad you have three practically useful options:

A. Conditional Gaussian (CG)

CG Independence Test → ConditionalGaussianLrt
CG BIC Score → ConditionalGaussianBicScore

Works when continuous variables are approximately Gaussian within levels of discrete parents.
Statistically principled but slower.

B. Degenerate Gaussian (DGC)

Degenerate Gaussian Test → DegenerateGaussianLrt
Degenerate Gaussian Score → DegenerateGaussianBicScore

Treats discrete variables as “degenerate Gaussians.”
Much faster than full CG for large data.

C. Basis Function (BF) Tests/Scores

Basis Function CI Test → BasisFunctionLrt
Basis Function BIC Score → BasisFunctionBicScore

This approach expands continuous variables using orthogonal polynomials (Legendre/Chebyshev):

works for mixed data
handles nonlinear relationships
extremely fast compared to kernel methods
sample-size scaling ~ constant

4. Non-Gaussian Linear Models

If relationships are linear but noise terms are non-Gaussian (LiNGAM-type settings or visibly skewed residuals):

Recommended Tests

For LiNGAM-style algorithms (DirectLiNGAM, ICA-LiNGAM, etc.), independence is handled internally via ICA or related objective functions; there is no separate CI test to choose in the GUI.

If you still run “ordinary” DAG search (PC, FGES, BOSS, GRaSP) on linear non-Gaussian data:

It is usually fine to keep using Fisher Z as the CI test.
The structure is still identifiable from second-order moments under many conditions, even if the noise is non-Gaussian.

Recommended Scores

Sem BIC Score (heuristic but empirically strong)
Although derived under a Gaussian assumption, Sem BIC often works very well in linear non-Gaussian settings. In particular, the BOSS paper

Andrews, B., Ramsey, J., Sanchez-Romero, R., Camchong, J., & Kummerfeld, E. (2023).
Fast scalable and accurate discovery of DAGs using the best order score search and grow shrink trees.
NeurIPS 36, 63945–63956.

shows that BOSS + Sem BIC performs comparably to DirectLiNGAM in linear non-Gaussian simulations (see Figure 4a).
PoissonPriorScore (optional structural prior)
This is a structural sparsity prior (Poisson on edges/parents), not a noise model. You can combine it with Sem BIC if you want an explicit probabilistic prior over graph complexity, but it is not specific to linear non-Gaussian noise.

Best-Fit Algorithms

DirectLiNGAM, ICA-LiNGAM, ICA-LiNG-D (internal ICA-type objective)
FASK / FASK-Vote
BOSS or FGES with Sem BIC (heuristic but well supported by experiments)
Pairwise-skewness-based orientation methods

Rule of thumb:
If residuals are visibly skewed or heavy-tailed, it is still quite reasonable to use Sem BIC with BOSS/FGES for structure learning, and to compare against dedicated LiNGAM-style methods when possible.

5. Nonlinear Models

Tetrad provides three practically useful nonlinear CI tests.

A. Kernel Conditional Independence Test (KCI)

KCI

Captures arbitrary nonlinear dependencies.
Very powerful but computationally expensive for large N.
Best for small–medium datasets.

B. Random Conditional Independence Test (RCIT)

RCIT

Captures arbitrary nonlinear dependencies.
Faster than KCI.
Useful for larger datasets.

B. Basis Function Test / Score (Recommended for scalability)

Basis Function CI Test
Basis Function BIC Score
Nonlinear via polynomial/orthogonal expansions.
Post-nonlinear models.
Often matches or outperforms KCI on large N due to superior speed.

Rule of thumb:

Goal	Recommended
Best accuracy for nonlinear CI	KCI
Best speed + strong accuracy	Basis Function Test / Basis Function BIC

6. Latent Variable Workflows (Block-Based Search)

When you run latent clustering (TSC, FOFC, FTFC, GFFC, BPC), you obtain clusters → latent nodes.
You can then run structure discovery over those latent nodes.

Block-Based Tests/Scores

Blocks-Test-TS
Trek-separation test on clusters.
Blocks-BIC Score
A block-aware score for FGES, BOSS, GRaSP, SP, etc.

(These are not yet documented as separate test/score pages in this manual.)

Compatible Algorithms

Any algorithm that accepts a test and/or score:

PC (default choice)
GFCI / RFCI
FGES (when clusters are few)
BOSS (very strong for moderate-sized latent structures)
GRaSP or SP (if small)

Typical Workflow

Run clustering (e.g., TSC)
Convert clusters → latent variables
Choose Blocks-Test-TS or Blocks-BIC
Run PC, GFCI, BOSS, FGES, or others

This is the recommended approach for latent causal structure without specifying measurement models.

Summary Table (Practical Defaults)

Setting	Test	Score	Algorithms
Continuous linear	Fisher Z	Sem BIC	PC, FGES, BOSS*, GFCI
Discrete	G² or BDeu-style tests	BDeu	BOSS*, FGES, PC
Mixed	Degenerate Gaussian Test	Degenerate Gaussian BIC	PC, FGES, BOSS, GFCI, FCIT
Linear non-Gaussian	Internal ICA criteria or Fisher Z when using PC/BOSS	Sem BIC	DirectLiNGAM, FASK, BOSS (heuristic), FGES
Nonlinear	KCI, RCIT	(none / kernel-based)	PC+KCI, CAM
Nonlinear scalable	Basis Function Test	Basis Function BIC	PC+BF, GFCI+BF
Latent blocks	Blocks-Test-TS	Blocks-BIC	PC, GFCI, FGES, BOSS

* BOSS is recommended over FGES unless the number of variables is very large. GRaSP is a similar algorithm that should be considered as well.

Next Steps

Tests & Scores Catalog
Search Algorithms — Short List
Latent Clustering
Per-algorithm documentation for parameter definitions