23. SEM BIC Score
23.1. Summary
The SEM BIC Score is a BIC-type score for linear structural equation models (SEMs) with continuous variables and Gaussian errors. It evaluates the fit of a DAG or SEM structure by combining the log-likelihood of the implied covariance matrix with a penalty on model complexity.
23.2. When to use
Data are continuous and reasonably Gaussian.
You are learning a DAG or SEM using algorithms like FGES, BOSS, or GRaSP.
You want a consistent, likelihood-based score that trades off fit and complexity.
23.3. Model class
Linear structural equation models with Gaussian noise.
Equivalent to evaluating a DAG with linear regressions at each node.
23.4. Score form (conceptual)
The SEM BIC Score is of the form:
BIC = 2 * logL − k * ln(N)
where:
logLis the maximized log-likelihood for the model,kis the number of free parameters (edges and variances),Nis the sample size.
In Tetrad’s convention, larger BIC values are better.
23.5. Parameters
Parameter (camelCase) |
Description |
|---|---|
|
Double ≥ 0.0. The penalty multiplier “c” in the modified BIC-type criterion (for example, a score of the form 2·log-likelihood minus c·k·log(N), where k is the number of free parameters and N is the sample size). Larger values impose a stronger complexity penalty and yield sparser graphs; smaller values allow denser graphs. Default is 2.0. |
|
Double ≥ 0.0. Structure prior coefficient specific to the SEM BIC score. When 0.0 (default), the score uses essentially a flat structure prior. Positive values encode a preference for certain in-degree patterns (for example, sparser graphs), acting as an additional prior on the number of edges or parents per node. |
|
Integer. Choice of SEM BIC rule for how likelihood differences are translated into edge decisions: |
|
Boolean. If |
|
Double. Handles singular or nearly singular covariance matrices. If |
|
Integer > 0, or |
23.6. Strengths
Well-studied, consistent under standard regularity conditions.
Efficient to compute using regression or covariance matrix factorizations.
Natural choice for continuous linear DAG/SEM learning.
23.7. Limitations
Assumes linear-Gaussian structure; may mis-score strong nonlinear or non-Gaussian relationships.
Sensitive to outliers and heteroskedasticity.