10. Extended BIC (EBIC) Score

10.1. Summary

The Extended BIC (EBIC) Score is a generalization of BIC intended for high-dimensional settings. It adds an extra penalty term that depends on the number of possible edges, favoring sparser graphs more strongly than standard BIC.

10.2. When to use

  • Number of variables is large relative to the sample size.

  • You want stronger sparsity encouragement than standard BIC provides.

  • You are using score-based methods (FGES, BOSS, GRaSP) in high-dimensional regimes.

10.3. Model class

  • Typically applied to linear Gaussian or discrete DAGs, but the EBIC form is generic.

10.4. Score form (conceptual)

A common EBIC form is:

EBIC = 2 * logL − k * ln(N) − 2 * γ * ln(choose(p, k_edges))

where γ is a parameter in [0, 1], p is the number of variables, and k_edges is the number of edges.

10.5. Parameters

Parameter (camelCase)

Description

ebicGamma

Double in [0, 1]. The gamma parameter for Extended BIC (EBIC). Values closer to 0 reduce EBIC to ordinary BIC; values closer to 1 add a strong extra penalty for models with many predictors (useful in high-dimensional settings). Default is 0.8.

precomputeCovariances

Boolean. If true, precomputes and caches covariance (and possibly cross-covariance) matrices used by the score. This speeds up repeated scoring at the cost of additional memory. If false, these quantities are recomputed on the fly, which saves memory but can be slower for large graphs or many score evaluations.

singularityLambda

Double. Handles singular or nearly singular covariance matrices. If singularityLambda > 0, that value is added to the diagonal (a ridge term) to stabilize matrix inverses. If singularityLambda < 0, a pseudoinverse is used instead. Default is 0.0. Use a small positive value if you encounter numerical-singularity warnings.

effectiveSampleSize

Double > 0, or -1. If -1 (default), the actual sample size N is used in the log(N) penalty term. If set to a positive value, the score behaves as if that were the sample size (for example, when treating weighted or subsampled data as having a different effective N).

10.6. Strengths

  • More conservative than BIC, tending to select sparser graphs in high- dimensional settings.

  • Supported by theory in some sparse regression and graphical model contexts.

10.7. Limitations

  • Choice of γ is somewhat problem-dependent.

  • May penalize edges too strongly when N is not extremely small compared to p.

10.8. References

  • Chen, J., & Chen, Z. (2008). Extended Bayesian information criteria for model selection with large model spaces. Biometrika, 95(3), 759–771.