3. BDeu Score
3.1. Summary
The BDeu Score is a Bayesian score for discrete Bayesian networks. It uses a Dirichlet prior with a specified equivalent sample size (ESS) and a uniform prior over CPT parameters, yielding a score that is score equivalent across Markov-equivalent DAGs.
3.2. When to use
Variables are discrete.
You want a Bayesian score with a tunable prior strength (ESS).
You care about score equivalence across equivalent DAG structures.
3.3. Model class
Discrete Bayes nets with Dirichlet priors over CPT rows.
3.4. Score form (conceptual)
The BDeu score is the log marginal likelihood of the data given the DAG under a Dirichlet-multinomial model with a uniform Dirichlet prior of strength ESS. The score depends on:
counts in each CPT cell,
the prior hyperparameters determined by ESS and uniformity,
the DAG structure.
3.5. Parameters
Parameter (camelCase) |
Description |
|---|---|
|
Double ≥ 1.0. The prior equivalent sample size for the BDeu Dirichlet prior. This total prior count is spread uniformly across all parent–child configurations in the conditional probability tables. Larger values make the prior stronger relative to the data (smoother estimates and stronger regularization); smaller values let the data dominate more. Default is 10.0. |
|
Double ≥ 0.0. Structure prior coefficient controlling a binomial-style prior on graph structure (for example, expected number of parents per node). When set to 0.0 (the default), BDeu uses essentially a uniform structure prior. Increasing this value biases the score toward graphs whose parent counts match the implied binomial prior; larger values therefore encourage particular sparsity levels. |
3.6. Strengths
Score equivalent under standard conditions.
Incorporates prior information via ESS.
Often preferred over pure BIC in small-sample discrete settings.
3.7. Limitations
Sensitive to the choice of ESS; too large or too small values can bias results.
Uniform Dirichlet priors may not reflect domain knowledge.
3.8. References
Heckerman, D., Geiger, D., & Chickering, D. M. (1995). Learning Bayesian networks: The combination of knowledge and statistical data. Machine Learning, 20(3), 197–243.