10. DirectLiNGAM

Type: Non-Gaussian / Moment-based / Orientation
Output: DAG
Assumptions: Linear structural equation model + non-Gaussian independent errors

DirectLiNGAM is a version of LiNGAM (Linear Non-Gaussian Acyclic Model) that identifies a causal ordering directly, without using ICA. It uses asymmetries in regression residuals that arise whenever error terms are non-Gaussian. Once a causal ordering is found, the algorithm constructs the DAG by regressing each variable on its predecessors.


10.1. Key Idea

LiNGAM models assume:

  • The system is linear.

  • The error terms are independent and non-Gaussian.

  • A causal ordering exists in which each variable is a linear function of earlier variables plus a non-Gaussian noise term.

DirectLiNGAM proceeds by:

  1. Finding an exogenous variable
    An exogenous variable (a variable with no parents) has residuals that are independent of all other variables when each variable is regressed on it.

  2. Removing the exogenous variable
    Once identified, its effect is regressed out of the rest of the system.

  3. Repeating the process
    Each iteration finds the next variable in the causal order.

  4. Estimating the DAG
    After the ordering is known, each variable is regressed on all earlier variables. Nonzero coefficients correspond to directed edges.

This produces a fully oriented DAG.


10.2. When to Use

DirectLiNGAM is appropriate when:

  • The underlying model is linear.

  • The noise terms are non-Gaussian (skewed or heavy-tailed).

  • You want a fully oriented DAG, not just a CPDAG.

  • You prefer an algorithm that avoids ICA (which can be unstable).

Especially useful for:

  • Biological data (gene expression)

  • Econometric and social-science data

  • fMRI or other data where non-Gaussianity is expected


10.3. Prior Knowledge Support

The Tetrad implementation supports:

  • Required edges

  • Forbidden edges

  • Temporal tiers and ordering restrictions

These constraints are used while searching for the causal ordering.


10.4. Strengths

  • Recovers a fully directed causal graph.

  • More numerically stable than ICA-LiNGAM.

  • Does not require conditional independence tests.

  • Works well in moderately high dimensions.

  • Deterministic ordering search.


10.5. Limitations

  • Requires linearity.

  • Requires strong enough non-Gaussianity to detect asymmetries.

  • Not suitable when the true structure contains latent variables.

  • Performance degrades if noise terms are nearly Gaussian.

  • Computational cost grows with the number of variables.


10.6. Key Parameters in Tetrad

Parameter (camelCase)

Meaning

useBootstrap

Whether to compute bootstrap DAG estimates.

numBootstrapSamples

Number of bootstrap samples.

knowledge

Background knowledge such as required/forbidden edges or tiers.

verbose

Print detailed logs during ordering and regression.


10.7. Reference

  • Shimizu, S., Hoyer, P. O., Hyvärinen, A., & Kerminen, A. (2011).
    DirectLiNGAM: A direct method for learning a linear non-Gaussian structural equation model.
    Journal of Machine Learning Research, 12: 1225–1248.

10.7.1. Python implementation by the authors

A maintained Python package containing DirectLiNGAM and related methods:
https://github.com/cdt15/lingam


10.8. Summary

DirectLiNGAM is a non-Gaussian, moment-based method that identifies a causal ordering directly from asymmetries in regression residuals. It is stable, ICA-free, and ideal for linear models with independent non-Gaussian errors.