10. DirectLiNGAM
Type: Non-Gaussian / Moment-based / Orientation
Output: DAG
Assumptions: Linear structural equation model + non-Gaussian independent errors
DirectLiNGAM is a version of LiNGAM (Linear Non-Gaussian Acyclic Model) that identifies a causal ordering directly, without using ICA. It uses asymmetries in regression residuals that arise whenever error terms are non-Gaussian. Once a causal ordering is found, the algorithm constructs the DAG by regressing each variable on its predecessors.
10.1. Key Idea
LiNGAM models assume:
The system is linear.
The error terms are independent and non-Gaussian.
A causal ordering exists in which each variable is a linear function of earlier variables plus a non-Gaussian noise term.
DirectLiNGAM proceeds by:
Finding an exogenous variable
An exogenous variable (a variable with no parents) has residuals that are independent of all other variables when each variable is regressed on it.Removing the exogenous variable
Once identified, its effect is regressed out of the rest of the system.Repeating the process
Each iteration finds the next variable in the causal order.Estimating the DAG
After the ordering is known, each variable is regressed on all earlier variables. Nonzero coefficients correspond to directed edges.
This produces a fully oriented DAG.
10.2. When to Use
DirectLiNGAM is appropriate when:
The underlying model is linear.
The noise terms are non-Gaussian (skewed or heavy-tailed).
You want a fully oriented DAG, not just a CPDAG.
You prefer an algorithm that avoids ICA (which can be unstable).
Especially useful for:
Biological data (gene expression)
Econometric and social-science data
fMRI or other data where non-Gaussianity is expected
10.3. Prior Knowledge Support
The Tetrad implementation supports:
Required edges
Forbidden edges
Temporal tiers and ordering restrictions
These constraints are used while searching for the causal ordering.
10.4. Strengths
Recovers a fully directed causal graph.
More numerically stable than ICA-LiNGAM.
Does not require conditional independence tests.
Works well in moderately high dimensions.
Deterministic ordering search.
10.5. Limitations
Requires linearity.
Requires strong enough non-Gaussianity to detect asymmetries.
Not suitable when the true structure contains latent variables.
Performance degrades if noise terms are nearly Gaussian.
Computational cost grows with the number of variables.
10.6. Key Parameters in Tetrad
Parameter (camelCase) |
Meaning |
|---|---|
useBootstrap |
Whether to compute bootstrap DAG estimates. |
numBootstrapSamples |
Number of bootstrap samples. |
knowledge |
Background knowledge such as required/forbidden edges or tiers. |
verbose |
Print detailed logs during ordering and regression. |
10.7. Reference
Shimizu, S., Hoyer, P. O., Hyvärinen, A., & Kerminen, A. (2011).
DirectLiNGAM: A direct method for learning a linear non-Gaussian structural equation model.
Journal of Machine Learning Research, 12: 1225–1248.
10.8. Summary
DirectLiNGAM is a non-Gaussian, moment-based method that identifies a causal ordering directly from asymmetries in regression residuals. It is stable, ICA-free, and ideal for linear models with independent non-Gaussian errors.