Redefining the Shortest Path Problem Formulation of the Linear Non-Gaussian Acyclic Model: Pairwise Likelihood Ratios, Prior Knowledge, and Path Enumeration
Published in arXiv preprint arXiv:2404.11922, 2024
Effective causal discovery is essential for learning the causal graph from observational data. The Linear Non-Gaussian Acyclic Model (LiNGAM) operates under the assumption of a linear data generating process with non-Gaussian noise. Its assumption of no unmeasured confounders, however, poses limitations in practical settings.
Empirical research has shown that reformulating LiNGAM as a shortest path problem (LiNGAM-SPP) addresses this limitation, using mutual information as a measure of independence. This introduces challenges with parameter tuning due to its reliance on kNN estimators.
This paper proposes a threefold enhancement to the LiNGAM-SPP framework:
- Pairwise Likelihood Ratio – Replaces kNN-based mutual information, removing the need for parameter tuning and improving performance across synthetic and real datasets.
- Prior Knowledge Integration – A node-skipping strategy enables incorporating known relative orderings to eliminate invalid causal paths.
- Path Distribution Analysis – The distribution of paths is analyzed to infer graph-level properties like unmeasured confounding and sparsity.
These refinements enhance the practicality, interpretability, and robustness of graph-search-based causal discovery.
Recommended citation: Ong, Hans Jarett J., and Lim, Brian Godwin S. (2024). "Redefining the Shortest Path Problem Formulation of the Linear Non-Gaussian Acyclic Model: Pairwise Likelihood Ratios, Prior Knowledge, and Path Enumeration." arXiv preprint arXiv:2404.11922.
Download Paper