Redefining the Shortest Path Problem Formulation of the Linear Non-Gaussian Acyclic Model: Pairwise Likelihood Ratios, Prior Knowledge, and Path Enumeration

Published in arXiv preprint arXiv:2404.11922, 2024

Effective causal discovery is essential for learning the causal graph from observational data. The Linear Non-Gaussian Acyclic Model (LiNGAM) operates under the assumption of a linear data generating process with non-Gaussian noise. Its assumption of no unmeasured confounders, however, poses limitations in practical settings.

Empirical research has shown that reformulating LiNGAM as a shortest path problem (LiNGAM-SPP) addresses this limitation, using mutual information as a measure of independence. This introduces challenges with parameter tuning due to its reliance on kNN estimators.

This paper proposes a threefold enhancement to the LiNGAM-SPP framework:

  1. Pairwise Likelihood Ratio – Replaces kNN-based mutual information, removing the need for parameter tuning and improving performance across synthetic and real datasets.
  2. Prior Knowledge Integration – A node-skipping strategy enables incorporating known relative orderings to eliminate invalid causal paths.
  3. Path Distribution Analysis – The distribution of paths is analyzed to infer graph-level properties like unmeasured confounding and sparsity.

These refinements enhance the practicality, interpretability, and robustness of graph-search-based causal discovery.

Recommended citation: Ong, Hans Jarett J., and Lim, Brian Godwin S. (2024). "Redefining the Shortest Path Problem Formulation of the Linear Non-Gaussian Acyclic Model: Pairwise Likelihood Ratios, Prior Knowledge, and Path Enumeration." arXiv preprint arXiv:2404.11922.
Download Paper