Accurate and efficient causal discovery, and causal representation learning
Loading...
Authors
Yin, Naiyu
Issue Date
2024-12
Type
Electronic thesis
Thesis
Thesis
Language
en_US
Keywords
Electrical engineering
Alternative Title
Abstract
Recent advances in causal learning and inference have positioned causality as a key approach to addressing critical challenges in AI, including limited generalization, lack of interpretability, and fairness issues. Structural causal models (SCMs) are a foundational framework in this domain, using directed acyclic graphs (DAGs) to represent causal relationships among variables and structural equation models to quantify them. This dissertation addresses two main areas: causal discovery, which seeks to recover a unique DAG from observational data, and causal representation learning, which leverages an SCM to learn representations causally related to the target variable. We develop new theories and algorithms to overcome limitations in these areas. In causal discovery, we improve both accuracy and efficiency. To enhance accuracy, we introduce the heteroscedastic SCM (HSCM), which extends traditional SCMs by allowing heteroscedastic (variable) noise. We then propose an algorithm with identifiability guarantees to accurately learn HSCMs. Empirical results show that our method achieves high accuracy with the learned DAGs, particularly with real-world data. For efficiency, we introduce a method to learn nonlinear DAGs in a projected space, ensuring the acyclicity constraint is met without explicitly imposing it in the original space. This approach significantly reduces computation time while maintaining state-of-the-art accuracy. In causal representation learning, we address existing SCM limitations, such as limited representation scope, unaddressed latent confounders, and incomplete SCM learning. To broaden causal representations, we define an SCM that incorporates the causal Markov blanket (CMB) features of the target variable, and propose an efficient algorithm with identifiability guarantees for learning CMB representations. Results indicate significant improvement in out-of-distribution (OOD) performance. To handle latent confounders and obtain a complete SCM, we further extend the SCM by including a latent confounder, and apply the structural EM method for full SCM learning. Using the SCM, we introduce interventional inference for domain generalization and counterfactual inference for data generation, demonstrating strong OOD performance and generation of counterfactual images aligned with human interpretation.
Description
December 2024
School of Engineering
School of Engineering
Full Citation
Publisher
Rensselaer Polytechnic Institute, Troy, NY