Identifying Structure and Semantics in Bayesian Network Inference
MetadataShow full item record
Bayesian networks are a semantic modeling tool for managing uncertainty in complex domains. While the numerous techniques for exact inference vary when they apply the multiplication and marginalization (addition) operators, they all center around eliminating variables. These inference algorithms start and end with clear structure and semantics, yet all intermediate distributions, whether normalized or unnormalized, are denoted as potentials. In this thesis, we reveal the structure and semantics of the intermediate factors constructed during exact inference in discrete Bayesian networks. Our discussion is primarily based on Variable Elimination (VE), which is a standard approach to Bayesian network inference. We first show that, when evidence is not considered, every multiplication and every addition in VE yields a conditional probability table (CPT). As for semantics, we present several techniques to establish whether or not the CPT generated by VE using the given Bayesian network CPTs is the same as if it was constructed by brute force from the joint probability distribution defined by the given Bayesian network. The culminating result in this thesis is the Semantics in Inference (SI) algorithm. Four important properties of SI are shown, including soundness, completeness, strong completeness, and polynomial time complexity. This work is important in several ways. First, we have removed potentials from any discussion on exact inference in discrete Bayesian network. Potentials do not have clear physical interpretation, as they are unnormalized probability distributions. In contrast, CPTs have clear semantic meaning. In addition, we have revealed that d-separation plays an important role in understanding semantics inference. Pearl, the founder of Bayesian networks, emphasizes the importance of d-separation with respect to Bayesian network modeling. With respect to inference, Pearl only states that d-separation can determine the minimum information needed for answering a query posed to a Bayesian network. No claim has ever been made that d-separation can also provide semantics during Bayesian network inference. The above results may serve as a pedagogical aid to newcomers to the field, since Bayesian network inference is regarded as being hard to understand. Practical benefits of our findings are introduced, but primarily remain for future work.