Publisher's Synopsis
Deep neural networks (DNNs) are currently the dominant technology in Artificial Intelligence (AI) and have shown impressive performance in diverse applications, including autonomous driving, medical diagnosis, text generation, and logical reasoning. However, they lack transparency due to their black-box construction and are vulnerable to environmental and adversarial noise. These issues have caused concerns about their safety and trust when deployed in the real world. Although standard training optimizes the model's accuracy, it does not take into account desirable safety properties such as robustness, fairness, and monotonicity. As a result, researchers have spent considerable time developing automated methods for building safe and trustworthy DNNs. Abstract interpretation has emerged as the most popular framework for efficiently analyzing realistic DNNs among the various approaches. However, due to fundamental differences in the computational structure of DNNs compared to traditional programs, developing efficient DNN analyzers has required tackling significantly different research challenges than those encountered for programs. In this monograph, state-of-the-art approaches based on abstract interpretation for analyzing DNNs are described. These approaches include the design of new abstract domains, synthesis of novel abstract transformers, abstraction refinement, and incremental analysis. Discussed is how the analysis results can be used to: (i) formally check whether a trained DNN satisfies desired output and gradient-based safety properties, (ii) guide the model updates during training towards satisfying safety properties, and (iii) reliably explain and interpret the black-box workings of DNNs.