Human interpretable artificial intelligence applications for microbial-related diseases
Date
2022-09
Authors
Espinoza, Josh L.
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The human microbiome is a complex ecosystem that is influenced not only by host
genetics but environmental stimuli. With advancements in next-generation sequencing
(NGS) technologies, genomics and related meta-omics such as metagenomics,
metatranscriptomics and metaproteomics have become increasingly accessible for
researchers and clinicians to investigate microbial-related diseases. However, analysis
of the outputs of “omics” technologies are often difficult due to variance introduced by
biological complexity, batch effects from laboratory protocols/conditions, and the
sensitivity/calibration of highly sensitive instruments. The biological complexity of “omics”
presents a considerable analytical obstacle as most datasets contain hundreds of
thousands to millions of unique features with unknown connections and nested
hierarchies. In addition to this inherent complexity, the deluge of data generated from
NGS technologies is fundamentally compositional, conveys only relative information, and
because of this cannot be robustly analyzed using conventional statistical approaches.
Furthermore, meta-omics datasets are typically sparse and the number of biological
features often vastly exceeds the number of biological samples which can introduce
anomalies in statistical analysis and the downstream findings if not addressed
accordingly; a term dubbed as “the curse of dimensionality”. The complexity,
compositionality, and dimensionality of “omics” datasets makes it challenging to derive
clinical meaning and an understanding of the microbial system with respect to a host
phenotype. Although, artificial intelligence and machine-learning methods have progressed
substantially in recent years, their applications in domain sciences such as biology, and
by extension “omics” technologies, have been limited in terms of human interpretability.
In many machine-learning paradigms, interpretability is often sacrificed for analytical
performance, or vice versa, but recently a domain-agnostic effort aims to develop
explainable artificial intelligence algorithms that have both high modeling performance
and human interpretability; a major goal of biomedical sciences.
In this dissertation, I develop novel approaches in bridging biological science with
machine learning methods at the vanguard of scientific development through the initiative
of explainable artificial intelligence. The methods developed are validated on 3 datasets
pertaining to microbial-related diseases including antibiotic resistance discovery, acute
malnutrition in West African children, and caries pathology in Australian juvenile twins.
The combination of methods developed are expected to provide the means for clinical
researchers to overcome obstacles in interrogating the complex narratives that determine
health and disease.
Description
Submitted in fulfillment of the requirements of the degree of Doctor of Philosophy of
Applied Science in Biotechnology, Durban University of Technology, Durban, South Africa, 2022.
Keywords
Human microbiome, Artificial intelligence, Microbial-related diseases
Citation
DOI
https://doi.org/10.51415/10321/4701