Decoding the stochastic profile of m6A over the entire transcriptome

Konferenz: BIBE 2022 - The 6th International Conference on Biological Information and Biomedical Engineering
19.06.2022 - 20.06.202 in Virtual, China

Tagungsband: BIBE 2022

Seiten: 4Sprache: EnglischTyp: PDF

Autoren:
Wang, Jiaying; Wei, Zhen; Zhang, Yuxin (Department of Biological Sciences, Xi’an Jiaotong-Liverpool University, Suzhou Jiangsu, China)

Inhalt:
N6-methladenosine (m6A), an abundant eukaryotic mRNA modification, is a crucial epigenetic marker dynamically regulated by demethylase (Erasers), methyltransferase (Writers), and binding proteins (Readers). Hence, decoding the stochastic profile of m6A over transcriptome is invaluable to our understanding of its biological functions. The m6A site over 1624625 DRACH motifs on human exons were summarized from 40 experiments. Four machine learning algorithms, generalized linear model (GLM), multi-layer perceptron (MLP), extreme gradient boosting (XGBoost), and random forest (RF), were implemented to build Poisson regression models. Compared with classification models used in previous studies, our model provides a new framework to integrate multiple single-base RNA modification datasets. We demonstrated that the Poisson regressors can better predict the biological and technical variation between experiments than classifiers trained with same features. In addition, we for the first time utilized the protein binding information for prediction and achieved significantly better performance than models based on only sequence-derived and genome-derived features.