Show simple item record

dc.contributor.authorAl-Barakati, Hussam J.
dc.contributor.authorThapa, Niraj
dc.contributor.authorHiroto, Saigo
dc.contributor.authorRoy, Kaushik
dc.contributor.authorNewman, Robert H.
dc.contributor.authorKC, Dukka B.
dc.date.accessioned2020-04-25T22:39:11Z
dc.date.available2020-04-25T22:39:11Z
dc.date.issued2020
dc.identifier.citationAl-Barakati, Hussam J.; Thapa, Niraj; Hiroto, Saigo; Roy, Kaushik; Newman, Robert H.; Kc, Dukka B. 2020. RF-MaloSite and DL-Malosite: Methods based on random forest and deep learning to identify malonylation sites. Computational and Structural Biotechnology Journal, vol. 18:pp 852-860en_US
dc.identifier.issn2001-0370
dc.identifier.urihttps://doi.org/10.1016/j.csbj.2020.02.012
dc.identifier.urihttps://soar.wichita.edu/handle/10057/17484
dc.description© 2020 The Authors. Open access. Under a Creative Commons license.en_US
dc.description.abstractMalonylation, which has recently emerged as an important lysine modification, regulates diverse biological activities and has been implicated in several pervasive disorders, including cardiovascular disease and cancer. However, conventional global proteomics analysis using tandem mass spectrometry can be time-consuming, expensive and technically challenging. Therefore, to complement and extend existing experimental methods for malonylation site identification, we developed two novel computational methods for malonylation site prediction based on random forest and deep learning machine learning algorithms, RF-MaloSite and DL-MaloSite, respectively. DL-MaloSite requires the primary amino acid sequence as an input and RF-MaloSite utilizes a diverse set of biochemical, physiochemical and sequence-based features. While systematic assessment of performance metrics suggests that both ‘RF-MaloSite’ and ‘DL-MaloSite’ perform well in all metrics tested, our methods perform particularly well in the areas of accuracy, sensitivity and overall method performance (assessed by the Matthew's Correlation Coefficient). For instance, RF-MaloSite exhibited MCC scores of 0.42 and 0.40 using 10-fold cross-validation and an independent test set, respectively. Meanwhile, DL-MaloSite was characterized by MCC scores of 0.51 and 0.49 based on 10-fold cross-validation and an independent set, respectively. Importantly, both methods exhibited efficiency scores that were on par or better than those achieved by existing malonylation site prediction methods. The identification of these sites may also provide important insights into the mechanisms of crosstalk between malonylation and other lysine modifications, such as acetylation, glutarylation and succinylation. To facilitate their use, both methods have been made freely available to the research community at https://github.com/dukkakc/DL-MaloSite-and-RF-MaloSite.en_US
dc.description.sponsorshipNational Science Foundation (NSF) grant nos. 2021734, 1564606 and 1901793 (to DK). RHN is supported by an HBCU-UP Excellence in Research Award from NSF (1901793) and an SC1 Award from the National Institutes of Health National Institute of General Medical Science (5SC1GM130545). HS was supported by JSPS KAKENHI Grant Numbers JP18H01762 and JP19H04176.en_US
dc.language.isoen_USen_US
dc.publisherElsevieren_US
dc.relation.ispartofseriesComputational and Structural Biotechnology Journal;v.18
dc.subjectConvolutional neural networken_US
dc.subjectDeep learningen_US
dc.subjectMalonylationen_US
dc.subjectPost-translational modification sitesen_US
dc.subjectRandom foresten_US
dc.titleRF-MaloSite and DL-Malosite: Methods based on random forest and deep learning to identify malonylation sitesen_US
dc.typeArticleen_US
dc.rights.holder© 2020 The Authorsen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record