Current Biotechnology ›› 2023, Vol. 13 ›› Issue (5): 671-680.DOI: 10.19586/j.2095-2341.2021.0201
• Reviews • Previous Articles Next Articles
Haitao CAO(
), Jing ZHU(
), Yunpeng MA, Xinghua CUI
Received:2021-12-29
Accepted:2022-05-16
Online:2023-09-25
Published:2023-10-10
Contact:
Jing ZHU
通讯作者:
朱静
作者简介:曹海涛 E-mail: 2232060551@qq.com;
基金资助:CLC Number:
Haitao CAO, Jing ZHU, Yunpeng MA, Xinghua CUI. Application of Machine Learning in Phenotypic Prediction of Gut Microbiota[J]. Current Biotechnology, 2023, 13(5): 671-680.
曹海涛, 朱静, 马云鹏, 崔兴华. 机器学习在肠道菌群宿主表型预测中的应用[J]. 生物技术进展, 2023, 13(5): 671-680.
| 疾病类型 | 样本数 | 负样本数 | 正样本数 | 算法类型 | 评价标准 | 预测精度 |
|---|---|---|---|---|---|---|
| 2型糖尿病 | 344 | 170 | 174 | 随机森林 | AUC | 0.74 |
| 支持向量机 | AUC | 0.66 | ||||
| 弹性网 | AUC | 0.70 | ||||
| 套索 | AUC | 0.71 | ||||
| 806 | 423 | 383 | 逻辑回归 | F1分数 | 0.91 | |
| 支持向量机 | F1分数 | 0.91 | ||||
| 自适应提升 | F1分数 | 0.90 | ||||
| 梯度提升决策树 | F1分数 | 0.87 | ||||
| K近邻 | F1分数 | 0.86 | ||||
| 随机梯度下降 | F1分数 | 0.84 | ||||
| 随机森林 | F1分数 | 0.83 | ||||
| 肝硬化 | 232 | 118 | 114 | 随机森林 | AUC | 0.95 |
| 支持向量机 | AUC | 0.92 | ||||
| 弹性网 | AUC | 0.91 | ||||
| 套索 | AUC | 0.88 | ||||
| 结直肠癌 | 121 | 48 | 73 | 随机森林 | AUC | 0.87 |
| 支持向量机 | AUC | 0.81 | ||||
| 弹性网 | AUC | 0.79 | ||||
| 套索 | AUC | 0.73 | ||||
| 肥胖 | 253 | 164 | 89 | 随机森林 | AUC | 0.66 |
| 支持向量机 | AUC | 0.65 | ||||
| 弹性网 | AUC | 0.64 | ||||
| 套索 | AUC | 0.60 | ||||
| 炎症性肠病 | 110 | 25 | 85 | 随机森林 | AUC | 0.89 |
| 支持向量机 | AUC | 0.86 | ||||
| 弹性网 | AUC | 0.83 | ||||
| 套索 | AUC | 0.81 | ||||
| 胆管炎 | 48 | 24 | 24 | 随机森林 | AUC | 0.74 |
| 口臭 | 90 | 45 | 45 | 深度学习 | AUC | 0.97 |
| 支持向量机 | AUC | 0.79 | ||||
| 肠息肉 | 552 | 316 | 236 | 朴素贝叶斯 | AUC | 0.86 |
| 人工神经网络 | AUC | 0.87 |
Table 1 Examples of algorithms and prediction accuracy of different diseases predicted by machine learning
| 疾病类型 | 样本数 | 负样本数 | 正样本数 | 算法类型 | 评价标准 | 预测精度 |
|---|---|---|---|---|---|---|
| 2型糖尿病 | 344 | 170 | 174 | 随机森林 | AUC | 0.74 |
| 支持向量机 | AUC | 0.66 | ||||
| 弹性网 | AUC | 0.70 | ||||
| 套索 | AUC | 0.71 | ||||
| 806 | 423 | 383 | 逻辑回归 | F1分数 | 0.91 | |
| 支持向量机 | F1分数 | 0.91 | ||||
| 自适应提升 | F1分数 | 0.90 | ||||
| 梯度提升决策树 | F1分数 | 0.87 | ||||
| K近邻 | F1分数 | 0.86 | ||||
| 随机梯度下降 | F1分数 | 0.84 | ||||
| 随机森林 | F1分数 | 0.83 | ||||
| 肝硬化 | 232 | 118 | 114 | 随机森林 | AUC | 0.95 |
| 支持向量机 | AUC | 0.92 | ||||
| 弹性网 | AUC | 0.91 | ||||
| 套索 | AUC | 0.88 | ||||
| 结直肠癌 | 121 | 48 | 73 | 随机森林 | AUC | 0.87 |
| 支持向量机 | AUC | 0.81 | ||||
| 弹性网 | AUC | 0.79 | ||||
| 套索 | AUC | 0.73 | ||||
| 肥胖 | 253 | 164 | 89 | 随机森林 | AUC | 0.66 |
| 支持向量机 | AUC | 0.65 | ||||
| 弹性网 | AUC | 0.64 | ||||
| 套索 | AUC | 0.60 | ||||
| 炎症性肠病 | 110 | 25 | 85 | 随机森林 | AUC | 0.89 |
| 支持向量机 | AUC | 0.86 | ||||
| 弹性网 | AUC | 0.83 | ||||
| 套索 | AUC | 0.81 | ||||
| 胆管炎 | 48 | 24 | 24 | 随机森林 | AUC | 0.74 |
| 口臭 | 90 | 45 | 45 | 深度学习 | AUC | 0.97 |
| 支持向量机 | AUC | 0.79 | ||||
| 肠息肉 | 552 | 316 | 236 | 朴素贝叶斯 | AUC | 0.86 |
| 人工神经网络 | AUC | 0.87 |
| 1 | Human Microbiome Project Consortium. Structure, function and diversity of the healthy human microbiome[J]. Nature, 2012, 486(7402): 207-214. |
| 2 | FALONY G, JOOSSENS M, VIEIRA-SILVA S, et al.. Population-level analysis of gut microbiome variation[J]. Science, 2016, 352(6285): 560-564. |
| 3 | HE Y, WU W, ZHENG H M, et al.. Regional variation limits applications of healthy gut microbiome reference ranges and disease models[J]. Nat. Med., 2018, 24(10): 1532-1535. |
| 4 | NAJAFABADI M M, VILLANUSTRE F, KHOSHGOFTAAR T M, et al.. Deep learning applications and challenges in big data analytics[J]. J. Big Data, 2015, 2(1): 1-21. |
| 5 | HERNÁNDEZ MEDINA R, KUTUZOVA S, NIELSEN K N, et al.. Machine learning and deep learning applications in microbiome research[J]. ISME Commun., 2022, 2(1): 1-7. |
| 6 | COVER T, HART P. Nearest neighbor pattern classification[J]. IEEE Transac. Inform. Theory, 1967, 13(1): 21-27. |
| 7 | CORTES C, VAPNIK V. Support-vector networks[J]. Mach. Learn., 1995, 20(3): 273-297. |
| 8 | HACLLAR H, NALBANTOĞLU O U, BAKIR-GÜNGÖR B. Machine learning analysis of inflammatory bowel disease-associated metagenomics dataset[C]//2018 3rd International Conference on Computer Science and Engineering (UBMK). IEEE, 2018: 434-438. |
| 9 | ASSEGIE T A. Support vector machine and k-nearest neighbor based liver disease classification model[J]. Indonesian J. Electr. Engin. Med. Inform., 2021, 3(1): 9-14. |
| 10 | LIU W, FANG X, ZHOU Y, et al.. Machine learning-based investigation of the relationship between gut microbiome and obesity status[J/OL]. Microbes Infect., 2022, 24(2): 104892[2022-05-04]. . |
| 11 | REIMAN D, METWALLY A, DAI Y. Using convolutional neural networks to explore the microbiome[J]. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc., 2017, 2017: 4269-4272. |
| 12 | NASSER I M, ABU-NASER S S. Lung cancer detection using artificial neural network[J]. Int. J. Engin. Inform. Systems, 2019, 3(3): 17-23. |
| 13 | LYNGDOH A C, CHOUDHURY N A, MOULIK S. Diabetes disease prediction using machine learning algorithms[C]//2020 IEEE-EMBS Conference on Biomedical Engineering and Sciences (IECBES). IEEE, 2021: 517-521. |
| 14 | GILL S R, POP M, DEBOY R T, et al.. Metagenomic analysis of the human distal gut microbiome[J]. Science, 2006, 312(5778): 1355-1359. |
| 15 | XU J, GORDON J I. Honor thy symbionts[J]. Proc. Natl. Acad. Sci. USA, 2003, 100(18): 10452-10459. |
| 16 | FRANK D N, AMAND A L, FELDMAN R A, et al.. Molecular-phylogenetic characterization of microbial community imbalances in human inflammatory bowel diseases[J]. FEMS Microbiol. Ecol., 2007, 104(34): 13780-13785. |
| 17 | ZHANG X, ZHAO S, SONG X, et al.. Inhibition effect of glycyrrhiza polysaccharide (GCP) on tumor growth through regulation of the gut microbiota composition[J]. J. Pharmacol. Sci., 2018, 137(4): 324-332. |
| 18 | O'HARA A M, SHANAHAN F. The gut flora as a forgotten organ[J]. EMBO Rep., 2006, 7(7): 688-693. |
| 19 | SCHLOSS P D, HANDELSMAN J. Status of the microbial census[J]. Microbiol. Mol. Biol. Rev., 2004, 68(4): 686-691. |
| 20 | MENG C, BAI C, BROWN T D, et al.. Human gut microbiota and gastrointestinal cancer[J]. Genom. Proteom. Bioinform., 2018, 16(1): 33-49. |
| 21 | CARDING S, VERBEKE K, VIPOND D T, et al.. Dysbiosis of the gut microbiota in disease[J/OL]. Microb. Ecol. Health Dis., 2015, 26: 26191[2022-05-04]. . |
| 22 | HENNESSY A A, ROSS R P, FITZGERALD G F, et al.. Role of the gut in modulating lipoprotein metabolism[J/OL]. Curr. Cardiol. Rep., 2014, 16(8): 515[2022-05-04]. . |
| 23 | CHELAKKOT C, GHIM J, RYU S H. Mechanisms regulating intestinal barrier integrity and its pathological implications[J]. Exp. Mol. Med., 2018, 50(8): 1-9. |
| 24 | WALSH C J, GUINANE C M, O'TOOLE P W, et al.. Beneficial modulation of the gut microbiota[J]. FEBS Lett., 2014, 588(22): 4120-4130. |
| 25 | WANG J, TANG H, ZHANG C, et al.. Modulation of gut microbiota during probiotic-mediated attenuation of metabolic syndrome in high fat diet-fed mice[J]. ISME J., 2015, 9(1): 1-15. |
| 26 | RODRÍGUEZ J M, MURPHY K, STANTON C, et al.. The composition of the gut microbiota throughout life, with an emphasis on early life[J/OL]. Microb. Ecol. Health Dis., 2015, 26: 26050[2022-05-04]. . |
| 27 | LEPAGE P, COLOMBET J, MARTEAU P, et al.. Dysbiosis in inflammatory bowel disease: a role for bacteriophages?[J]. Gut, 2008, 57(3): 424-425. |
| 28 | MÄTTÖ J, MAUNUKSELA L, KAJANDER K, et al.. Composition and temporal stability of gastrointestinal microbiota in irritable bowel syndrome: a longitudinal study in IBS and control subjects[J]. FEMS Immunol. Med. Microbiol., 2005, 43(2): 213-222. |
| 29 | KEKU T O, DULAL S, DEVEAUX A, et al.. The gastrointestinal microbiota and colorectal cancer[J]. Am. J. Physiol. Gastrointest. Liver Physiol., 2015, 308(5): 351-363. |
| 30 | ECK A, DE GROOT E F J, DE MEIJ T G J, et al.. Robust microbiota-based diagnostics for inflammatory bowel disease[J]. J. Clin. Microbiol., 2017, 55(6): 1720-1732. |
| 31 | KANG D W, PARK J G, ILHAN Z E, et al.. Reduced incidence of Prevotella and other fermenters in intestinal microflora of autistic children[J/OL]. PLoS ONE, 2013, 8(7): e68322[2022-05-04]. . |
| 32 | SON J S, ZHENG L J, ROWEHL L M, et al.. Comparison of fecal microbiota in children with autism spectrum disorders and neurotypical siblings in the Simons simplex collection[J/OL]. PLoS ONE, 2015, 10(10): e0137725[2022-05-04]. . |
| 33 | ANGELAKIS E, ARMOUGOM F, MILLION M, et al.. The relationship between gut microbiota and weight gain in humans[J]. Future Microbiol., 2012, 7(1): 91-109. |
| 34 | QIN J, LI Y, CAI Z, et al.. A metagenome-wide association study of gut microbiota in type 2 diabetes[J]. Nature, 2012, 490(7418): 55-60. |
| 35 | 张国庆,黄子琪,王明月,等. 大学生饮食习惯与唾液微生物多样性的关联[J].食品科学, 2019,40(1): 196-201. |
| 36 | LI L, MINGLE D R. Mini review adv biotech & micro machine learning techniques on microbiome-based diagnostics[J]. Adv. Biotechnol. Microbiol., 2017, 6(4): 555695[2022-05-04]. . |
| 37 | CAMMAROTA G, IANIRO G, AHERN A, et al.. Gut microbiome, big data and machine learning to promote precision medicine for cancer[J]. Nat. Rev. Gastroenterol. Hepatol., 2020, 17(10): 635-648. |
| 38 | FRADKOV A. Early history of machine learning[J]. IFAC-Papers, 2020, 53(2): 1385-1390. |
| 39 | ZHOU Y H, GALLINS P. A review and tutorial of machine learning methods for microbiome host trait prediction[J/OL]. Front. Genet., 2019, 10: 579[[2022-05-04]. . |
| 40 | ZHANG Y, YAN J, CHEN S, et al.. Review of the applications of deep learning in bioinformatics[J].Curr. Bioinform., 2020, 15(8):1-14. |
| 41 | DAVENPORT T, KALAKOTA R. The potential for artificial intelligence in healthcare[J]. Future Healthc. J., 2019, 6(2): 94-98. |
| 42 | VUJKOVIC-CVIJIN I, SKLAR J, JIANG L, et al.. Host variables confound gut microbiota studies of human disease[J]. Nature, 2020, 587(7834): 448-454. |
| 43 | CAMACHO D M, COLLINS K M, POWERS R K, et al.. Next-generation machine learning for biological networks[J]. Cell, 2018, 173(7): 1581-1592. |
| 44 | MAHESH B. Machine learning algorithms-a review[J]. Int. J. Sci. Res., 2020, 9: 381-386. |
| 45 | XU L, LIANG G, LIAO C, et al.. An efficient classifier for Alzheimer's disease genes identification[J/OL]. Molecules, 2018, 23(12): 3140[2022-05-04]. . |
| 46 | KUNG H C, CHEN R M, TSAI J J P, et al.. Stratification of human gut microiome and building a SVM-based classifier[C]//2018 IEEE 18th International Conference on Bioinformatics and Bioengineering (BIBE). IEEE, 2018: 14-17. |
| 47 | ALTY S, MILLASSEAU S, CHOWIENCZYC P J, et al. Cardiovascular disease prediction using support vector machines[J]. Midwest Symp. Circuits Syst., 2004, 1: 376-379. |
| 48 | WU H, CAI L, LI D, et al. Metagenomics biomarkers selected for prediction of three different diseases in Chinese population[J]. BioMed. Res. Int., 2018, 2018: 1-7. |
| 49 | YAO Q, TANG M, ZENG L, et al.. Potential of fecal microbiota for detection and postoperative surveillance of colorectal cancer[J/OL]. BMC Microbiol., 2021, 21(1): 156[2022-05-04]. . |
| 50 | LI H, PI D, WU Y, et al.. Integrative method based on linear regression for the prediction of zinc-binding sites in proteins[J/OL]. IEEE Access, 2017, PP(99): 1[2022-05-04]. . |
| 51 | STATNIKOV A, HENAFF M, NARENDRA V, et al.. A comprehensive evaluation of multicategory classification methods for microbiomic data[J/OL]. Microbiome, 2013, 1(1): 11[2022-05-04]. . |
| 52 | PASOLLI E, TRUONG D T, MALIK F, et al.. Machine learning meta-analysis of large metagenomic datasets: tools and biological insights[J/OL]. PLoS Comput. Biol., 2016, 12(7): e1004977[2022-05-04]. . |
| 53 | YANG L, WU H, JIN X, et al.. Study of cardiovascular disease prediction model based on random forest in Eastern China[J/OL]. Sci. Rep., 2020, 10(1): 5245[2022-05-04]. . |
| 54 | TEJAMMA M, NAVEENKUMAR J P, PATIL S. A model based on convolutional neural network (CNN) to predict heart disease[J]. J. Algeb. Statist., 2022, 13(3): 2360-2367. |
| 55 | WEHKAMP J, HARDER J, WEHKAMP K, et al.. NF-kappaB- and AP-1-mediated induction of human beta defensin-2 in intestinal epithelial cells by Escherichia coli Nissle 1917: a novel effect of a probiotic bacterium[J]. Infect. Immun., 2004, 72(10): 5750-5758. |
| 56 | SCHAEDLER R W, DUBOS R, COSTELLO R. The development of the bacterial flora in the gastrointestinal tract of mice[J]. J. Exp. Med., 1965, 122(1): 59-66. |
| 57 | MAZMANIAN S K, LIU C H, TZIANABOS A O, et al.. An immunomodulatory molecule of symbiotic bacteria directs maturation of the host immune system[J]. Cell, 2005, 122(1): 107-118. |
| 58 | 刘驰, 李家宝, 芮俊鹏, 等. 16S rRNA基因在微生物生态学中的应用[J]. 生态学报, 2015, 35(9): 2769-2788. |
| 59 | CONSORTIUM H M P, HUTTENHOWER C, GEVERS D, et al.. Structure, function and diversity of the healthy human microbiome[J]. Nature, 2012, 486(7402): 207-214. |
| 60 | MCDONALD D, HYDE E, DEBELIUS J W, et al.. American gut: an open platform for citizen science microbiome research[J]. Microorganisms, 2018, 3(3): e00031-e00018. |
| 61 | PFLUGHOEFT K J, VERSALOVIC J. Human microbiome in health and disease[J]. Annu. Rev. Pathol., 2012, 7: 99-122. |
| 62 | VALDES A M, WALTER J, SEGAL E, et al.. Role of the gut microbiota in nutrition and health[J/OL]. Brithish Med. J., 2018, 361: k2179[2022-05-04]. . |
| 63 | WONG A C, LEVY M. New approaches to microbiome-based therapies[J]. mSystems, 2019, 4(3): 119-122. |
| 64 | SCHLABERG R. Microbiome diagnostics[J]. Clin. Chem., 2020, 66(1): 68-76. |
| 65 | SCHLOSS P D. Identifying and overcoming threats to reproducibility, replicability, robustness, and generalizability in microbiome research[J]. mBio, 2018, 9(3): 518-525. |
| 66 | MCLAREN M R, WILLIS A D, CALLAHAN B J. Consistent and correctable bias in metagenomic sequencing experiments[J/OL]. eLife, 2019, 8: e46923[2022-05-04]. . |
| 67 | QIN N, YANG F, LI A, et al.. Alterations of the human gut microbiome in liver cirrhosis[J]. Nature, 2014, 513(7516): 59-64. |
| 68 | IWASAWA K, SUDA W, TSUNODA T, et al.. Dysbiosis of the salivary microbiota in pediatric-onset primary sclerosing cholangitis and its potential as a biomarker[J/OL]. Sci. Rep., 2018, 8(1): 5480[2022-05-04]. . |
| 69 | SARWAR A, JAVED K, KHAN M J, et al.. Enhanced accuracy for motor imagery detection using deep learning for BCI[J]. Comp. Mater. Contin., 2021(9): 3825-3840. |
| 70 | DADKHAH E, SIKAROODI M, KORMAN L, et al.. Gut microbiome identifies risk for colorectal polyps[J/OL]. BMJ Open Gastroenterol., 2019, 6(1): e000297[2022-05-04]. . |
| 71 | OSISANWO F Y, AKINSOLA J E T, AWODELE O, et al.. Supervised machine learning algorithms: classification and comparison[J]. Int. J. Comp. Trends Technol., 2017, 48(3): 128-138. |
| 72 | LIVINGSTONE D J, MANALLACK D T, TETKO I V. Data modelling with neural networks: advantages and limitations[J]. J. Comput. Aided Mol. Des., 1997, 11(2): 135-142. |
| 73 | LAMICHHANE S, SEN P, DICKENS A M, et al.. Gut metabolome meets microbiome: a methodological perspective to understand the relationship between host and microbe[J]. Methods, 2018, 149: 3-12. |
| 74 | KUANG X, WANG F, HERNANDEZ K M, et al.. Accurate and rapid prediction of tuberculosis drug resistance from genome sequence data using traditional machine learning algorithms and CNN[J/OL]. Sci. Rep., 2022, 12(1): 2427[2022-05-04]. . |
| [1] | Ting XU, Jiahao SHEN, Kang ZHAO, Lu HUANG, Enhui DONG, Kexin ZENG, Xinwei BIAN, Minghui JI, Qin XU. Bacterial Signature for Prediction of Disease Type Based on Abundance of Ruminococcus [J]. Current Biotechnology, 2024, 14(2): 323-330. |
| [2] | Yunpeng MA, Jing ZHU, Xinghua CUI. Content Estimating of Microbial Dissolved Organic Carbon Based on Machine Learning [J]. Current Biotechnology, 2023, 13(4): 645-653. |
| [3] | Ruiju MIAO, Zundan DING, Jian TIAN, Hongbing ZHANG, Feifei GUAN. Research Advances on Traditional and Intelligent Molecular Design of PET Hydrolases [J]. Current Biotechnology, 2023, 13(1): 46-54. |
| [4] | Meng WANG, Yang YI, Mengting SUN, Zijia LIU, Xue JIANG, Chen MA, Yifei SONG, Fei XIE. Biomedical Research Progress of Hydrogen⁃rich Water and Hydrogen⁃rich Saline: Animal Experiment [J]. Current Biotechnology, 2022, 12(3): 332-343. |
| [5] | Zhiqi XIN, Hang ZHAO, Hai WANG, Tiegang LU. Crop Genomics and Genetic Improvement Based on Deep Learning [J]. Current Biotechnology, 2021, 11(4): 483-488. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||