Login / Signup

Building a Kokumi Database and Machine Learning-Based Prediction: A Systematic Computational Study on Kokumi Analysis.

Yi HeKaifeng LiuXiangyu YuHengzheng YangWeiwei Han
Published in: Journal of chemical information and modeling (2024)
Kokumi is a subtle sensation characterized by a sense of fullness, continuity, and thickness. Traditional methods of taste discovery and analysis, including those of kokumi, have been labor-intensive and costly, thus necessitating the emergence of computational methods as critical strategies in molecular taste analysis and prediction. In this study, we undertook a comprehensive analysis, prediction, and screening of the kokumi compounds. We categorized 285 kokumi compounds from a previously unreleased kokumi database into five groups based on their molecular characteristics. Moreover, we predicted kokumi/non-kokumi and multi-flavor compositions using six structure-taste relationship models: MLP-E3FP, MLP-PLIF, MLP-RDKFP, SVM-RDKFP, RF-RDKFP, and WeaveGNN feature of Atoms and Bonds. These six predictors exhibited diverse performance levels across two different models. For kokumi/non-kokumi prediction, the WeaveGNN model showed an exceptional predictive AUC value (0.94), outperforming the other models (0.87, 0.90, 0.89, 0.92, and 0.78). For multi-flavor prediction, the MLP-E3FP model demonstrated a higher predictive AUC and MCC value (0.94 and 0.74) than the others (0.73 and 0.33; 0.92 and 0.70; 0.95 and 0.73; 0.94 and 0.64; and 0.88 and 0.69). This data highlights the model's proficiency in accurately predicting kokumi molecules. As a result, we sourced kokumi active compounds through a high-throughput screening of over 100 million molecules, further refined by toxicity and similarity screening. Lastly, we launched a web platform, KokumiPD (https://www.kokumipd.com/), offering a comprehensive kokumi database and online prediction services for users.
Keyphrases
  • machine learning
  • primary care
  • healthcare
  • small molecule
  • mental health
  • high throughput
  • oxidative stress
  • big data
  • optical coherence tomography
  • electronic health record