ASAS-NANP symposium: mathematical modeling in animal nutrition-Making sense of big data and machine learning: how open-source code can advance training of animal scientists.
Jameson R BrennanHector Manuel MenendezKrista EhlertLuis Orlindo TedeschiPublished in: Journal of animal science (2023)
Advancements in precision livestock technology have resulted in an unprecedented amount of data being collected on individual animals. Throughout the data analysis chain, many bottlenecks occur, including processing raw sensor data, integrating multiple streams of information, incorporating data into animal growth and nutrition models, developing decision support tools for producers, and training animal science students as data scientists. To realize the promise of precision livestock management technologies, open-source tools and tutorials must be developed to reduce these bottlenecks, which are a direct result of the tremendous time and effort required to create data pipelines from scratch. Open-source programming languages (e.g., R or Python) can provide users with tools to automate many data processing steps for cleaning, aggregating, and integrating data. However, the steps from data collection to training artificial intelligence models and integrating predictions into mathematical models can be tedious for those new to statistical programming, with few examples pertaining to animal science. To address this issue, we outline how open-source code can help overcome many of the bottlenecks that occur in the era of big data and precision livestock technology, with an emphasis on how routine use and publication of open-source code can help facilitate training the next generation of animal scientists. In addition, two case studies are presented with publicly available data and code to demonstrate how open-source tutorials can be utilized to streamline data processing, train machine learning models, integrate with animal nutrition models, and facilitate learning. The National Animal Nutrition Program focuses on providing research-based data on animal performance and feeding strategies. Open-source data and code repositories with examples specific to animal science can help create a reinforcing mechanism aimed at advancing animal science research.