After trying Keras and PyTorch, we moved over to using RAPIDSAI as a software platform, and as a result model development moved from days per iteration to minutes<\/figcaption><\/figure>\nA closer look into the technology<\/h3>\n We started with a download of TensorFlow onto a local server and some data on a cattle disease of interest \u2013 Bovine Tuberculosis (bTB). TensorFlow is an open-source software library for machine and deep learning tasks, developed by Google. We initially started with TensorFlow due to its popularity and particular focus on the training of deep neural networks for use in research and development, as well as production systems. Two versions are available to run calculations using the CPU or GPU; the GPU-enabled version of TensorFlow offered enhanced processing speeds over the CPU version.<\/p>\n
After some time developing models on the local server we decided to purchase an Nvidia DGX station running 4x Tesla v100 GPUs. The DGX provides plug-and-play AI supercomputing and data centre technology without the requirement of a data centre or any additional IT infrastructure. Moreover, the DGX offered many orders of magnitude performance for deep learning training, compared with CPU-based servers. This increased our model development speed as models ran ten or more times faster. After also playing around with Keras and PyTorch, we moved over to using RAPIDSAI as a software platform, and as a result model development moved from days per iteration to minutes.<\/p>\n
As we progressed our work using DL, we realised that the DL process works much better with data that is precisely labelled. In this case, bTB is often undetected prior to failing a skin test. Data immediately preceding a failed test was of uncertain affected\/unaffected label and so we moved on to a trait where we could be more certain.<\/p>\n
We focused first on predicting pregnancy from milk MIR spectral data, since we had information on the cow that allowed us to be certain about the cow\u2019s pregnancy status throughout her life. At any one milk recording, the cow is either pregnant or not and we can limit our training data to reflect that labelling state. The commercial partner, NMR, was most interested in a change of state from pregnant to not pregnant as this often happened later in lactation and was difficult and expensive for the farmer to manage in a herd context. The ability to alert the farmer to this change of state using routine milk recording would be of great help to farmers in deciding what to do early.<\/p>\n
We had three million spectral records on 697,671 cows from more than 4,000 milk recorded herds. Using a convolutional neural network (CNN), we produced a model that had an accuracy of 89%. This is considered an acceptable prediction accuracy for the commercial partner and we are now conducting a field trial, to work out the mechanism for deploying this new tool as a commercial service to dairy farmers in the UK. Currently this model can predict pregnancy status with an accuracy of 94%.<\/p>\n
Once we had experience of DL and TensorFlow we moved back to bTB. In order to overcome the problem of uncertain labels, we removed all spectral data from the training data set for the eight months immediately preceding a failed skin test, resulting in cows in the dataset being either healthy or infected. The cows\u2019 data from the eight-month period between being healthy and being infected was retained for later analysis.<\/p>\n
There were more than 230,000 cows from around 3,000 herds: 8,591 of which were infected. The incidence of the disease nationally is around 7% and so the data was very unbalanced in the affected and unaffected classes. Rather than reduce the unaffected dataset size to equal the affected, we synthesised new spectral data for the affected group using SMOTE (Synthetic Minority Over-sampling Technique). We then used transfer learning, a machine learning technique in which a pretrained (or learned) model, trained for a specific task, is repurposed for a new and different task.<\/p>\n
The internal prediction accuracy resulting from this model was 95%. However, the more interesting and useful metric was the sensitivity of the prediction. bTB testing has a sensitivity of around 80% meaning that some animals that pass the test can still be covertly infected and return to the herd. These cows are thought to be partly responsible for the continuing incidence of bTB in UK herds. Sensitivity and specificity of our model was 96% and 94%, respectively. A high sensitivity means more cows that will eventually fail a skin test will be alerted much earlier allowing them to be removed before they have a chance to infect more cows in the herd.<\/p>\n
What this means for the wider industry<\/h3>\n These predictions are now being field trialled using interested farmers and vets, who are analysing our predictions in detail at farm level and using other tests to confirm the predictions. Once these field trials are complete, the next step is to work out how to deploy predictions in a working system. Model development is compute-intensive but predictions from a trained model are not and we are investigating the use of a NVIDIA Jetson device locally.<\/p>\n
Both of the traits described here are important to dairy farmers. EGENES produces breeding values for fertility and TB resistance that allow farmers to select animals that have favourable genetics for these traits. Both are based on field data and suffer from loss of accuracy due to the imprecise nature of the phenotype. Predicting these traits from milk spectra offers the opportunity of increasing both the volume and precision of the collected data, thereby making genetic measures more useful. Genomic breeding values based on DNA are widely used by the dairy industry and the increase in precision of collected data improve those genomic breeding values especially.<\/p>\n
This work is the first known application of DL to animal data in agriculture. Initial results are encouraging, enough to lead us to consider other traits that might benefit from this approach (e.g. Johnes, BVD, immune status, minerals, etc). Furthermore, as measures of methane emissions from dairy cows become more widespread and available, it may be possible to estimate the amount of methane produced by a cow from milk MIR. The automated prediction of economically and socially important traits at routine milk recording has the possibility of providing farmers with a robust tool, enabling them to make early management decisions and with a reduced environmental impact. Such tools would have the added benefit of providing an effective enabling service, giving farmers the ability to take ownership of the health and fertility of their herd. Other wider applications offer almost limitless opportunities to gain new scientific insight, from analysis of CT scans and hyperspectral data from cameras to predicting meat yield and quality and the use of images to better record food sensory data. The use of drones to carry hyperspectral cameras, for example, would also enable the collection of field and plant data at high volumes and at a relatively low cost. It seems machine learning in agriculture is already opening new avenues in the industry and is set to open up many more.<\/p>\n
Please note, this article also appears in the fifth edition of our <\/strong><\/em>quarterly publication<\/em><\/strong><\/a>. <\/strong><\/em><\/p>\n","protected":false},"excerpt":{"rendered":"Mike Coffey, researcher at Scotland\u2019s Rural College, explains how machine learning in agriculture is being used to transform the industry.<\/p>\n","protected":false},"author":4,"featured_media":9894,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[24433,785],"tags":[550,570,589,24141],"acf":[],"yoast_head":"\n
The application of machine learning in agriculture<\/title>\n \n \n \n \n \n \n \n \n \n \n \n \n \n\t \n\t \n\t \n \n \n \n \n \n\t \n\t \n\t \n