PetCaseFinder

Peer-reviewed veterinary case report

Machine learning-based prediction of clinical mastitis in dairy cows: A comparative analysis of 9 algorithms using production and management data.

Journal:
Journal of dairy science
Year:
2026
Authors:
Liu, Chengyuan et al.
Affiliation:
College of Veterinary Medicine · China

Abstract

Mastitis represents one of the most formidable challenges in modern dairy farming, posing significant threats to individual cow health and causing substantial economic losses throughout the dairy production chain. Traditional disease diagnosis methods are often reactive and costly, creating an urgent need for advanced predictive technologies. To address these issues, we proposed a novel machine learning-based mastitis prediction system that breaks through conventional diagnostic paradigms by deeply integrating data science with veterinary medicine. We analyzed 177,493 dairy cow records from a large-scale commercial dairy farm in China, implementing 9 distinct machine learning algorithms for model development and evaluation: random forest, multilayer perceptron, support vector machine, decision tree, gradient boosting, adaptive boosting (AdaBoost), linear discriminant analysis, logistic regression, and naive Bayes. The dataset included comprehensive production metrics, physiological parameters, and management variables, with models trained on both standardized and nonstandardized datasets using rigorous cross-validation techniques. Random forest demonstrated superior predictive performance across all evaluation approaches. On synthetic minority oversampling technique (SMOTE)-balanced cross-validation, random forest achieved an Fscore of 0.804 and an area under the receiver operating characteristic curve (AUC) of 0.884 (95% CI: 0.883-0.885). When evaluated using the methodologically rigorous approach with SMOTE applied within each cross-validation fold, performance metrics had an Fscore of 0.542 and an AUC of 0.773, closely matching the held-out test-set results (Fscore: 0.558, AUC: 0.785). This consistency between cross-validation and test-set performance validates our evaluation methodology and demonstrates robust model generalization under realistic class imbalance conditions. Feature importance analysis revealed month_age as the most critical predictor (relative importance: 0.165), followed by milk_yield (0.138), protein_percentage (0.138), and fat_percentage (0.135). The Z-standardization consistently enhanced model performance across all algorithms, with random forest maintaining optimal calibration between predicted probabilities and actual outcomes. Leveraging the predictive model, we quantified key risk factors and developed an early warning system capable of identifying high-risk animals, providing a robust foundation for precision dairy farming and improved mastitis management strategies. This research successfully develops a data-driven technological solution that substantially reduces disease spread risk and economic losses while driving the transformation of dairy farming toward digitalization and precision management.

Find similar cases for your pet

PetCaseFinder finds other peer-reviewed reports of pets with the same symptoms, plus a plain-English summary of what was tried across them.

Search related cases →

Original publication: https://pubmed.ncbi.nlm.nih.gov/41397607/