Beyond classical statistics – data fusion coupled with pattern recognition

AIM: Patterns in data obtained from wine chemical and sensory evaluations are difficult to infer using classical statistics. Pattern recognition can be resolved by coupling data fusion with machine learning techniques, possibly leading to new hypotheses being formed. This study demonstrates the applicability of two pattern recognition approaches using as case study involving Chenin Blanc wines (recently bottled and after two years storage) from young (35 years) vines. METHODS: Sensory (sorting (Mafata et al. 2020)) and chemical (NMR: nuclear magnetic resonance, HRMS: high resolution mass spectrometry, and UV-Vis: ultraviolet spectrophotometry) data were collected for the young and aged (two years in the bottle) wines. Data sets were combined using multiple factor analysis (MFA). Exploratory unsupervised cluster analysis was performed by agglomerative hierarchical clustering (AHC) and Fuzzy-k means (Bezdek 1981). Optimal cluster conditions were found for both methods and the cophenetic coefficient was used to assess the most confident clustering method. RESULTS: Since large data sets were fused, the models were very complex. There were no consistent clustering patterns when varying clustering conditions, signalling high similarity between samples. The samples could not confidently be distinguished from one another even at the highest optimized conditions. Although Fuzzy-k means gave more confident clustering, it was still not sufficient for solving classification issues in this sample set. CONCLUSIONS: Fuzzy-k means was better at resolving the natural grouping of samples. Coupled to data fusion, it could potentially lead to better pattern recognition, especially for oenological chemical and sensory data. The fuzzy approach should be explored, keeping in mind it is more sensitive to small differences in the data compared to classical statistics.

Authors: Mpho Mafata – 1South African Grape and Wine Research Institute, Department of Viticulture and Oenology, Stellenbosch University & 2School for Data Science and Computational Thinking, Stellenbosch University, South Africa ,Jeanne, BRAND, South African Grape and Wine Research Institute, Department of Viticulture and Oenology, Stellenbosch University, South Africa  Astrid, BUICA, South African Grape and Wine Research Institute, Department of Viticulture and Oenology, Stellenbosch University

Email: mafata@sun.ac.za

Keywords: data fusion, pattern recognition, machine learning, artificial intelligence, multiple factor analysis, fuzzy-k means, cluster analysis

Share via
Copy link
Powered by Social Snap