Macrowine 2021
IVES 9 IVES Conference Series 9 Fluorescence spectroscopy with xgboost discriminant analysis for intraregional wine authentication

Fluorescence spectroscopy with xgboost discriminant analysis for intraregional wine authentication

Abstract

AIM: This study aimed to use simultaneous measurements of absorbance, transmittance, and fluorescence excitation-emission matrix (A-TEEM) combined with chemometrics as a rapid method to authenticate wines from three vintages within a single geographical indication (GI) according to their subregional variations.

METHODS: The A-TEEM technique (Gilmore, Akaji, & Csatorday, 2017) has been applied to analyse experimental Shiraz wines (n = 186) from six subregions of Barossa Valley, South Australia, from 2018, 2019 and 2020 vintages. Absorbance spectra and EEM fingerprints of the wines were recorded and the data were fused for multivariate statistical modelling with extreme gradient boost discriminant analysis (XGBDA) as reported by Ranaweera, Gilmore, Capone, Bastian, and Jeffery (2021) to classify wine according to their subregions. The cross-validated (k =10, Venetian blinds) confusion matrix score probabilities of classes were used to assess the accuracy of the classification models. A similar procedure was also carried out to discriminate subregions for a single vintage year. Basic chemical parameters (alcohol %v/v, pH, titratable acidity, and volatile acidity) were modelled with the partial least squares regression (PLSR) using A-TEEM data and reference chemical data.

RESULTS: Results have shown an unprecedented 100% correct classification of wines according to subregion across the three vintages and 98% accuracy for subregion in a single vintage year. Other model performance parameters of confusion matrix, including sensitivity, specificity, precision, and F1 score, were also showing the highest values (1.0) for each of the subregions. PLSR modelling revealed that A-TEEM data can also be used for a rapid assessment of basic wine chemical parameters. Notably, the results confirmed a distinct resolution among subregions despite their relatively close proximity within a single GI, indicating the effect of terroir on intraregional variation.

CONCLUSIONS

The sensitivity of A-TEEM allied with multivariate statistical analysis of fluorescence data facilitated the accurate classification of Shiraz wines according to the subregion of origin and production year. As a robust analytical method, A-TEEM can help identify the drivers of regional expression of wine and can potentially be developed for use within the supply chain to guarantee the provenance indicated on the label and to provide an assurance of quality. Overall, A-TEEM with XGBDA modelling continues to be shown as an accurate wine authentication tool that could even be applied at a subregional level.

DOI:

Publication date: September 7, 2021

Issue: Macrowine 2021

Type: Article

Authors

Ruchira Ranaweera

Department of Wine Science, The University of Adelaide, South Australia, Australia,Adam GILMORE, Horiba Instruments Inc., Piscataway, New Jersey, USA Dimitra CAPONE, The Australian Research Council Training Centre for Innovative Wine Production, The University of Adelaide Susan BASTIAN, The Australian Research Council Training Centre for Innovative Wine Production, The University of Adelaide David JEFFERY, The Australian Research Council Training Centre for Innovative Wine Production, The University of Adelaide

Contact the author

Keywords

geographical indication, authenticity, subregion, excitation-emission matrix, chemometrics, terroir

Citation

Related articles…

Differential responses of red and white grape cultivars trained to a single trellis system – the VSP

Commercial grape production relies on training grapevine cultivars onto a variety of trellis systems. Training allows for well-lit leaves and clusters, maximizing fruit quality in addition to facilitating cultivation, harvesting, and diseases control. Although grapevines can be trained onto an infinite variety of trellis systems, most red and white cultivars are trained to the standard VSP (Vertical Shoot Positioning) system. However, red and white cultivars respond differently to VSP in fruit composition and growth characteristics, which are yet to be fully understood. Therefore, the objective of this study was to examine the influence of the VSP trellis system on fruit composition of three red, Cabernet Sauvignon, Merlot and Syrah, and three white, Chardonnay, Riesling, and Gewurztraminer cultivars grown under uniform growing conditions in the same vineyard. All cultivars were monitored for maturity and harvested at their physiologically maximum possible sugar concentration to compare various fruit quality attributes such as Brix, pH, TA, malic and tartaric acids, glucose and fructose, potassium, YAN, and phenolic compounds including total anthocyanins, anthocyanin profile, and tannins. A distinct pattern in fruit composition was observed in each cultivar. In regards to growth characteristics, Syrah grew vigorously with the highest cluster weight. Although all cultivars developed pyriform seeds, the seed size and weight varied among all cultivars. Also varied were mesocarp cell viability, brush morphology, and cane structure. This knowledge of the canopy architectural characteristics assessed by the widely employed fruit compositional attributes and growth characteristics will aid the growers in better management of the vines in varied situations.

Comparison of imputation methods in long and varied phenological series. Application to the Conegliano dataset, including observations from 1964 over 400 grape varieties

A large varietal collection including over 1700 varieties was maintained in Conegliano, ITA, since the 1950s. Phenological data on a subset of 400 grape varieties including wine grapes, table grapes, and raisins were acquired at bud break, flowering, veraison, and ripening since 1964. Despite the efforts in maintaining and acquiring data over such an extensive collection, the data set has varying degrees of missing cases depending on the variety and the year. This is ubiquitous in phenology datasets with significant size and length. In this work, we evaluated four state-of-the-art methods to estimate missing values in this phenological series: k-Nearest Neighbour (kNN), Multivariate Imputation by Chained Equations (mice), MissForest, and Bidirectional Recurrent Imputation for Time Series (BRITS). For each phenological stage, we evaluated the performance of the methods in two ways. 1) On the full dataset, we randomly hold-out 10% of the true values for use as a test set and repeated the process 1000 times (Monte Carlo cross-validation). 2) On a reduced and almost complete subset of varieties, we varied the percentage of missing values from 10% to 70% by random deletion. In all cases, we evaluated the performance on the original values using normalized root mean squared error. For the full dataset we also obtained performance statistics by variety and by year. MissForest provided average errors of 17% (3 days) at budbreak, 14% (4 days) at flowering, 14.5% (7 days) at veraison, and 17% (3 days) at maturity. We completed the imputations of the Conegliano dataset, one of the world’s most extensive and varied phenological time series and a steppingstone for future climate change studies in grapes. The dataset is now ready for further analysis, and a rigorous evaluation of imputation errors is included.

Climate, Viticulture, and Wine … my how things have changed!

The planet is warmer than at any time in our recorded past and increasing greenhouse emissions and persistence in the climate system means that continued warming is highly likely. Climate change has already altered the basic framework of growing grapes for wine production worldwide and will likely continue to do so for years to come. The wine sector can continue to play an important role in leading the agricultural sector in addressing climate change. From developing on…

Biodiversity in the vineyard agroecosystem: exploring systemic approaches

Biodiversity conservation and restoration are essential for guarantee the provision of ecosystem services associated to vineyard agroecosystem such as climate regulation trough carbon sequestration and control of pests and diseases. Most of published research dealing with the complexity of the vineyard agroecosystems emphasizes the necessity of innovative approaches, including the integration of information at different temporal and spatial scales and development of systemic analysis based on modelling. A biodiversity survey was conducted in the Franciacorta wine-growing area (Lombardy, Italy), one of the most important Italian wine-growing regions for sparkling wine production, considering a portion of the territory of 112 ha. The area was divided into several Environmental Units (EUs), defined as a whole vineyard or portion of vineyard homogenous in terms of four agronomic characteristics: planting year, planting density, cultivar, and training system. In each EU a set of compartments was identified and characterised by specific variables. The compartments are meteorology, morphology (altitude, slope, aspect, row orientation, and solar irradiance), ecological infrastructures and management. The landscape surrounding EU was also characterised in terms of land-use in a buffer zone of 500 m. For each component a specific methodology was identified and applied. Different statistical approaches were used to evaluate the method to integrate the information related to different compartments within the EU and related to the buffer zone. These approaches were also preliminarily evaluated for their ability to describe the contribution of biodiversity and landscape components to ecosystem services. This methodological exploration provides useful indication for the development of a fully systemic approach to structural and functional biodiversity in vineyard agroecosystems, contributing to promote a multifunctional perspective for the all wine-growing sector.

A better understanding of the climate effect on anthocyanin accumulation in grapes using a machine learning approach

The current climate changes are directly threatening the balance of the vineyard at harvest time. The maturation period of the grapes is shifted to the middle of the summer, at a time when radiation and air temperature are at their maximum. In this context, the implementation of corrective practices becomes problematic. Unfortunately, our knowledge of the climate effect on the quality of different grape varieties remains very incomplete to guide these choices. During the Innovine project, original experiments were carried out on Syrah to study the combined effects of normal or high air temperature and varying degrees of exposure of the berries to the sun. Berries subjected to these different conditions were sampled and analyzed throughout the maturation period. Several quality characteristics were determined, including anthocyanin content. The objective of the experiments was to investigate which climatic determinants were most important for anthocyanin accumulation in the berries. Temperature and irradiance data, observed over time with a very thin discretization step, are called functional data in statistics. We developed the procedure SpiceFP (Sparse and Structured Procedure to Identify Combined Effects of Functional Predictors) to explain the variations of a scalar response variable (a grape berry quality variable for example) by two or three functional predictors (as temperature and irradiance) in a context of joint influence of these predictors. Particular attention was paid to the interpretability of the results. Analysis of the data using SpiceFP identified a negative impact of morning combinations of low irradiance (lower than about 100 μmol m−2 s−1 or 45 μmol m−2 s−1 depending on the advanced-delayed state of the berries) and high temperature (higher than 25oC). A slight difference associated with overnight temperature occurred between these effects identified in the morning.