Macrowine 2021
IVES 9 IVES Conference Series 9 Fluorescence spectroscopy with xgboost discriminant analysis for intraregional wine authentication

Fluorescence spectroscopy with xgboost discriminant analysis for intraregional wine authentication

Abstract

AIM: This study aimed to use simultaneous measurements of absorbance, transmittance, and fluorescence excitation-emission matrix (A-TEEM) combined with chemometrics as a rapid method to authenticate wines from three vintages within a single geographical indication (GI) according to their subregional variations.

METHODS: The A-TEEM technique (Gilmore, Akaji, & Csatorday, 2017) has been applied to analyse experimental Shiraz wines (n = 186) from six subregions of Barossa Valley, South Australia, from 2018, 2019 and 2020 vintages. Absorbance spectra and EEM fingerprints of the wines were recorded and the data were fused for multivariate statistical modelling with extreme gradient boost discriminant analysis (XGBDA) as reported by Ranaweera, Gilmore, Capone, Bastian, and Jeffery (2021) to classify wine according to their subregions. The cross-validated (k =10, Venetian blinds) confusion matrix score probabilities of classes were used to assess the accuracy of the classification models. A similar procedure was also carried out to discriminate subregions for a single vintage year. Basic chemical parameters (alcohol %v/v, pH, titratable acidity, and volatile acidity) were modelled with the partial least squares regression (PLSR) using A-TEEM data and reference chemical data.

RESULTS: Results have shown an unprecedented 100% correct classification of wines according to subregion across the three vintages and 98% accuracy for subregion in a single vintage year. Other model performance parameters of confusion matrix, including sensitivity, specificity, precision, and F1 score, were also showing the highest values (1.0) for each of the subregions. PLSR modelling revealed that A-TEEM data can also be used for a rapid assessment of basic wine chemical parameters. Notably, the results confirmed a distinct resolution among subregions despite their relatively close proximity within a single GI, indicating the effect of terroir on intraregional variation.

CONCLUSIONS

The sensitivity of A-TEEM allied with multivariate statistical analysis of fluorescence data facilitated the accurate classification of Shiraz wines according to the subregion of origin and production year. As a robust analytical method, A-TEEM can help identify the drivers of regional expression of wine and can potentially be developed for use within the supply chain to guarantee the provenance indicated on the label and to provide an assurance of quality. Overall, A-TEEM with XGBDA modelling continues to be shown as an accurate wine authentication tool that could even be applied at a subregional level.

DOI:

Publication date: September 7, 2021

Issue: Macrowine 2021

Type: Article

Authors

Ruchira Ranaweera

Department of Wine Science, The University of Adelaide, South Australia, Australia,Adam GILMORE, Horiba Instruments Inc., Piscataway, New Jersey, USA Dimitra CAPONE, The Australian Research Council Training Centre for Innovative Wine Production, The University of Adelaide Susan BASTIAN, The Australian Research Council Training Centre for Innovative Wine Production, The University of Adelaide David JEFFERY, The Australian Research Council Training Centre for Innovative Wine Production, The University of Adelaide

Contact the author

Keywords

geographical indication, authenticity, subregion, excitation-emission matrix, chemometrics, terroir

Citation

Related articles…

Frost risk projections in a changing climate are highly sensitive in time and space to frost modelling approaches

Late spring frost is a major challenge for various winegrowing regions across the world, its occurrence often leading to important yield losses and/or plant failure. Despite a significant increase in minimum temperatures worldwide, the spatial and temporal evolution of spring frost risk under a warmer climate remains largely uncertain. Recent projections of spring frost risk for viticulture in Europe throughout the 21st century show that its evolution strongly depends on the model approach used to simulate budburst. Furthermore, the frost damage modelling methods used in these projections are usually not assessed through comparison to field observations and/or frost damage reports.
The present study aims at comparing frost risk projections simulated using six spring frost models based on two approaches: a) models considering a fixed damage threshold after the predicted budburst date (e.g BRIN, Smoothed-Utah, Growing Degree Days, Fenovitis) and b) models considering a dynamic frost sensitivity threshold based on the predicted grapevine winter/spring dehardening process (e.g. Ferguson model). The capability of each model to simulate an actual frost event for the Vitis vinifera cv. Chadonnay B was previously assessed by comparing simulated cold thermal stress to reports of events with frost damage in Chablis, the northernmost winegrowing region of Burgundy. Models exhibited scores of κ > 0.65 when reproducing the frost/non-frost damage years and an accuracy ranging from 0.82 to 0.90.
Spring frost risk projections throughout the 21st century were performed for all winegrowing subregions of Bourgogne-Franche-Comté under two CMIP5 concentration pathways (4.5 and 8.5) using statistically downscaled 8×8 km daily air temperature and humidity of 13 climate models. Contrasting results with region-specific spring frost risk trends were observed. Three out of five models show a decrease in the frequency of frost years across the whole study area while the other two show an increase that is more or less pronounced depending on winegrowing subregion. Our findings indicate that the lack of accuracy in grapevine budburst and dehardening models makes climate projections of spring frost risk highly uncertain for grapevine cultivation regions.

Under-vine management effects on grapevine production, soil properties and plant communities in South Australia

Under-vine (UV) management has traditionally consisted of synthetic herbicide use to limit competition between weeds and grapevines. With growing global interest towards non-synthetic chemical use, this study aimed to capture the effects of alternative UV management at two commercial Shiraz vineyards in South Australia, where the sole management variables were UV management since 2016. In adjacent treatment blocks, cultivation (CU) was compared to spontaneous vegetation (SV) in McLaren Vale (MV), and herbicide was compared to SV in Eden Valley (EV). Soil water infiltration rates were slower and grapevine stem water potential was lower in CU compared to SV in MV, with the latter having a plant community dominated by soursob (Oxalis pes-caprae) during winter; while in EV, there was little separation between the treatments. Yields were affected at both sites, with SV being higher in MV and HE being higher in EV. In MV, the only effect on grape must was a lower 13C:12C isotope ratio in CU, indicating greater grapevine water stress. In the grape must at EV, SV had higher total soluble solids, total phenolics, anthocyanins, and yeast available nitrogen; and lower pH and titratable acidity. Pruning weights were not affected by the treatments in MV, while they were higher in HE at EV. Assessments revealed that the differing soil types at the two sites were likely the main determinants of the opposing production outcomes associated with UV management. In the silty loam soil of MV, the higher yields in SV were likely due to more plant-available water, as a potential result of the continuous soil bio-pores formed by winter UV vegetation. Conversely, in the loamy sand soils of EV with a lower cation exchange capacity, the lower yields and pruning weights in SV suggest the UV vegetation competed significantly with the grapevines for available water and nutrients.

A predictive model of spatial Eca variability in the vineyard to support the monitoring of plant status

[lwp_divi_breadcrumbs home_text="IVES" use_before_icon="on" before_icon="||divi||400" module_id="publication-ariane" _builder_version="4.19.4" _module_preset="default" module_text_align="center" module_font_size="16px" text_orientation="center"...

Comparison of imputation methods in long and varied phenological series. Application to the Conegliano dataset, including observations from 1964 over 400 grape varieties

A large varietal collection including over 1700 varieties was maintained in Conegliano, ITA, since the 1950s. Phenological data on a subset of 400 grape varieties including wine grapes, table grapes, and raisins were acquired at bud break, flowering, veraison, and ripening since 1964. Despite the efforts in maintaining and acquiring data over such an extensive collection, the data set has varying degrees of missing cases depending on the variety and the year. This is ubiquitous in phenology datasets with significant size and length. In this work, we evaluated four state-of-the-art methods to estimate missing values in this phenological series: k-Nearest Neighbour (kNN), Multivariate Imputation by Chained Equations (mice), MissForest, and Bidirectional Recurrent Imputation for Time Series (BRITS). For each phenological stage, we evaluated the performance of the methods in two ways. 1) On the full dataset, we randomly hold-out 10% of the true values for use as a test set and repeated the process 1000 times (Monte Carlo cross-validation). 2) On a reduced and almost complete subset of varieties, we varied the percentage of missing values from 10% to 70% by random deletion. In all cases, we evaluated the performance on the original values using normalized root mean squared error. For the full dataset we also obtained performance statistics by variety and by year. MissForest provided average errors of 17% (3 days) at budbreak, 14% (4 days) at flowering, 14.5% (7 days) at veraison, and 17% (3 days) at maturity. We completed the imputations of the Conegliano dataset, one of the world’s most extensive and varied phenological time series and a steppingstone for future climate change studies in grapes. The dataset is now ready for further analysis, and a rigorous evaluation of imputation errors is included.

A better understanding of the climate effect on anthocyanin accumulation in grapes using a machine learning approach

The current climate changes are directly threatening the balance of the vineyard at harvest time. The maturation period of the grapes is shifted to the middle of the summer, at a time when radiation and air temperature are at their maximum. In this context, the implementation of corrective practices becomes problematic. Unfortunately, our knowledge of the climate effect on the quality of different grape varieties remains very incomplete to guide these choices. During the Innovine project, original experiments were carried out on Syrah to study the combined effects of normal or high air temperature and varying degrees of exposure of the berries to the sun. Berries subjected to these different conditions were sampled and analyzed throughout the maturation period. Several quality characteristics were determined, including anthocyanin content. The objective of the experiments was to investigate which climatic determinants were most important for anthocyanin accumulation in the berries. Temperature and irradiance data, observed over time with a very thin discretization step, are called functional data in statistics. We developed the procedure SpiceFP (Sparse and Structured Procedure to Identify Combined Effects of Functional Predictors) to explain the variations of a scalar response variable (a grape berry quality variable for example) by two or three functional predictors (as temperature and irradiance) in a context of joint influence of these predictors. Particular attention was paid to the interpretability of the results. Analysis of the data using SpiceFP identified a negative impact of morning combinations of low irradiance (lower than about 100 μmol m−2 s−1 or 45 μmol m−2 s−1 depending on the advanced-delayed state of the berries) and high temperature (higher than 25oC). A slight difference associated with overnight temperature occurred between these effects identified in the morning.