Macrowine 2021
IVES 9 IVES Conference Series 9 Beyond classical statistics – data fusion coupled with pattern recognition

Beyond classical statistics – data fusion coupled with pattern recognition

Abstract

AIM: Patterns in data obtained from wine chemical and sensory evaluations are difficult to infer using classical statistics. Pattern recognition can be resolved by coupling data fusion with machine learning techniques, possibly leading to new hypotheses being formed. This study demonstrates the applicability of two pattern recognition approaches using as case study involving Chenin Blanc wines (recently bottled and after two years storage) from young (35 years) vines.

METHODS: Sensory (sorting (Mafata et al. 2020)) and chemical (NMR: nuclear magnetic resonance, HRMS: high resolution mass spectrometry, and UV-Vis: ultraviolet spectrophotometry) data were collected for the young and aged (two years in the bottle) wines. Data sets were combined using multiple factor analysis (MFA). Exploratory unsupervised cluster analysis was performed by agglomerative hierarchical clustering (AHC) and Fuzzy-k means (Bezdek 1981). Optimal cluster conditions were found for both methods and the cophenetic coefficient was used to assess the most confident clustering method.

RESULTS: Since large data sets were fused, the models were very complex. There were no consistent clustering patterns when varying clustering conditions, signalling high similarity between samples. The samples could not confidently be distinguished from one another even at the highest optimized conditions. Although Fuzzy-k means gave more confident clustering, it was still not sufficient for solving classification issues in this sample set.

CONCLUSIONS:

Fuzzy-k means was better at resolving the natural grouping of samples. Coupled to data fusion, it could potentially lead to better pattern recognition, especially for oenological chemical and sensory data. The fuzzy approach should be explored, keeping in mind it is more sensitive to small differences in the data compared to classical statistics.

DOI:

Publication date: September 7, 2021

Issue: Macrowine 2021

Type: Article

Authors

Mpho Mafata, Jeanne

1South African Grape and Wine Research Institute, Department of Viticulture and Oenology, Stellenbosch University & 2School for Data Science and Computational Thinking, Stellenbosch University, South Africa, BRAND, South African Grape and Wine Research Institute, Department of Viticulture and Oenology, Stellenbosch University, South Africa  Astrid, BUICA, South African Grape and Wine Research Institute, Department of Viticulture and Oenology, Stellenbosch University

Contact the author

Keywords

data fusion, pattern recognition, machine learning, artificial intelligence, multiple factor analysis, fuzzy-k means, cluster analysis

Citation

Related articles…

Evolution of the amino acids content through grape ripening: Effect of foliar application of methyl jasmonate with or without urea

The parameters that determine the grape quality, and therefore the optimal harvest time, suffer variations during berry ripening, related to climate change, with the widely known problem of the gap between technological and phenolic maturities. However, there are few studies about its incidence on grape nitrogen composition. For this reason, the use of an elicitor, methyl jasmonate (MeJ), alone or with urea, is proposed as a tool to reduce climatic decoupling, allowing to establish the harvest time in order to achieve the optimum grape quality. The aim was to study the effect of MeJ and MeJ+Urea foliar applications on the evolution of Tempranillo amino acids content throughout the grape maturation. Three treatments were foliarly applied, at veraison and 7 days later: control (water), MeJ (10 mM) and MeJ+Urea (10 mM+6 kg N/ha). Grape samples were taken at five stages of maturation: day before the first and second applications, 15 days after the second application (pre-harvest), harvest day, and 15 days after harvest (post-harvest). The amino acids analysis of the samples was carried out by HPLC. Results showed that the evolution of amino acids was similar regardless of the treatment; however, foliar applications influenced the nitrogen compounds content, i.e., there was no qualitative effect but quantitative one. Most of the amino acids reached their maximum concentration in pre-harvest, being higher in grapes from the treatments than in the control. In general, no differences in grape amino acids content were observed between MeJ and MeJ+Urea treatments. Foliar applications with MeJ and MeJ+Urea enhanced the grape amino acids content, without affecting their profile, helping to optimize their quality and allowing to establish a more complete grape ripening standard. Therefore, MeJ and MeJ+Urea foliar applications can be a simple agronomic practice, which has shown promising results in order to enhance the grape quality.

δ13C : A still underused indicator in precision viticulture  

The first demonstration of the interest of carbon isotope composition of sugars in grapevine, as an integrated indicator of vineyard water status, dates back to 2000 (Gaudillère et al., 1999; Van Leeuwen et al., 2001). Thanks to the isotopic discrimination of Carbon that takes place during plant photosynthesis, under hydric stress conditions, it is possible to accurately estimate the photosynthetic activity. Ever since, δ13C has been widely applied with success to zonation, terroir studies and vine physiology research, but is still not widely used by viticulturists. This is quite astonishing by considering the impact of global warming on viticulture and the need to improve water management, that would justify a widespread use of δ13C.
The lack of private laboratories proposing the analysis, the cost of the technology, as well as the long analytical delays, have been detrimental to its development. Some laboratories tried to overcome the analytical difficulties of isotopic analysis by using fourier transformed infrared spectroscopy, as a fast and cheap alternative to the official OIV method (IRMS). These claimed FTIR models have never been published or peer reviewed and cannot be considered robust. In this work, thanks to the recent acquisition of IRMS technology, new modern and robust applications of δ13C for viticulture are proposed. This includes the use of the analysis to make parcel separations at harvesting, the possibility to increase the precision of hydric stress cartography and the potential cost reduction when compared with Scholander pressure bomb analysis.

Effects of graft quality on growth and grapevine-water relations

Climate change is challenging viticulture worldwide compromising its sustainability due to warmer temperatures and the increased frequency of extreme events. Grafting Vitis vinifera L.

Projected changes in vine phenology of two varieties with different thermal requirements cultivated in La Mancha DO (Spain) under climate change scenarios

The aim of this work was to analyze the phenology variability of Tempranillo and Chardonnay cultivars, related to the climatic characteristics in La Mancha Designation of Origin, and their potential changes under climate change scenarios. Phenological dates referred to budbreak, flowering, veraison and harvest were analyzed for the period 2000-2019. The weather conditions at daily time scale, recorded during the same period, were also evaluated. The thermal requirements to reach each of these phenological stages were calculated and expressed as the GDD accumulated from DOY=60. Changes in phenology were projected by 2050 and 2070 taking into account those values and the projected temperatures and precipitation, simulated under two Representative Concentration Pathway (RCP) scenarios –RCP4.5 and RCP8.5– using an ensemble of models. The average phenological dates during the period under study were, April 16th ± 6.6 days and April 5th ± 6.0 days for budbreak, May 31st ± 6.0 days and May 27th ± 5.3 days for flowering, July 26th ± 5.6 days and July 25th ± 5.8 days for veraison, and Ago 23rd ± 10.8 days and Ago 17th ± 9.0 days for harvest, respectively, for Tempranillo and Chardonnay. The projected changes in temperature imply an average change in the maximum growing season (April-August) temperatures of 1.2 and 1.9°C by 2050, and 1.6 and 2.6°C by 2070, under the RCP4.5 and RCP8.5 scenarios, respectively. A reduction in precipitation is predicted, which vary between 15% for 2050 under RCP4.5 scenario and up to 30% by 2070 under RCP8.5. The advance of the phenological dates for 2050, could be of 6, 7, 7, and 8 days for Tempranillo and 4, 6, 6 and 9 days for Chardonnay, respectively for budbreak, flowering, veraison and harvest under the RCP4.5 scenario. Under the RCP8.5 emission scenario, the advance could be up to 30% higher.

Mesoclimate impact on Tannat in the Atlantic terroir of Uruguay

The study of climate is relevant as an element conditioning the typicity of a product, its quality and sustainability over the years. The grapevine development and growth and the final grape and wine composition are closely related to temperature, while climate components vary at mesoscale according to topography and/or proximity to large bodies of water. The objective of this work is to assess the mesoclimate of the Atlantic region of Uruguay and to determine the effect of topography and the ocean on temperature and consequently on Tannat grapevine behavior.

Macrowine 2021
IVES 9 IVES Conference Series 9 Beyond classical statistics – data fusion coupled with pattern recognition

Beyond classical statistics – data fusion coupled with pattern recognition

Abstract

Content of the article

References

Section for all references

DOI:

Publication date: September 7, 2021

Issue: (ex: Issue: Terclim 2023)

Type: typeofpublication

Authors

author1, author2, author3

Presenting author

Description

List of affiliations ¹ ² ³

Contact the author

Email address (with mailto: link)

Keywords

List of different keywords (keyword1, keyword2, keyword3)

Tags

Citation

Related articles…

Grape must quality and mesoclimatic variability in Fruška Gora wine-growing region, Serbia

The Fruška Gora mountain is a traditional wine-growing region in Serbia situated in the Pannonian Basin. Due to such a position, the vicinity of the Danube River and the presence of concave configuration, it is suitable for grape production. This paper provides analyses of spatial variations in meteorological parameters and grape juice quality within Fruška Gora wine region over three consecutive vintages (2018-2020). The examined period can be defined as warm with cool nights during September (AVG 18,9°C; GDD 1918°C; CI 12°CF) and with the presence of mesoclimatic variability. The East part of the study area was somewhat drier and hotter compared to other parts of the region. The analyses of grape must samples (190 in total) of five cultivars (Cabernet-Sauvignon, Merlot, Chardonnay, Sauvignon blanc and Grašac (Welschriesling)) commonly grown across the region (19 sites), were performed using Fourier Transform Infrared Technology (FTIR). Among all cultivars, Sauvignon blanc was harvested first in the East area (DOY=246±5, GDD at harvest=1552±74, 22.2±0.7 °Brix), while the latest harvest was recorded for Cabernet-Sauvignon in the West (DOY=283±5, GDD at harvest=1936±187, 23.4±1.0 °Brix ). Both the red and white cultivars had higher acidity and YAN in the grape must if the vines were grown in the North and East compared to South and West areas. According to PCA analysis, Grašac showed the lowest variation in grape must chemical composition. Thus, the results confirm that Grašac is the most stable cultivar in Fruška Gora. All monitored cultivars reached technological fruit ripeness by the end of the growing season. However, it was difficult to reach full ripeness of red cultivars, mostly beacuse of uncoupling of technolocical and phenolic ripeness. Thus, Cabernet-Sauvignon had higher variations in GDD sums at harvest compared to other cultivars, which probably increased variations in grape must quality.

First step in the preparation of a soil map of the Protected Designation of Origin Valdepeñas (Central, Spain)

This work is a first step to make a map of vineyard soils. The characterization of the soils of the Protected Designation of Origin (D.P.O.) Valdepeñas will allow to group the studied profiles according to their physico-chemical characteristics and the concentrations of most relevant chemical elements. 90 soil profiles were analysed throughout the territory and the soils were sampled and described according to FAO (2006) and classified according to and Soil Taxonomy (2014). All samples were air dried, sieved and some physico-chemical parameters were determined following standard protocols. Also, major and trace elements were analysed by X-ray fluorescence. The statistically study was made using the SPSS program. Trend maps were made using the ArcGIS program. The studied soils have the following average properties: pH, 8.3; electrical conductivity, 0,20 dS/m (low); clay, 18.8% (medium) and CaCO3, 17.1% (high). In the study for the major elements. The major elements of these soils are Si, followed by Ca and Al, with an average content of 203.7 g/kg, 105.5 g/kg and 74.0 g/kg respectively. On the other hand, 27 trace elements have been studied. Of all of them, it can be highlighted the average values of Ba (361.8 mg/kg), Sr (129.3 mg/kg), Rb (83.4 mg/kg), V (74.2 mg/kg) and Ce (70.6 mg/kg). Ba, V and Ce values are higher and the values of Sr and Rb are lower to those found in the literature. The discriminant analysis shows a percentage of grouping of 91%. The content of chemical elements together with the physico-chemical characteristics allows grouping the soils in 4 group according to their order in the classification to Soil Taxonomy; due to the importance of the Calcisols in Castilla-La Mancha, it has been decided to establish them as their own group even if they do not appear in Soil Taxonomy classification.

δ13C : A still underused indicator in precision viticulture  

The first demonstration of the interest of carbon isotope composition of sugars in grapevine, as an integrated indicator of vineyard water status, dates back to 2000 (Gaudillère et al., 1999; Van Leeuwen et al., 2001). Thanks to the isotopic discrimination of Carbon that takes place during plant photosynthesis, under hydric stress conditions, it is possible to accurately estimate the photosynthetic activity. Ever since, δ13C has been widely applied with success to zonation, terroir studies and vine physiology research, but is still not widely used by viticulturists. This is quite astonishing by considering the impact of global warming on viticulture and the need to improve water management, that would justify a widespread use of δ13C.
The lack of private laboratories proposing the analysis, the cost of the technology, as well as the long analytical delays, have been detrimental to its development. Some laboratories tried to overcome the analytical difficulties of isotopic analysis by using fourier transformed infrared spectroscopy, as a fast and cheap alternative to the official OIV method (IRMS). These claimed FTIR models have never been published or peer reviewed and cannot be considered robust. In this work, thanks to the recent acquisition of IRMS technology, new modern and robust applications of δ13C for viticulture are proposed. This includes the use of the analysis to make parcel separations at harvesting, the possibility to increase the precision of hydric stress cartography and the potential cost reduction when compared with Scholander pressure bomb analysis.

Comparison of imputation methods in long and varied phenological series. Application to the Conegliano dataset, including observations from 1964 over 400 grape varieties

A large varietal collection including over 1700 varieties was maintained in Conegliano, ITA, since the 1950s. Phenological data on a subset of 400 grape varieties including wine grapes, table grapes, and raisins were acquired at bud break, flowering, veraison, and ripening since 1964. Despite the efforts in maintaining and acquiring data over such an extensive collection, the data set has varying degrees of missing cases depending on the variety and the year. This is ubiquitous in phenology datasets with significant size and length. In this work, we evaluated four state-of-the-art methods to estimate missing values in this phenological series: k-Nearest Neighbour (kNN), Multivariate Imputation by Chained Equations (mice), MissForest, and Bidirectional Recurrent Imputation for Time Series (BRITS). For each phenological stage, we evaluated the performance of the methods in two ways. 1) On the full dataset, we randomly hold-out 10% of the true values for use as a test set and repeated the process 1000 times (Monte Carlo cross-validation). 2) On a reduced and almost complete subset of varieties, we varied the percentage of missing values from 10% to 70% by random deletion. In all cases, we evaluated the performance on the original values using normalized root mean squared error. For the full dataset we also obtained performance statistics by variety and by year. MissForest provided average errors of 17% (3 days) at budbreak, 14% (4 days) at flowering, 14.5% (7 days) at veraison, and 17% (3 days) at maturity. We completed the imputations of the Conegliano dataset, one of the world’s most extensive and varied phenological time series and a steppingstone for future climate change studies in grapes. The dataset is now ready for further analysis, and a rigorous evaluation of imputation errors is included.

De novo Vitis champinii whole genome assembly allows rootstock-specific identification of potential candidate genes for drought and salt tolerance

Vitis champinii cultivars Ramsey and Dog-ridge are main choices for rootstocks to adapt viticulture in semi-arid and arid regions thanks to their distinctive tolerance to drought and salinity. However, genetic studies on non-vinifera rootstocks have heavily relied on the grapevine (Vitis vinifera) reference genome, which difficulted the assessment of the genetic variation between rootstock species and grapevines. In the present study, this limitation is addressed by introducing a novo phased genome assembly and annotation of Vitis champinii. This new Vitis champinii genome was employed as reference for mapping RNA-seq reads from the same species under drought and salt stresses, and for comparison the same reads were also mapped to the Vitis vinifera PN40024.V4 reference genome. A significant increase in alignment rate was gained when mapping Vitis champinii RNA-seq reads to its own genome, compared to the Vitis vinifera PN40024.V4 reference genome, thus revealing the expression levels of genes specific to Vitis champinii. Moreover, differences in coding sequences were observed in ortholog genes between Vitis champinii and Vitis vinifera, which therefore challenges previous differential expression analyses performed between contrasting Vitis genotypes on the same gene from the Vitis vinifera genome. Genes with possible implications in drought and salt tolerance have been identified across the genome of Vitis champinii, and the same genomic data can potentially guide the discovery of candidate genes specific from Vitis champinii for other traits of interest, therefore becoming a valuable resource for rootstock breeding designs, specially towards increased drought and salinity due to climate change.