Fully automated non-targeted GC-MS data analysis

Abstract

Non-targeted analysis is applied in many different domains of analytical chemistry such as metabolomics, environmental and food analysis. In contrast to targeted analysis, non-targeted approaches take information of known and unknown compounds into account, are inherently more comprehensive and give a more holistic representation of the sample composition. 

Besides chromatographic techniques coupled to high resolution mass spectrometry such as LC-HRMS, gas chromatography with unit resolution mass spectrometry is still regularly utilized for non-targeted profiling or fingerprinting. This is mainly due to high separation power of GC and a wide availability and low costs of quadrupole mass spectrometers. 

Although several non-targeted approaches have been developed, data processing still remains a serious bottleneck. Baseline correction, feature detection, and retention time alignment can be prone to errors and time-consuming manual corrections are often necessary. We therefore developed an automated strategy to non-targeted GC-MS data avoiding feature detection and retention time alignment. The novel automated approach includes segmentation of chromatograms along the retention time axis, multiway decomposition of transformed segments followed by a supervised machine learning pipeline based on gradient boosted tree classification on the decomposed tensor [1, 2]. 

In order to make this novel data analysis strategy available to scientists without programming background, we developed a convenient browser based application. For the here presented interactive browser application the open source Python packages Bokeh and HoloViews were used. The application will be online freely available soon. 

[1] J. Vestner, G. de Revel, S. Krieger-Weber, D. Rauhut, M. du Toit, A. de Villiers, Toward automated chromatographic fingerprinting: A non-alignment approach to gas chromatography mass spectrometry data. Acta Chimica Acta 911 (2016) 42-58 
[2] K. Sirén, U. Fischer, J. Vestner, Automated supervised learning pipeline for non-targeted GC-MS data analysis. Analytica Chimica Acta: X 1 (2019) 100005

DOI:

Publication date: June 19, 2020

Issue: OENO IVAS 2019

Type: Article

Authors

Jochen Vestner, Kimmo Sirén, Pierre Le Brun, Ulrich Fischer

Institute for Viticulture and Oenology, DLR Rheinpfalz, Breitenweg 71, D-67435 Neustadt, Germany
Institut National Supérieur des Sciences Agronomiques de l’Alimentation et de l’ Environnement, Agrosup Dijon, 6 boulevard Docteur Petitjean, 21000 Dijon, France
Department of Chemistry, University of Kaiserslautern, Erwin-Schroedinger-Strasse 52, D-67663 Kaiserslautern

Contact the author

Keywords

metabolomics, non-targeted, GC-MS, exploratory data analysis 

Tags

IVES Conference Series | OENO IVAS 2019

Citation

Related articles…

Impact of geographical location on the phenolic profile of minority varieties grown in Spain. II: red grapevines

Because terroir and cultivar are drivers of wine quality, is essential to investigate theirs effects on polyphenolic profile before promoting the implantation of a red minority variety in a specific area. This work, included in MINORVIN project, focuses in the polyphenolic profile of 7 red grapevines minority varieties of Vitis vinifera L. (Morate, Sanguina, Santafe, Terriza Tinta Jeromo Tortozona Tinta) and Tempranillo) from six typical viticulture Spanish areas: Aragón (A1), Cataluña (A2), Castilla la Mancha (A3), Castilla –León (A4), Madrid (A5) and Navarra (A6) of 2020 season. Polyphenolic substances were extracted from grapes. 35 compounds were identified and quantified (mg subtance/kg fresh berry) by HPLC and grouped in anthocyanins (ANT) flavanols (FLAVA), flavonols (FLAVO), hydroxycinnamic (AH), benzoic (BA) acids and stilbenes (ST). Antioxidant activity (AA, mmol TE /g fresh berry) was determined by DPPH method. The results were submitted to a two-way ANOVA to investigate the influence of variety, area and their interaction for each polyphenolic family and cluster analysis was used to construct hierarchical dendrograms, searching the natural groupings among the samples. Sanguina (A3) had the most of total polyphenols while Tempranillo (A5) those of ANT. Sanguina (A2) and (A3) reached the highest values of FLAVO, FLAVA and AA. These two last samples had also the maximum of AA. The effect cultivar and area were significant for all polyphenolic families analyzed. A high variability due to variety (>50%) was observed in FLAVA and the maximum value of variability due to growing area was detected in AA (86.41%), ANT and FLAVO (51%); the interaction variety*zone was significant only for ANT, FLAVO, EST and AA. Finally, dendrograms presented five cluster: i) Sanguina (A2); ii) Sanguina (A3); iii) Tempranillo (A5); iv) Tempranillo (A3); Terriza (A3,A5), Morate (A5,A6); v) Santafé (A1,A6); Tortozona tinta (A1,A3,A6); Tinta Jeromo (A3,A4).

Grapevine yield-gap: identification of environmental limitations by soil and climate zoning in Languedoc-Roussillon region (south of France)

Grapevine yield has been historically overlooked, assuming a strong trade-off between grape yield and wine quality. At present, menaced by climate change, many vineyards in Southern France are far from the quality label threshold, becoming grapevine yield-gaps a major subject of concern. Although yield-gaps are well studied in arable crops, we know very little about grapevine yield-gaps. In the present study, we analysed the environmental component of grapevine yield-gaps linked to climate and soil resources in the Languedoc Roussillon. We used SAFRAN data and IGP Pays d’Oc wine yields from 2010 to 2018. We selected climate and soil indicators proving to have a significant effect on average wine yield-gaps at the municipality scale. The most significant factors of grapevine yield were the Soil Available Water Capacity; followed by the Huglin Index and the Climatic Dryness Index. The Days of Frost; the Soil pH; and the Very Hot Days were also significant. Then, we clustered geographical zones presenting similar indicators, facilitating the identification of resources yield-gaps. We discussed the number of zones with the experts of IGP Pays d’Oc label, obtaining 7 zones with similar limitations for grapevine yield. Finally, we analysed the main resources causing yield-gaps and the grapevine varieties planted on each zone. Mapping grapevine resource yield-gaps are the first stage for understanding grapevine yield-gaps at the regional scale.

Climate projections over France wine-growing region and its potential impact on phenology

Climate change represents a major challenge for the French wine industry. Climatic conditions in French vineyards have already changed and will continue to evolve. One of the notable effects on grapevine is the advancing growing season. The aim of this study is to characterise the evolution of agroclimatic indicators (Huglin index, number of hot days, mean temperature, cumulative rainfall and number of rainy days during the growing season) at French wine-growing regions scale between 1980 and 2019 using gridded data (8 km resolution, SAFRAN) and for the middle of the 21th century (2046-2065) with 21 GCMs statistically debiased and downscaled at 8 km. A set of three phenological models were used to simulate the budburst (BRIN, Smoothed-Utah), flowering, veraison and theoretical maturity (GFV and GSR) stages for two grape varieties (Chardonnay and Cabernet-Sauvignon) over the whole period studied. All the French wine-growing regions show an increase in both temperatures during the growing season and Huglin index. This increase is accompanied by an advance in the simulated flowering (+3 to +9 days), veraison (+6 to +13 days) and theoretical maturity (+6 to +16 days) stages, which are more noticeable in the north-eastern part of France. The climate projections unanimously show, for all the GCMs considered, a clear increase in the Huglin index (+662 to 771 °C.days compared to the 1980-1999 period) and in the number of hot days (+5.6 to 22.6 days) in all the wine regions studied. Regarding rainfall, the expected evolution remains very uncertain due to the heterogeneity of the climates simulated by the 21 models. Only 4 regions out of 21 have a significant decrease in the number of rainy days during the growing season. The two budburst models show a strong divergence in the evolution of this stage with an average difference of 18 days between the two models on all grapevine regions. The theoretical maturity is the most impacted stage with a potential advance between 40 and 23 days according to wine-growing regions.

Modelling vine water stress during a critical period and potential yield reduction rate in European wine regions: a retrospective analysis

Most European vineyards are managed under rainfed conditions, where seasonal water deficit has become increasingly important. The flowering-veraison phenophase represents an important period for vine response to water stress, which is seldomly thoroughly evaluated. Therefore, we aim to quantify the flowering-veraison water stress levels using Crop Water Stress Indicator (CWSI) over 1986–2015 for important European wine regions, and to assess the respective potential Yield Lose Rate (YLR). Additionally, we also investigate whether an advanced flowering-veraison phase may help alleviating the water stress with improved yield. A process-based grapevine model STICS is employed, which has been extensively calibrated for flowering and veraison stages using observed data at 38 locations with 10 different grapevine varieties. Subsequently, the model is being implemented at the regional level, considering site-specific calibration results and gridded climate and soil datasets. The findings suggest wine regions with stronger flowering-veraison CWSI tend to have higher potential YLR. However, contrasting patterns are found between wine regions in France-Germany-Luxembourg and Italy-Portugal-Spain. The former tends to have slight-to-moderate drought conditions (CWSI<0.5) and a negligible-to-moderate YLR (<30%), whereas the latter possesses severe-to-extreme CWSI (>0.5) and substantial YLR (>40%). Wine regions prone to a high drought risk (CWSI>0.75) are also identified, which are concentrated in southern Mediterranean Europe. An advanced flowering-veraison phase may have benefited from cooler temperatures and a higher fraction of spring precipitation in wine regions of Italy-Portugal-Spain, resulting in alleviated CWSI and moderate reductions of YLR. For those of France-Germany-Luxembourg, this can have reduced flowering-veraison precipitation, but prevalent alleviations of YLR are also found, possibly because of shifted phase towards a cooler growing season with reduced evaporative demands. Overall, such a retrospective analysis might provide new insights towards better management of seasonal water deficit for conventionally vulnerable Mediterranean wine regions, but also for relatively cooler and wetter Central European regions.

The impact of leaf canopy management on eco-physiology, wood chemical properties and microbial communities in root, trunk and cordon of Riesling grapevines (Vitis vinifera L.)

In the last decades, climate change required already adaptation of vineyard management. Increase in temperature and unexpected weather events cause changes in all phenological stages requiring new management tools. For example, defoliation can be a useful tool to reduce the sugar content in the berries creating differences in the wine profiles. In a ten-year field experiment using Riesling (Vitis vinifera L, planted 1986, Geisenheim, Germany), various mechanical defoliation strategies and different intensities were trialed until 2016 before the vineyard was uprooted. Wood was sampled from the plant compartments root, trunk, cordon and shoot for analyses of physicochemical properties (e.g. lignin and element content, pH, diameter), nonstructural carbohydrates and the microbial communities. The aim of the study was to investigate the influence of reduced canopy leaf area on the sink-source allocation into different compartments and potential changes of the fungal and prokaryotic wood-inhabiting community using a metabarcoding approach. Severe summer pruning (SSP) of the canopy and mechanical defoliation (MDC) above the bunch zone decreased the leaf area by 50% compared to control (C). SSP reduced the photosynthetic capacity, which resulted in an altered source-sink allocation and carbohydrate storage. With lower leaf area, less carbohydrates are allocated. This for example resulted in a decreased trunk diameter. Further, it affected the composition of the grapevine wood microbiota. SSP and MDC management changed significantly the prokaryotic community composition in wood of the root samples, but had no effect in other compartments. In general, this study found strong compartment and less management effects of the microbial community composition and associated physicochemical properties. The highest microbial diversities were identified in the wood of the trunk, and several species were recorded the first time in grapevine.