Fully automated non-targeted GC-MS data analysis

Abstract

Non-targeted analysis is applied in many different domains of analytical chemistry such as metabolomics, environmental and food analysis. In contrast to targeted analysis, non-targeted approaches take information of known and unknown compounds into account, are inherently more comprehensive and give a more holistic representation of the sample composition. 

Besides chromatographic techniques coupled to high resolution mass spectrometry such as LC-HRMS, gas chromatography with unit resolution mass spectrometry is still regularly utilized for non-targeted profiling or fingerprinting. This is mainly due to high separation power of GC and a wide availability and low costs of quadrupole mass spectrometers. 

Although several non-targeted approaches have been developed, data processing still remains a serious bottleneck. Baseline correction, feature detection, and retention time alignment can be prone to errors and time-consuming manual corrections are often necessary. We therefore developed an automated strategy to non-targeted GC-MS data avoiding feature detection and retention time alignment. The novel automated approach includes segmentation of chromatograms along the retention time axis, multiway decomposition of transformed segments followed by a supervised machine learning pipeline based on gradient boosted tree classification on the decomposed tensor [1, 2]. 

In order to make this novel data analysis strategy available to scientists without programming background, we developed a convenient browser based application. For the here presented interactive browser application the open source Python packages Bokeh and HoloViews were used. The application will be online freely available soon. 

[1] J. Vestner, G. de Revel, S. Krieger-Weber, D. Rauhut, M. du Toit, A. de Villiers, Toward automated chromatographic fingerprinting: A non-alignment approach to gas chromatography mass spectrometry data. Acta Chimica Acta 911 (2016) 42-58 
[2] K. Sirén, U. Fischer, J. Vestner, Automated supervised learning pipeline for non-targeted GC-MS data analysis. Analytica Chimica Acta: X 1 (2019) 100005

DOI:

Publication date: June 19, 2020

Issue: OENO IVAS 2019

Type: Article

Authors

Jochen Vestner, Kimmo Sirén, Pierre Le Brun, Ulrich Fischer

Institute for Viticulture and Oenology, DLR Rheinpfalz, Breitenweg 71, D-67435 Neustadt, Germany
Institut National Supérieur des Sciences Agronomiques de l’Alimentation et de l’ Environnement, Agrosup Dijon, 6 boulevard Docteur Petitjean, 21000 Dijon, France
Department of Chemistry, University of Kaiserslautern, Erwin-Schroedinger-Strasse 52, D-67663 Kaiserslautern

Contact the author

Keywords

metabolomics, non-targeted, GC-MS, exploratory data analysis 

Tags

IVES Conference Series | OENO IVAS 2019

Citation

Related articles…

Modelling vine water stress during a critical period and potential yield reduction rate in European wine regions: a retrospective analysis

Most European vineyards are managed under rainfed conditions, where seasonal water deficit has become increasingly important. The flowering-veraison phenophase represents an important period for vine response to water stress, which is seldomly thoroughly evaluated. Therefore, we aim to quantify the flowering-veraison water stress levels using Crop Water Stress Indicator (CWSI) over 1986–2015 for important European wine regions, and to assess the respective potential Yield Lose Rate (YLR). Additionally, we also investigate whether an advanced flowering-veraison phase may help alleviating the water stress with improved yield. A process-based grapevine model STICS is employed, which has been extensively calibrated for flowering and veraison stages using observed data at 38 locations with 10 different grapevine varieties. Subsequently, the model is being implemented at the regional level, considering site-specific calibration results and gridded climate and soil datasets. The findings suggest wine regions with stronger flowering-veraison CWSI tend to have higher potential YLR. However, contrasting patterns are found between wine regions in France-Germany-Luxembourg and Italy-Portugal-Spain. The former tends to have slight-to-moderate drought conditions (CWSI<0.5) and a negligible-to-moderate YLR (<30%), whereas the latter possesses severe-to-extreme CWSI (>0.5) and substantial YLR (>40%). Wine regions prone to a high drought risk (CWSI>0.75) are also identified, which are concentrated in southern Mediterranean Europe. An advanced flowering-veraison phase may have benefited from cooler temperatures and a higher fraction of spring precipitation in wine regions of Italy-Portugal-Spain, resulting in alleviated CWSI and moderate reductions of YLR. For those of France-Germany-Luxembourg, this can have reduced flowering-veraison precipitation, but prevalent alleviations of YLR are also found, possibly because of shifted phase towards a cooler growing season with reduced evaporative demands. Overall, such a retrospective analysis might provide new insights towards better management of seasonal water deficit for conventionally vulnerable Mediterranean wine regions, but also for relatively cooler and wetter Central European regions.

Frost risk projections in a changing climate are highly sensitive in time and space to frost modelling approaches

Late spring frost is a major challenge for various winegrowing regions across the world, its occurrence often leading to important yield losses and/or plant failure. Despite a significant increase in minimum temperatures worldwide, the spatial and temporal evolution of spring frost risk under a warmer climate remains largely uncertain. Recent projections of spring frost risk for viticulture in Europe throughout the 21st century show that its evolution strongly depends on the model approach used to simulate budburst. Furthermore, the frost damage modelling methods used in these projections are usually not assessed through comparison to field observations and/or frost damage reports.
The present study aims at comparing frost risk projections simulated using six spring frost models based on two approaches: a) models considering a fixed damage threshold after the predicted budburst date (e.g BRIN, Smoothed-Utah, Growing Degree Days, Fenovitis) and b) models considering a dynamic frost sensitivity threshold based on the predicted grapevine winter/spring dehardening process (e.g. Ferguson model). The capability of each model to simulate an actual frost event for the Vitis vinifera cv. Chadonnay B was previously assessed by comparing simulated cold thermal stress to reports of events with frost damage in Chablis, the northernmost winegrowing region of Burgundy. Models exhibited scores of κ > 0.65 when reproducing the frost/non-frost damage years and an accuracy ranging from 0.82 to 0.90.
Spring frost risk projections throughout the 21st century were performed for all winegrowing subregions of Bourgogne-Franche-Comté under two CMIP5 concentration pathways (4.5 and 8.5) using statistically downscaled 8×8 km daily air temperature and humidity of 13 climate models. Contrasting results with region-specific spring frost risk trends were observed. Three out of five models show a decrease in the frequency of frost years across the whole study area while the other two show an increase that is more or less pronounced depending on winegrowing subregion. Our findings indicate that the lack of accuracy in grapevine budburst and dehardening models makes climate projections of spring frost risk highly uncertain for grapevine cultivation regions.

Grape berry size is a key factor in determining New Zealand Pinot noir wine composition

Making high quality but affordable Pinot noir (PN) wine is challenging in most terroirs and New Zealand’s (NZ) situation is no exception. To increase the probability of making highly typical PN wines producers choose to grow grapes in cool climates on lower fertility soils while adopting labour intensive practices. Stringent yield targets and higher input costs necessarily mean that PN wine cost is high, and profitability lower, in line-priced varietal wine ranges. To understand the reasons why higher yielding vines are perceived to produce wines of lower quality we have undertaken an extensive study of PN in NZ. Since 2018, we established a network of twelve trial sites in three NZ regions to find individual vines that produced acceptable commercial yields (above 2.5kg per vine) and wines of composition comparable to “Icon” labels. Approximately 20% of 660 grape lots (N = 135) were selected from within a narrow juice Total Soluble Solids (TSS) range and made into single vine wines under controlled conditions. Principal Component Analysis of the vine, berry, juice and wine parameters from three vintages found grape berry mass to be most effective clustering variable. As berry mass category decreased there was a systematic increase in the probability of higher berry red colour and total phenolics with a parallel increase in wine phenolics, changed aroma fraction and decreased juice amino acids. The influence of berry size on wine composition would appear stronger than the individual effects of vintage, region, vineyard or vine yield. Our observations support the hypothesis that it is possible to produce PN wines that fall within an “Icon” benchmark composition range at yields above 2.5kg per vine provided that the Leaf Area:Fruit Weight ratio is above 12cm2 per g, mean berry mass is below 1.2g and juice TSS is above 22°Brix.

1H-NMR-based Metabolomics to assess the impact of soil type on the chemical composition of Mediterranean red wines

The aim of this study was to evaluate the effects of different soil types on the chemical composition of Mediterranean red wines, through untargeted and targeted 1H-NMR metabolomics. One milliliter of raw wine was analyzed by means of a Bruker Avance II 400 spectrometer operating at 400.15 MHz. The spectra were recorded by applying the NOESYGPPS1D pulse sequency, to achieve water and ethanol signals suppression. No modification of the pH was performed to avoid any chemical alteration of the matrix. The generation of input variables for untargeted analysis was done via bucketing the spectra. The resulting dataset was preprocessed prior to perform unsupervised PCA, by means of MetaboAnalyst web-based tool suite. The identification of compounds for the targeted analysis was performed by comparison to pure compounds spectra by means of SMA plug-in of MNova 14.2.3 software. The dataset containing the concentrations (%) of identified compounds was subjected to one-way analysis of variance (ANOVA) to highlight significant differences among the wines. The untargeted analysis, carried out through the PCA, revealed a clear differentiation among the wines. The fragments of the spectra contributing mostly to the separation were attributed to flavonoids, aroma compounds and amino acids. The targeted analysis leaded to the identification of 68 compounds, whose concentrations were significant different among the wines. The results were related to soils physical-chemical analysis and showed that: 1) high concentrations of flavan-3-ols and flavonols are correlated with high clay content in soils; 2) high concentrations of anthocyanins, amino acids, and aroma compounds are correlated with neutral and moderately alkaline soil pH; 3) low concentrations of flavonoids and aroma compounds are correlated with high soil organic matter content and acidic pH. The 1H-NMR metabolomic analysis proved to be an excellent tool to discriminate between wines originating from grapes grown on different soil types and revealed that soils in the Mediterranean area exert a strong impact on the chemical composition of the wines.

The combined effects of climate, soils, and deficit irrigation on yield and quality of Touriga Nacional under high atmospheric demand in the Douro Region

Global warming is one of the biggest environmental, social and economic threats in several viticultural regions. In the Douro Valley, changes are expected in the coming years, namely an increase in temperature and a decrease in precipitation. These changes are likely to have consequences for the production and quality of wine.
The aim of this study was to explore the effects of different soil characteristics combined with several deficit irrigation strategies, managed throughout ETc references and predawn leaf water potentials thresholds, on physiology, yield, and qualitative attributes on the Touriga Nacional variety under years of mild to severe water and heat stress.
The studies were conducted over seven years (2015 to 2021) in two plots of a commercial vineyard located at Quinta do Ataíde (Symington Family Estates) planted in 2011 and 2014 at 170 meters elevation, growing under three water regimes: non-irrigated (NI) and two deficit irrigation strategies (30% and 60% ETc) assessed weekly by Ψpd. The site has an annual rainfall below 500 mm, with high atmospheric demand. Climate data was collected from a weather station, located on site. Berry ripening was followed weekly for fruit analysis. At harvest, yield, vigour and pruning weight per vine were determined from 90 vines by treatment. Each season at veraison the NDVI Index was accessed by a drone. The soils physic-chemistry in the experimental blocs were analysed and grouped by SWHC. Delta C-13 analyses were also performed per treatment in two years.Irrigation had a positive effect on yield per vine, mostly due to an increase in berry and cluster weight, and fertility index through the years. A significant increase in sugar content, colour and phenols was observed with deficit irrigation in some years, but vine vigour related to soil characteristics had by far the greatest impact on quality.