Fully automated non-targeted GC-MS data analysis

Abstract

Non-targeted analysis is applied in many different domains of analytical chemistry such as metabolomics, environmental and food analysis. In contrast to targeted analysis, non-targeted approaches take information of known and unknown compounds into account, are inherently more comprehensive and give a more holistic representation of the sample composition. 

Besides chromatographic techniques coupled to high resolution mass spectrometry such as LC-HRMS, gas chromatography with unit resolution mass spectrometry is still regularly utilized for non-targeted profiling or fingerprinting. This is mainly due to high separation power of GC and a wide availability and low costs of quadrupole mass spectrometers. 

Although several non-targeted approaches have been developed, data processing still remains a serious bottleneck. Baseline correction, feature detection, and retention time alignment can be prone to errors and time-consuming manual corrections are often necessary. We therefore developed an automated strategy to non-targeted GC-MS data avoiding feature detection and retention time alignment. The novel automated approach includes segmentation of chromatograms along the retention time axis, multiway decomposition of transformed segments followed by a supervised machine learning pipeline based on gradient boosted tree classification on the decomposed tensor [1, 2]. 

In order to make this novel data analysis strategy available to scientists without programming background, we developed a convenient browser based application. For the here presented interactive browser application the open source Python packages Bokeh and HoloViews were used. The application will be online freely available soon. 

[1] J. Vestner, G. de Revel, S. Krieger-Weber, D. Rauhut, M. du Toit, A. de Villiers, Toward automated chromatographic fingerprinting: A non-alignment approach to gas chromatography mass spectrometry data. Acta Chimica Acta 911 (2016) 42-58 
[2] K. Sirén, U. Fischer, J. Vestner, Automated supervised learning pipeline for non-targeted GC-MS data analysis. Analytica Chimica Acta: X 1 (2019) 100005

DOI:

Publication date: June 19, 2020

Issue: OENO IVAS 2019

Type: Article

Authors

Jochen Vestner, Kimmo Sirén, Pierre Le Brun, Ulrich Fischer

Institute for Viticulture and Oenology, DLR Rheinpfalz, Breitenweg 71, D-67435 Neustadt, Germany
Institut National Supérieur des Sciences Agronomiques de l’Alimentation et de l’ Environnement, Agrosup Dijon, 6 boulevard Docteur Petitjean, 21000 Dijon, France
Department of Chemistry, University of Kaiserslautern, Erwin-Schroedinger-Strasse 52, D-67663 Kaiserslautern

Contact the author

Keywords

metabolomics, non-targeted, GC-MS, exploratory data analysis 

Tags

IVES Conference Series | OENO IVAS 2019

Citation

Related articles…

Grape must quality and mesoclimatic variability in Fruška Gora wine-growing region, Serbia

The Fruška Gora mountain is a traditional wine-growing region in Serbia situated in the Pannonian Basin. Due to such a position, the vicinity of the Danube River and the presence of concave configuration, it is suitable for grape production. This paper provides analyses of spatial variations in meteorological parameters and grape juice quality within Fruška Gora wine region over three consecutive vintages (2018-2020). The examined period can be defined as warm with cool nights during September (AVG 18,9°C; GDD 1918°C; CI 12°CF) and with the presence of mesoclimatic variability. The East part of the study area was somewhat drier and hotter compared to other parts of the region. The analyses of grape must samples (190 in total) of five cultivars (Cabernet-Sauvignon, Merlot, Chardonnay, Sauvignon blanc and Grašac (Welschriesling)) commonly grown across the region (19 sites), were performed using Fourier Transform Infrared Technology (FTIR). Among all cultivars, Sauvignon blanc was harvested first in the East area (DOY=246±5, GDD at harvest=1552±74, 22.2±0.7 °Brix), while the latest harvest was recorded for Cabernet-Sauvignon in the West (DOY=283±5, GDD at harvest=1936±187, 23.4±1.0 °Brix ). Both the red and white cultivars had higher acidity and YAN in the grape must if the vines were grown in the North and East compared to South and West areas. According to PCA analysis, Grašac showed the lowest variation in grape must chemical composition. Thus, the results confirm that Grašac is the most stable cultivar in Fruška Gora. All monitored cultivars reached technological fruit ripeness by the end of the growing season. However, it was difficult to reach full ripeness of red cultivars, mostly beacuse of uncoupling of technolocical and phenolic ripeness. Thus, Cabernet-Sauvignon had higher variations in GDD sums at harvest compared to other cultivars, which probably increased variations in grape must quality.

A better understanding of the climate effect on anthocyanin accumulation in grapes using a machine learning approach

The current climate changes are directly threatening the balance of the vineyard at harvest time. The maturation period of the grapes is shifted to the middle of the summer, at a time when radiation and air temperature are at their maximum. In this context, the implementation of corrective practices becomes problematic. Unfortunately, our knowledge of the climate effect on the quality of different grape varieties remains very incomplete to guide these choices. During the Innovine project, original experiments were carried out on Syrah to study the combined effects of normal or high air temperature and varying degrees of exposure of the berries to the sun. Berries subjected to these different conditions were sampled and analyzed throughout the maturation period. Several quality characteristics were determined, including anthocyanin content. The objective of the experiments was to investigate which climatic determinants were most important for anthocyanin accumulation in the berries. Temperature and irradiance data, observed over time with a very thin discretization step, are called functional data in statistics. We developed the procedure SpiceFP (Sparse and Structured Procedure to Identify Combined Effects of Functional Predictors) to explain the variations of a scalar response variable (a grape berry quality variable for example) by two or three functional predictors (as temperature and irradiance) in a context of joint influence of these predictors. Particular attention was paid to the interpretability of the results. Analysis of the data using SpiceFP identified a negative impact of morning combinations of low irradiance (lower than about 100 μmol m−2 s−1 or 45 μmol m−2 s−1 depending on the advanced-delayed state of the berries) and high temperature (higher than 25oC). A slight difference associated with overnight temperature occurred between these effects identified in the morning.

A spatial explicit inventory of EU wine protected designation of origin to support decision making in a changing climate

Winemaking areas recognized as protected designations of origin (PDOs) shape important economic, environmental and cultural values that are tied to closely defined geographic locations. To preserve wine products and wine-growing practices adopted in different PDOs these areas are strictly regulated by legal specifications. However, quality viticulture is increasingly under pressure from climate change, which is altering the local conditions of many winegrowing areas. Therefore, maintaining traditional wine products will require the adoption of tailored adaptation strategies, including possible changes in the legal regulation of protected wines. To this end, it is necessary to have a comprehensive knowledge on PDOs including their extension, products and allowed practices. While there have been efforts to build databases that summarize the characteristics for individual wine PDO areas and to quantify the related effects of climate change, much information is still included only in the official documentation of the EU geographical indication register and has never been collected in a comprehensive manner. With this study we aim at filling this gap by building a spatial inventory of European wine PDOs that supports decision making in viticulture in the context of climate change. To map and characterize European wine PDOs, we analysed their legal documents and extracted relevant information useful for climate change adaptation. The output consists of a comprehensive geographical dataset that identifies the boundaries of all 1200 European wine PDOs at unprecedented spatial resolution and includes a set of legally binding regulations, such as authorized vine varieties, maximum yields and planting density. The inventory will allow researchers to analyse the impacts of climate change on European wine PDOs and support decision makers in developing tailored adaptation strategies. This includes, among others, the evaluation of new vineyard site selection, the expansion of cultivated varieties or the authorization of irrigation in vineyards.

Influence of a spontaneous cover crop on the vineyard and soil erosion under Mediterranean climate

Sixty five % of the agricultural area of the Basque Country located in the DO Ca Rioja corresponds to vineyards. More than 40% of it has an average slope greater than 10%, which makes it sensitive to erosive processes. Furthermore, it is foreseeable that extreme weather events (storms, hail, extreme heat and cold, etc.) will be favored due to climate change. Cover cropping can mitigate this risk, and therefore the objective of this work is to evaluate the impact that a vegetable cover has on the agronomic behavior of the vineyard, the quality of the grape and soil erosion. For this, a trial has been carried out with a Graciano variety vineyard with a slope between 10% -20% during the years 2020 and 2021. Conventional tillage management in the area has been compared (4-6 passes per year of tillage machinery) versus spontaneous vegetation cover management in the vineyard. This implies not tilling and allowing the grass of the land to colonize the range between the lines of vines, controlling their height through 1-3 mowing passes per year, always trying to affect the surface of the land as little as possible. The vegetative growth, yield and quality of the grape and wine was measured. Furthermore, erosion has been measured using Gerlasch boxes. The yield was lower in the second year of the trial in the cover crop treatment, but erosion was significantly reduced.

Frost risk projections in a changing climate are highly sensitive in time and space to frost modelling approaches

Late spring frost is a major challenge for various winegrowing regions across the world, its occurrence often leading to important yield losses and/or plant failure. Despite a significant increase in minimum temperatures worldwide, the spatial and temporal evolution of spring frost risk under a warmer climate remains largely uncertain. Recent projections of spring frost risk for viticulture in Europe throughout the 21st century show that its evolution strongly depends on the model approach used to simulate budburst. Furthermore, the frost damage modelling methods used in these projections are usually not assessed through comparison to field observations and/or frost damage reports.
The present study aims at comparing frost risk projections simulated using six spring frost models based on two approaches: a) models considering a fixed damage threshold after the predicted budburst date (e.g BRIN, Smoothed-Utah, Growing Degree Days, Fenovitis) and b) models considering a dynamic frost sensitivity threshold based on the predicted grapevine winter/spring dehardening process (e.g. Ferguson model). The capability of each model to simulate an actual frost event for the Vitis vinifera cv. Chadonnay B was previously assessed by comparing simulated cold thermal stress to reports of events with frost damage in Chablis, the northernmost winegrowing region of Burgundy. Models exhibited scores of κ > 0.65 when reproducing the frost/non-frost damage years and an accuracy ranging from 0.82 to 0.90.
Spring frost risk projections throughout the 21st century were performed for all winegrowing subregions of Bourgogne-Franche-Comté under two CMIP5 concentration pathways (4.5 and 8.5) using statistically downscaled 8×8 km daily air temperature and humidity of 13 climate models. Contrasting results with region-specific spring frost risk trends were observed. Three out of five models show a decrease in the frequency of frost years across the whole study area while the other two show an increase that is more or less pronounced depending on winegrowing subregion. Our findings indicate that the lack of accuracy in grapevine budburst and dehardening models makes climate projections of spring frost risk highly uncertain for grapevine cultivation regions.