Fully automated non-targeted GC-MS data analysis

Abstract

Non-targeted analysis is applied in many different domains of analytical chemistry such as metabolomics, environmental and food analysis. In contrast to targeted analysis, non-targeted approaches take information of known and unknown compounds into account, are inherently more comprehensive and give a more holistic representation of the sample composition. 

Besides chromatographic techniques coupled to high resolution mass spectrometry such as LC-HRMS, gas chromatography with unit resolution mass spectrometry is still regularly utilized for non-targeted profiling or fingerprinting. This is mainly due to high separation power of GC and a wide availability and low costs of quadrupole mass spectrometers. 

Although several non-targeted approaches have been developed, data processing still remains a serious bottleneck. Baseline correction, feature detection, and retention time alignment can be prone to errors and time-consuming manual corrections are often necessary. We therefore developed an automated strategy to non-targeted GC-MS data avoiding feature detection and retention time alignment. The novel automated approach includes segmentation of chromatograms along the retention time axis, multiway decomposition of transformed segments followed by a supervised machine learning pipeline based on gradient boosted tree classification on the decomposed tensor [1, 2]. 

In order to make this novel data analysis strategy available to scientists without programming background, we developed a convenient browser based application. For the here presented interactive browser application the open source Python packages Bokeh and HoloViews were used. The application will be online freely available soon. 

[1] J. Vestner, G. de Revel, S. Krieger-Weber, D. Rauhut, M. du Toit, A. de Villiers, Toward automated chromatographic fingerprinting: A non-alignment approach to gas chromatography mass spectrometry data. Acta Chimica Acta 911 (2016) 42-58 
[2] K. Sirén, U. Fischer, J. Vestner, Automated supervised learning pipeline for non-targeted GC-MS data analysis. Analytica Chimica Acta: X 1 (2019) 100005

DOI:

Publication date: June 19, 2020

Issue: OENO IVAS 2019

Type: Article

Authors

Jochen Vestner, Kimmo Sirén, Pierre Le Brun, Ulrich Fischer

Institute for Viticulture and Oenology, DLR Rheinpfalz, Breitenweg 71, D-67435 Neustadt, Germany
Institut National Supérieur des Sciences Agronomiques de l’Alimentation et de l’ Environnement, Agrosup Dijon, 6 boulevard Docteur Petitjean, 21000 Dijon, France
Department of Chemistry, University of Kaiserslautern, Erwin-Schroedinger-Strasse 52, D-67663 Kaiserslautern

Contact the author

Keywords

metabolomics, non-targeted, GC-MS, exploratory data analysis 

Tags

IVES Conference Series | OENO IVAS 2019

Citation

Related articles…

Climate projections over France wine-growing region and its potential impact on phenology

Climate change represents a major challenge for the French wine industry. Climatic conditions in French vineyards have already changed and will continue to evolve. One of the notable effects on grapevine is the advancing growing season. The aim of this study is to characterise the evolution of agroclimatic indicators (Huglin index, number of hot days, mean temperature, cumulative rainfall and number of rainy days during the growing season) at French wine-growing regions scale between 1980 and 2019 using gridded data (8 km resolution, SAFRAN) and for the middle of the 21th century (2046-2065) with 21 GCMs statistically debiased and downscaled at 8 km. A set of three phenological models were used to simulate the budburst (BRIN, Smoothed-Utah), flowering, veraison and theoretical maturity (GFV and GSR) stages for two grape varieties (Chardonnay and Cabernet-Sauvignon) over the whole period studied. All the French wine-growing regions show an increase in both temperatures during the growing season and Huglin index. This increase is accompanied by an advance in the simulated flowering (+3 to +9 days), veraison (+6 to +13 days) and theoretical maturity (+6 to +16 days) stages, which are more noticeable in the north-eastern part of France. The climate projections unanimously show, for all the GCMs considered, a clear increase in the Huglin index (+662 to 771 °C.days compared to the 1980-1999 period) and in the number of hot days (+5.6 to 22.6 days) in all the wine regions studied. Regarding rainfall, the expected evolution remains very uncertain due to the heterogeneity of the climates simulated by the 21 models. Only 4 regions out of 21 have a significant decrease in the number of rainy days during the growing season. The two budburst models show a strong divergence in the evolution of this stage with an average difference of 18 days between the two models on all grapevine regions. The theoretical maturity is the most impacted stage with a potential advance between 40 and 23 days according to wine-growing regions.

Climate change impacts: a multi-stress issue

With the aim of producing premium wines, it is admitted that moderate environmental stresses may contribute to the accumulation of compounds of interest in grapes. However the ongoing climate change, with the appearance of more limiting conditions of production is a major concern for the wine industry economic. Will it be possible to maintain the vineyards in place, to preserve the current grape varieties and how should we anticipate the adaptation measures to ensure the sustainability of vineyards? In this context, the question of the responses and adaptation of grapevine to abiotic stresses becomes a major scientific issue to tackle. An abiotic stress can be defined as the effect of a specific factor of the physico-chemical environment of the plants (temperature, availability of water and minerals, light, etc.) which reduces growth, and for a crop such as the vine, the yield, the composition of the fruits and the sustainability of the plants. Water stress is in many minds, but a systemic vision is essential for at least two reasons. The first reason is that in natural environments, a single factor is rarely limiting, and plants have to deal with a combination of constraints, as for example heat and drought, both in time and at a given time. The second reason is that plants, including grapevine, have central mechanisms of stress responses, as redox regulatory pathways, that play an important role in adaptation and survival. Here we will review the most recent studies dealing with this issue to provide a better understanding of the grapevine responses to a combination of environmental constraints and of the underlying regulatory pathways, which may be very helpful to design more adapted solutions to cope with climate change.

Underpinning terroir with data: rethinking the zoning paradigm

Agriculture, natural resource management and the production and sale of products such as wine are increasingly data-driven activities. Thus, the use of remote and proximal crop and soil sensors to aid management decisions is becoming commonplace and ‘Agtech’ is proliferating commercially; mapping, underpinned by geographical information systems and complex methods of spatial analysis, is widely used. Likewise, the chemical and sensory analysis of wines draws on multivariate statistics; the efficient winery intake of grapes, subsequent production of wines and their delivery to markets relies on logistics; whilst the sales and marketing of wines is increasingly driven by artificial intelligence linked to the recorded purchasing behaviour of consumers. In brief, there is data everywhere!

Opinions will vary on whether these developments are a good thing. Those concerned with the ‘mystique’ of wine, or the historical aspects of terroir and its preservation, may find them confronting. In contrast, they offer an opportunity to those interested in the biophysical elements of terroir, and efforts aimed at better understanding how these impact on vineyard performance and the sensory attributes of resultant wines. At the previous Terroir Congress, we demonstrated the potential of analytical methods used at the within-vineyard scale in the development of Precision Viticulture, in contributing to a quantitative understanding of regional terroir. For this conference, we take this approach forward with examples from contrasting locations in both the northern and southern hemispheres. We show how, by focussing on the vineyards within winegrowing regions, as opposed to all of the land within those regions, we might move towards a more robust terroir zoning than one derived from a mixture of history, thematic mapping, heuristics and the whims of marketers. Aside from providing improved understanding by underpinning terroir with data, such methods should also promote improved management of the entire wine value chain.

Grapevine sugar concentration model in the Douro Superior, Portugal

Increasingly warm and dry climate conditions are challenging the viticulture and winemaking sector. Digital technologies and crop modelling bear the promise to provide practical answers to those challenges. As viticultural activities strongly depend on harvest date, its early prediction is particularly important, since the success of winemaking practices largely depends upon this key event, which should be based on an accurate and advanced plan of the annual cycle. Herein, we demonstrate the creation of modelling tools to assess grape ripeness, through sugar concentration monitoring. The study area, the Portuguese Côa valley wine region, represents an important terroir in the “Douro Superior” subregion. Two varieties (cv. Touriga Nacional and Touriga Franca) grown in five locations across the Côa Region were considered. Sugar accumulation in grapes, with concentrations between 170 and 230 g l-1, was used from 2014 to 2020 as an indicator of technological maturity conditioned by meteorological factors. The climatic time series were retrieved from the EU Copernicus Service, while sugar data were collected by a non-profit organization, ADVID, and by Sogrape, a leading wine company. The software for calibrating and validating this model framework was the Phenology Modeling Platform (PMP), version 5.5, using Sigmoid and growing degree-day (GDD) models for predictions. The performance was assessed through two metrics: Roots Mean Square Error (RMSE) and efficiency coefficient (EFF), while validation was undertaken using leave-one-out cross-validation. Our findings demonstrate that sugar content is mainly dependent on temperature and air humidity. The models achieved a performance of 0.65

VINIoT – Precision viticulture service

The project VINIoT pursues the creation of a new technological vineyard monitoring service, which will allow companies in the wine sector in the SUDOE space to monitor plantations in real time and remotely at various levels of precision. The system is based on spectral images and an IoT architecture that allows assessing parameters of interest viticulture and the collection of data at a precise scale (level of grape, plant, plot or vineyard) will be designed. In France, three subjects were specifically developed: evaluation of maturity, of water stress, and detection of flavescence dorée. For the evaluation of maturity, it has been decided first to work at the berry scale in the laboratory, then at the bunch scale and finally in the vineyard. The acquisition of the spectral hyperstal image as well as the reference analyzes to measure the maturity, were carried out in the laboratory after harvesting the berries in a maturity monitoring context. This work focuses on a case study to predict sugar content of three different grape varieties: Syrah, Fer Servadou and Mauzac. A robust method called Roboost-PLSR, developed in the framework of this work (Courand et al., 2022), to improve prediction model performance was applied on spectra after the acquirement of hyperspectral images. Regarding the evaluation of water stress, to work with a significant variability in terms of water status, it has been worked first with potted plants under 2 different water regimes. The facilities have allowed the supervision of irrigation and micro-climatic conditions. The regression models on agronomic variables (stomatal conductance, water potential, …) are studied. To detect flavescence dorée, the experimental plan has consisted of work at leaf scale in the laboratory first, and then in the field. To detect the disease from hyper-spectral imaging, a combination of multivariate curve resolution-alternating least squares (MCR-ALS) and factorial discriminant analysis (FDA) was proposed. This strategy proved the potential towards the discrimination of healthy and infected leaves by flavescence dorée based on the use of hyperspectral images (Mas Garcia et al., 2021).