Fully automated non-targeted GC-MS data analysis

Abstract

Non-targeted analysis is applied in many different domains of analytical chemistry such as metabolomics, environmental and food analysis. In contrast to targeted analysis, non-targeted approaches take information of known and unknown compounds into account, are inherently more comprehensive and give a more holistic representation of the sample composition. 

Besides chromatographic techniques coupled to high resolution mass spectrometry such as LC-HRMS, gas chromatography with unit resolution mass spectrometry is still regularly utilized for non-targeted profiling or fingerprinting. This is mainly due to high separation power of GC and a wide availability and low costs of quadrupole mass spectrometers. 

Although several non-targeted approaches have been developed, data processing still remains a serious bottleneck. Baseline correction, feature detection, and retention time alignment can be prone to errors and time-consuming manual corrections are often necessary. We therefore developed an automated strategy to non-targeted GC-MS data avoiding feature detection and retention time alignment. The novel automated approach includes segmentation of chromatograms along the retention time axis, multiway decomposition of transformed segments followed by a supervised machine learning pipeline based on gradient boosted tree classification on the decomposed tensor [1, 2]. 

In order to make this novel data analysis strategy available to scientists without programming background, we developed a convenient browser based application. For the here presented interactive browser application the open source Python packages Bokeh and HoloViews were used. The application will be online freely available soon. 

[1] J. Vestner, G. de Revel, S. Krieger-Weber, D. Rauhut, M. du Toit, A. de Villiers, Toward automated chromatographic fingerprinting: A non-alignment approach to gas chromatography mass spectrometry data. Acta Chimica Acta 911 (2016) 42-58 
[2] K. Sirén, U. Fischer, J. Vestner, Automated supervised learning pipeline for non-targeted GC-MS data analysis. Analytica Chimica Acta: X 1 (2019) 100005

DOI:

Publication date: June 19, 2020

Issue: OENO IVAS 2019

Type: Article

Authors

Jochen Vestner, Kimmo Sirén, Pierre Le Brun, Ulrich Fischer

Institute for Viticulture and Oenology, DLR Rheinpfalz, Breitenweg 71, D-67435 Neustadt, Germany
Institut National Supérieur des Sciences Agronomiques de l’Alimentation et de l’ Environnement, Agrosup Dijon, 6 boulevard Docteur Petitjean, 21000 Dijon, France
Department of Chemistry, University of Kaiserslautern, Erwin-Schroedinger-Strasse 52, D-67663 Kaiserslautern

Contact the author

Keywords

metabolomics, non-targeted, GC-MS, exploratory data analysis 

Tags

IVES Conference Series | OENO IVAS 2019

Citation

Related articles…

Assessment of climate change impacts on water needs and growing cycle on grapevine in three DOs of NE Spain

This study assessed the suitability of grapevine growing in three DOs (Empordà, Pla de Bages and Penedès) of Catalonia (NE Spain) over the 21st century. For this purpose, an estimation of water needs and agroclimatic and phenological indicators was made. Climate change impacts were estimated at 1 km pixel resolution using temperature and precipitation projections from several general circulation models (GCM) and two climate change scenarios: RCP 4.5 (stabilization scenario) and RCP 8.5 (worst-case scenario). Potential crop evapotranspiration (following FAO procedure) and a daily water balance considering soil water holding capacity were used to estimate actual evapotranspiration of vines and, finally, water needs. Dynamics would be similar in the three DOs studied although the magnitude of impact differs. Water needs would be 2 and 3 times greater (ranging from 0 to more than 1500 m3/ha) than current water needs at both climate change scenarios. Moreover, blooming date would advance from 3 to 6 weeks, harvest date from 1 to 2.5 months, resulting in growing cycles from 10 to 80 days shorter. It should also be noted that frost risk would decrease from 6 to 76%, the number of days with temperatures above 30ºC during ripening would rise from 48 to 500% and tropical nights (minimum temperature >20ºC) at ripening would increase from 28 to 150%, depending on the scenario and the DOs. The impacts of climate change in the three DOs could result in significant limitations for grapevine cultivation and wine production if adaptive strategies are not applied. This result could serve as a basis for the design of specific and particular adaptation strategies to improve and maintain vineyards in the DOs studied and could be extrapolated to similar DOs and regions.

Heatwaves and grapevine yield in the Douro region, crop model simulations

Heatwaves or extreme heat events can be particularly harmful to agriculture. Grapevines grown in the Douro winemaking region are particularly exposed to this threat, due to the specificities of the already warm and dry climatic conditions. Furthermore, climate change simulations point to an increase in the frequency of occurrence of these extreme heat events, therefore posing a major challenge to winegrowers in the Mediterranean type climates. The current study focuses on the application of the STICS crop model to assess the potential impacts of heatwaves in grapevine yields over the Douro valley winemaking region. For this purpose, STICS was applied to grapevines using high-resolution weather, soil and terrain datasets over the Douro. To assess the impact of heatwaves, the weather dataset (1989-2005) was artificially modified, generating periods with anomalously high temperatures (+5 ºC), at certain onset dates and with specific durations (from 5 to 9 days). The model was run with this modified weather dataset and results were compared to the original unmodified runs. The results show that heatwaves can have a very strong impact on grapevine yields, strongly depending on the onset dates and duration of the heatwaves. The highest negative impacts may result in a decrease in the yield by up to -35% in some regions. Despite some uncertainties inherent to the current modelling assessment, the present study highlights the negative impacts of heatwaves on viticultural yields in the Douro region, which is critical information for stakeholders within the winemaking sector for planning suitable adaptation measures.

VINIoT – Precision viticulture service

The project VINIoT pursues the creation of a new technological vineyard monitoring service, which will allow companies in the wine sector in the SUDOE space to monitor plantations in real time and remotely at various levels of precision. The system is based on spectral images and an IoT architecture that allows assessing parameters of interest viticulture and the collection of data at a precise scale (level of grape, plant, plot or vineyard) will be designed. In France, three subjects were specifically developed: evaluation of maturity, of water stress, and detection of flavescence dorée. For the evaluation of maturity, it has been decided first to work at the berry scale in the laboratory, then at the bunch scale and finally in the vineyard. The acquisition of the spectral hyperstal image as well as the reference analyzes to measure the maturity, were carried out in the laboratory after harvesting the berries in a maturity monitoring context. This work focuses on a case study to predict sugar content of three different grape varieties: Syrah, Fer Servadou and Mauzac. A robust method called Roboost-PLSR, developed in the framework of this work (Courand et al., 2022), to improve prediction model performance was applied on spectra after the acquirement of hyperspectral images. Regarding the evaluation of water stress, to work with a significant variability in terms of water status, it has been worked first with potted plants under 2 different water regimes. The facilities have allowed the supervision of irrigation and micro-climatic conditions. The regression models on agronomic variables (stomatal conductance, water potential, …) are studied. To detect flavescence dorée, the experimental plan has consisted of work at leaf scale in the laboratory first, and then in the field. To detect the disease from hyper-spectral imaging, a combination of multivariate curve resolution-alternating least squares (MCR-ALS) and factorial discriminant analysis (FDA) was proposed. This strategy proved the potential towards the discrimination of healthy and infected leaves by flavescence dorée based on the use of hyperspectral images (Mas Garcia et al., 2021).

Adaptation to soil and climate through the choice of plant material

Choosing the rootstock, the scion variety and the training system best suited to the local soil and climate are the key elements for an economically sustainable production of wine. The choice of the rootstock/scion variety best adapted to the characteristics of the soil is essential but, by changing climatic conditions, ongoing climate change disrupts the fine-tuned local equilibrium. Higher temperatures induce shifts in developmental stages, with on the one hand increasing fears of spring frost damages and, on the other hand, ripening during the warmest periods in summer. Expected higher water demand and longer and more frequent drought events are also major concerns. The genetic control of the phenotypes, by genomic information but also by the epigenetic control of gene expression, offers a lot of opportunities for adapting the plant material to the future. For complex traits, genomic selection is also a promising method for predicting phenotypes. However, ecophysiological modelling is necessary to better anticipate the phenotypes in unexplored climatic conditions Genetic approaches applied on parameters of ecophysiological models rather than raw observed data are more than ever the basis for finding, or building, the ideal varieties of the future.

An analytical framework to site-specifically study climate influence on grapevine involving the functional and Bayesian exploration of farm data time series synchronized using an eGDD thermal index

Climate influence on grapevine physiology is prevalent and this influence is only expected to increase with climate change. Although governed by a general determinism, climate influence on grapevine physiology may present variations according to the terroir. In addition, these site-specific differences are likely to be enhanced when climate influence is studied using farm data. Indeed, farm data integrate additional sources of variation such as a varying representativity of the conditions actually experienced in the field. Nevertheless, there is a real challenge in valuing farm data to enable grape growers to understand their own terroir and consequently adapt their practices to the local conditions. In such a context, this article proposes a framework to site-specifically study climate influence on grapevine physiology using farm data. It focuses on improving the analysis of time series of weather data. The analytical framework includes the synchronization of time series using site-specific thermal indices computed with an original method called Extended Growing Degree Days (eGDD). Synchronized time series are then analyzed using a Bayesian functional Linear regression with Sparse Steps functions (BLiSS) in order to detect site-specific periods of strong climate influence on yield development. The article focuses on temperature and rain influence on grape yield development as a case study. It uses data from three commercial vineyards respectively situated in the Bordeaux region (France), California (USA) and Israel. For all vineyards, common periods of climate influence on yield development were found. They corresponded to already known periods, for example around veraison of the year before harvest. However, the periods differed in their precise timing (e.g. before, around or after veraison), duration and correlation direction with yield. Other periods were found for only one or two vineyards and/or were not referred to in literature, for example during the winter before harvest.