Fully automated non-targeted GC-MS data analysis

Abstract

Non-targeted analysis is applied in many different domains of analytical chemistry such as metabolomics, environmental and food analysis. In contrast to targeted analysis, non-targeted approaches take information of known and unknown compounds into account, are inherently more comprehensive and give a more holistic representation of the sample composition. 

Besides chromatographic techniques coupled to high resolution mass spectrometry such as LC-HRMS, gas chromatography with unit resolution mass spectrometry is still regularly utilized for non-targeted profiling or fingerprinting. This is mainly due to high separation power of GC and a wide availability and low costs of quadrupole mass spectrometers. 

Although several non-targeted approaches have been developed, data processing still remains a serious bottleneck. Baseline correction, feature detection, and retention time alignment can be prone to errors and time-consuming manual corrections are often necessary. We therefore developed an automated strategy to non-targeted GC-MS data avoiding feature detection and retention time alignment. The novel automated approach includes segmentation of chromatograms along the retention time axis, multiway decomposition of transformed segments followed by a supervised machine learning pipeline based on gradient boosted tree classification on the decomposed tensor [1, 2]. 

In order to make this novel data analysis strategy available to scientists without programming background, we developed a convenient browser based application. For the here presented interactive browser application the open source Python packages Bokeh and HoloViews were used. The application will be online freely available soon. 

[1] J. Vestner, G. de Revel, S. Krieger-Weber, D. Rauhut, M. du Toit, A. de Villiers, Toward automated chromatographic fingerprinting: A non-alignment approach to gas chromatography mass spectrometry data. Acta Chimica Acta 911 (2016) 42-58 
[2] K. Sirén, U. Fischer, J. Vestner, Automated supervised learning pipeline for non-targeted GC-MS data analysis. Analytica Chimica Acta: X 1 (2019) 100005

DOI:

Publication date: June 19, 2020

Issue: OENO IVAS 2019

Type: Article

Authors

Jochen Vestner, Kimmo Sirén, Pierre Le Brun, Ulrich Fischer

Institute for Viticulture and Oenology, DLR Rheinpfalz, Breitenweg 71, D-67435 Neustadt, Germany
Institut National Supérieur des Sciences Agronomiques de l’Alimentation et de l’ Environnement, Agrosup Dijon, 6 boulevard Docteur Petitjean, 21000 Dijon, France
Department of Chemistry, University of Kaiserslautern, Erwin-Schroedinger-Strasse 52, D-67663 Kaiserslautern

Contact the author

Keywords

metabolomics, non-targeted, GC-MS, exploratory data analysis 

Tags

IVES Conference Series | OENO IVAS 2019

Citation

Related articles…

Downscaling of remote sensing time series: thermal zone classification approach in Gironde region

In viticulture, the challenges of local climate modelling are multiple: taking into account the local environment, fine temporal and spatial scales, reliable time series of climate data, ease of implementation and reproducibility of the method. At the local scale, recent studies have demonstrated the contribution of spatialization methods for ground-based climate observation data considering topographic factors such as altitude, slope, aspect, and geographic coordinates (Le Roux et al, 2017; De Rességuier et al, 2020). However, these studies have shown questions in terms of the reproducibility and sustainability of this type of climate study. In this context, we evaluated the potential of MODIS thermal satellite images validated with ground-based climate data (Morin et al, 2020). Previous studies have been encouraging, but questions remain to be explored at the regional scale, particularly in the dynamics of the massive use of bioclimatic indices to classify the climate of wine regions. The results at the local scale were encouraging, but this approach was tested in the current study at the regional scale. Several objectives were set: 1) to evaluate the downscaling method for land surface temperature time series, 2) to identify regional thermal structure variations. We used weekly minimum and maximum surface temperature time series acquired by MODIS satellites at a spatial resolution of 1000 m and downscaled at 500 m using topographical variables. Two types of analyses were performed:

Extreme canopy management for vineyard adaptation to climate change: is it a good idea?

Climate change constitutes an enormous challenge for humankind and for all human activities, viticulture not being an exception. Long-term strategic changes are probably needed the most, but growers also need to deal with short-term changes: summers that are getting progressively warmer, earlier harvest dates and higher pH in musts and wines. In the last 10-15 years, a relevant corpus of research is being developed worldwide in order to evaluate to which extent extreme canopy management operations, aimed at reducing leaf area and, thus, limiting the source to sink ratio, could be useful to delay ripening. Although extreme canopy management can result in relevant delays in harvest dates, longer term studies, as well as detailed analysis of their implications on carbohydrate reserves, bud fertility and future yield are desirable before these practices can be recommended.

Terroir analysis and its complexity

Terroir is not only a geographical site, but it is a more complex concept able to express the “collective knowledge of the interactions” between the environment and the vines mediated through human action and “providing distinctive characteristics” to the final product (OIV 2010). It is often treated and accepted as a “black box”, in which the relationships between wine and its origin have not been clearly explained. Nevertheless, it is well known that terroir expression is strongly dependent on the physical environment, and in particular on the interaction between soil-plant and atmosphere system, which influences the grapevine responses, grapes composition and wine quality. The Terroir studying and mapping are based on viticultural zoning procedures, obtained with different levels of know-how, at different spatial and temporal scales, empiricism and complexity in the description of involved bio-physical processes, and integrating or not the multidisciplinary nature of the terroir. The scientific understanding of the mechanisms ruling both the vineyard variability and the quality of grapes is one of the most important scientific focuses of terroir research. In fact, this know-how is crucial for supporting the analysis of climate change impacts on terroir resilience, identifying new promised lands for viticulture, and driving vineyard management toward a target oenological goal. In this contribution, an overview of the last findings in terroir studies and approaches will be shown with special attention to the terroir resilience analysis to climate change, facing the use and abuse of terroir concept and new technology able to support it and identifying the terroir zones.

Late season canopy management practices to reduce sugar loading and improve color profile of Cabernet-Sauvignon grapes and wines in the high irradiance and hot conditions of California Central Valley

Global warming is accelerating grape ripening, leading to unbalanced wines from fruit with high sugar content but poor aroma and colour development. Reducing the size of the photosynthetic apparatus after veraison has been shown to delay technological ripeness in cool climates, but methods have not been tested in areas with high irradiance and temperature where fruit exposure could have disastrous effects on berry composition. In this Cabernet-Sauvignon trial, we compared the application of an antitranspirant (pinolene), to severe canopy topping and above bunch zone leaf removal, all performed at mid-ripening, with an untouched control. We monitored the vines weekly by measuring stem water potential, gas exchange, fruit zone light exposure. We sampled berries to measure berry weight, total soluble solids, pH, titratable acidity, and the anthocyanin profile. At harvest, we assessed yield components, measured carbon isotope discrimination, rated sunburn on clusters, and produced experimental wines. We submitted harvest samples to metabolomic profiling through PFP-Q Exactive MS/MS and wines to sensory analysis. Application of the antitranspirant significantly reduced stomatal conductance and assimilation rate but did not affect the stem water potential. Inversely, leaf removal and topping increased water potential but did not affect leaf gas exchange. The late topping was the only treatment able to decrease sugar content (up to 2Bx), increase titratable acidity and pH, and improve anthocyanin content because of lower degradation of di-hydroxylated forms. Late leaf removal above the bunch zone increased lightning conditions in the canopy and produced the most significant damage on fruits. Yield components were not affected. This work suggests that late-season canopy management can effectively control ripening speeds and improve grapes and wines. Still, the effect on grape exposure in a critical time must be well balanced to avoid problems with the appropriate technique.

Effect of multi-level and multi-scale spectral data source on vineyard state assessment

Currently, the main goal of agriculture is to promote the resilience of agricultural systems in a sustainable way through the improvement of use efficiency of farm resources, increasing crop yield and quality under climate change conditions. This last is expected to drastically modify plant growth, with possible negative effects, especially in arid and semi-arid regions of Europe on the viticultural sector. In this context, the monitoring of spatial behavior of grapevine during the growing season represents an opportunity to improve the plant management, winegrowers’ incomes, and to preserve the environmental health, but it has additional costs for the farmer. Nowadays, UAS equipped with a VIS-NIR multispectral camera (blue, green, red, red-edge, and NIR) represents a good and relatively cheap solution to assess plant status spatial information (by means of a limited set of spectral vegetation indices), representing important support in precision agriculture management during the growing season. While differences between UAS-based multispectral imagery and point-based spectroscopy are well discussed in the literature, their impact on plant status estimation by vegetation indices is not completely investigated in depth. The aim of this study was to assess the performance level of UAS-based multispectral (5 bands across 450-800nm spectral region with a spatial resolution of 5cm) imagery, reconstructed high-resolution satellite (Sentinel-2A) multispectral imagery (13 bands across 400-2500 nm with spatial resolution of <2 m) through Convolutional Neural Network (CNN) approach, and point-based field spectroscopy (collecting 600 wavelengths across 400-1000 nm spectral region with a surface footprint of 1-2 cm) in a plant status estimation application, and then, using Bayesian regularization artificial neural network for leaf chlorophyll content (LCC) and plant water status (LWP) prediction. The test site is a Greco vineyard of southern Italy, where detailed and precise records on soil and atmosphere systems, in-vivo plant monitoring of eco-physiological parameters have been conducted.