Fully automated non-targeted GC-MS data analysis

Abstract

Non-targeted analysis is applied in many different domains of analytical chemistry such as metabolomics, environmental and food analysis. In contrast to targeted analysis, non-targeted approaches take information of known and unknown compounds into account, are inherently more comprehensive and give a more holistic representation of the sample composition. 

Besides chromatographic techniques coupled to high resolution mass spectrometry such as LC-HRMS, gas chromatography with unit resolution mass spectrometry is still regularly utilized for non-targeted profiling or fingerprinting. This is mainly due to high separation power of GC and a wide availability and low costs of quadrupole mass spectrometers. 

Although several non-targeted approaches have been developed, data processing still remains a serious bottleneck. Baseline correction, feature detection, and retention time alignment can be prone to errors and time-consuming manual corrections are often necessary. We therefore developed an automated strategy to non-targeted GC-MS data avoiding feature detection and retention time alignment. The novel automated approach includes segmentation of chromatograms along the retention time axis, multiway decomposition of transformed segments followed by a supervised machine learning pipeline based on gradient boosted tree classification on the decomposed tensor [1, 2]. 

In order to make this novel data analysis strategy available to scientists without programming background, we developed a convenient browser based application. For the here presented interactive browser application the open source Python packages Bokeh and HoloViews were used. The application will be online freely available soon. 

[1] J. Vestner, G. de Revel, S. Krieger-Weber, D. Rauhut, M. du Toit, A. de Villiers, Toward automated chromatographic fingerprinting: A non-alignment approach to gas chromatography mass spectrometry data. Acta Chimica Acta 911 (2016) 42-58 
[2] K. Sirén, U. Fischer, J. Vestner, Automated supervised learning pipeline for non-targeted GC-MS data analysis. Analytica Chimica Acta: X 1 (2019) 100005

DOI:

Publication date: June 19, 2020

Issue: OENO IVAS 2019

Type: Article

Authors

Jochen Vestner, Kimmo Sirén, Pierre Le Brun, Ulrich Fischer

Institute for Viticulture and Oenology, DLR Rheinpfalz, Breitenweg 71, D-67435 Neustadt, Germany
Institut National Supérieur des Sciences Agronomiques de l’Alimentation et de l’ Environnement, Agrosup Dijon, 6 boulevard Docteur Petitjean, 21000 Dijon, France
Department of Chemistry, University of Kaiserslautern, Erwin-Schroedinger-Strasse 52, D-67663 Kaiserslautern

Contact the author

Keywords

metabolomics, non-targeted, GC-MS, exploratory data analysis 

Tags

IVES Conference Series | OENO IVAS 2019

Citation

Related articles…

Impact of yeast derivatives to increase the phenolic maturity and aroma intensity of wine

Using viticultural and enological techniques to increase aromatics in white wine is a prized yet challenging technique for commercial wine producers. Equally difficult are challenges encountered in hastening phenolic maturity and thereby increasing color intensity in red wines. The ability to alter organoleptic and visual properties of wines plays a decisive role in vintages in which grapes are not able to reach full maturity, which is seen increasingly more often as a result of climate change. A new, yeast-based product on the viticultural market may give the opportunity to increase sensory properties of finished wines. Manufacturer packaging claims these yeast derivatives intensify wine aromas of white grape varieties, as well as improve phenolic ripeness of red varieties, but the effects of this application have been little researched until now. The current study applied the yeast derivative, according to the manufacture’s instructions, to the leaves of both neutral and aromatic white wine varieties, as well as on structured red wine varieties. Chemical parameters and volatile aromatics were analyzed in grape musts and finished wines, and all wines were subjected to sensory analysis by a tasting panel. Collective results of all analyses showed that the application of the yeast derivative in the vineyard showed no effect across all varieties examined, and did not intensify white wine aromatics, nor improve phenolic ripeness and color intensity in red wine.

Measurement of redox potential as a new analytical winegrowing tool

Excell laboratory has initiated the development of an analytical method based on electrochemistry to evaluate the ability of wines to undergo or resist to oxidative phenomena. Electrochemistry is a powerful tool to probe reactions involving electron transfers and offers possibility of real-time measurements. In that context, the laboratory has implemented electrochemical analysis to assess oxidation state of different wine matrices but also in order to evaluate oxidative or reduced character of leaf and soil. Initially, our laboratory focused on dosage of compounds involved in responses of plant stresses and we were also interested in microbiological activity of soils. These analyses were compared with the measurement of redox potential (Eh) and pH which are two fundamental variables involved in the modulation of plant metabolism. Indeed, the variation of redox states of the plant reflects its biological activity but also its capacity to absorb nutriments. The Eh-pH conditions mainly determine metabolic processes involved in soil and leaf and our goal is to determine if this combined analytical approach will be sufficiently precise to detect biological evolutions (plant health, parasitic attack…).

Better understand the soil wet bulb formation with subsurface or aerial drip irrigation in viticulture

The gradual change in rainfall patterns experienced in the south of France vineyards, especially around the Mediterranean sea, means that the vines are increasingly subject to summer drought. The winegrowers developped the use of irrigation techniques to ensure the maintenance of competitive yields in the production of wines under Protected Geographical Indication label. In practice, drip irrigation pipes can be installed above the ground or buried into the soil as well as at different distances from the vine row. The objective of this study was to examine the profiles of the wet bulbs of the soil obtained from two drip irrigation systems : aerial drip located under the vine row and subsurface drip placed in the middle of the inter-row. This experiment took place over two consecutive seasons (2020-2021) on a 3.4 ha Viognier plot in the Mediterranean region (PGI Oc, France) on sandy clay soil. The annual rainfalls were less than 400 mm. Soil water content probes were installed at different depths (20 – 40 – 60 – 80 cm) and at different lateralities from the vine row (30 – 60 – 90 – 120 cm) to control the formation of the soil wet bulb during irrigation. The mapping and the analysis of the data allowed a better understanding and differentiation of the water percolation when irrigating with subsurface or aerial drip. For the same amount of water and without differences of vine water status, it is shown that in a subsurface drip irrigation situation, the size of the wet bulb formed is larger than in aerial drip irrigation system.

VineyardFACE: Investigation of a moderate (+20%) increase of ambient CO2 level on berry ripening dynamics and fruit composition

Climate change and rising atmospheric carbon dioxide concentration is a concern for agriculture, including viticulture. Studies on elevated carbon dioxide have already been on grapevines, mainly taking place in greenhouses using potted plants or using field grown vines under higher CO2 enrichment, i.e. >650 ppm. The VineyardFACE, located at Hochschule Geisenheim University, is an open field Free Air CO2 Enrichment (FACE) experimental set-up designed to study the effects of elevated carbon dioxide using field grown vines (Vitis vinifera L. cvs. Riesling and Cabernet Sauvignon). As the carbon dioxide fumigation started in 2014, the long term effects of elevated carbon dioxide treatment can be investigated on berry ripening parameters and fruit metabolic composition.
The present study aims to investigate the effect on fruit composition under a moderate increase (+20%; eCO2) of carbon dioxide concentration, as predicted for 2050 on both Riesling and Cabernet Sauvignon. Berry composition was determined for primary (sugars, organic acids, amino acids) and secondary metabolites (anthocyanins). Special focus was given on monitoring of berry diameter and ripening rates throughout three growing seasons. Compared to previous results of the early adaptative phase of the vines [1], our results show little effect of eCO2 treatment on primary metabolites composition in berries. However, total anthocyanins concentration in berry skin was lower for eCO2 treatment in 2020, although the ratio between anthocyanins derivatives did not differ.
[1] Wohlfahrt Y., Tittmann S., Schmidt D., Rauhut D., Honermeier B., Stoll M. (2020) The effect of elevated CO2 on berry development and bunch structure of Vitis vinifera L. cvs. Riesling and Cabernet Sauvignon. Applied Science Basel 10: 2486

Comparison of imputation methods in long and varied phenological series. Application to the Conegliano dataset, including observations from 1964 over 400 grape varieties

A large varietal collection including over 1700 varieties was maintained in Conegliano, ITA, since the 1950s. Phenological data on a subset of 400 grape varieties including wine grapes, table grapes, and raisins were acquired at bud break, flowering, veraison, and ripening since 1964. Despite the efforts in maintaining and acquiring data over such an extensive collection, the data set has varying degrees of missing cases depending on the variety and the year. This is ubiquitous in phenology datasets with significant size and length. In this work, we evaluated four state-of-the-art methods to estimate missing values in this phenological series: k-Nearest Neighbour (kNN), Multivariate Imputation by Chained Equations (mice), MissForest, and Bidirectional Recurrent Imputation for Time Series (BRITS). For each phenological stage, we evaluated the performance of the methods in two ways. 1) On the full dataset, we randomly hold-out 10% of the true values for use as a test set and repeated the process 1000 times (Monte Carlo cross-validation). 2) On a reduced and almost complete subset of varieties, we varied the percentage of missing values from 10% to 70% by random deletion. In all cases, we evaluated the performance on the original values using normalized root mean squared error. For the full dataset we also obtained performance statistics by variety and by year. MissForest provided average errors of 17% (3 days) at budbreak, 14% (4 days) at flowering, 14.5% (7 days) at veraison, and 17% (3 days) at maturity. We completed the imputations of the Conegliano dataset, one of the world’s most extensive and varied phenological time series and a steppingstone for future climate change studies in grapes. The dataset is now ready for further analysis, and a rigorous evaluation of imputation errors is included.