Fully automated non-targeted GC-MS data analysis

Abstract

Non-targeted analysis is applied in many different domains of analytical chemistry such as metabolomics, environmental and food analysis. In contrast to targeted analysis, non-targeted approaches take information of known and unknown compounds into account, are inherently more comprehensive and give a more holistic representation of the sample composition. 

Besides chromatographic techniques coupled to high resolution mass spectrometry such as LC-HRMS, gas chromatography with unit resolution mass spectrometry is still regularly utilized for non-targeted profiling or fingerprinting. This is mainly due to high separation power of GC and a wide availability and low costs of quadrupole mass spectrometers. 

Although several non-targeted approaches have been developed, data processing still remains a serious bottleneck. Baseline correction, feature detection, and retention time alignment can be prone to errors and time-consuming manual corrections are often necessary. We therefore developed an automated strategy to non-targeted GC-MS data avoiding feature detection and retention time alignment. The novel automated approach includes segmentation of chromatograms along the retention time axis, multiway decomposition of transformed segments followed by a supervised machine learning pipeline based on gradient boosted tree classification on the decomposed tensor [1, 2]. 

In order to make this novel data analysis strategy available to scientists without programming background, we developed a convenient browser based application. For the here presented interactive browser application the open source Python packages Bokeh and HoloViews were used. The application will be online freely available soon. 

[1] J. Vestner, G. de Revel, S. Krieger-Weber, D. Rauhut, M. du Toit, A. de Villiers, Toward automated chromatographic fingerprinting: A non-alignment approach to gas chromatography mass spectrometry data. Acta Chimica Acta 911 (2016) 42-58 
[2] K. Sirén, U. Fischer, J. Vestner, Automated supervised learning pipeline for non-targeted GC-MS data analysis. Analytica Chimica Acta: X 1 (2019) 100005

DOI:

Publication date: June 19, 2020

Issue: OENO IVAS 2019

Type: Article

Authors

Jochen Vestner, Kimmo Sirén, Pierre Le Brun, Ulrich Fischer

Institute for Viticulture and Oenology, DLR Rheinpfalz, Breitenweg 71, D-67435 Neustadt, Germany
Institut National Supérieur des Sciences Agronomiques de l’Alimentation et de l’ Environnement, Agrosup Dijon, 6 boulevard Docteur Petitjean, 21000 Dijon, France
Department of Chemistry, University of Kaiserslautern, Erwin-Schroedinger-Strasse 52, D-67663 Kaiserslautern

Contact the author

Keywords

metabolomics, non-targeted, GC-MS, exploratory data analysis 

Tags

IVES Conference Series | OENO IVAS 2019

Citation

Related articles…

VineyardFACE: Investigation of a moderate (+20%) increase of ambient CO2 level on berry ripening dynamics and fruit composition

Climate change and rising atmospheric carbon dioxide concentration is a concern for agriculture, including viticulture. Studies on elevated carbon dioxide have already been on grapevines, mainly taking place in greenhouses using potted plants or using field grown vines under higher CO2 enrichment, i.e. >650 ppm. The VineyardFACE, located at Hochschule Geisenheim University, is an open field Free Air CO2 Enrichment (FACE) experimental set-up designed to study the effects of elevated carbon dioxide using field grown vines (Vitis vinifera L. cvs. Riesling and Cabernet Sauvignon). As the carbon dioxide fumigation started in 2014, the long term effects of elevated carbon dioxide treatment can be investigated on berry ripening parameters and fruit metabolic composition.
The present study aims to investigate the effect on fruit composition under a moderate increase (+20%; eCO2) of carbon dioxide concentration, as predicted for 2050 on both Riesling and Cabernet Sauvignon. Berry composition was determined for primary (sugars, organic acids, amino acids) and secondary metabolites (anthocyanins). Special focus was given on monitoring of berry diameter and ripening rates throughout three growing seasons. Compared to previous results of the early adaptative phase of the vines [1], our results show little effect of eCO2 treatment on primary metabolites composition in berries. However, total anthocyanins concentration in berry skin was lower for eCO2 treatment in 2020, although the ratio between anthocyanins derivatives did not differ.
[1] Wohlfahrt Y., Tittmann S., Schmidt D., Rauhut D., Honermeier B., Stoll M. (2020) The effect of elevated CO2 on berry development and bunch structure of Vitis vinifera L. cvs. Riesling and Cabernet Sauvignon. Applied Science Basel 10: 2486

Use of multispectral satellite for monitoring vine water status in mediterranean areas

The development of new generations of multispectral satellites such as Sentinel-2 opens possibilities as to vine water status assessment (Cohen et al., 2019). Based on a three years field campaign, a model of Stem Water Potential (SWP) estimation on vine using four satellite bands in Red, Red-Edge, NIR and SWIR domains was developed (Laroche-Pinel et al., 2021). The model relies on SWP field measures done using a pressure chamber (Scholander et al., 1965), which is a common, robust and precise method to assess vine water status (Acevedo-Opazo et al., 2008). The model was mainly developed from from SWP measures on Syrah N (Laroche Pinel E., 2021).

A large scale monitoring was organized in different vineyards in the Mediterranean region in 2021. 10 varieties amongst the most represented in this area were monitored (Cabernet sauvignon N, Chardonnay B, Cinsault N, Grenache N, Merlot N, Mourvèdre N, Sauvignon B, Syrah N, Vermentino B, Viognier B). The model was used to produce water status maps from Sentinel-2 images, starting from the beginning of June (fruit set) up to September (harvest). The average estimated SWP for each vine was compared to actual field SWP measures done by wine growers or technicians during usual monitoring of irrigation programs. The correlations between mean estimated SWP and mean measured SWP were at the same level than expected by the model. (Laroche Pinel, 2021) The general SWP kinetics were comparable. The estimated SWP would have led to same irrigation decisions concerning the date of first irrigation in comparison with measured SWP.

Acevedo-Opazo, C., Tisseyre, B., Ojeda, H., Ortega-Farias, S., Guillaume, S. (2008). Is it possible to assess the spatial variability of vine water status? OENO One, 42(4), 203.
Cohen, Y., Gogumalla, P., Bahat, I., Netzer, Y., Ben-Gal, A., Lenski, I., … Helman, D. (2019). Can time series of multispectral satellite images be used to estimate stem water potential in vineyards? In Precision agriculture ’19, The Netherlands: Wageningen Academic Publishers, pp. 445–451.
Laroche-Pinel, E., Duthoit, S., Albughdadi, M., Costard, A. D., Rousseau, J., Chéret, V., & Clenet, H. (2021). Towards vine water status monitoring on a large scale using sentinel-2 images. remote sensing, 13(9), 1837.
Laroche-Pinel,E. (2021). Suivi du statut hydrique de la vigne par télédétection hyper et multispectrale. Thèse INP Toulouse, France.
Scholander, P.F., Bradstreet, E.D., Hemmingsen, E.A., & Hammel, H.T. (1965). Sap pressure in vascular plants: Negative hydrostatic pressure can be measured in plants. Science, 148(3668), 339–346.

Grape berry size is a key factor in determining New Zealand Pinot noir wine composition

Making high quality but affordable Pinot noir (PN) wine is challenging in most terroirs and New Zealand’s (NZ) situation is no exception. To increase the probability of making highly typical PN wines producers choose to grow grapes in cool climates on lower fertility soils while adopting labour intensive practices. Stringent yield targets and higher input costs necessarily mean that PN wine cost is high, and profitability lower, in line-priced varietal wine ranges. To understand the reasons why higher yielding vines are perceived to produce wines of lower quality we have undertaken an extensive study of PN in NZ. Since 2018, we established a network of twelve trial sites in three NZ regions to find individual vines that produced acceptable commercial yields (above 2.5kg per vine) and wines of composition comparable to “Icon” labels. Approximately 20% of 660 grape lots (N = 135) were selected from within a narrow juice Total Soluble Solids (TSS) range and made into single vine wines under controlled conditions. Principal Component Analysis of the vine, berry, juice and wine parameters from three vintages found grape berry mass to be most effective clustering variable. As berry mass category decreased there was a systematic increase in the probability of higher berry red colour and total phenolics with a parallel increase in wine phenolics, changed aroma fraction and decreased juice amino acids. The influence of berry size on wine composition would appear stronger than the individual effects of vintage, region, vineyard or vine yield. Our observations support the hypothesis that it is possible to produce PN wines that fall within an “Icon” benchmark composition range at yields above 2.5kg per vine provided that the Leaf Area:Fruit Weight ratio is above 12cm2 per g, mean berry mass is below 1.2g and juice TSS is above 22°Brix.

Grapevine varietal diversity as mitigation tool for climate change: Agronomic and oenologic potential of 14 foreign varieties grown in Languedoc region (France)

Climate change effects in Languedoc include an expected rise in temperatures, increased evapotranspiration as well as more severe and frequent climatic hazards, such as frost, drought periods and heat waves. For winegrowers theses phenomena impact both yield and quality, resulting in more frequent unbalanced wines. Research on identified mitigation tools for vineyard management is necessary to improve resilience of grapevine agrosystems. Varietal assortment is one of them. This study focuses on agronomic and oenologic potential of 14 foreign varieties grown in Languedoc French region. Fourteen grapevine varieties were monitored during 2021 from June until harvest on eight different sites, some of which occurring on more than one site adding up to 21 different modalities: 7 white varieties Alvarinho B, Assyrtiko B (2), Malvasia Istriana B, Parellada B, Verdejo B, Verdelho B, Xarello B, and 7 black varieties Saperavi N (2), Touriga nacional N, Baga N, Aleatico N, Montepulciano N (2), Primitivo N (3), Calabrese N (3). Varietals were compared through the following parameters: phenology was assessed by using the information collected in the Database Network of French Vine Conservatories (INRAE-SupAgro-IFV, 2005-2015). The number of inflorescences for shoots from secondary buds and bourillons and suckers were observed to assess post-bud break frost tolerance potential. Grapevine water status was studied through stem water potential measurement, observation of foliage symptoms of drought, and 𝛿13C on must. Frequencies and intensities of downy mildew, powdery mildew, and black rot attacks were estimated before harvest on leaves and clusters and botrytis at harvest to assess disease susceptibilities. Berry composition was monitored from end of veraison until harvest. Yield and mean bunch weight were also calculated. Varieties were then ranked on a 1-4 scale for each parameter and compared through PCA. Forty two stations of the Mediterranean basin were compared by PCA with the Multicriteria Climatic Classification indicators in order to confront the collected information during 2021 campaign to the hypothesis that plants coming from dry and hot regions are genetically adapted to such climatic conditions.

Protected Designation of Origin (D.P.O.) Valdepeñas: classification and map of soils

The objective of the work described here is the elaboration of a map of the different types of vineyard soils that to guide the famers in the choice of the most productive vine rootstocks and varieties. 90 vineyard soils profiles were analysed in the entire territory of the Origen Denominations of Valdepeñas. The sampling was carried out in 2018 (June to October) by making a sampling grid, followed by photointerpretation and control in the field. The studied soils can be grouped into 9 different soil types (according to FAO 2006 classification): Leptosols, Regosols, Fluvisols, Gleysols, Cambisols, Calcisols, Luvisols and Anthrosols. A map showing the soil distribution with different type of soils has been made with the ArcGIS program. Regarding to the choice of rootstock, Calcisoles are soils with a high active limestone content, so the rootstocks used in these soils must be resistant to this parameter; Luvisols are deep soils with high clay content, so they will support vigorous rootstocks. Because the cartographic units are composed of two or more subgroups, with are associated in variable proportions, 9 different soil associations have been established; Unit 1: Leptosols, Cambisols and Luvisols (80%, 15% and 5% respectively); Unit 2: Cambisols with Regosols and Luvisols (40%, 30% and 30% respectively); Unit 3: Cambisols and Gleysols with Regosols (40%, 40% and 20% respectively); Unit 4: Regosols with Cambisols, Leptosols and Calcisols (40%, 30%, 15% and 15% respectively); Unit 5: Cambisols, Leptosols, Calcisols and Regosols (25% each of them); Unit 6: Luvisols with Cambisol and Calcisols (80%, 10% and 10% respectively); Unit 7: Luvisols and Calcisols with Cambisols (40%, 40% and 20% respectively); Unit 8: Calcisols with, Cambisols and Luvisols (80%, 10% and 10% respectively); Unit 9: Anthrosols. These study allow to elaborate the first map of vineyard soils of this Protected Designation of Origin in Castilla-La Mancha.