Fully automated non-targeted GC-MS data analysis

Abstract

Non-targeted analysis is applied in many different domains of analytical chemistry such as metabolomics, environmental and food analysis. In contrast to targeted analysis, non-targeted approaches take information of known and unknown compounds into account, are inherently more comprehensive and give a more holistic representation of the sample composition. 

Besides chromatographic techniques coupled to high resolution mass spectrometry such as LC-HRMS, gas chromatography with unit resolution mass spectrometry is still regularly utilized for non-targeted profiling or fingerprinting. This is mainly due to high separation power of GC and a wide availability and low costs of quadrupole mass spectrometers. 

Although several non-targeted approaches have been developed, data processing still remains a serious bottleneck. Baseline correction, feature detection, and retention time alignment can be prone to errors and time-consuming manual corrections are often necessary. We therefore developed an automated strategy to non-targeted GC-MS data avoiding feature detection and retention time alignment. The novel automated approach includes segmentation of chromatograms along the retention time axis, multiway decomposition of transformed segments followed by a supervised machine learning pipeline based on gradient boosted tree classification on the decomposed tensor [1, 2]. 

In order to make this novel data analysis strategy available to scientists without programming background, we developed a convenient browser based application. For the here presented interactive browser application the open source Python packages Bokeh and HoloViews were used. The application will be online freely available soon. 

[1] J. Vestner, G. de Revel, S. Krieger-Weber, D. Rauhut, M. du Toit, A. de Villiers, Toward automated chromatographic fingerprinting: A non-alignment approach to gas chromatography mass spectrometry data. Acta Chimica Acta 911 (2016) 42-58 
[2] K. Sirén, U. Fischer, J. Vestner, Automated supervised learning pipeline for non-targeted GC-MS data analysis. Analytica Chimica Acta: X 1 (2019) 100005

DOI:

Publication date: June 19, 2020

Issue: OENO IVAS 2019

Type: Article

Authors

Jochen Vestner, Kimmo Sirén, Pierre Le Brun, Ulrich Fischer

Institute for Viticulture and Oenology, DLR Rheinpfalz, Breitenweg 71, D-67435 Neustadt, Germany
Institut National Supérieur des Sciences Agronomiques de l’Alimentation et de l’ Environnement, Agrosup Dijon, 6 boulevard Docteur Petitjean, 21000 Dijon, France
Department of Chemistry, University of Kaiserslautern, Erwin-Schroedinger-Strasse 52, D-67663 Kaiserslautern

Contact the author

Keywords

metabolomics, non-targeted, GC-MS, exploratory data analysis 

Tags

IVES Conference Series | OENO IVAS 2019

Citation

Related articles…

Grapevine yield-gap: identification of environmental limitations by soil and climate zoning in Languedoc-Roussillon region (south of France)

Grapevine yield has been historically overlooked, assuming a strong trade-off between grape yield and wine quality. At present, menaced by climate change, many vineyards in Southern France are far from the quality label threshold, becoming grapevine yield-gaps a major subject of concern. Although yield-gaps are well studied in arable crops, we know very little about grapevine yield-gaps. In the present study, we analysed the environmental component of grapevine yield-gaps linked to climate and soil resources in the Languedoc Roussillon. We used SAFRAN data and IGP Pays d’Oc wine yields from 2010 to 2018. We selected climate and soil indicators proving to have a significant effect on average wine yield-gaps at the municipality scale. The most significant factors of grapevine yield were the Soil Available Water Capacity; followed by the Huglin Index and the Climatic Dryness Index. The Days of Frost; the Soil pH; and the Very Hot Days were also significant. Then, we clustered geographical zones presenting similar indicators, facilitating the identification of resources yield-gaps. We discussed the number of zones with the experts of IGP Pays d’Oc label, obtaining 7 zones with similar limitations for grapevine yield. Finally, we analysed the main resources causing yield-gaps and the grapevine varieties planted on each zone. Mapping grapevine resource yield-gaps are the first stage for understanding grapevine yield-gaps at the regional scale.

Updating the Winkler index: An analysis of Cabernet sauvignon in Napa Valley’s varied and changing climate

This study aims to create an updated, agile viticultural climate index (similar to the Winkler Index) by performing in-depth analyses of current and historical data from industry partners in several major winegrowing regions. The Winkler Index was developed in the early twentieth century based on analysis of various grape-growing regions in California. The index uses heat accumulation (i.e. Growing Degree Days) throughout the growing season to determine which grape varieties are best suited to each region. As viticultural regions are increasingly subject to the complexity and uncertainty of a changing climate, a more rigorous, agile model is needed to aid grape growers in determining which cultivars to plant where. For the first phase of this study, 21 industry partners throughout Napa Valley shared historical phenology, harvest, viticultural practice, and weather data related to their Cabernet sauvignon vineyard blocks. To complement this data, berry samples were collected throughout the 2021 growing season from 50 vineyard blocks located throughout 16 American Viticultural Areas that were then analyzed for basic berry chemistry and phenolics. These blocks have been mapped using a Geographic Information System (GIS), enabling analysis of altitude, vineyard row orientation, slope, and remotely sensed climate data. Sampling sites were also chosen based on their proximity to a weather station. By analyzing historical data from industry partners and data specifically collected for this study, it is possible to identify key parameters for further analysis. Initial results indicate extreme variability at a high spatial resolution not currently accounted for in modern viticultural climate indices and suggest that viticultural practices play a major role. Using the structure of data collection and analyses developed for the first phase, this project will soon be expanded to other wine regions globally, while continuing data collection in Napa Valley.

Impact of changes in pruning practices on vine growth and yield

A gradual decline in vineyards has been observed over the past twenty years worldwide. This might be explained by the climate change, practices change or the increase of dieback diseases. To increase the longevity of vines, we studied the impact of different pruning strategies in four adult and four young vineyards located in France and Spain. In France, vineyards were planted with Cabernet franc on 3309C while Spanish trials were planted with Tempranillo grafted on 110R. Vegetative expression, yield, quality of berries and wood vessels conductivity were measured. The distribution of vegetative expression, yield and berry composition between primary and secondary vegetation were quantified. Finally, tomography was used to evaluate the implication of the treatments on sap flows.
First results show that i) the respectful pruning leads to an increase of 30 to 50% more secondary shoots than the aggressive pruning in France and between 15 and 20% in Spain, ii) there is no major effect on the yield over the first two years following the implementation of the new pruning practices, although the proportion of clusters from suckers is higher on the respectful pruning method. On young vines, the development of the trunk according to a respectful pruning leads to a loss of harvest 2 years after planting. This is due to the removal, on the future trunk, of the green suckers which carrying bunches. This operation carried out in spring rather than during winter pruning, would promote a better leaf / fruit balance when the plant comes into production, and could lead to better hydraulic conduction in the vessels of the trunk. Maintaining these trials for several years will provide more robust data to assess the impact of these practices on the vines over the long term.

The use of rootstock as a lever in the face of climate change and dieback of vineyard

As viticulture faces challenges such as climate change or vineyard dieback, the choice of the variety and rootstock becomes more and more crucial. To study rootstock levers in the Bordeaux region, a parcel of Cabernet Sauvignon (CS) was planted with four rootstocks in 2014. Twenty repetitions of each of the following four rootstocks were set up: 101-14 MGt, Nemadex AB, 420A MGt and Gravesac. The number of bunches, yields and pruning weights of the vine shoots were measured individually on 240 vines from 2017 to 2021. Since 2020, nitrogen status assessed by assimilable nitrogen level, hydric status assessed by δ13C and berry maturity were measured on 80 samples taken from 20 repetitions of the four rootstocks. A lower yield was measured for CS grafted onto Nemadex AB due to the lower number of bunches and the lower weight of berries. The differences between the other three rootstocks are small, but CS grafted onto 420A MGt was the most productive. The CS grafted onto Nemadex AB had the lowest pruning weight while 101-14 MGt had the highest. In 2020, δ13C showed a more moderate water stress with 101-14 MGt and 420A MGt than with Nemadex AB. Surprisingly, the Gravesac was under more stress than the 101-14 MGt. The nitrogen status in the berries was better for Nemadex AB but this was perhaps due to the significantly lower weight of the berries.Rootstock 101-14 MGt attained the highest accumulation of sugars in the berries while 420A MGt allows to preserve higher acidity. The parcel is still young which may explain some of the results. These measures must therefore be continued over the next several years to fully assess the effects of these rootstocks on the development of the vines and the quality of the production under new climatic conditions.

Heatwaves and grapevine yield in the Douro region, crop model simulations

Heatwaves or extreme heat events can be particularly harmful to agriculture. Grapevines grown in the Douro winemaking region are particularly exposed to this threat, due to the specificities of the already warm and dry climatic conditions. Furthermore, climate change simulations point to an increase in the frequency of occurrence of these extreme heat events, therefore posing a major challenge to winegrowers in the Mediterranean type climates. The current study focuses on the application of the STICS crop model to assess the potential impacts of heatwaves in grapevine yields over the Douro valley winemaking region. For this purpose, STICS was applied to grapevines using high-resolution weather, soil and terrain datasets over the Douro. To assess the impact of heatwaves, the weather dataset (1989-2005) was artificially modified, generating periods with anomalously high temperatures (+5 ºC), at certain onset dates and with specific durations (from 5 to 9 days). The model was run with this modified weather dataset and results were compared to the original unmodified runs. The results show that heatwaves can have a very strong impact on grapevine yields, strongly depending on the onset dates and duration of the heatwaves. The highest negative impacts may result in a decrease in the yield by up to -35% in some regions. Despite some uncertainties inherent to the current modelling assessment, the present study highlights the negative impacts of heatwaves on viticultural yields in the Douro region, which is critical information for stakeholders within the winemaking sector for planning suitable adaptation measures.