Fully automated non-targeted GC-MS data analysis

Abstract

Non-targeted analysis is applied in many different domains of analytical chemistry such as metabolomics, environmental and food analysis. In contrast to targeted analysis, non-targeted approaches take information of known and unknown compounds into account, are inherently more comprehensive and give a more holistic representation of the sample composition. 

Besides chromatographic techniques coupled to high resolution mass spectrometry such as LC-HRMS, gas chromatography with unit resolution mass spectrometry is still regularly utilized for non-targeted profiling or fingerprinting. This is mainly due to high separation power of GC and a wide availability and low costs of quadrupole mass spectrometers. 

Although several non-targeted approaches have been developed, data processing still remains a serious bottleneck. Baseline correction, feature detection, and retention time alignment can be prone to errors and time-consuming manual corrections are often necessary. We therefore developed an automated strategy to non-targeted GC-MS data avoiding feature detection and retention time alignment. The novel automated approach includes segmentation of chromatograms along the retention time axis, multiway decomposition of transformed segments followed by a supervised machine learning pipeline based on gradient boosted tree classification on the decomposed tensor [1, 2]. 

In order to make this novel data analysis strategy available to scientists without programming background, we developed a convenient browser based application. For the here presented interactive browser application the open source Python packages Bokeh and HoloViews were used. The application will be online freely available soon. 

[1] J. Vestner, G. de Revel, S. Krieger-Weber, D. Rauhut, M. du Toit, A. de Villiers, Toward automated chromatographic fingerprinting: A non-alignment approach to gas chromatography mass spectrometry data. Acta Chimica Acta 911 (2016) 42-58 
[2] K. Sirén, U. Fischer, J. Vestner, Automated supervised learning pipeline for non-targeted GC-MS data analysis. Analytica Chimica Acta: X 1 (2019) 100005

DOI:

Publication date: June 19, 2020

Issue: OENO IVAS 2019

Type: Article

Authors

Jochen Vestner, Kimmo Sirén, Pierre Le Brun, Ulrich Fischer

Institute for Viticulture and Oenology, DLR Rheinpfalz, Breitenweg 71, D-67435 Neustadt, Germany
Institut National Supérieur des Sciences Agronomiques de l’Alimentation et de l’ Environnement, Agrosup Dijon, 6 boulevard Docteur Petitjean, 21000 Dijon, France
Department of Chemistry, University of Kaiserslautern, Erwin-Schroedinger-Strasse 52, D-67663 Kaiserslautern

Contact the author

Keywords

metabolomics, non-targeted, GC-MS, exploratory data analysis 

Tags

IVES Conference Series | OENO IVAS 2019

Citation

Related articles…

Soil quality in Beaujolais vineyard. Importance of pedology and cultural practices

A pedological study was carried out from 2009 to 2017 in Beaujolais vineyard, to improve physical and chemical knowledge of soils. It was completed in 2016 and 2017 by the current study, dealing with microbial aspects, in order to build a reference frame for improved advice in soil management. Microbial biomass was measured on representative plots of the six most common soil types identified in Beaujolais and, for each soil type, on plots with different levels of the main impacting parameters: total organic carbon, pH, cation exchange capacity, extractable copper. A total of 59 soil samples were collected. Confirming the results of various trials carried out in Beaujolais over the past 20 years, the results of the present study showed that the soils were still alive, but exhibited a large variability of biological parameters, which appeared dependant on both pedological and anthropic factors. Therefore, a good interpretation of biological parameters and advice for vine growers must rely on a pedologically-based referential with differentiated main driving factors. For example, the control of pH is of primary importance in granitic soils and in no way organic matter addition can improve soil quality if pH is too low. Conversely, in calcareous soils, biological parameters are more directly affected by direct or indirect (cover crops for example) inputs of organic matter. The use of biological parameters, such as microbial biomass, is of great potential value to improve advice on agro-viticultural practices (soil management, fertilization, liming, etc.), basis of a sustainable wine production on fragile soils.

Organic recycled mulches in sustainable viticulture: assessment of spontaneous plants communities and weed coverage

In recent years, developing more efficient and sustainable viticulture management has been essential due to the impact of climate change in semiarid regions. For this reason, the use of recycled organic mulching (ROM) in the vineyard has become an interesting strategy to cope with water stress, isolated soil from extreme temperatures and improving soil humidity, control the presence of weeds and therefore reduce the inputs of herbicides and improve soil fertility. This work aimed to analyse the effect of three different organic mulches [straw (S), grape pruning debris (GPD) and spent mushroom compost (SMC)] and two traditional soil management techniques [herbicide (H) and interrow (IN)] on weed coverage and the spontaneous plant communities’ presence. Data sampling was collected throughout the vine vegetative cycle of 2021 in La Rioja, Spain. The different soil management techniques had a clear effect on weed coverage and his development during the vine vegetative cycle. SMC and H were the treatments with the highest and the lowest coverage percentage, respectively. IN had a delayed weed emergence at the beginning of the vine vegetative cycle, but finally it reached maximum values nearby SMC. GPD and S had similar effects on weed emergence, reaching 25-30% of the maximum coverage values. A total of 29 herbaceous species were identified during the vegetative cycle, some of them very isolated and occasional. Principal component analysis (PCAs) showed a good association between spontaneous species and treatments, furthermore, specific species-treatment associations were found. Moreover, three clear groups of herbaceous communities were identified by cluster analysis. This study provides interesting information about the effect of different alternative soil management on herbaceous plant coverage and weed species communities which could contribute to making more sustainable viticulture.

Anthocyanin profile is differentially affected by high temperature, elevated CO2 and water deficit in Tempranillo (Vitis vinifera L.) clones

Anthocyanin potential of grape berries is an important quality factor in wine production. Anthocyanin concentration and profile differ among varieties but it also depends on the environmental conditions, which are expected to be greatly modified by climate change in the future. These modifications may significantly modify the biochemical composition of berries at harvest, and thus wine typicity. Among the diverse approaches proposed to reduce the potential negative effects that climate change may have on grape quality, genetic diversity among clones can represent a source of potential candidates to select better adapted plant material for future climatic conditions. The effects of individual and combined factors associated to climate change (increase of temperature, rise of air CO2 concentration and water deficit) on the anthocyanin profile of different clones of Tempranillo that differ in the length of their reproductive cycle were studied. The aim was to highlight those clones more adapted to maintain specific Tempranillo typicity in the future. Fruit-bearing cuttings were grown in controlled conditions under two temperatures (ambient temperature versus ambient temperature + 4ºC), two CO2 levels (400 ppm versus 700 ppm) and two water regimes (well-watered versus water deficit), both in combination or independently, in order to simulate future climate change scenarios. Elevated temperature increased anthocyanin acylation, whereas elevated CO2 and water deficit favoured the accumulation of malvidin derivatives, as well as the acylation and tri-hydroxylation level of anthocyanins. Although the changes in anthocyanin profile observed followed a common pattern among clones, such impact of environmental conditions was especially noticeable in one of the most widely distributed Tempranillo clones, the accession RJ43.

Rapid damage assessment and grapevine recovery after fire

There is increasing scientific consensus that climate changeis the underlying cause of the prolonged dry and hot conditions that have increased the risk of extreme fire weather in many countries around the world. In December 2019, a bushfire event occurred in the Adelaide Hills, South Australia where 25,000 hectares were burnt and in vineyards and surrounding areas various degrees of scorching and infrastructure damage occurred. The ability to coordinate and plan recovery after a fire event relies on robust and timely data. The current practice for measuring the scale and distribution of fire damage is to walk or drive the vineyard and score individual vines based on visual observation. The process is time consuming, subjective, or semi-quantitative at best. After the December 2019 fires, it took many months to access properties and estimate the area of vineyard damaged. This study compares the rapid assessment and mapping of fire damage using high-resolution satellite imagery with more traditional ground based measures. Satellite imagery tracking vineyard recovery in the season following the bushfire is being correlated to field assessments of vineyard productivity such as canopy health and development, fertility and carbohydrate storage. Canopy health in the seasons following the fires correlated to the severity of the initial fire damage. Severely damaged vines had reduced canopy growth, were infertile or had very low fertility as well as lower carbohydrate levels in buds and canes during dormancy, which reduced productivity in the seasons following the bushfire event. In contrast, vines that received minor damage were able to recover within 1-2 years. Tools that rapidly and affordably capture the extent and severity of damage over large vineyard area will allow producers, government and industry bodies to manage decisions in relation to fire recovery planning, coordination and delivery, improving the efficiency and effectiveness of their response.

Underpinning terroir with data: rethinking the zoning paradigm

Agriculture, natural resource management and the production and sale of products such as wine are increasingly data-driven activities. Thus, the use of remote and proximal crop and soil sensors to aid management decisions is becoming commonplace and ‘Agtech’ is proliferating commercially; mapping, underpinned by geographical information systems and complex methods of spatial analysis, is widely used. Likewise, the chemical and sensory analysis of wines draws on multivariate statistics; the efficient winery intake of grapes, subsequent production of wines and their delivery to markets relies on logistics; whilst the sales and marketing of wines is increasingly driven by artificial intelligence linked to the recorded purchasing behaviour of consumers. In brief, there is data everywhere!

Opinions will vary on whether these developments are a good thing. Those concerned with the ‘mystique’ of wine, or the historical aspects of terroir and its preservation, may find them confronting. In contrast, they offer an opportunity to those interested in the biophysical elements of terroir, and efforts aimed at better understanding how these impact on vineyard performance and the sensory attributes of resultant wines. At the previous Terroir Congress, we demonstrated the potential of analytical methods used at the within-vineyard scale in the development of Precision Viticulture, in contributing to a quantitative understanding of regional terroir. For this conference, we take this approach forward with examples from contrasting locations in both the northern and southern hemispheres. We show how, by focussing on the vineyards within winegrowing regions, as opposed to all of the land within those regions, we might move towards a more robust terroir zoning than one derived from a mixture of history, thematic mapping, heuristics and the whims of marketers. Aside from providing improved understanding by underpinning terroir with data, such methods should also promote improved management of the entire wine value chain.