Fully automated non-targeted GC-MS data analysis

Abstract

Non-targeted analysis is applied in many different domains of analytical chemistry such as metabolomics, environmental and food analysis. In contrast to targeted analysis, non-targeted approaches take information of known and unknown compounds into account, are inherently more comprehensive and give a more holistic representation of the sample composition. 

Besides chromatographic techniques coupled to high resolution mass spectrometry such as LC-HRMS, gas chromatography with unit resolution mass spectrometry is still regularly utilized for non-targeted profiling or fingerprinting. This is mainly due to high separation power of GC and a wide availability and low costs of quadrupole mass spectrometers. 

Although several non-targeted approaches have been developed, data processing still remains a serious bottleneck. Baseline correction, feature detection, and retention time alignment can be prone to errors and time-consuming manual corrections are often necessary. We therefore developed an automated strategy to non-targeted GC-MS data avoiding feature detection and retention time alignment. The novel automated approach includes segmentation of chromatograms along the retention time axis, multiway decomposition of transformed segments followed by a supervised machine learning pipeline based on gradient boosted tree classification on the decomposed tensor [1, 2]. 

In order to make this novel data analysis strategy available to scientists without programming background, we developed a convenient browser based application. For the here presented interactive browser application the open source Python packages Bokeh and HoloViews were used. The application will be online freely available soon. 

[1] J. Vestner, G. de Revel, S. Krieger-Weber, D. Rauhut, M. du Toit, A. de Villiers, Toward automated chromatographic fingerprinting: A non-alignment approach to gas chromatography mass spectrometry data. Acta Chimica Acta 911 (2016) 42-58 
[2] K. Sirén, U. Fischer, J. Vestner, Automated supervised learning pipeline for non-targeted GC-MS data analysis. Analytica Chimica Acta: X 1 (2019) 100005

DOI:

Publication date: June 19, 2020

Issue: OENO IVAS 2019

Type: Article

Authors

Jochen Vestner, Kimmo Sirén, Pierre Le Brun, Ulrich Fischer

Institute for Viticulture and Oenology, DLR Rheinpfalz, Breitenweg 71, D-67435 Neustadt, Germany
Institut National Supérieur des Sciences Agronomiques de l’Alimentation et de l’ Environnement, Agrosup Dijon, 6 boulevard Docteur Petitjean, 21000 Dijon, France
Department of Chemistry, University of Kaiserslautern, Erwin-Schroedinger-Strasse 52, D-67663 Kaiserslautern

Contact the author

Keywords

metabolomics, non-targeted, GC-MS, exploratory data analysis 

Tags

IVES Conference Series | OENO IVAS 2019

Citation

Related articles…

Organic recycled mulches in sustainable viticulture: assessment of spontaneous plants communities and weed coverage

In recent years, developing more efficient and sustainable viticulture management has been essential due to the impact of climate change in semiarid regions. For this reason, the use of recycled organic mulching (ROM) in the vineyard has become an interesting strategy to cope with water stress, isolated soil from extreme temperatures and improving soil humidity, control the presence of weeds and therefore reduce the inputs of herbicides and improve soil fertility. This work aimed to analyse the effect of three different organic mulches [straw (S), grape pruning debris (GPD) and spent mushroom compost (SMC)] and two traditional soil management techniques [herbicide (H) and interrow (IN)] on weed coverage and the spontaneous plant communities’ presence. Data sampling was collected throughout the vine vegetative cycle of 2021 in La Rioja, Spain. The different soil management techniques had a clear effect on weed coverage and his development during the vine vegetative cycle. SMC and H were the treatments with the highest and the lowest coverage percentage, respectively. IN had a delayed weed emergence at the beginning of the vine vegetative cycle, but finally it reached maximum values nearby SMC. GPD and S had similar effects on weed emergence, reaching 25-30% of the maximum coverage values. A total of 29 herbaceous species were identified during the vegetative cycle, some of them very isolated and occasional. Principal component analysis (PCAs) showed a good association between spontaneous species and treatments, furthermore, specific species-treatment associations were found. Moreover, three clear groups of herbaceous communities were identified by cluster analysis. This study provides interesting information about the effect of different alternative soil management on herbaceous plant coverage and weed species communities which could contribute to making more sustainable viticulture.

A blueprint for managing vine physiological balance at different spatial and temporal scales in Champagne

In Champagne, the vine adaptation to different climatic and technical changes during these last 20 years can be seen through physiological balance disruptions. These disruptions emphasize the general grapevine decline. Since the 2000s, among other nitrogen stress indicators, the must nitrogen has been decreasing. The combination of restricted mineral fertilizers and herbicide use, the growing variability of spring rainfall, the increasing thermal stress as well as the soil type heterogeneity are only a few underlying factors that trigger loss of physiological balance in the vineyards. It is important to weigh and quantify the impact of these factors on the vine. In order to do so, the Comité Champagne uses two key-tools: networking and modelization. The use of quantitative and harmonized ecophysiological indicators is necessary, especially in large spatial scales such as the Champagne appellation. A working group with different professional structures of Champagne has been launched by the Comité Champagne in order to create a common ecophysiology protocol and thus monitor the vine physiology, yearly, around 100 plots, with various cultural practices and types of soil. The use of crop modelling to follow the vine physiological balance within different pedoclimatic conditions enables to understand the present balance but also predict the possible disruptions to come in future climatic scenarios. The physiological references created each year through the working group, benefit the calibration of the STICS model used in Champagne. In return, the model delivers ecophysiology indicators, on a daily scale and can be used on very different types of soils. This study will present the bottom-up method used to give accurate information on the impacts of soil, climate and cultural practices on vine physiology.

First step in the preparation of a soil map of the Protected Designation of Origin Valdepeñas (Central, Spain)

This work is a first step to make a map of vineyard soils. The characterization of the soils of the Protected Designation of Origin (D.P.O.) Valdepeñas will allow to group the studied profiles according to their physico-chemical characteristics and the concentrations of most relevant chemical elements. 90 soil profiles were analysed throughout the territory and the soils were sampled and described according to FAO (2006) and classified according to and Soil Taxonomy (2014). All samples were air dried, sieved and some physico-chemical parameters were determined following standard protocols. Also, major and trace elements were analysed by X-ray fluorescence. The statistically study was made using the SPSS program. Trend maps were made using the ArcGIS program. The studied soils have the following average properties: pH, 8.3; electrical conductivity, 0,20 dS/m (low); clay, 18.8% (medium) and CaCO3, 17.1% (high). In the study for the major elements. The major elements of these soils are Si, followed by Ca and Al, with an average content of 203.7 g/kg, 105.5 g/kg and 74.0 g/kg respectively. On the other hand, 27 trace elements have been studied. Of all of them, it can be highlighted the average values of Ba (361.8 mg/kg), Sr (129.3 mg/kg), Rb (83.4 mg/kg), V (74.2 mg/kg) and Ce (70.6 mg/kg). Ba, V and Ce values are higher and the values of Sr and Rb are lower to those found in the literature. The discriminant analysis shows a percentage of grouping of 91%. The content of chemical elements together with the physico-chemical characteristics allows grouping the soils in 4 group according to their order in the classification to Soil Taxonomy; due to the importance of the Calcisols in Castilla-La Mancha, it has been decided to establish them as their own group even if they do not appear in Soil Taxonomy classification.

Evaluation of climate change impacts at the Portuguese Dão terroir over the last decades: observed effects on bioclimatic indices and grapevine phenology

In the last decades the growers of the Portuguese Dão winegrowing region (center of Portugal) are experiencing changes in climate that are influencing either grape phenology berry health and ripening. Aiming to study the relationships between climate indices (CI), seasonal weather and grapevine phenology, in this work long-term climate and phenological data collected at the experimental vineyard of the Portuguese Dão research centre between 1958 and 2019 (61 years) for the red variety Touriga Nacional, was analyzed. The trends over time for the classical temperature-based indices (Growing Season Temperature – GST -, Growing Degree Days – GDD, Huglin Index – HI and Cool Night Index – CI) presented a significantly positive slope while the Dryness Index (DI) showed a negative trend over the last 61 years. Regarding grapevine phenology, an average advance of 4.5 days per decade in the harvest day was observed throughout the last 61 years. Consequently, the weather conditions during the ripening period have changed, showing an increasing trend over time in the average temperature (higher magnitude in the maximum than in the minimum temperature) and a decrease in the accumulated rainfall. A regression analysis showed that ~50% of harvest date variability over years was explained by the temperature-based indices variability. These observed effects of climate change on bioclimatic indices and corresponding anticipation of harvest date can still be considered advantageous for the Dão terroir as it allows to achieve an optimal berry ripening before the common equinox rains and, therefore, avoid the potential negative impacts of the rainfall on berry health and composition.

Underpinning terroir with data: rethinking the zoning paradigm

Agriculture, natural resource management and the production and sale of products such as wine are increasingly data-driven activities. Thus, the use of remote and proximal crop and soil sensors to aid management decisions is becoming commonplace and ‘Agtech’ is proliferating commercially; mapping, underpinned by geographical information systems and complex methods of spatial analysis, is widely used. Likewise, the chemical and sensory analysis of wines draws on multivariate statistics; the efficient winery intake of grapes, subsequent production of wines and their delivery to markets relies on logistics; whilst the sales and marketing of wines is increasingly driven by artificial intelligence linked to the recorded purchasing behaviour of consumers. In brief, there is data everywhere!

Opinions will vary on whether these developments are a good thing. Those concerned with the ‘mystique’ of wine, or the historical aspects of terroir and its preservation, may find them confronting. In contrast, they offer an opportunity to those interested in the biophysical elements of terroir, and efforts aimed at better understanding how these impact on vineyard performance and the sensory attributes of resultant wines. At the previous Terroir Congress, we demonstrated the potential of analytical methods used at the within-vineyard scale in the development of Precision Viticulture, in contributing to a quantitative understanding of regional terroir. For this conference, we take this approach forward with examples from contrasting locations in both the northern and southern hemispheres. We show how, by focussing on the vineyards within winegrowing regions, as opposed to all of the land within those regions, we might move towards a more robust terroir zoning than one derived from a mixture of history, thematic mapping, heuristics and the whims of marketers. Aside from providing improved understanding by underpinning terroir with data, such methods should also promote improved management of the entire wine value chain.