Fully automated non-targeted GC-MS data analysis

Abstract

Non-targeted analysis is applied in many different domains of analytical chemistry such as metabolomics, environmental and food analysis. In contrast to targeted analysis, non-targeted approaches take information of known and unknown compounds into account, are inherently more comprehensive and give a more holistic representation of the sample composition. 

Besides chromatographic techniques coupled to high resolution mass spectrometry such as LC-HRMS, gas chromatography with unit resolution mass spectrometry is still regularly utilized for non-targeted profiling or fingerprinting. This is mainly due to high separation power of GC and a wide availability and low costs of quadrupole mass spectrometers. 

Although several non-targeted approaches have been developed, data processing still remains a serious bottleneck. Baseline correction, feature detection, and retention time alignment can be prone to errors and time-consuming manual corrections are often necessary. We therefore developed an automated strategy to non-targeted GC-MS data avoiding feature detection and retention time alignment. The novel automated approach includes segmentation of chromatograms along the retention time axis, multiway decomposition of transformed segments followed by a supervised machine learning pipeline based on gradient boosted tree classification on the decomposed tensor [1, 2]. 

In order to make this novel data analysis strategy available to scientists without programming background, we developed a convenient browser based application. For the here presented interactive browser application the open source Python packages Bokeh and HoloViews were used. The application will be online freely available soon. 

[1] J. Vestner, G. de Revel, S. Krieger-Weber, D. Rauhut, M. du Toit, A. de Villiers, Toward automated chromatographic fingerprinting: A non-alignment approach to gas chromatography mass spectrometry data. Acta Chimica Acta 911 (2016) 42-58 
[2] K. Sirén, U. Fischer, J. Vestner, Automated supervised learning pipeline for non-targeted GC-MS data analysis. Analytica Chimica Acta: X 1 (2019) 100005

DOI:

Publication date: June 19, 2020

Issue: OENO IVAS 2019

Type: Article

Authors

Jochen Vestner, Kimmo Sirén, Pierre Le Brun, Ulrich Fischer

Institute for Viticulture and Oenology, DLR Rheinpfalz, Breitenweg 71, D-67435 Neustadt, Germany
Institut National Supérieur des Sciences Agronomiques de l’Alimentation et de l’ Environnement, Agrosup Dijon, 6 boulevard Docteur Petitjean, 21000 Dijon, France
Department of Chemistry, University of Kaiserslautern, Erwin-Schroedinger-Strasse 52, D-67663 Kaiserslautern

Contact the author

Keywords

metabolomics, non-targeted, GC-MS, exploratory data analysis 

Tags

IVES Conference Series | OENO IVAS 2019

Citation

Related articles…

Protected Designation of Origin (D.P.O.) Valdepeñas: classification and map of soils

The objective of the work described here is the elaboration of a map of the different types of vineyard soils that to guide the famers in the choice of the most productive vine rootstocks and varieties. 90 vineyard soils profiles were analysed in the entire territory of the Origen Denominations of Valdepeñas. The sampling was carried out in 2018 (June to October) by making a sampling grid, followed by photointerpretation and control in the field. The studied soils can be grouped into 9 different soil types (according to FAO 2006 classification): Leptosols, Regosols, Fluvisols, Gleysols, Cambisols, Calcisols, Luvisols and Anthrosols. A map showing the soil distribution with different type of soils has been made with the ArcGIS program. Regarding to the choice of rootstock, Calcisoles are soils with a high active limestone content, so the rootstocks used in these soils must be resistant to this parameter; Luvisols are deep soils with high clay content, so they will support vigorous rootstocks. Because the cartographic units are composed of two or more subgroups, with are associated in variable proportions, 9 different soil associations have been established; Unit 1: Leptosols, Cambisols and Luvisols (80%, 15% and 5% respectively); Unit 2: Cambisols with Regosols and Luvisols (40%, 30% and 30% respectively); Unit 3: Cambisols and Gleysols with Regosols (40%, 40% and 20% respectively); Unit 4: Regosols with Cambisols, Leptosols and Calcisols (40%, 30%, 15% and 15% respectively); Unit 5: Cambisols, Leptosols, Calcisols and Regosols (25% each of them); Unit 6: Luvisols with Cambisol and Calcisols (80%, 10% and 10% respectively); Unit 7: Luvisols and Calcisols with Cambisols (40%, 40% and 20% respectively); Unit 8: Calcisols with, Cambisols and Luvisols (80%, 10% and 10% respectively); Unit 9: Anthrosols. These study allow to elaborate the first map of vineyard soils of this Protected Designation of Origin in Castilla-La Mancha.

Impact of long term agroecological and conventional practices on subsurface soil microbiota in Macabeu and Xarel·lo vineyards

There is a growing trend on the transition from conventional to agroecological management of vineyards. However, the impact of practices, such as reduced-tillage, organic fertilization and cover crops, is not well-understood regarding the soil microbial diversity, and its relationship with the soil physicochemical properties in the subsurface depth near the rooting zone. Soil bacterial diversity is an important contributor towards plant health, productivity and response to environmental stresses. A field experiment was conducted by sampling subsurface soil bacterial community (NGS and qPCR) near to the root zone of Macabeu and Xarel·lo vineyards, located at the Penedes. 3 organic (ECO) and 3 conventional (CON) vineyards, with more than 10 years of respective management were sampled (n=5 each plot). ECO practices did not affect bacterial and fungal abundance but increased significantly the ammonium oxidizing bacteria and alpha-diversity (Inv.Simpson). Interestingly beta-diversity was significantly affected by the management strategy. ANOSIM-tests revealed a significative effect of the management (ecological vs conventional) and plot, on the soil microbial structure (ASV abundance). Main phyla depicted were Proteobacteria, Actinobacteria and Acidobacteria, whose relative abundances were not affected by the management. EdgeR assay revealed a significant increase of Cyanobacteria and decrease of Gemmatimonadetes and Firmicutes phyla in ECO. Interestingly, the grapevine variety was not correlated with the soil microbial community structure. Mantel-test revealed an important correlation (Spearman) of some physicochemical parameters with the soil microbiota structure, in order of importance: texture, EC, pH Ca/Mg, Mg/P, K+, Mg2+, Ca2+, SO42-, and OM. N-NH4 and NTK, which were higher in the ECO managed soils, did not correlated significantly with the soil microbiome population. The results revealed the importance of combining a deep physicochemical characterization of each replicate with the microbial diversity assessment to gain better insights on the relationship between soil microbiome and vineyard management.

Variations of soil attributes in vineyards influence their reflectance spectra

Knowledge on the reflectance spectrum of soil is potentially useful since it carries information on soil chemical composition that can be used to the planning of agricultural practices. If compared with analytical methods such as conventional chemical analysis, reflectance measurement provides non-destructive, economic, near real-time data. This paper reports results from reflectance measurements performed by spectroradiometry on soils from two vineyards in south Brazil. The vineyards are close to each other, are on different geological formations, but were subjected to the same management. The objective was to detect spectral differences between the two areas, correlating these differences to variations in their chemical composition, to assess the technique’s potential to predict soil attributes from reflectance data.To that end, soil samples were collected from ten selected vine parcels. Chemical analysis yield data on concentration of twenty-one soil attributes, and spectroradiometry was performed on samples. Chemical differences significant to a 95% confidence level between the two studied areas were found for six soil attributes, and the average reflectance spectra were separated by this same level along most of the observed spectral domain. Correlations between soil reflectance and concentrations of soil attributes were looked for, and for ten soil traits it was possible to define wavelength domains were reflectance and concentrations are correlated to confidence levels from 95% to 99%. Partial Least Squares Regression (PLSR) analyses were performed comparing measured and predicted concentrations, and for fifteen out of 21 soil traits we found Pearson correlation coefficients r > 0.8. These preliminary results, which have to be validated, suggest that variations of concentration in the investigated soil attributes induce differences in reflectance that can be detected by spectroradiometry. Applications of these observations include the assessment of the chemical content of soils by spectroradiometry as a fast, low-cost alternative to chemical analytical methods.

Underpinning terroir with data: rethinking the zoning paradigm

Agriculture, natural resource management and the production and sale of products such as wine are increasingly data-driven activities. Thus, the use of remote and proximal crop and soil sensors to aid management decisions is becoming commonplace and ‘Agtech’ is proliferating commercially; mapping, underpinned by geographical information systems and complex methods of spatial analysis, is widely used. Likewise, the chemical and sensory analysis of wines draws on multivariate statistics; the efficient winery intake of grapes, subsequent production of wines and their delivery to markets relies on logistics; whilst the sales and marketing of wines is increasingly driven by artificial intelligence linked to the recorded purchasing behaviour of consumers. In brief, there is data everywhere!

Opinions will vary on whether these developments are a good thing. Those concerned with the ‘mystique’ of wine, or the historical aspects of terroir and its preservation, may find them confronting. In contrast, they offer an opportunity to those interested in the biophysical elements of terroir, and efforts aimed at better understanding how these impact on vineyard performance and the sensory attributes of resultant wines. At the previous Terroir Congress, we demonstrated the potential of analytical methods used at the within-vineyard scale in the development of Precision Viticulture, in contributing to a quantitative understanding of regional terroir. For this conference, we take this approach forward with examples from contrasting locations in both the northern and southern hemispheres. We show how, by focussing on the vineyards within winegrowing regions, as opposed to all of the land within those regions, we might move towards a more robust terroir zoning than one derived from a mixture of history, thematic mapping, heuristics and the whims of marketers. Aside from providing improved understanding by underpinning terroir with data, such methods should also promote improved management of the entire wine value chain.

Adaptability of grapevines to climate change: characterization of phenology and sugar accumulation of 50 varieties, under hot climate conditions

Climate is the major factor influencing the dynamics of the vegetative cycle and can determine the timing of phenological periods. Knowledge of the phenology of varieties, their chronological duration, and thermal requirements, allows not only for the better management of interventions in the vineyard, but also to predict the varieties’ behaviour in a scenario of climate change, giving the wine producer the possibility of selecting the grape varieties that are best adapted to the climatic conditions of a certain terroir. In 2014, Symington Family Estates, Vinhos, established two grape variety libraries in two different places with distinctive climate conditions (Douro Superior, and Cima Corgo), with the commitment of contributing to a deeper agronomic and oenological understanding of some grape varieties, in hot climate conditions. In these research vineyards are represented local varieties that are important in the regional and national viticulture, but also others that have over time been forgotten — as well as five international reference cultivars. From 2017 to 2021, phenological observations have been made three times a week, following a defined protocol, to determine the average dates of budbreak, flowering and veraison. With the climate data of each location, the thermal requirements of each variety and the chronological duration of each phase have been calculated. During maturation, berry samples have been gathered weekly to study the dynamics of sugar accumulation, between other parameters. The data was analysed applying phenological and sugar accumulation models available in literature. The results obtained show significant differences between the varieties over several parameters, from the chronological duration and thermal requirements to complete the various stages of development, to the differences between the two locations, confirming the influence of the climate on phenology and the stages of maturation, in these specific conditions.