Fully automated non-targeted GC-MS data analysis

Abstract

Non-targeted analysis is applied in many different domains of analytical chemistry such as metabolomics, environmental and food analysis. In contrast to targeted analysis, non-targeted approaches take information of known and unknown compounds into account, are inherently more comprehensive and give a more holistic representation of the sample composition. 

Besides chromatographic techniques coupled to high resolution mass spectrometry such as LC-HRMS, gas chromatography with unit resolution mass spectrometry is still regularly utilized for non-targeted profiling or fingerprinting. This is mainly due to high separation power of GC and a wide availability and low costs of quadrupole mass spectrometers. 

Although several non-targeted approaches have been developed, data processing still remains a serious bottleneck. Baseline correction, feature detection, and retention time alignment can be prone to errors and time-consuming manual corrections are often necessary. We therefore developed an automated strategy to non-targeted GC-MS data avoiding feature detection and retention time alignment. The novel automated approach includes segmentation of chromatograms along the retention time axis, multiway decomposition of transformed segments followed by a supervised machine learning pipeline based on gradient boosted tree classification on the decomposed tensor [1, 2]. 

In order to make this novel data analysis strategy available to scientists without programming background, we developed a convenient browser based application. For the here presented interactive browser application the open source Python packages Bokeh and HoloViews were used. The application will be online freely available soon. 

[1] J. Vestner, G. de Revel, S. Krieger-Weber, D. Rauhut, M. du Toit, A. de Villiers, Toward automated chromatographic fingerprinting: A non-alignment approach to gas chromatography mass spectrometry data. Acta Chimica Acta 911 (2016) 42-58 
[2] K. Sirén, U. Fischer, J. Vestner, Automated supervised learning pipeline for non-targeted GC-MS data analysis. Analytica Chimica Acta: X 1 (2019) 100005

DOI:

Publication date: June 19, 2020

Issue: OENO IVAS 2019

Type: Article

Authors

Jochen Vestner, Kimmo Sirén, Pierre Le Brun, Ulrich Fischer

Institute for Viticulture and Oenology, DLR Rheinpfalz, Breitenweg 71, D-67435 Neustadt, Germany
Institut National Supérieur des Sciences Agronomiques de l’Alimentation et de l’ Environnement, Agrosup Dijon, 6 boulevard Docteur Petitjean, 21000 Dijon, France
Department of Chemistry, University of Kaiserslautern, Erwin-Schroedinger-Strasse 52, D-67663 Kaiserslautern

Contact the author

Keywords

metabolomics, non-targeted, GC-MS, exploratory data analysis 

Tags

IVES Conference Series | OENO IVAS 2019

Citation

Related articles…

Grapevine sugar concentration model in the Douro Superior, Portugal

Increasingly warm and dry climate conditions are challenging the viticulture and winemaking sector. Digital technologies and crop modelling bear the promise to provide practical answers to those challenges. As viticultural activities strongly depend on harvest date, its early prediction is particularly important, since the success of winemaking practices largely depends upon this key event, which should be based on an accurate and advanced plan of the annual cycle. Herein, we demonstrate the creation of modelling tools to assess grape ripeness, through sugar concentration monitoring. The study area, the Portuguese Côa valley wine region, represents an important terroir in the “Douro Superior” subregion. Two varieties (cv. Touriga Nacional and Touriga Franca) grown in five locations across the Côa Region were considered. Sugar accumulation in grapes, with concentrations between 170 and 230 g l-1, was used from 2014 to 2020 as an indicator of technological maturity conditioned by meteorological factors. The climatic time series were retrieved from the EU Copernicus Service, while sugar data were collected by a non-profit organization, ADVID, and by Sogrape, a leading wine company. The software for calibrating and validating this model framework was the Phenology Modeling Platform (PMP), version 5.5, using Sigmoid and growing degree-day (GDD) models for predictions. The performance was assessed through two metrics: Roots Mean Square Error (RMSE) and efficiency coefficient (EFF), while validation was undertaken using leave-one-out cross-validation. Our findings demonstrate that sugar content is mainly dependent on temperature and air humidity. The models achieved a performance of 0.65

Spatiotemporal patterns of chemical attributes in Vitis vinifera L. cv. Cabernet Sauvignon vineyards in Central California

Spatial variability of vine productivity in winegrapes is important to characterise as both yield and quality are relevant for the production of different wine styles and products. The objectives were to understand how patterns of variability of Cabernet Sauvignon fruit composition changed over time and space, how these patterns could be characterised with indirect measurements, and how spatial patterns of the variation in fruit compositional attributes can aid in improving management. Prior to the 2017 vintage, 125 data vines were distributed across each of four vineyards in the Lodi American Viticultural Area (AVA) of California. Each data vine was sampled at commercial harvest in 2017, 2018, and 2019. Yield components and fruit composition were measured at harvest for each data vine, and maps of yield and fruit composition were produced for eight ‘objective measures of fruit quality’: total anthocyanins, polymeric tannins, quercetin glycosides, malic acid, yeast assimilable nitrogen, β-damascenone, C6 alcohols and aldehydes, and 3-isobutyl-2-methoxypyrazine. Patterns of variation in anthocyanins and phenolic compounds were found to be most stable over time. Given this relative stability, management decisions focused on fruit quality could be based on zonal descriptions of anthocyanins or phenolics to increase profitability in some vineyards. In each vineyard, dormant season pruning weights and soil cores were collected at each location, elevation and soil apparent electrical conductivity surveys were completed, and remotely sensed imagery was captured by fixed wing aircraft and two satellite platforms at major phenological stages. The data collected were used to develop relationships among biophysical data, soil, imagery, and fruit composition. The standardised and aggregated samples from four vineyards over three seasons were included in the estimation of ‘common variograms’ to assess how this technique could aid growers in producing geostatistically rigorous maps of fruit composition variability without cumbersome, single season sampling efforts.

Bioclimatic shifts and land use options for Viticulture in Portugal

Land use, plays a relevant role in the climatic system. It endows means for agriculture practices thus contributing to the food supply. Since climate and land are closely intertwined through multiple interface processes, climate change may lead to significant impacts in land use. In this study, 1-km observational gridded datasets are used to assess changes in the Köppen–Geiger and Worldwide Bioclimatic (WBCS)

Analysis of some environmental factors and cultural practices that affect the production and quality of the Manto Negro, Callet and Prensal Blanc varieties

45 non irrigated vineyards distributed in the DO (Denomination) Pla i Llevant de Mallorca and the DO Binissalem Mallorca were used to investigate the characteristics of production and quality and their relationships certain environmental factors and cultural practices. The grape varieties investigated are autochthonous to the island of Mallorca, Manto Negro and Callet as red and Prensal Blanc as white. All plants were measured for four consecutive years in the main production and quality parameters. Among the environmental factors, the type of soil has been studied, more specifically its water retention capacity, the planting density, the age of the vineyard and the level of viral infection. The presence or absence of virus seems to have no effect on any component studied in the varieties studied. For the white variety Prensal Blanc age is negatively correlated with production and the number of bunches, nevertheless it does not cause any effect on the required quality parameters. However, for the red varieties Callet and Manto Negro, the age of the plantation is the variable that best correlates with the quality parameters, therefore the old vines should be the object of preservation by the viticulturists and winemakers in order to guarantee its contribution to the quality of the wines made with these varieties.

Exploring resilience and competitiveness of wine estates in Languedoc-Roussillon in the recent past: a multi-level perspective

The Languedoc-Roussillon wineries are facing a decline in wine yields particularly PGI yields due to many factors. Climate change is just ones, but is expected to increase in the future. There is also structurally a large heterogeneity of yield profiles among terroirs, varieties and strategies. This work investigates the link between yield, competitiveness and resilience to explore how resilient winegrowers have been in the recent past. To this end two approaches have been combined; (i) an accountancy database analysis at estate scale and (ii) municipality level competitiveness analysis. A new resilience indicator that characterizes the capacity of an estate to absorb yield variation is also defined. The FADN database between 2000 and 2018 of ex-Languedoc-Roussillon (France) and other data are used to analyse the current situation and the past evolution of competitiveness and resilience by type of estate (type of farm: PGI and/or PDO & type of commercialization: bulk and/or bottles). The net margin, which defines competitiveness, is not correlated to yield for all types but depends on the type of commercialization and the level of specialisation. The resilience indicator shows that the net margin of estates specialized in PGI is particularly sensitive to yield declines. We also show that price evolutions seem to compensate the effect of yield losses for the majority of types. Municipality scale analysis shows the links between local pedoclimate, yield, commercialization strategies and price. Overlapping a PDO with a PGI does not always increase a municipality’s PGI competitiveness. It is difficult to make links between causes and effects due to the complexity of the wine production system. Production diversification may be a solution. Resorting to the two level of analysis helps resolving the data gap that is necessary to explore the links between yield and economic performance of the wine estates in the long term.