Fully automated non-targeted GC-MS data analysis

Abstract

Non-targeted analysis is applied in many different domains of analytical chemistry such as metabolomics, environmental and food analysis. In contrast to targeted analysis, non-targeted approaches take information of known and unknown compounds into account, are inherently more comprehensive and give a more holistic representation of the sample composition. 

Besides chromatographic techniques coupled to high resolution mass spectrometry such as LC-HRMS, gas chromatography with unit resolution mass spectrometry is still regularly utilized for non-targeted profiling or fingerprinting. This is mainly due to high separation power of GC and a wide availability and low costs of quadrupole mass spectrometers. 

Although several non-targeted approaches have been developed, data processing still remains a serious bottleneck. Baseline correction, feature detection, and retention time alignment can be prone to errors and time-consuming manual corrections are often necessary. We therefore developed an automated strategy to non-targeted GC-MS data avoiding feature detection and retention time alignment. The novel automated approach includes segmentation of chromatograms along the retention time axis, multiway decomposition of transformed segments followed by a supervised machine learning pipeline based on gradient boosted tree classification on the decomposed tensor [1, 2]. 

In order to make this novel data analysis strategy available to scientists without programming background, we developed a convenient browser based application. For the here presented interactive browser application the open source Python packages Bokeh and HoloViews were used. The application will be online freely available soon. 

[1] J. Vestner, G. de Revel, S. Krieger-Weber, D. Rauhut, M. du Toit, A. de Villiers, Toward automated chromatographic fingerprinting: A non-alignment approach to gas chromatography mass spectrometry data. Acta Chimica Acta 911 (2016) 42-58 
[2] K. Sirén, U. Fischer, J. Vestner, Automated supervised learning pipeline for non-targeted GC-MS data analysis. Analytica Chimica Acta: X 1 (2019) 100005

DOI:

Publication date: June 19, 2020

Issue: OENO IVAS 2019

Type: Article

Authors

Jochen Vestner, Kimmo Sirén, Pierre Le Brun, Ulrich Fischer

Institute for Viticulture and Oenology, DLR Rheinpfalz, Breitenweg 71, D-67435 Neustadt, Germany
Institut National Supérieur des Sciences Agronomiques de l’Alimentation et de l’ Environnement, Agrosup Dijon, 6 boulevard Docteur Petitjean, 21000 Dijon, France
Department of Chemistry, University of Kaiserslautern, Erwin-Schroedinger-Strasse 52, D-67663 Kaiserslautern

Contact the author

Keywords

metabolomics, non-targeted, GC-MS, exploratory data analysis 

Tags

IVES Conference Series | OENO IVAS 2019

Citation

Related articles…

Impact of yeast derivatives to increase the phenolic maturity and aroma intensity of wine

Using viticultural and enological techniques to increase aromatics in white wine is a prized yet challenging technique for commercial wine producers. Equally difficult are challenges encountered in hastening phenolic maturity and thereby increasing color intensity in red wines. The ability to alter organoleptic and visual properties of wines plays a decisive role in vintages in which grapes are not able to reach full maturity, which is seen increasingly more often as a result of climate change. A new, yeast-based product on the viticultural market may give the opportunity to increase sensory properties of finished wines. Manufacturer packaging claims these yeast derivatives intensify wine aromas of white grape varieties, as well as improve phenolic ripeness of red varieties, but the effects of this application have been little researched until now. The current study applied the yeast derivative, according to the manufacture’s instructions, to the leaves of both neutral and aromatic white wine varieties, as well as on structured red wine varieties. Chemical parameters and volatile aromatics were analyzed in grape musts and finished wines, and all wines were subjected to sensory analysis by a tasting panel. Collective results of all analyses showed that the application of the yeast derivative in the vineyard showed no effect across all varieties examined, and did not intensify white wine aromatics, nor improve phenolic ripeness and color intensity in red wine.

Co-design and evaluation of spatially explicit strategies of adaptation to climate change in a Mediterranean watershed

Climate change challenges differently wine growing systems, depending on their biophysical, sociological and economic features. Therefore, there is a need to locally design and evaluate adaptation strategies combining several technical options, and considering the local opportunities and constraints (e.g. water access, wine typicity). The case study took place in a typical and heterogeneous Mediterranean vineyard of 1,500 ha in the South of France. We developed a participatory modeling approach to (1) conceptualize local climate change issues and design spatially explicit adaptation strategies with stakeholders, (2) numerically evaluate their effects on phenology, yield and irrigation needs under the high-emissions climate change scenario RCP 8.5, and (3) collectively discuss simulation results. We organized five sets of workshops, with in-between modeling phases. A process-based model was developed that allowed to evaluate the effects of six technical options (late varieties, irrigation, water saving by reducing canopy size, adjusting cover cropping, reducing density, and shading) with various distributions in the watershed, as well as vineyard relocation. Overall, we co-designed three adaptation strategies. Delay harvest strategy with late varieties showed little effects on decreasing air temperature during ripening. Water constraint limitation strategy would compensate for production losses if disruptive adaptations (e.g. reduced density) were adopted, and more land got access to irrigation. Relocation strategy would foster high premium wine production in the constrained mountainous areas where grapevine is less impacted by climate change. This research shows that a spatial distribution of technical changes gives room for adaptation to climate change, and that the collaboration with local stakeholders is a key to the identification of relevant adaptation. Further research should explore the potential of adaptation strategies based on soil quality improvement and on water stress tolerant varieties.

Comparison of imputation methods in long and varied phenological series. Application to the Conegliano dataset, including observations from 1964 over 400 grape varieties

A large varietal collection including over 1700 varieties was maintained in Conegliano, ITA, since the 1950s. Phenological data on a subset of 400 grape varieties including wine grapes, table grapes, and raisins were acquired at bud break, flowering, veraison, and ripening since 1964. Despite the efforts in maintaining and acquiring data over such an extensive collection, the data set has varying degrees of missing cases depending on the variety and the year. This is ubiquitous in phenology datasets with significant size and length. In this work, we evaluated four state-of-the-art methods to estimate missing values in this phenological series: k-Nearest Neighbour (kNN), Multivariate Imputation by Chained Equations (mice), MissForest, and Bidirectional Recurrent Imputation for Time Series (BRITS). For each phenological stage, we evaluated the performance of the methods in two ways. 1) On the full dataset, we randomly hold-out 10% of the true values for use as a test set and repeated the process 1000 times (Monte Carlo cross-validation). 2) On a reduced and almost complete subset of varieties, we varied the percentage of missing values from 10% to 70% by random deletion. In all cases, we evaluated the performance on the original values using normalized root mean squared error. For the full dataset we also obtained performance statistics by variety and by year. MissForest provided average errors of 17% (3 days) at budbreak, 14% (4 days) at flowering, 14.5% (7 days) at veraison, and 17% (3 days) at maturity. We completed the imputations of the Conegliano dataset, one of the world’s most extensive and varied phenological time series and a steppingstone for future climate change studies in grapes. The dataset is now ready for further analysis, and a rigorous evaluation of imputation errors is included.

Adaptability of grapevines to climate change: characterization of phenology and sugar accumulation of 50 varieties, under hot climate conditions

Climate is the major factor influencing the dynamics of the vegetative cycle and can determine the timing of phenological periods. Knowledge of the phenology of varieties, their chronological duration, and thermal requirements, allows not only for the better management of interventions in the vineyard, but also to predict the varieties’ behaviour in a scenario of climate change, giving the wine producer the possibility of selecting the grape varieties that are best adapted to the climatic conditions of a certain terroir. In 2014, Symington Family Estates, Vinhos, established two grape variety libraries in two different places with distinctive climate conditions (Douro Superior, and Cima Corgo), with the commitment of contributing to a deeper agronomic and oenological understanding of some grape varieties, in hot climate conditions. In these research vineyards are represented local varieties that are important in the regional and national viticulture, but also others that have over time been forgotten — as well as five international reference cultivars. From 2017 to 2021, phenological observations have been made three times a week, following a defined protocol, to determine the average dates of budbreak, flowering and veraison. With the climate data of each location, the thermal requirements of each variety and the chronological duration of each phase have been calculated. During maturation, berry samples have been gathered weekly to study the dynamics of sugar accumulation, between other parameters. The data was analysed applying phenological and sugar accumulation models available in literature. The results obtained show significant differences between the varieties over several parameters, from the chronological duration and thermal requirements to complete the various stages of development, to the differences between the two locations, confirming the influence of the climate on phenology and the stages of maturation, in these specific conditions.

Influence of weather and climatic conditions on the viticultural production in Croatia

The research includes an analysis of the impact of weather conditions on phenological development of the vine and grape quality, through monitoring of four experimental cultivars (Chardonnay, Graševina, Merlot and Plavac mali) over two production years. In each experimental vineyard, which were evenly distributed throughout the regions of Slavonia and The Croatian Danube, Croatian Uplands,