Fully automated non-targeted GC-MS data analysis

Abstract

Non-targeted analysis is applied in many different domains of analytical chemistry such as metabolomics, environmental and food analysis. In contrast to targeted analysis, non-targeted approaches take information of known and unknown compounds into account, are inherently more comprehensive and give a more holistic representation of the sample composition. 

Besides chromatographic techniques coupled to high resolution mass spectrometry such as LC-HRMS, gas chromatography with unit resolution mass spectrometry is still regularly utilized for non-targeted profiling or fingerprinting. This is mainly due to high separation power of GC and a wide availability and low costs of quadrupole mass spectrometers. 

Although several non-targeted approaches have been developed, data processing still remains a serious bottleneck. Baseline correction, feature detection, and retention time alignment can be prone to errors and time-consuming manual corrections are often necessary. We therefore developed an automated strategy to non-targeted GC-MS data avoiding feature detection and retention time alignment. The novel automated approach includes segmentation of chromatograms along the retention time axis, multiway decomposition of transformed segments followed by a supervised machine learning pipeline based on gradient boosted tree classification on the decomposed tensor [1, 2]. 

In order to make this novel data analysis strategy available to scientists without programming background, we developed a convenient browser based application. For the here presented interactive browser application the open source Python packages Bokeh and HoloViews were used. The application will be online freely available soon. 

[1] J. Vestner, G. de Revel, S. Krieger-Weber, D. Rauhut, M. du Toit, A. de Villiers, Toward automated chromatographic fingerprinting: A non-alignment approach to gas chromatography mass spectrometry data. Acta Chimica Acta 911 (2016) 42-58 
[2] K. Sirén, U. Fischer, J. Vestner, Automated supervised learning pipeline for non-targeted GC-MS data analysis. Analytica Chimica Acta: X 1 (2019) 100005

DOI:

Publication date: June 19, 2020

Issue: OENO IVAS 2019

Type: Article

Authors

Jochen Vestner, Kimmo Sirén, Pierre Le Brun, Ulrich Fischer

Institute for Viticulture and Oenology, DLR Rheinpfalz, Breitenweg 71, D-67435 Neustadt, Germany
Institut National Supérieur des Sciences Agronomiques de l’Alimentation et de l’ Environnement, Agrosup Dijon, 6 boulevard Docteur Petitjean, 21000 Dijon, France
Department of Chemistry, University of Kaiserslautern, Erwin-Schroedinger-Strasse 52, D-67663 Kaiserslautern

Contact the author

Keywords

metabolomics, non-targeted, GC-MS, exploratory data analysis 

Tags

IVES Conference Series | OENO IVAS 2019

Citation

Related articles…

Effect of soil type on Sauvignon blanc and Cabernet-Sauvignon wine style at different localities in South Africa

The wine producing regions of South Africa are characterized by climatic diversity. The Coastal Region has a Mediterranean climate, with a mean annual rainfall of c.

Phenolic characterization of four different red varieties with “Caíño” denomination cultivated in Northwestern Spain

In this work, these four red varieties were characterized in terms of phenolic composition. Thus, the anthocyanin accumulation and the extractability evolution during ripening were compared.

Optimizing disease management in the Rioja wine region: a study on Erisiphe necator and the Gubler-Thomas model

Erisiphe necator is endemic in the Rioja Appellation of Origin. Vine growers exert significant effort to protect their crops, given the economic losses this disease causes. Different studies have shown that using Gubler-Thomas Model (GTM) can reduce treatments by up to 20% compared to a full-time protection strategy. This reduction is achieved by optimizing applications based on temperature variations in late spring and summer when the disease’s conidial stage is active.

Quantification of polysaccharides of variety Pomaces of the D.O.Ca Rioja

Pomace is one of the main residues generated by the wine industry and represents an environmental problem. Currently, there is a growing interest in the revaluation of these products because different bioactive compounds can be obtained from them, such as polyphenols, grape seed oils and polysaccharides. Red grape pomace can be an important source of polysaccharides, but they are currently little studied and even less with viable and environmental extraction processes (green extraction), such as flash extraction. The residual amount of the fraction rich in pectin (residual pulp) and component rich in hemicellulose in the pomace and the strength of association of the pectin with the cellulose-xyloglucan network depend on the degree of extractability of the polysaccharides in red winemaking and on the winemaking conditions.

What drives Indications of Geographical Origin protection and governance mechanisms in the U.S. and European contexts? A contribution of the social sciences

There are fundamentally two different ways in which indications of geographical origin (igos) can be protected. The us approach favors the pre-existing trademark system through collective marks (cms), while the eu approach favors a maximalist approach via a sui generis system which promotes appellations of origin (aos). A consensus however emerges regarding the fundamental protection of origin against misleading, confusing and dilutive uses. Previous literature discusses these competing igo logics from historical, legal and international trade perspectives. In this paper, we depart from the field of social sciences, in particular from recent advancements in the well-established literature on proximities, in order to provide a reflection on the different logics underpinning the aos and cms systems.