Fully automated non-targeted GC-MS data analysis

Abstract

Non-targeted analysis is applied in many different domains of analytical chemistry such as metabolomics, environmental and food analysis. In contrast to targeted analysis, non-targeted approaches take information of known and unknown compounds into account, are inherently more comprehensive and give a more holistic representation of the sample composition. 

Besides chromatographic techniques coupled to high resolution mass spectrometry such as LC-HRMS, gas chromatography with unit resolution mass spectrometry is still regularly utilized for non-targeted profiling or fingerprinting. This is mainly due to high separation power of GC and a wide availability and low costs of quadrupole mass spectrometers. 

Although several non-targeted approaches have been developed, data processing still remains a serious bottleneck. Baseline correction, feature detection, and retention time alignment can be prone to errors and time-consuming manual corrections are often necessary. We therefore developed an automated strategy to non-targeted GC-MS data avoiding feature detection and retention time alignment. The novel automated approach includes segmentation of chromatograms along the retention time axis, multiway decomposition of transformed segments followed by a supervised machine learning pipeline based on gradient boosted tree classification on the decomposed tensor [1, 2]. 

In order to make this novel data analysis strategy available to scientists without programming background, we developed a convenient browser based application. For the here presented interactive browser application the open source Python packages Bokeh and HoloViews were used. The application will be online freely available soon. 

[1] J. Vestner, G. de Revel, S. Krieger-Weber, D. Rauhut, M. du Toit, A. de Villiers, Toward automated chromatographic fingerprinting: A non-alignment approach to gas chromatography mass spectrometry data. Acta Chimica Acta 911 (2016) 42-58 
[2] K. Sirén, U. Fischer, J. Vestner, Automated supervised learning pipeline for non-targeted GC-MS data analysis. Analytica Chimica Acta: X 1 (2019) 100005

DOI:

Publication date: June 19, 2020

Issue: OENO IVAS 2019

Type: Article

Authors

Jochen Vestner, Kimmo Sirén, Pierre Le Brun, Ulrich Fischer

Institute for Viticulture and Oenology, DLR Rheinpfalz, Breitenweg 71, D-67435 Neustadt, Germany
Institut National Supérieur des Sciences Agronomiques de l’Alimentation et de l’ Environnement, Agrosup Dijon, 6 boulevard Docteur Petitjean, 21000 Dijon, France
Department of Chemistry, University of Kaiserslautern, Erwin-Schroedinger-Strasse 52, D-67663 Kaiserslautern

Contact the author

Keywords

metabolomics, non-targeted, GC-MS, exploratory data analysis 

Tags

IVES Conference Series | OENO IVAS 2019

Citation

Related articles…

Tolerance to sunburn: a variable to consider in the context of climate change

Climate change effects on grapevine phenology and grape primary and secondary metabolites are well described in recent literature. Increasing frequency and intensity of heat waves may be responsible for important yield losses in the future. However, the impact of this event is not so well described in literature. The present study highlights the importance of grape variety tolerance as a mitigation tool to climate change.

Extracellular substances of lactic acid bacteria interests in biotechnological practices applied to enology

Extracellular substances (ECS) represent all molecules outside the cytoplasmic membrane, which are not directly anchored to the cell wall of microorganisms living through a planktonic or biofilm phenotype. They are the high-biomolecular-weight secretions from microorganisms (i.e. extracellular polymeric substances – EPS – proteins, polysaccharides, humic acid, nucleic acid), and the products of cellular lysis and hydrolysis of macromolecules. In addition, some high- and low-molecular-weight organic and inorganic matters from environment can also be adsorbed to the EPS. All can be firmly bound to the cell surface, associated with the EPS matrix of biofilm, or released as being freely diffusing throughout the medium.

Black foot disease in South African vineyards and grapevine nurseries

Over the last few years a drastic reduction has been noted in the survival rate of vine cuttings in nurseries, as well as in young vineyards in the Western Cape Province of South Africa. The low average take percentages of young vines can be attributed to several factors, including fungal, bacterial and viral diseases, insect and nematode pests,

The rootstock, the neglected player in the scion transpiration even during the night

Water is the main limiting factor for yield in viticulture. Improving drought adaptation in viticulture will be an increasingly important issue under climate change. Genetic variability of water deficit responses in grapevine partly results from the rootstocks, making them an attractive and relevant mean to achieve adaptation without changing the scion genotype. The objective of this work was to characterize the rootstock effect on the diurnal regulation of scion transpiration. A large panel of 55 commercial genotypes were grafted onto Cabernet Sauvignon. Three biological repetitions per genotype were analyzed. Potted plants were phenotyped on a greenhouse balance platform capable of assessing real-time water use and maintaining a targeted water deficit intensity. After a 10 days well-watered baseline period, an increasing water deficit was applied for 10 days, followed by a stable water deficit stress for 7 days. Pruning weight, root and aerial dry weight and transpiration were recorded and the experiment was repeated during two years. Transpiration efficiency (ratio between aerial biomass and transpiration) was calculated and δ13C was measured in leaves for the baseline and stable water deficit periods. A large genetic variability was observed within the panel. The rootstock had a significant impact on nocturnal transpiration which was also strongly and positively correlated with maximum daytime transpiration. The correlations with growth and water use efficiency related traits will be discussed. Transpiration data were also related with VPD and soil water content demonstrating the influence of environmental conditions on transpiration. These results highlighted the role of the rootstock in modulating water deficit responses and give insights for rootstock breeding programs aimed at identifying drought tolerant rootstocks. It was also helpful to better define the mechanisms on which the drought tolerance in grapevine rootstocks is based on.

Geospatial trends of bioclimatic indexes in the topographically complex region of Barolo DOCG

Barolo DOCG is an economically important wine producing region in Northwest Italy. It is a small region of approximately 70 km2 gross area. The topography is very complex with steep sloped hills ranging in elevation from below 200 m to 550 m. Barolo DOCG wine is made exclusively from the Nebbiolo grape. Bioclimatic indexes are often used in viticulture to gain a better understanding of broader climate trends which can be compared temporally and geographically. These indexes are also used for identifying potential phenological timing, growing region suitability, and potential risks associated with expected climatic changes. Understanding how topography influences bioclimatic indexes can help with understanding of mesoscale climate behaviour leading to improved decision making and risk management strategies. The average monthly maximum and minimum temperatures, the Cool Night Index, the Huglin Index, and the monthly diurnal range (from July to October) were calculated using data from 45 weather stations within a 40 km radius of the Barolo DOCG growing area between the years 1996 and 2019. Linear and multiple regression models were developed using independent variables (elevation, aspect, slope) extracted from a digital elevation model to identify significant relationships. Bioclimatic indexes were then kriged with external drift using independent variables that showed significant relationships with the bioclimatic index using a 100 m resolution grid. The maximum monthly temperatures and the Huglin Index showed consistent significant negative relationships with elevation in all years. The minimum monthly temperatures showed no relationship with elevation but in some months a small but significant relationship was observed with aspect. Due to the lack of a relationship between minimum monthly temperatures and elevation compared to the significant relationship between maximum monthly temperatures and elevation, monthly diurnal range had a negative relationship with elevation.