Fully automated non-targeted GC-MS data analysis

Abstract

Non-targeted analysis is applied in many different domains of analytical chemistry such as metabolomics, environmental and food analysis. In contrast to targeted analysis, non-targeted approaches take information of known and unknown compounds into account, are inherently more comprehensive and give a more holistic representation of the sample composition. 

Besides chromatographic techniques coupled to high resolution mass spectrometry such as LC-HRMS, gas chromatography with unit resolution mass spectrometry is still regularly utilized for non-targeted profiling or fingerprinting. This is mainly due to high separation power of GC and a wide availability and low costs of quadrupole mass spectrometers. 

Although several non-targeted approaches have been developed, data processing still remains a serious bottleneck. Baseline correction, feature detection, and retention time alignment can be prone to errors and time-consuming manual corrections are often necessary. We therefore developed an automated strategy to non-targeted GC-MS data avoiding feature detection and retention time alignment. The novel automated approach includes segmentation of chromatograms along the retention time axis, multiway decomposition of transformed segments followed by a supervised machine learning pipeline based on gradient boosted tree classification on the decomposed tensor [1, 2]. 

In order to make this novel data analysis strategy available to scientists without programming background, we developed a convenient browser based application. For the here presented interactive browser application the open source Python packages Bokeh and HoloViews were used. The application will be online freely available soon. 

[1] J. Vestner, G. de Revel, S. Krieger-Weber, D. Rauhut, M. du Toit, A. de Villiers, Toward automated chromatographic fingerprinting: A non-alignment approach to gas chromatography mass spectrometry data. Acta Chimica Acta 911 (2016) 42-58 
[2] K. Sirén, U. Fischer, J. Vestner, Automated supervised learning pipeline for non-targeted GC-MS data analysis. Analytica Chimica Acta: X 1 (2019) 100005

DOI:

Publication date: June 19, 2020

Issue: OENO IVAS 2019

Type: Article

Authors

Jochen Vestner, Kimmo Sirén, Pierre Le Brun, Ulrich Fischer

Institute for Viticulture and Oenology, DLR Rheinpfalz, Breitenweg 71, D-67435 Neustadt, Germany
Institut National Supérieur des Sciences Agronomiques de l’Alimentation et de l’ Environnement, Agrosup Dijon, 6 boulevard Docteur Petitjean, 21000 Dijon, France
Department of Chemistry, University of Kaiserslautern, Erwin-Schroedinger-Strasse 52, D-67663 Kaiserslautern

Contact the author

Keywords

metabolomics, non-targeted, GC-MS, exploratory data analysis 

Tags

IVES Conference Series | OENO IVAS 2019

Citation

Related articles…

A better understanding of the climate effect on anthocyanin accumulation in grapes using a machine learning approach

The current climate changes are directly threatening the balance of the vineyard at harvest time. The maturation period of the grapes is shifted to the middle of the summer, at a time when radiation and air temperature are at their maximum. In this context, the implementation of corrective practices becomes problematic. Unfortunately, our knowledge of the climate effect on the quality of different grape varieties remains very incomplete to guide these choices. During the Innovine project, original experiments were carried out on Syrah to study the combined effects of normal or high air temperature and varying degrees of exposure of the berries to the sun. Berries subjected to these different conditions were sampled and analyzed throughout the maturation period. Several quality characteristics were determined, including anthocyanin content. The objective of the experiments was to investigate which climatic determinants were most important for anthocyanin accumulation in the berries. Temperature and irradiance data, observed over time with a very thin discretization step, are called functional data in statistics. We developed the procedure SpiceFP (Sparse and Structured Procedure to Identify Combined Effects of Functional Predictors) to explain the variations of a scalar response variable (a grape berry quality variable for example) by two or three functional predictors (as temperature and irradiance) in a context of joint influence of these predictors. Particular attention was paid to the interpretability of the results. Analysis of the data using SpiceFP identified a negative impact of morning combinations of low irradiance (lower than about 100 μmol m−2 s−1 or 45 μmol m−2 s−1 depending on the advanced-delayed state of the berries) and high temperature (higher than 25oC). A slight difference associated with overnight temperature occurred between these effects identified in the morning.

Sustainable fertilisation of the vineyard in Galicia (Spain)

Excessive fertilization of the vineyard leads to low quality grapes, increased costs and a negative impact on the environment. In order to establish an integrated management system aimed at a sustainable fertilization of the vineyards, nutritional reference levels were established. For this purpose, 30 representative vineyards of the Albariño variety were studied, in which soil and petiole analyses were carried out for two years and grape yield and quality at harvest were measured. In both years of study, soil pH, calcium, sodium and cation exchange capacity were positively correlated with calcium content and negatively correlated with manganese in grapes. Irrigated vineyards had higher levels of aluminium in soil and lower levels of calcium in petiole. Climatic conditions were very different in the years of the study. The year 2019 was colder than usual, in 2020 there was a marked water stress with high summer temperatures. This resulted in medium-high acidity in grapes in 2019 and low acidity in 2020, with sugar levels being similar both years. A very marked decrease in must amino nitrogen was observed in 2020, with ammonia nitrogen remaining stable. The correlation of acidity and sugar values in grapes with soil and petiole analysis data made it possible to establish reference levels for the nutritional diagnosis of the Albariño variety in this region. Based on these results, an easy-to-use TIC application is currently being created for grapegrowers, aimed at improving the sustainability of the vineyard through reasoned fertilization. This study has now been extended to other Galician vine varieties.

Evaluation of climate change impacts at the Portuguese Dão terroir over the last decades: observed effects on bioclimatic indices and grapevine phenology

In the last decades the growers of the Portuguese Dão winegrowing region (center of Portugal) are experiencing changes in climate that are influencing either grape phenology berry health and ripening. Aiming to study the relationships between climate indices (CI), seasonal weather and grapevine phenology, in this work long-term climate and phenological data collected at the experimental vineyard of the Portuguese Dão research centre between 1958 and 2019 (61 years) for the red variety Touriga Nacional, was analyzed. The trends over time for the classical temperature-based indices (Growing Season Temperature – GST -, Growing Degree Days – GDD, Huglin Index – HI and Cool Night Index – CI) presented a significantly positive slope while the Dryness Index (DI) showed a negative trend over the last 61 years. Regarding grapevine phenology, an average advance of 4.5 days per decade in the harvest day was observed throughout the last 61 years. Consequently, the weather conditions during the ripening period have changed, showing an increasing trend over time in the average temperature (higher magnitude in the maximum than in the minimum temperature) and a decrease in the accumulated rainfall. A regression analysis showed that ~50% of harvest date variability over years was explained by the temperature-based indices variability. These observed effects of climate change on bioclimatic indices and corresponding anticipation of harvest date can still be considered advantageous for the Dão terroir as it allows to achieve an optimal berry ripening before the common equinox rains and, therefore, avoid the potential negative impacts of the rainfall on berry health and composition.

Spatiotemporal patterns of chemical attributes in Vitis vinifera L. cv. Cabernet Sauvignon vineyards in Central California

Spatial variability of vine productivity in winegrapes is important to characterise as both yield and quality are relevant for the production of different wine styles and products. The objectives were to understand how patterns of variability of Cabernet Sauvignon fruit composition changed over time and space, how these patterns could be characterised with indirect measurements, and how spatial patterns of the variation in fruit compositional attributes can aid in improving management. Prior to the 2017 vintage, 125 data vines were distributed across each of four vineyards in the Lodi American Viticultural Area (AVA) of California. Each data vine was sampled at commercial harvest in 2017, 2018, and 2019. Yield components and fruit composition were measured at harvest for each data vine, and maps of yield and fruit composition were produced for eight ‘objective measures of fruit quality’: total anthocyanins, polymeric tannins, quercetin glycosides, malic acid, yeast assimilable nitrogen, β-damascenone, C6 alcohols and aldehydes, and 3-isobutyl-2-methoxypyrazine. Patterns of variation in anthocyanins and phenolic compounds were found to be most stable over time. Given this relative stability, management decisions focused on fruit quality could be based on zonal descriptions of anthocyanins or phenolics to increase profitability in some vineyards. In each vineyard, dormant season pruning weights and soil cores were collected at each location, elevation and soil apparent electrical conductivity surveys were completed, and remotely sensed imagery was captured by fixed wing aircraft and two satellite platforms at major phenological stages. The data collected were used to develop relationships among biophysical data, soil, imagery, and fruit composition. The standardised and aggregated samples from four vineyards over three seasons were included in the estimation of ‘common variograms’ to assess how this technique could aid growers in producing geostatistically rigorous maps of fruit composition variability without cumbersome, single season sampling efforts.

Is wine terroir a valid concept under a changing climate?

The OIV[i] defines terroir as a concept referring to an area in which collective knowledge of the interactions between the physical and biological environment (soil, topography, climate, landscape characteristics and biodiversity features) and vitivinicultural practices develops, providing distinctive wine characteristics. Those are perceptible in the taste of wine, which drives consumer preference and, therefore, wine’s value in the marketplace. Geographical indications (GI) are recognized regulatory constructs formalizing and protecting the nexus between wine taste and the terroir generating it. Despite considering updates, GIs do not consider the nexus as a dynamic one and do not anticipate change, namely of climate. Being climate a fundamental feature of terroir, it strongly impacts wine characteristics, such as taste. According to IPCC[ii], many widespread, rapid and unprecedented changes of climate occurred, some being irreversible over hundreds to thousands of years. Climatic shifts and atmospheric-driven extreme events have been widely reported worldwide. Recent climatic trends are projected to strengthen in upcoming decades, whereas extremes are expected to increase in frequency and intensity, forcing wines away from GI definitions. Geographical shifts of viticultural suitability are projected, often moving into regions and countries different from current ones. Some authors propose adaptation in viticulture, winemaking and product innovation. We show evidence of climate changing wine characteristics in the Douro valley, home of 270-year-old Port GI. We discuss herein resist or adapt stances for when climate changes the nexus between terroir and wine characteristics. Using the MED-GOLD[iii] dashboard, a tool allowing for easy visual navigation of past and future climates, we demonstrate how policymakers can identify future moments, throughout the 21st century under different emission scenarios, when GI specifications will likely need updates (e.g., boundaries, varieties) to reduce climate-change impacts.