Fully automated non-targeted GC-MS data analysis

Abstract

Non-targeted analysis is applied in many different domains of analytical chemistry such as metabolomics, environmental and food analysis. In contrast to targeted analysis, non-targeted approaches take information of known and unknown compounds into account, are inherently more comprehensive and give a more holistic representation of the sample composition. 

Besides chromatographic techniques coupled to high resolution mass spectrometry such as LC-HRMS, gas chromatography with unit resolution mass spectrometry is still regularly utilized for non-targeted profiling or fingerprinting. This is mainly due to high separation power of GC and a wide availability and low costs of quadrupole mass spectrometers. 

Although several non-targeted approaches have been developed, data processing still remains a serious bottleneck. Baseline correction, feature detection, and retention time alignment can be prone to errors and time-consuming manual corrections are often necessary. We therefore developed an automated strategy to non-targeted GC-MS data avoiding feature detection and retention time alignment. The novel automated approach includes segmentation of chromatograms along the retention time axis, multiway decomposition of transformed segments followed by a supervised machine learning pipeline based on gradient boosted tree classification on the decomposed tensor [1, 2]. 

In order to make this novel data analysis strategy available to scientists without programming background, we developed a convenient browser based application. For the here presented interactive browser application the open source Python packages Bokeh and HoloViews were used. The application will be online freely available soon. 

[1] J. Vestner, G. de Revel, S. Krieger-Weber, D. Rauhut, M. du Toit, A. de Villiers, Toward automated chromatographic fingerprinting: A non-alignment approach to gas chromatography mass spectrometry data. Acta Chimica Acta 911 (2016) 42-58 
[2] K. Sirén, U. Fischer, J. Vestner, Automated supervised learning pipeline for non-targeted GC-MS data analysis. Analytica Chimica Acta: X 1 (2019) 100005

DOI:

Publication date: June 19, 2020

Issue: OENO IVAS 2019

Type: Article

Authors

Jochen Vestner, Kimmo Sirén, Pierre Le Brun, Ulrich Fischer

Institute for Viticulture and Oenology, DLR Rheinpfalz, Breitenweg 71, D-67435 Neustadt, Germany
Institut National Supérieur des Sciences Agronomiques de l’Alimentation et de l’ Environnement, Agrosup Dijon, 6 boulevard Docteur Petitjean, 21000 Dijon, France
Department of Chemistry, University of Kaiserslautern, Erwin-Schroedinger-Strasse 52, D-67663 Kaiserslautern

Contact the author

Keywords

metabolomics, non-targeted, GC-MS, exploratory data analysis 

Tags

IVES Conference Series | OENO IVAS 2019

Citation

Related articles…

Spatial determination of areas in the Western Balkans region favorable for organic production

In problematic conditions for production of grapes and wine caused by the COVID-19 pandemic and the resulting occurrence of wine surpluses, producers are increasingly turning to the innovative viticulture and winemaking of products that are more appealing to the market and the consumers. On the other hand, consumption of the food safety or organic products, and therefore of organic grapes and wine, is increasingly common in the world, in particular in Europe. The Regional Rural Development Standing Working Group (SWG RRD), as a regional intergovernmental organization gathers actors in the viticulture and winemaking sector from states and territories of the Western Balkans (South-East Europe) in the Expert Working Group for Wine, with the aim of improving viticulture and winemaking in this region through joint activities. In accordance with the aforementioned, the SWG RRD is working on advancing organic production of grapes and wine, and on recognition of specificities of the terroir of wine-growing areas in Western Balkans. In addition, as part of the project “Facilitation of Exchange and Advice on Wine Regulations in Western Balkan Countries” helmed by the German Federal Ministry of Food and Agriculture, in addition to harmonization of relevant legislation with EU regulations, efforts are being invested towards recognition of organic wines. Within activities and project implemented by this organization, expert analyses and scientific research of the terroir of Western Balkans were carried out, and some of the results are presented in this paper.

Adaptation to soil and climate through the choice of plant material

Choosing the rootstock, the scion variety and the training system best suited to the local soil and climate are the key elements for an economically sustainable production of wine. The choice of the rootstock/scion variety best adapted to the characteristics of the soil is essential but, by changing climatic conditions, ongoing climate change disrupts the fine-tuned local equilibrium. Higher temperatures induce shifts in developmental stages, with on the one hand increasing fears of spring frost damages and, on the other hand, ripening during the warmest periods in summer. Expected higher water demand and longer and more frequent drought events are also major concerns. The genetic control of the phenotypes, by genomic information but also by the epigenetic control of gene expression, offers a lot of opportunities for adapting the plant material to the future. For complex traits, genomic selection is also a promising method for predicting phenotypes. However, ecophysiological modelling is necessary to better anticipate the phenotypes in unexplored climatic conditions Genetic approaches applied on parameters of ecophysiological models rather than raw observed data are more than ever the basis for finding, or building, the ideal varieties of the future.

Grape berry size is a key factor in determining New Zealand Pinot noir wine composition

Making high quality but affordable Pinot noir (PN) wine is challenging in most terroirs and New Zealand’s (NZ) situation is no exception. To increase the probability of making highly typical PN wines producers choose to grow grapes in cool climates on lower fertility soils while adopting labour intensive practices. Stringent yield targets and higher input costs necessarily mean that PN wine cost is high, and profitability lower, in line-priced varietal wine ranges. To understand the reasons why higher yielding vines are perceived to produce wines of lower quality we have undertaken an extensive study of PN in NZ. Since 2018, we established a network of twelve trial sites in three NZ regions to find individual vines that produced acceptable commercial yields (above 2.5kg per vine) and wines of composition comparable to “Icon” labels. Approximately 20% of 660 grape lots (N = 135) were selected from within a narrow juice Total Soluble Solids (TSS) range and made into single vine wines under controlled conditions. Principal Component Analysis of the vine, berry, juice and wine parameters from three vintages found grape berry mass to be most effective clustering variable. As berry mass category decreased there was a systematic increase in the probability of higher berry red colour and total phenolics with a parallel increase in wine phenolics, changed aroma fraction and decreased juice amino acids. The influence of berry size on wine composition would appear stronger than the individual effects of vintage, region, vineyard or vine yield. Our observations support the hypothesis that it is possible to produce PN wines that fall within an “Icon” benchmark composition range at yields above 2.5kg per vine provided that the Leaf Area:Fruit Weight ratio is above 12cm2 per g, mean berry mass is below 1.2g and juice TSS is above 22°Brix.

Influence of agronomic practices in soil water content in mid-mountain vineyards

In the context of LIFE project MIDMACC (LIFE18 CCA/ES/001099), several pilots have been installed in vineyards in mid mountain areas of Catalonia (NE Spain) to test well stablished agronomic practices to increase the adaptation of Mediterranean mid mountain to climate change. Soil water content (SWC) at three different depths (15, 30 and 45cm) was measured in continuum from August 2020. One pilot (WC) included a well-established green cover (GC), a new GC (NC) and a conventional soil management (CM, tilling+herbicides). NC presented an intermediate state between WC and CM, responding similarly to CM in autumn but quickly reaching similar SWC to WC, then following the same evolution till next spring, with CM presenting lower values along autumn and winter. Then vegetation activation decreased SWC in all plots, (much slower in CM, lacking GC). Sensibility to spring rains is again intermediate for NC, which joins SWC evolution of CM by the end of spring till next autumn. It is expected that NC will resemble WC more and more as its GC develops. In the pilot combining vine training (VSP vs Gobelet) and hillside management (slope vs terrace), no clear pattern could be related with these conditions. However, both terraces seem to be more sensitive to spring rains. A third pilot included new vineyards (7 and 1 year old). In the new vineyard (N), higher canopy development, a spontaneous green cover and row straw resulted in a slower SWC dynamic, not so sensitive to rains but conserving more soil water in spring and most of summer, even with presumably a higher water extraction by vines. In the newest vineyard (VN) the deepest sensor is still sensitive to rain events all over the year and SWC is always highest at this depth, revealing small water capture by vines.

VINIoT: Precision viticulture service for SMEs based on IoT sensors network

The main innovation in the VINIoT service is the joint use of two technologies that are currently used separately: vineyard monitoring using multispectral imaging and deployed terrain sensors. One part of the system is based on the development of artificial intelligence algorithms that are feed on the images of the multispectral camera and IoT sensors, high-level information on water stress, grape ripening status and the presence of diseases. In order to obtain algorithms to determine the state of ripening of the grapes and avoid losing information due to the diversity of the grape berries, it was decided to work along the first year 2020 at berry scale in the laboratory, during the second year at the cluster scale and on the last year at plot scale. Different varieties of white and red grapes were used; in the case of Galicia we worked with the white grape variety Treixadura and the red variety Mencía. During the 2020 and 2021 campaigns, multispectral images were taken in the visible and infrared range of: 1) sets of 100 grapes classifying them by means of densimetric baths, 2) individual bunches. The images taken with the laboratory analysis of the ripening stage were correlated. Technological maturity, pH, probable degree, malic acid content, tartaric acid content and parameters for assessing phenolic maturity, IPT, anthocyanin content were determined. It has been calculated for each single image the mean value of each spectral band (only taking into account the pixels of interest) and a correlation study of these values with laboratory data has been carried out. These studies are still provisional and it will be necessary to continue with them, jointly with the training of the machine learning algorithms. Processed data will allow to determine the sensitivity of the multispectral images and select bands of interest in maturation.