Understanding the dark side of food: the analysis of processed food by modern mass spectrometry

Posted: 5 March 2014 | Nikolai Kuhnert, School of Engineering and Science, Jacobs University Bremen | No comments yet

Do we actually know what we eat? How well do we really understand the chemical composition of our daily food? These are two questions of utmost importance for consumers, food manufacturers and the scientific community involved in food research. The answer to that question is anything but straightforward. Our knowledge of the chemical composition of food resembles the Roman god Janus, on one side a tremendous knowledge has been built up over the last decade about the chemical composition of the raw materials used for food production. The database FoodDB contains around 28,000 entries of fully characterised chemical compounds found in our daily diet and for many food raw material food scientists are able to account for nearly every molecule present in the raw material.

Coffee Beans

Fieldhouse claims that

The other side of Janus is processed food. Most of the food consumed by humans undergoes some sort of processing, either by thermal treatment (baking, roasting, frying, cooking, steaming, etc.), by fermentation, pickling, pressure treatment, irradiation or others. During these processing steps, usually associated with browning or darkening of colour (here referred to as the dark side), the chemical composition of the initial raw material is dramatically altered. The hundreds of components present in the raw material undergo a myriad of chemical transformations producing thousands of novel products. As a rule of thumb, it can be estimated that up to 50 per cent of the compounds present in the food raw materials are decomposed by chemical reactions to produce new products1. Since humans consume the vast majority of their food after processing, the actual percentage depending on social and cultural circumstances, up to 80 per cent of food consumed is estimated to be processed prior to consumption. It follows that the majority of the food consumed by humans is processed food.

It is worth pointing out that processing food constitutes a unique activity of humans and distinguishes humans from all other animal species on our planet. Very little is known about the chemical composition of processed food, mainly due to the enormous chemical complexity of the materials obtained after food processing, containing sometimes several tens of thousands of chemical compounds. Such large compound numbers are usually associated with the presence of an unresolved hump in a chromatographic analysis. These humps with its associated complexity have so far constituted an insurmountable hurdle for analytical chemistry. Trying to gain a better understanding of processed food therefore means to develop and apply analytical methods that are capable of coping with such an enormous complexity.

Our research group has over the last few years developed such methods based on modern mass spectrometry that allow, for the first time, an understanding of the composition of processed food along with an understanding of the chemistry underlying food processing. This article will provide a short overview about the methods used and highlight some key findings of our investigations.

Modern mass spectrometry

Mass spectrometry has made some enormous advances over the past two decades. The advent of Electrospray Ionisation (ESI) and Matrix Assisted Laser Desorption Ionisation (MALDI) has allowed the soft ionisation of almost any biomolecule, large or small, stable or labile and its transfer into the gas phase for further analysis. Isolation of ions in the gas phase followed by fragmentation of intact ions in the gas phase produce fragment ions (this process is referred to as tandem mass spectrometry), which offer the possibility of carrying out structure detailed elucidation. Additionally, the coupling of mass spectrometers to chromatographic equipment and other spectroscopic techniques has allowed to separate compounds physically and obtain multi-dimensional information on the analytes, all this at relatively high sensitivity for routine commercial MS instruments in the nM range or better.

The key advantage of mass spectrometry in the analysis of processed food, however, is its resolution, in particular if using high resolution mass analysers such as Time of Flight (TOF) or Fourier Cyclotrone Resonance mass detectors (FT-ICR-MS), able to resolve tens of thousands of ions in a single mass spectrometry experiment. The resolution achieved in such experiments is therefore several orders of magnitude higher if compared to chromatographic separation or any other spectroscopic method.

A high resolution mass spectrum of processed food samples shows typically several thousands of resolved signals (see Table 1 for representative examples) e.g. around 2,000, for caramel, around 10,000 for a black tea infusion or around 30,000 for a cocoa powder sample. From the observed signals in the mass spectrum, molecular formulae for the large majority of signals can be determined, directly taking advantage of the high mass accuracy employed, providing extensive mass lists of all analytes observable. It should be noted that in such an experiment, not all analytes present can be ionised with one given ionisation technique, however, the use of complementary ionisation methods for example in positive and negative ion mode and using ESI, MALDI and APCI provides a more complete picture of the content of a given sample. Additionally, it must be noted that mass spectrometry is isomer blind at this stage, resulting in the necessity to obtain average isomer numbers from subsequent LC-tandem-MS experiments. These average isomer numbers are then multiplied with the number of signals obtained in a high resolution MS experiment, providing an estimate of real compound numbers. Figure 1 shows a typical MS spectrum of a complex mixture, in this case Maillard reaction products. 

Interpretation of high resolution mass spectra

As mentioned above, a high resolution mass spectrum of processed food samples provides molecular formula information on thousands or even tens of thousands of analytes present within the sample in a single experiment. In order to interpret such extremely information rich samples, we have adapted and transferred data interpretation strategies pioneered in the field of crude oil analysis (petrolomics) along with the development of novel data interpretation strategies. In petrolomics, two approaches have been shown to be of enormous use, the elemental ratio analysis (or van Krevelen analysis) and the mass defect analysis (or Kendrick analysis)2.

In the van Krevelen analysis, elemental ratios e.g. H/C or O/C are calculated for each analyte from the molecular formulae information and plotted against one another. The beauty of this approach lies in the fact that certain classes of compounds (e.g. polyphenols, proteins, carbohydrates, lipids, terpenes phenolglycosides etc.) are characterised by a set of elemental ratio boundaries. Therefore, all members of a specified class of compounds appear in a restricted zone on the plot, hence a tentative classification of the tens of thousands of analytes in distinct classes of compounds becomes possible. Figure 2 shows a typical van Krevelen diagram of an FT-ICR-MS analysis of a cocoa powder extract with compounds of different elemental compositions colour coded.

The Kendrick approach makes use of the observation that in a homologous series of compounds, which are defined as series compounds, to which starting from a given precursor, a defined group of atoms is repeatedly added. The repeated addition of a defined group of atoms with an associated mass increment leads within the series of compounds to a constant mass defect. If the plot is normalised to a given mass defect (e.g. in Figure x loss of water H2O), all members of a homologous series lie on a straight line parallel to the x-axis on the Kendrick plot. Using this approach, homologous series of compounds can be readily identified from the data set. Figure 3 shows a Kendrick diagram from a caramel sample, where homologous series formed by dehydration are shown in colour parallel to the x-axis. From our work, we could show that for nearly all classes of processed food, a relatively small selection of chemical reactions actually take place on heating (loss of water, addition of water,  transesterification, transglycosylation etc.) or fermentation (oxidation, nucleophilic attack by water, carbohydrates, phenols etc.), always leading to a homologous series of compounds with repeated sequences of mechanistically identical reactions. Hence, the identification of homologous series immediately suggests the mechanistic pathways for chemical reactions occurring during food processing.

Further parameters can be derived from the high resolution MS data set and plotted against one another to provide graphical displays with patterns and trends embedded in the graph, whose visual inspection and later on rationalisation provides additional valuable insight into both the chemical composition and the mechanistic chemistry underlying food processing.

Chromatographic coupling to mass spectrometry

Liquid chromatography can be easily coupled to mass spectrometry and complex mixtures arising from food processing analysed this way. In contrast to the high resolution MS analysis by direct infusion described in the previous section, chromatography allows additional separation of isomeric structures and hence an estimate of the numbers of isomers present in a given sample. Additionally, the chromatographic separation reduces the effect of ion suppression and ion enhancement in a direct infusion measurement, which tends to distort the information obtained. Complex mixtures (also referred to as UCMs; unresolved complex mixtures) typically show next to well defined chromatographic peaks as an unresolved hump as exemplified in Figure 4.

If coupled to tandem mass spectrometry, additional structural information can be obtained from chromatographic runs. In the absence of authentic reference materials, which for processed food analysis constitutes a major problem, we have suggested that compounds of similar structure display identical fragmentation mechanisms. Hence from the homologous series analysis described tentative structural hypothesis for compounds produced in food processing must be developed, which in a second stage can be probed by LC-tandem-MS. A search in an extracted ion chromatogram reveals all compounds with a given molecular formula, and after inspection of their fragment spectra, these can be grouped into compound classes displaying identical fragmentation mechanisms. Establishing the correct regio- and stereochemistry for the compounds is of course not possible in the absence of authentic reference materials; however, to solve this problem we have used computational chemistry to aid compound assignment. Here, our approach is to use DFT calculations to obtain HOMO and LUMO orbital coefficients from starting materials in food processing, which immediately suggest likely regiochemical outcomes of the reactions occurring.

Structure elucidation by tandem mass spectrometry

MS has the advantage of being able to isolate ions of a chosen m/z value and subsequently fragment the chosen ions, revealing valuable structural information through the analysis of fragment ions. Recent years have shown that this method is much more powerful than previously assumed with even regio- and stereoisomeric ions yielding distinct fragment spectra that allow unambiguous structure elucidation. As a prime example, all four regioisomeric mono-esters of quinic acid and all six possible regioisomeric di-esters of quinic acid , all generally classified as chlorogenic acids, have been shown to display diagnostic and fragment spectra that allow predictive assignment of compounds3. Figure 5 shows the MS2 spectra of all six regioisomers of dicaffeoyl quinic acid as an illustrative example. Similarly, esters of diastereoisomers of quinic acids have been shown to display diagnostic fragment spectra allowing assignment of stereochemistry of the quinic acid moiety.  In the case of quinic acid regioisomers and stereoisomers, all assignments of compounds could be supported with the aid of authentic reference materials obtained by chemical synthesis or extraction and isolation from natural sources4.

In other work, we have shown that the same principle of structure elucidation can be applied to shikimic acid derivatives5, lactones or proanthocyanidines6 and in so far unpublished work, we have extended this principle to the unambiguous identification of carbohydrate esters and carbohydrates themselves.

In general, we take the view that tandem mass spectrometry is as information rich as NMR spectroscopy and should in principle be able to provide all information necessary for a complete structure elucidation of even complex structures. Two aspects, however, prevent a full exploitation of its tremendous potential; firstly a lack of standardisation of MS equipment and experimental parameters used in practice and secondly an insufficient understanding on how MS data are correlated to structure. Both will change within the next decade.

Quantification of compounds from Food Processing

Using our approach, no quantitative data on analytes present can be deduced from the data obtained, since in a complex sample, all analytes compete for ionisation and the ability to ionise preferentially determines the intensity of the signals observed rather than relative or absolute quantities. Additionally, the effects of ion suppression and enhancement change signal intensities in samples of non-identical compositions. Hence for quantification authentic reference materials are always required, which after obtaining calibration curves can be used even in very complex samples for quantification purposes. Here, either tandem mass spectrometry in the single or multiple reaction monitoring mode (SRM or MRM) or the use of extracted ion chromatograms from high resolution LC-MS data can be employed, guaranteeing a high level of selectivity for the analyte of interest.

Data reduction using multi-variant statistical techniques

The section above described our strategies to carry out a full analysis of a chromatographic unresolved hump. Alternatively, one has the option to concentrate only on compounds that are important within a sample. The importance can relate to sensory properties, sample origin, food authenticity, proceesing parameters (improving visual appearance, shelf-life of a product etc.), beneficial health effects or possible IP considerations. If such parameters are related to differences in composition between samples, multi-variant statistical techniques can be employed to analyse a data set without loss of information. The most popular method to achieve data reduction without loss of information is principal component analysis (PCA). Here all analytes e.g. observed in an LC-MS run are characterised as a triplet of data containing retention time, m/z ratio and intensity. Out of all the data describing the analyte, a set of linear combination is computed (the principle component) and those searched that account for most variations between samples7.

The results of a PCA analysis are displayed in two plots. Firstly the scores plot shows how sample groups can be distinguished in two selected PCs (each data point represent one sample or LC-MS run) and seconcdly in the loading plot every data point corresponds to one analyte that is in particular responsible for differentiations. Hence analytes that are important for differences between samples can be readily identified irrespective of the complexity of the original sample.

As an example, a PCA scores and loading plot of a series of thearubigins from Kenyan and Ceylonese black tea are shown in Figure 6. The scores plot allows clear differentialtion between samples, however the loading plot reveals the extraordinarily complexity of the samples with many hundreds compounds contributing to sample differentiation.

Selected examples of processed food analysed: black tea

Most of our analytical strategies for the investigation of processed food were developed for black tea chemistry. Black tea is produced from the young shoots of Camilla sinensis or Camilla assamica by a process referred to as fermentation. The green tea leaves are mechanically treated mixing their phenolic secondary metabolites, mainly six compounds of the class of the catechins, stored in the cell vacuoles with the enzyme polyphenoloxidase (PPO). PPO oxidises the B-ring of catechins to an ortho-quinone, which is subsequently attacked by a nucleophile, e.g. another catechin, to form dimeric structures such as theaflavins or theasinensins and a material referred to as thearubigin (TR) first described by E.A.H. Roberts in the 1950s. For structures, please see Figure 7. The chemical composition of the TRs remained mysterious and enigmatic for a long time and withstood for almost five decades any attempt to carry out a meaningful characterisation8.

Using FT-ICR-MS, we could show that the TRs are composed from around 30,000 different compounds (10,000 signals in a mass spectrum multiplied by an average number of isomers) arising through the action of PPO acting on the only six catechins contained within the plant leaf. Using the analytical data interpretation strategies described, we formulated a hypothesis for TR formation, which has become known as the oxidative cascade hypothesis9. PPO produces electrophilic quinones, which are attacked by nucleophilic catechins to form oligomers (up to six catechin moieties). Additionally, water as the most abundant nucleophile in the green tea leaves furnishes polyhydroxylated derivatives of oligomeric catechins, which are in a redox equilibrium with their quinone counterparts. Hence 90 per cent of the TR constituents could be assigned with tentative structures and for more than 1,000 constituents, their structure was subsequently confirmed by tandem-MS. A sound and plausible mechanism for TR formation was introduced, which has so far withstood all further experimental tests10.

Roasted coffee

Green coffee beans contain mainly a class of secondary metabolites termed the chlorogenic acids (CGAs) along with proteins, carbohydrates and lipids. Roasted coffee is produced by heating green coffee beans to temperatures ranging from 180 – 250°C for a period of 8 – 15 minutes producing different roasts. An FT-ICR-MS of roasted coffee extracts revealed the presence of around 2,000 signals, which could, by comparison to model roast systems, be classified into products arising from the reactions of all the individual classes of compounds present in the bean, with products arising from CGAs and carbohydrates dominating the complex mixture. Using synthetic chemistry, we could show that CGAs undergo a series of simple organic transformation during roasting, which include acyl-migration, dehydration to form lactones or cyclohexene derivatives along with epimerisation. In a second brewing step, water undergoes conjugate addition to CGAs and subsequent beta-eliminations yielding cis-CGA derivatives11. Figure 8 illustrates the different mechanistic pathways observed for CGAs after thermal treatment in coffee roasting.


The products of thermally treated sugars are traditionally referred to as caramel. For caramel made from sucrose, we could show that a typical caramel sample is characterised by the presence of around 3,000 signals in an FT-ICR-mass spectrum. Further analysis revealed that structures encountered are formed by oligomerisation of sucrose and sequential stepwise dehydration. In contrast to CGAs, no epimerisation of sugar moieties could be observed12. Similar work was carried out for the analysis of thermally treated starch as frequently used in bread baking. Starch breaks down into small glucose oligomers followed by dehydration reactions resembling the chemistry of caramel formation13.


In conclusion, our novel analytical strategies based on modern mass spectrometry allow, for the first time, a detailed insight into processed food, or otherwise referred to here as the dark side of food. The number of constituents present can be obtained along with molecular formulae lists for all constituents. Further data interpretation allows the formulation of structural and mechanistic hypothesis on the chemistry underlying food processing, thus a global picture of processed food composition emerges. We have so far successfully applied this approach to black tea chemistry, coffee roasting and caramel formation. Current work focuses on the investigation of cocoa production and the 100-year-old enigmatic Maillard reaction.

Our methods are not aimed at replacing the more classical approach based on isolation, purification and complete spectroscopic structure elucidation of food constituents. Rather, it constitutes a complementary approach to the classical way of food chemistry and should always be attempted in cases where thousands of analytes prevent due to poor chromatographic resolution their proper purification and characterisation.

We hope as well that regulatory authorities will take notice of our new approach and request and accept such data for the majority of processed foods, whether new or old, whose chemical composition has remained unknown and mysterious.

From a philosophical point of view, our work has highlighted a completely unexpected chemical complexity and chemical diversity of complex food, raising the general question on how humans are actually coping with a daily avalanche of tens of thousands of xenobiotics defining the chemistry and pleasure of processed food. We like to put forward the hypothesis that the chemical composition of processed food has largely contributed to the evolutionary success of the human species. To find out why and how will be the research challenge for the future.


  1. “What is under the hump? Mass spectrometry based analysis of processed food. Lessons from the analysis of black tea thearubigins, coffee melanoidines and caramel”, N. Kuhnert, F. Dairpoosh, A. Golon, G. Yassin and R. Jaiswal, Food and Function, 2013, 4, 1130-1147
  2. “On the chemical characterization of black tea thearubigins using mass spectrometry” N. Kuhnert, J. W. Drynan, J. Obuchowicz and M. Clifford, Rapid Commun. Mass Spetrom. 2010, 24, 3387-3404
  1. “Hierarchical Scheme for the LC–MSn identification of chlorogenic acids”, M. N. Clifford, K. L. Johnston, S. Knight and N. Kuhnert, J. Agr. Food. Chem. 2003, 51, 2900-2911
  2. “Discriminating between the six isomers of dicaffeoylquinic acid by LC-MSn”  M. Clifford, S. Knight and N. Kuhnert, J. Agr. Food Chem. 2005, 53, 3821-3832
  3. “Profiling the chlorogenic acids and hydroxyl cinnamoylshikimates in mate (ilex paraguayensis)”, R. Jaiswal, T. Sovdat and N. Kuhnert J. Agr. Food Chem. 2010, 58, 5471-5484
  1. “Identification and characterization of proanthocyanidines of 16 members of the Rhododendron genus (Ericaceae) by tandem LC-MS” R. Jaiswal, L. Jayasinghe and N. Kuhnert, J. Mass Spectrom. 2012, 47, 502-515
  2. “Scope and limitations of principal component analysis of high resolution LC-MS data: The analysis of the chlorogenic acid fraction in green coffee beans as a case study” N. Kuhnert, R. Jaiswal, PO. Eruvichera, M. El-Abassy, B. von der Kammer and A. Materny, Anal. 2011, 3, 144-155
  1. “On the chemistry of small molecular weight polyphenols in black tea” J. W. Drynan, M. N. Clifford, J. Obuchowicz and N. Kuhnert, Nat. Prod. Rep. 2010, 27, 417-462.
  2. “Unravelling the structure of black tea thearubigins”, N. Kuhnert, Arch. Biochem. Biophys. 2010, 501, 37-51
  3. “Oxidative cascade reactions yielding polyhydroxy-theaflavins and theacitrins in the formation of black tea thearubigins: Evidence by tandem LC-MS” N. Kuhnert, M. N. Clifford and A. Müller, Food and Function, 2010, 1, 180-199
  4. “Investigating the chemical changes of chlorogenic acids during coffee brewing – conjugate addition of water to the olefinic moiety of chlorogenic acids and their quinides”, M. F. Matei, R. Jaiswal and N. Kuhnert, J. Agr. Food Chem. 2012, 60, 12105-12115
  5. “Unravelling the chemical composition of caramel”, A. Golon and N. Kuhnert, J. Agr. Food Chem. 2012, 60, 3266-3274
  6. Investigating the thermal decomposition of starch and cellulose in model systems and toasted bread” A. Golon, X. Hernandez, J. Z. Davalos and N. Kuhnert, J. Agr. Food Chem. 2013, 61, 674-684

About the author

Nikolai Kuhnert studied chemistry at the University of Würzburg and received his PhD under the supervision of Professor W. A. Schenk in 1995. Following postdoctoral stays at the Universities of Cambridge and Oxford, he obtained his first faculty position at the University of Surrey. In 2006, he moved to Jacobs University Bremen, where he is now a Full Professor of Analytical and Organic Chemistry. His research interests cover the application of modern mass spectrometry in structure elucidation of natural products, the analysis of complex mixtures from processed food and the chemistry and biological activity of dietary polyphenolics.

Related topics