Coupling NIR spectroscopy and chemometrics for the assessment of food quality

Posted: 28 February 2013 | Federico Marini, Department of Chemistry, University of Rome ‘La Sapienza’ | No comments yet

In the last 30 years, there has been increasing attention paid to the possibility of using Near Infrared (NIR) spectroscopy to deal with different aspects of food quality assessment. Indeed, the intrinsic characteristics of this technique, which, requiring little or no sample pretreatment, allows high throughput analyses in a rapid and non-invasive/non-destructive way, together with its easy on-line applicability, make NIR particularly suitable for real-time assessment and control of food quality both in a laboratory and on an industrial scale.

In the last 30 years, there has been increasing attention paid to the possibility of using Near Infrared (NIR) spectroscopy to deal with different aspects of food quality assessment[1]. Indeed, the intrinsic characteristics of this technique, which, requiring little or no sample pretreatment, allows high throughput analyses in a rapid and non-invasive/non-destructive way, together with its easy on-line applicability, make NIR particularly suitable for real-time assessment and control of food quality both in a laboratory and on an industrial scale.

From a physico-chemical standpoint, the near infrared region of the electromagnetic spectrum corresponds to high energy vibrational transitions (overtones and combination bands), which, differently to what happens in the mid-infrared frequencies, result in low intensity bands which are often highly overlapped and difficult to interpret directly. This is the main reason why the growth of applications involving the use of this technique accompanied and followed the development and diffusion of chemometric methods to process the spectral data, where the name ‘chemometrics’ indicates the discipline, which makes use of math ematical, statistical and logical methods to solve chemical problems and to extract the maximum possible information from the measured data[2]. Indeed, when the instrumental signals are noisy and/or poorly specific, and, in general, when a large number of variables (i.e., spectral intensities at different wavelengths) are recorded for each sample, the information which is sought is often hidden and hard to unravel by simply looking at the instrumental profiles recorded. In the following article, some successful examples of coupling chemometric data processing methods to NIR spectroscopic analysis of foodstuff will be presented.

Figure 1 NIR spectra of 57 olive oil samples from Sabina PDO and from other geographical origins

Traceability

It has been widely recognised that, for many foodstuffs, origin (geographic, species and production) represents an important quality attribute to the point that in 1992, the EU introduced norms concerning the Designations of Origin (Protected Denomination of Origin (PDO) and Protected Geographical Indication (PGI)) to protect typical products[3]. In this framework, the use of NIR spectroscopy coupled to opportune classification methods constitutes a promising approach for tracing the origin of foodstuff in a rapid, relatively cheap and non-destructive way. Indeed, the aim of chemometric classification methods is to build models that allow the accurate prediction of a qualitative property of the samples, based on the experimental fingerprint: as far as traceability problems are concerned, origin is the qualitative property to be predicted. To illustrate this concept, a case study concerning the authentication of olive oil samples from the PDO of Sabina (Italy) will be presented[4]: in this case, the scope of the research was to build a model that was able to recognise samples coming from a particular PDO area (Sabina, an oil-producing area in central Italy), based on the analysis of the spectral fingerprint. To this purpose, olive oil samples from both the Sabina area and from other origins, not only in Italy but also the rest of the world, were collected and analysed by NIR spectroscopy (see Figure 1).

Indeed, classification methods belong to the family of so-called supervised algorithms, i.e. those algorithms requiring that a set of samples for which the desired response (in this case, the geographical origin) is known (training set) is available, as this information is actively used in building the models. In particular, since in the example described in this paragraph attention was mainly focused on the PDO of Sabina, only two categories were defined: oils from Sabina and oils from other origins (irrespectively of their provenance). Classification of the oils (after spectral pre-treatment by the first derivative) was then accomplished by means of an algo rithm called Partial Least Squares-Discriminant Analysis (PLS-DA), a method which is particularly suitable for dealing with high dimensional data, i.e. data containing many variables, like the spectral fingerprints, providing at the same time a parsimonious description of the samples in terms of a few latent (abstract, mathematically constructed) variables, which facilitates the interpretation of the results and a reliable prediction of the origin of unknown individuals. These characteristics are well exemplified in Figure 2 (page 8), where the projection of the samples onto the space spanned by the first three latent vectors of the model is shown: in Figure 2 (page 8), the training samples are represented as points in the multivariate space, which appear to be clearly separated in two different groups, the first one corresponding to the extra virgin oils from the PDO Sabina, and the second one to the oils of other geographical origins.

The good separation between the two groups in space suggests that the use of the model to predict the origin of unknown samples should result in a high correct classification rate. Accordingly, when the model was applied to the NIR spectra collected on validation samples from both categories (i.e., samples of known origin which were not used in the model building phase), 100 per cent of the oils from Sabina and 95.5 per cent of those from other origins were correctly predicted. Indeed, when looking at the position of the test samples in the model space, reported in Figure 2 (page 8), it is possible to observe that they fall well inside the areas occupied by their respective categories, thus allowing an accurate prediction of the correct origin of the oils used for validation.

Figure 2 PLS-DA analysis of the olive oil data set: projection of the training and test samples from the two categories
Sabina and Other origins onto the first three latent variables calculated by the model. Legend: ●Sabina training; ■
Other origins training; ● Sabina test; ■ Other origins test

Quantification of food ingredients

The same concepts described in the previous paragraph for qualitative responses can be extended to the cases where the variables to be predicted are of a quantitative nature, such as, for instance, the concentrations of some food constituents or the values of analytical indices used for quality control (e.g., peroxide value or iodine value). The methods involved are gathered under the name of multivariate calibration techniques and aim at finding a relationship between a multivariate signal (e.g., the spectral fingerprint) and one or more real-valued dependent variables. Quite often, this relation is assumed to be as simple as possible, meaning that a linear dependence is postulated. In this framework, the algorithm which is most often used when there are many highly correlated predictors is Partial Least Squares regression (PLS). PLS is the same algorithm which constitutes the basis of the PLS-DA classification algorithm discussed in the previous paragraph, as the names suggest, and its main characteristic is to look for directions onto which to project the samples which have the highest covariance with the responses to be modelled. This means that PLS provides a parsimonious representation of the data set, projecting the individuals onto a lowdimensional space whose axes have the highest correlation with the dependent variables, at the same time explaining as much of the variability in the predictor space as possible. As an example, the possibility of using NIR spectroscopy to quantitate two important nutritional factors in naked oat samples will be presented here[5]. Indeed, the same spectroscopic fingerprint can be used as independent variables block to predict more than one constituent in the same food matrix, allowing a rapid, nondestructive and multi-component analysis. In particular, in the example discussed in this paragraph, this advantage was used to build calibration models for the simultaneous prediction of protein content and β-glucan in naked oat samples. The high content in dietary fibres, and in β-glucan, has triggered an increasing interest towards the use of oats in human consumption, especially after many recent studies have evidenced that these nutritional factors can have physiological effects and a positive impact on some of the risk factors responsible for cardiovascular diseases. In this framework, it must be pointed out that the available analytical methods for the quantification of β-glucan in oat samples are destructive, require a long and cumbersome sample preparation and sometimes don’t even possess the required accuracy. It is in cases like this that the advantages of the coupling of NIR spectroscopy to chemometrics emerge and are more evident. The approach described in this paragraph allows not only an accurate and nondestructive determination of the component of interest, β-glucan, but also the possibility of using the same spectral fingerprint to predict, without the need of performing further experiments, another important nutritional factor on the same samples (protein content). The results of the two calibrations are reported in Figure 3, both for the training samples and the test set that was used to validate the model, and it is apparent that in both cases the PLS models allow an accurate quantification of the variables to be predicted.

Figure 3 Results of PLS analysis for the quantification of b-glucan (a) and protein (b) content in naked oat samples. Legend: ●Training samples; ■Test sample

Moreover, when using a calibration method based on the projection of the samples onto latent variables, it is possible to identify the spectral regions which are most correlated with the responses and therefore to obtain a better interpretation of the results in terms of significant bands. In particular, an index called VIP, expressing how much the single experi mental variable contributes to the bilinear calibration model, was computed for this purpose. As a result, the regions between 400 and 700 nanometres, around 1150 nanometres and between 1900 and 2300 nanometres were found to contribute the most to the model for β- glucan: this outcome is particularly relevant as the latter region is reported in the literature to be an important band for polysaccharides, corresponding to OH stretching/deformation and C-O/O-H stretching combination bands. Similar considerations can be made for the model for the prediction of protein content, for which the relevant regions were found to be those between 400 and 700 nanometres, around 1100 and 1500 nanometres and between 2250 and 2498 nanometres, the latter two intervals corresponding to N-H deformation bands.

When the relationship between the responses and the predictors is not linear or can’t be reasonably approximated to be linear, it is still possible to achieve accurate calibrations in the framework of PLS regression. One example of this is to use models which are only locally linear and not linear over the whole data range, by identifying, whenever an unknown sample has to be analysed, the training samples most similar to it and building the regression model using only the data from these samples. This approach was applied, for instance, to build a model for the prediction of egg content in egg pasta samples produced under different manufacturing conditions, especially concerning drying time and temperature[6]. When inspecting the effect on the different processing conditions on the NIR fingerprint, it was found that nonlinearities in response could arise from the interaction of drying temperature and egg content: this hypothesis was also proved by the relatively high prediction error obtained on the validation set when using a globally linear model. On the other hand, by adopting a local PLS approach, a very accurate prediction of the response could be obtained, the prediction error being almost half of that resulting from standard PLS.

Conclusions

The coupling of near infrared spectroscopy with chemometric data processing techniques provides a valid tool to tackle different problems related to food authentication and quality control in a versatile way. The examples described in this article show that qualitative and quantitative predictions can be obtained on different samples with high accuracy and almost always without the need of sample pre-treatment.

References

Y. Ozaki, W. F. McClure, and A.A. Christy (Eds.), Near- Infrared Spectroscopy in Food Science and Technology, John Wiley and Sons, New York, 2003
D.L. Massart, B.G.M. Vandeginste, S.N. Deming, Y. Michotte, and L. Kaufman, Chemometrics. A textbook, Elsevier, Amsterdam, 1988
European Commission, Regulation (EC) no. 2081/1992, Off. J. Eur. Union L208 (1992) 1–8
M. Bevilacqua, R. Bucci, A.D. Magrì, A.L. Magrì, F. Marini, Tracing the origin of extra virgin olive oils by infrared spectroscopy and chemometrics: A case study, Anal. Chim. Acta, 717 (2012) 39– 51
S. Bellato, V. Del Frate, R. Redaelli, D. Sgrulletta, R. Bucci, A.D. Magrì, F. Marini, Use of Near Infrared Reflectance and Transmittance Coupled to Robust Calibration for the Evaluation of Nutritional Value in Naked Oats, J. Agric. Food Chem., 59 (2011) 4349-60
M. Bevilacqua, R. Bucci, S. Materazzi, F. Marini, Application of near infrared (NIR) spectroscopy coupled to chemometrics for dried egg-pasta characterization and egg content quantification, Food. Chem., in press. http://dx.doi.org/10.1016/j.foodchem.2012.11.018

Biography

Dr. Federico Marini is a researcher at the University of Rome ‘La Sapienza’, where he also teaches chemometrics at both the undergraduate and graduate levels. His research interests involve the development and application of classification methods, especially in the field of food authentication, nature-inspired methods (artificial neural networks, genetic algorithms, particle swarm optimisation) and multiway analysis.

Issue

Issue 1 2013

Related organisations

University of Rome ‘La Sapienza’

Cookie	Description
cookielawinfo-checkbox-advertising-targeting	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertising & Targeting".
cookielawinfo-checkbox-analytics	This cookie is set by GDPR Cookie Consent WordPress Plugin. The cookie is used to remember the user consent for the cookies under the category "Analytics".
cookielawinfo-checkbox-necessary	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	This cookie is set by GDPR Cookie Consent WordPress Plugin. The cookie is used to remember the user consent for the cookies under the category "Performance".
PHPSESSID	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
viewed_cookie_policy	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
zmember_logged	This session cookie is served by our membership/subscription system and controls whether you are able to see content which is only available to logged in users.

Cookie	Description
cf_ob_info	This cookie is set by Cloudflare content delivery network and, in conjunction with the cookie 'cf_use_ob', is used to determine whether it should continue serving “Always Online” until the cookie expires.
cf_use_ob	This cookie is set by Cloudflare content delivery network and is used to determine whether it should continue serving “Always Online” until the cookie expires.
free_subscription_only	This session cookie is served by our membership/subscription system and controls which types of content you are able to access.
ls_smartpush	This cookie is set by Litespeed Server and allows the server to store settings to help improve performance of the site.
one_signal_sdk_db	This cookie is set by OneSignal push notifications and is used for storing user preferences in connection with their notification permission status.
YSC	This cookie is set by Youtube and is used to track the views of embedded videos.

Cookie	Description
bcookie	This cookie is set by LinkedIn. The purpose of the cookie is to enable LinkedIn functionalities on the page.
GPS	This cookie is set by YouTube and registers a unique ID for tracking users based on their geographical location
lang	This cookie is set by LinkedIn and is used to store the language preferences of a user to serve up content in that stored language the next time user visit the website.
lidc	This cookie is set by LinkedIn and used for routing.
lissc	This cookie is set by LinkedIn share Buttons and ad tags.
vuid	We embed videos from our official Vimeo channel. When you press play, Vimeo will drop third party cookies to enable the video to play and to see how long a viewer has watched the video. This cookie does not track individuals.
wow.anonymousId	This cookie is set by Spotler and tracks an anonymous visitor ID.
wow.schedule	This cookie is set by Spotler and enables it to track the Load Balance Session Queue.
wow.session	This cookie is set by Spotler to track the Internet Information Services (IIS) session state.
wow.utmvalues	This cookie is set by Spotler and stores the UTM values for the session. UTM values are specific text strings that are appended to URLs that allow Communigator to track the URLs and the UTM values when they get clicked on.
_ga	This cookie is set by Google Analytics and is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. It stores information anonymously and assign a randomly generated number to identify unique visitors.
_gat	This cookies is set by Google Universal Analytics to throttle the request rate to limit the collection of data on high traffic sites.
_gid	This cookie is set by Google Analytics and is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visited in an anonymous form.

Cookie	Description
advanced_ads_browser_width	This cookie is set by Advanced Ads and measures the browser width.
advanced_ads_page_impressions	This cookie is set by Advanced Ads and measures the number of previous page impressions.
advanced_ads_pro_server_info	This cookie is set by Advanced Ads and sets geo-location, user role and user capabilities. It is used by cache busting in Advanced Ads Pro when the appropriate visitor conditions are used.
advanced_ads_pro_visitor_referrer	This cookie is set by Advanced Ads and sets the referrer URL.
bscookie	This cookie is a browser ID cookie set by LinkedIn share Buttons and ad tags.
IDE	This cookie is set by Google DoubleClick and stores information about how the user uses the website and any other advertisement before visiting the website. This is used to present users with ads that are relevant to them according to the user profile.
li_sugr	This cookie is set by LinkedIn and is used for tracking.
UserMatchHistory	This cookie is set by Linkedin and is used to track visitors on multiple websites, in order to present relevant advertisement based on the visitor's preferences.
VISITOR_INFO1_LIVE	This cookie is set by YouTube. Used to track the information of the embedded YouTube videos on a website.

Recommended

Coupling NIR spectroscopy and chemometrics for the assessment of food quality

Traceability

Quantification of food ingredients

Conclusions

References

Biography

Issue

Related topics

Related organisations

Leave a Reply Cancel reply

Recommended

Coupling NIR spectroscopy and chemometrics for the assessment of food quality

Traceability

Quantification of food ingredients

Conclusions

References

Biography

Issue

Related topics

Related organisations

From insight to automation: taking control with Process NIR

Understanding the PFAS risk in food supply chains

Understanding near-infrared (NIR) spectroscopy in food testing

Transparency in supply chains: what keeps the industry awake at night?

Contamination recalls cost money

Leave a Reply Cancel reply