The pros and cons of unified data for R&D

Posted: 9 December 2021 | Noel Hollingsworth | No comments yet

Uncountable’s CEO, Noel Hollingsworth, explains the process behind unifying data and what the benefits and pitfalls may include.

As R&D teams seek to digitalise their workflows, they encounter the same recurring theme – a proliferation of data from numerous sources, and multiple standards and processes in place. Where previously digitalisation focused on making data available in a computerised form, now companies need to make data usable without friction – either by scientists or by computers, ie, artificial intelligence (AI). Unifying different data sources is typically a necessary step for usability and promises an excellent ROI – but it isn’t cheap. This article outlines the process of unifying data, highlighting the benefits and downsides.

Unified data

The goal of a unified data system is to unite different data sources in a single system. An example of an ununified system might be an ELN that holds experimental notes, a LIMS system that records the results of developmental and production trials, another system for recording consumer data, and yet another for regulatory data. It may be easy to find data in one of these systems, but it is often very time consuming to tie the data across systems and understand how an experimental change recorded in the ELN led to a different observed output behaviour in the LIMS, which led to different customer data.

The process of unifying the data typically works by either implementing a more complete system that replaces multiple existing systems, or by trying to link the data within a hub system. A mixed approach often works best; it is common for ELN + LIMS to be replaced with a modern solution that brings the best of both frameworks, but pricing information is often still pulled from a production ERP system.

Downsides of unified data

The two primary downsides of the unified approach are switching costs and the need for more rigorous data entry. The first cost is more obvious – there are existing systems in place that scientists are used to working with and that IT has vetted. A unified approach requires leaving some of these systems and tying others into the new framework. Change management will be needed from both R&D and IT. Organisations must ensure that any new provider will work with them and be invested in the success of this changeover. It would be a costly mistake to pay a software provider a large upfront fee and not work with them afterwards, as this disincentivises the original vendor from ensuring success. Modern approaches like subscription-based billing help align incentives here.

The less obvious downside is the need for rigour in a unified system. When every scientist works in a lab notebook page, it is acceptable for one scientist to call something “Ing A”, another “Ingredient A”, and another “Trade Name XYZ”. However, the entire benefit of a unified system is standardising this information. The organisation should clarify internally that the goal is to represent each object in a consistent manner, ensuring that the provider in question can accommodate this. Features such as access control rights for who can edit inputs/outputs and merging of data should be easily achievable.

Unified data can make data easier to find – especially for bigger teams

Benefits of unified data

Despite the concerns of unifying data, the primary benefits can provide a high ROI, especially for organisations with larger teams. The first key benefit is simply the ease of finding data. Being able to find a formulation used in an R&D (not production) experiment that met certain output targets and uses certain ingredients often takes a phone call today. When that becomes a 15-second query, scientists can spend more time innovating and less time cleaning data.

Once data can be found easily the next step is ease of visualisation and analytical capabilities. Many scientists within R&D organisations work by extracting data from these systems, cleaning it in Excel, and then plotting it in a standalone programme. Each step of this process takes time and creates the possibility of introducing errors. Most R&D teams have questions they would like to ask of their data but have never got around to because simply compiling the information would take too long. These questions can be answered far more easily with a simple querying and filtering process in one place, rather than reaching across multiple programmes.

The end goal for many teams of a unified system is AI capabilities. While it is necessary to caution that wins using AI will not be immediate, AI can be a major benefit of a unified data approach. AI software needs clean data to work – no algorithm can rectify inputs and outputs not being jointly available, or input label inconsistency. Today, because data is split across multiple systems, AI work often involves one-off projects with lengthy data cleaning phases; these introduce potential for error and shrink the available data pool. A unified system will help ensure that AI work is part of a standard workflow, enabling it to be used more often across an organisation.

Conclusion

Moving to a unified data system requires both effort and attention to rigour within a business. But if organisations accept this, there are many benefits to be reaped – from searching through the data, analysing the information and eventually AI capabilities as well.

About the author

Noel Hollingsworth is the CEO and one of the Co-Founders of Uncountable, which works to centralise scientific development data across a number of industries, including cosmetics, personal care, flavours and fragrances, and more. Noel has a background in software engineering, and prior to his work at Uncountable was named to Forbes 30 under 30 for his work in machine learning. Today, Noel works directly with Uncountable’s customers to implement their vision for a centralised data platform, helping them to innovate quicker to meet modern consumers’ ever changing needs.

Related organisations

Uncountable

Cookie	Description
cookielawinfo-checkbox-advertising-targeting	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertising & Targeting".
cookielawinfo-checkbox-analytics	This cookie is set by GDPR Cookie Consent WordPress Plugin. The cookie is used to remember the user consent for the cookies under the category "Analytics".
cookielawinfo-checkbox-necessary	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	This cookie is set by GDPR Cookie Consent WordPress Plugin. The cookie is used to remember the user consent for the cookies under the category "Performance".
PHPSESSID	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
viewed_cookie_policy	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
zmember_logged	This session cookie is served by our membership/subscription system and controls whether you are able to see content which is only available to logged in users.

Cookie	Description
cf_ob_info	This cookie is set by Cloudflare content delivery network and, in conjunction with the cookie 'cf_use_ob', is used to determine whether it should continue serving “Always Online” until the cookie expires.
cf_use_ob	This cookie is set by Cloudflare content delivery network and is used to determine whether it should continue serving “Always Online” until the cookie expires.
free_subscription_only	This session cookie is served by our membership/subscription system and controls which types of content you are able to access.
ls_smartpush	This cookie is set by Litespeed Server and allows the server to store settings to help improve performance of the site.
one_signal_sdk_db	This cookie is set by OneSignal push notifications and is used for storing user preferences in connection with their notification permission status.
YSC	This cookie is set by Youtube and is used to track the views of embedded videos.

Cookie	Description
bcookie	This cookie is set by LinkedIn. The purpose of the cookie is to enable LinkedIn functionalities on the page.
GPS	This cookie is set by YouTube and registers a unique ID for tracking users based on their geographical location
lang	This cookie is set by LinkedIn and is used to store the language preferences of a user to serve up content in that stored language the next time user visit the website.
lidc	This cookie is set by LinkedIn and used for routing.
lissc	This cookie is set by LinkedIn share Buttons and ad tags.
vuid	We embed videos from our official Vimeo channel. When you press play, Vimeo will drop third party cookies to enable the video to play and to see how long a viewer has watched the video. This cookie does not track individuals.
wow.anonymousId	This cookie is set by Spotler and tracks an anonymous visitor ID.
wow.schedule	This cookie is set by Spotler and enables it to track the Load Balance Session Queue.
wow.session	This cookie is set by Spotler to track the Internet Information Services (IIS) session state.
wow.utmvalues	This cookie is set by Spotler and stores the UTM values for the session. UTM values are specific text strings that are appended to URLs that allow Communigator to track the URLs and the UTM values when they get clicked on.
_ga	This cookie is set by Google Analytics and is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. It stores information anonymously and assign a randomly generated number to identify unique visitors.
_gat	This cookies is set by Google Universal Analytics to throttle the request rate to limit the collection of data on high traffic sites.
_gid	This cookie is set by Google Analytics and is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visited in an anonymous form.

Cookie	Description
advanced_ads_browser_width	This cookie is set by Advanced Ads and measures the browser width.
advanced_ads_page_impressions	This cookie is set by Advanced Ads and measures the number of previous page impressions.
advanced_ads_pro_server_info	This cookie is set by Advanced Ads and sets geo-location, user role and user capabilities. It is used by cache busting in Advanced Ads Pro when the appropriate visitor conditions are used.
advanced_ads_pro_visitor_referrer	This cookie is set by Advanced Ads and sets the referrer URL.
bscookie	This cookie is a browser ID cookie set by LinkedIn share Buttons and ad tags.
IDE	This cookie is set by Google DoubleClick and stores information about how the user uses the website and any other advertisement before visiting the website. This is used to present users with ads that are relevant to them according to the user profile.
li_sugr	This cookie is set by LinkedIn and is used for tracking.
UserMatchHistory	This cookie is set by Linkedin and is used to track visitors on multiple websites, in order to present relevant advertisement based on the visitor's preferences.
VISITOR_INFO1_LIVE	This cookie is set by YouTube. Used to track the information of the embedded YouTube videos on a website.

Recommended

The pros and cons of unified data for R&D

Unified data

Downsides of unified data

Benefits of unified data

Conclusion

Related topics

Related organisations

Leave a Reply Cancel reply

Recommended

The pros and cons of unified data for R&D

Unified data

Downsides of unified data

Benefits of unified data

Conclusion

Related topics

Related organisations

Food tech, digital transformation & AI

CMA launches inquiry into Greencore’s £1.2bn Bakkavor acquisition

McDonald’s launches Spicy McMuffin to celebrate 50 years of breakfast classic

Waitrose invests in Bristol distribution centre to enhance South West supply chain

Nestlé and IBM harness AI to develop sustainable food packaging solutions

Leave a Reply Cancel reply