Data is everywhere. The list of words that are added after Data and bandied about in our everyday conversation makes it maybe one of the most used words daily. Data Science, Data Analysis, Data Governance, Data Stewardship, Data Structure, Data Mining, Data Lake, Data Warehouse and the list is endless. Lets start with the beginning and for the purpose of this Blog, lets focus only on good old Data Quality. Without a certain level of Data Quality, any follow-up exercise is bound to fail.
What is Data Quality?
At its most basic, Data quality refers to the accuracy, completeness, and consistency of the data that is used to make decisions and drive business processes. The quality of data can be affected both due to unintentional and intentional drivers. On one side complexity and inter-dependencies of multiple systems have an effect and on the other side are factors like human error and manipulation. Recent lawsuits involving JP Morgan Chase’s acquisition of the Frank Fin-Tech highlights the case in point. Imagine a bank size of JP Morgan missing in their due diligence that 93% customer records were fake and the cost of this was $175 million. Not a small amount.
Why is Data Quality Important
Data quality is essential for compliance and legal requirements. Many industries have strict regulations in place when it comes to data privacy and security, and companies must ensure that they follow these regulations. Poor data quality can lead to data breaches and other compliance issues, which can result in significant fines and penalties. According to Alation State of Data Culture Report, 87% of respondents say data quality issues are a barrier to successful implementation of AI in their organizations.*
Implication of Data Quality is across industries. For example, in Pharmaceuticals and Life Sciences, data quality is of paramount importance in the clinical development process for new medical treatments and drugs. Clinical trials rely on accurate and reliable data to establish the safety and efficacy of new treatments, and poor data quality can have serious consequences for both the development of new treatments and patient safety.
Ensuring data quality in clinical development requires a combination of technology and processes. On the technology side, companies can use electronic data capture (EDC) systems, which are specifically designed for the collection of clinical trial data. These systems allow for data validation, real-time data monitoring, and automated data cleaning, which help to ensure data quality. On the process side, companies can establish robust data management plans, which outline the procedures for data collection, storage, and analysis. This includes establishing roles and responsibilities for data quality management, as well as providing training and education for employees on data quality best practices.
Having established the implications and impact of data quality, let us look at the three important components of Data Quality, namely Accuracy, Completeness and Consistency.
The Three Important Components of Data Quality
- Accuracy – Accurate data is essential for making informed decisions about the safety and efficacy of new treatments. For example, if a clinical trial is relying on inaccurate data to measure the effectiveness of a new drug, the drug may be deemed ineffective when it is effective. This can delay or even halt the development of a potential treatment that could help many patients. Another example can be drawn from Supply Chain. If a company is relying on inaccurate data to make decisions about inventory levels, they may end up overstocking or understocking certain items, which can lead to wasted resources and lost sales.
- Completeness: This is also important for data quality in clinical development. Complete data ensures that all relevant information is included, and nothing of importance is omitted. This is especially important when it comes to making decisions about the safety and efficacy of new treatments, as incomplete data can lead to false conclusions and poor decision making. For example, if a trial is using incomplete data to track adverse events, it may miss important information that could help identify serious safety concerns and ultimately lead to the termination of the trial, even though the medication might have been effective. Another example can be from sales function and if a company is using incomplete data to track sales trends, they may miss important information that could help them identify new opportunities or address emerging issues.
- Consistency: Consistent data ensures that the same information is being used across different trials and studies, and that it is being used in the same way. This is important for ensuring that different trials are comparing like-with-like, and that the data can be combined and analyzed effectively. For example, if different trials are using inconsistent data to track patient outcomes, it may be difficult to establish a clear picture of the safety and efficacy of a new treatment. Any company using inconsistent data to track customer information, different departments may have different views of the same customer, which can lead to confusion and inefficiencies.
Ultimately, it all comes down to Garbage In, Garbage Out. Ensuring data quality requires a combination of technology and processes, including electronic data capture systems, data management plans, data governance policies and procedures and employee training and education. Establishing roles and responsibilities for data quality management is also an important aspect.
At DefineRight, we understand that data quality is not merely a data problem. There are underlying dimensions of Process and Technology and a very important component of People and change management for good adoption. We use human expertise and tools for each of the underlying dimensions to check data for accuracy, completeness, and consistency and help expedite the operational execution of your data quality strategy. with an estimated time-savings of 18-25%.