Introduction¶
Overview¶
At a data provider, we need to answer whether our data is trusted for use.
Why:
[1] The clients don't know what is behind the data that supply by us.
[2] The data come overtime and bigger to follow, ingest that can day-by-day operation
Based on that, Talend has been demonstrated
Data trust means having confidence that your organization’s data is healthy and ready to act on.
Trust is one of the keys to making successful use of your data. Combined with culture and agility, it leads organization to achieve data health. By ensuring trust in data across the organization and across departments, an organization provides its teams the ability to design exceptional customer experiences, improve operations, streamline decision-making, ensure compliance, and drive innovation. But data trust must be earned and quantified. It can’t be taken on faith. Before trusting your organization’s data, you should prove that it can produce reliable analytics to support well-informed business decisions.
And the ACCUTUV method has been used by The Data Management Association of the UK
Below is defined six dimensions of data quality:
Dimensions
Term
Example
Dimensions
Term
Example
Accuracy
The degree to which data correctly describes the real-world object or event in question
Say an accounting record uses the US date format MM/DD/YYYY. Data entered using the European DD/MM/YYYY format could lead to an invoice due May 8th not being paid until August 5th
Completeness
The proportion of data stored against the potential for being 100% complete
Blank values indicate that certain data has not been populated. An address record with 300 rows and 12 missing postal codes would have usable data for 288 addresses, and a completeness rate of 288/300, or 96%.
Consistency
The absence of difference when comparing two or more representations of an item against a definition
Do an organization’s HR, legal, and finance teams all use one date format, or would the same date appear as 11/12/2022, 12/11/22, and 22-NOV-12 in reports generated by different departments?
Timeliness
The degree to which data is current enough to represent reality as needed to support business functions
In a field that represents company earnings, it's vital to have access to the latest data. What is the delay in providing that data — is it on the order of minutes, days, or weeks?
Uniqueness
No item, or entity instance, is recorded more than once based upon how that item is identified
Duplication of a single customer's record based on multiple entries, such as A. Lee, Alan R. Lee, and Alan Lee appearing as three individuals with the same address and contact information.
Validity
, or Conformity
The degree to which data conforms to the syntax (format, type, or range) of its definition
A street address of 1000 Integration Drive is valid, though not necessarily accurate. A street address of H/*27 Integration Drive is not valid.
Sources What is Data Trust? Definition & Examples
Source Reference¶
[1] What is Data Trust What is Data Trust? Definition & Examples