Skip to content

Introduction

Overview

At a data provider, we need to answer whether our data is trusted for use.

Why:

[1] The clients don't know what is behind the data that supply by us.

[2] The data come overtime and bigger to follow, ingest that can day-by-day operation

Based on that, Talend has been demonstrated

Data trust means having confidence that your organization’s data is healthy and ready to act on.

Trust is one of the keys to making successful use of your data. Combined with culture and agility, it leads organization to achieve data health. By ensuring trust in data across the organization and across departments, an organization provides its teams the ability to design exceptional customer experiences, improve operations, streamline decision-making, ensure compliance, and drive innovation. But data trust must be earned and quantified. It can’t be taken on faith. Before trusting your organization’s data, you should prove that it can produce reliable analytics to support well-informed business decisions.

And the ACCUTUV method has been used by The Data Management Association of the UK

Below is defined six dimensions of data quality:

Dimensions

Term

Example

Dimensions

Term

Example

Accuracy

The degree to which data correctly describes the real-world object or event in question

Say an accounting record uses the US date format MM/DD/YYYY. Data entered using the European DD/MM/YYYY format could lead to an invoice due May 8th not being paid until August 5th

Completeness

The proportion of data stored against the potential for being 100% complete

Blank values indicate that certain data has not been populated. An address record with 300 rows and 12 missing postal codes would have usable data for 288 addresses, and a completeness rate of 288/300, or 96%.

Consistency

The absence of difference when comparing two or more representations of an item against a definition

Do an organization’s HR, legal, and finance teams all use one date format, or would the same date appear as 11/12/2022, 12/11/22, and 22-NOV-12 in reports generated by different departments?

Timeliness

The degree to which data is current enough to represent reality as needed to support business functions

In a field that represents company earnings, it's vital to have access to the latest data. What is the delay in providing that data — is it on the order of minutes, days, or weeks?

Uniqueness

No item, or entity instance, is recorded more than once based upon how that item is identified

Duplication of a single customer's record based on multiple entries, such as A. Lee, Alan R. Lee, and Alan Lee appearing as three individuals with the same address and contact information.

Validity

, or Conformity

The degree to which data conforms to the syntax (format, type, or range) of its definition

A street address of 1000 Integration Drive is valid, though not necessarily accurate. A street address of H/*27 Integration Drive is not valid.

Sources What is Data Trust? Definition & Examples

Source Reference

[1] What is Data Trust What is Data Trust? Definition & Examples