Course Title:
Data Quality Assessment - Practical Skills
Geared To:
Data quality practitioners - those in the trenches who are responsible to manage, maintain,
and deliver high-quality data and to continuously improve the quality of data.
You Will Learn:
The what, why, when, and how of data quality assessment.
How to identify and use data quality rules.
How to ensure completeness of data quality assessment.
How to construct and use a data quality scorecard.
How to collect, manage, maintain, warehouse and use data quality metadata.
Summary:
More and more companies initiate data quality programs and form data stewardship groups every
year. The starting point for any such program must be data quality assessment. Yet in absence of
a comprehensive methodology, measuring data quality remains an elusive concept. It proves to be
easier to produce hundreds or thousands of data error reports than to make any sense of them.
This course gives comprehensive treatment to the process and practical challenges of data
quality assessment. It starts with systematic treatment of various data quality rules and proceeds
to the results analysis and building aggregated data quality scorecard. Special attention is paid
to the architecture and functionality of the data quality metadata warehouse.
Course Outline:
1. Introduction to Data Quality Assessment
What are the objectives of data quality assessment?
When to perform data quality assessment?
What are the common mistakes in data quality assessment?
What are the steps of data quality assessment?
What are the roles and responsibilities in a data quality assessment team?
2. Using Data Quality Rules for Data Quality Assessment
What are data quality rules?
How to identify rules from data models?
How to identify rules through data profiling?
How to identify rules for state-dependent objects?
How to identify rules for time-dependent data?
How to identify complex data relationships?
How to identify rules using additional data sources?
3. Ensuring Comprehensive Assessment
How to ensure that all data elements are validated?
How to ensure that all data errors are correctly identified?
How to deal with false positives in error identification?
How to deal with uncertainty in error types and error location?
How to deal with redundancy and dependencies between data quality rules?
How to integrate data quality rules with manual data verification?
4. Constructing Data Quality Scorecard
Why build data quality scorecard?
How to design comprehensive dimensional data quality scorecard?
How to calculate aggregate data quality scores?
How data quality scorecard helps identify root causes of data problems?
How data quality scorecard helps measure impact of bad data on business processes?
How data quality scorecard helps measure impact of bad data on the corporate bottom line?
5. Building Data Quality Metadata Warehouse
What is data quality metadata warehouse?
How to catalogue data quality rules?
How to catalogue data problems?
How to store other relevant metadata?
How to build useful data quality reports?
What analytical functionality is desired?