Towards a framework for dealing with data quality in data warehouses.

Research Output

The popularity of data warehouses (DWs) in recent years confirms the importance of data quality in today’s business success. It is estimated that as high as 75% of the effort spent on building a data warehouse can be attributed to back-end issues, such as readying the data and transporting it into the data warehouse. In order to improve the efficiency of building up a data warehouse, other than issues about design and implementation, data cleaning is a crucial task. Regarding this task, there are at least two questions needed to be answered: How can we manage to reduce the time used for data cleaning? How can we manage to improve the degree of automation when performing data cleaning? This paper attempts to answer these two questions by presenting a novel framework, which provides an approach to managing data cleaning in data warehouses by focusing on the use of data quality factors, and decoupling the cleaning process into several sub-processes. Initial test run of the processes in the framework demonstrates that the approach presented is efficient and scalable for cleaning data in data warehouses.

Type:

Book Chapter
Date:

01 January 2006
Publication Status:

Published
Publisher

ATINER
Library of Congress:

QA75 Electronic computers. Computer science
Dewey Decimal Classification:

004 Data processing & computer science

http://researchrepository.napier.ac.uk/output/249322 <p>Peng, T. (2005). Towards a framework for dealing with data quality in data warehouses. In P. Petratos (Ed.), <i>Current Computing Developments in E-Commerce, Security, HCI, DB, Collaborative and Cooperative Systems</i>, 241-256. ATINER</p>

Citation

Peng, T. (2005). Towards a framework for dealing with data quality in data warehouses. In P. Petratos (Ed.), Current Computing Developments in E-Commerce, Security, HCI, DB, Collaborative and Cooperative Systems, 241-256. ATINER

Authors

Dr Taoxin Peng

Lecturer
School of Computing Engineering and the Built Environment

0131 455 2748

T.Peng@napier.ac.uk

Keywords

Data warehouses; quality; cleaning; framework; scalable;

Monthly Views:

Available Documents

Files currently unavailable for download , please contact t.peng@napier.ac.uk to request a copy
Downloadable citations
HTML BIB RTF

Type:

Date:

Publication Status:

Publisher

Library of Congress:

Dewey Decimal Classification:

Citation

Authors

Dr Taoxin Peng

Keywords

Monthly Views:

Files currently unavailable for download , please contact t.peng@napier.ac.uk to request a copy

Downloadable citations