Research Output

A framework for data cleaning in data warehouses.

  It is a persistent challenge to achieve a high quality of data in data warehouses. Data cleaning is a crucial task for such a challenge. To deal with this challenge, a set of methods and tools has been developed. However, there are still at least two questions needed to be answered: How to improve the efficiency while performing data cleaning? How to improve the degree of automation when performing data cleaning? This paper challenges these two questions by presenting a novel framework, which provides an approach to managing data cleaning in data warehouses by focusing on the use of data quality dimensions, and decoupling a cleaning process into several sub-processes. Initial test run of the processes in the framework demonstrates that the approach presented is efficient and scalable for data cleaning in data warehouses.

  • Type:

    Article

  • Publication Status:

    Published

  • Publisher

    Springer Verlag

  • ISSN:

    1751-7575

  • Library of Congress:

    QA75 Electronic computers. Computer science

Citation

Peng, T. (2007). A framework for data cleaning in data warehouses. Enterprise Information Systems. , 473-478. ISSN 1751-7575

Authors

Keywords

data cleaning; data warehouse; performance efficiency; automation; data quallity; decoupling; scalable;

Monthly Views:

Available Documents