Logo

Home >

Dataleaf Technologies, Inc:  HR System migration, Archiving, Analysis, Modeling, Interim HRIS

[Document ©2001-2005 Dataleaf Technologies, Inc, 486 Concord Street, Carlisle MA 01741. Contact georgestalker@dataleaf.net, (978) 369-7472]

 

Data Quality Automation

Whenever a Dataleaf® data mart is updated, data quality information is compiled, and the information becomes an integral part of the data mart

A user can instantly retrieve rows showing any data quality problems, or specific problems. "New-problem" and "fixed-problem" records and totals can be viewed in the same way.

automated sanity checking

Trouble Tests are defined at the client level to control monitoring of metadata discrepancies, sanity-check exceptions, and administrative errors.

Typically, HR organizations define 10-50 Trouble Tests in addition to any XML-validation, as in the following example ...

screen shot

Instructions for load-time "triage" (that is, actual load time alterations accompanied by a log entry) to be carried out on data which flunks a particular Trouble Test, may be part of the specification of that test. Both original and 'fixed' data values are always retained. Of course Trouble Test definitions without such triage properties are more common.

longitudinal data quality testing

The graphic display shows a historical record of data quality problems in a single large database. This particular display shows all DQ problems defined in a group of 30 client-specific trouble tests. Dark columns are fixed problems; light columns are new problems; the line marked 'curr' (axis to the right) is the total frequency of all data quality exceptions. The numbers on the X axis are the months of the year. About 2-1/2 years history is shown.

screen shot

Overall, it appears that data quality exceptions rise sharply every few months with an influx of new records, and -- in the next month -- most of the new problems are corrected.

What is interesting is that -- once such an influx of bad data has been handled -- only a low level of correction needs to occur.

In the illustration above, correction of data quality exceptions (the dark columns in the Dataleaf® illustration) does seem to occur in bursts.

routine data quality monitoring

mouse image

Whenever a Dataleaf® data mart is updated, data quality information is compiled, and the information becomes an integral part of the data mart

Our business is based on a single, compact set of methods and software components, the Dataleaf® Database technology.