Background: HealthStats NSW http://www.healthstats.nsw.gov.au/ is a public-facing website providing summarised statistics from more than 10 routinely collected, administrative data sources encompassing the breadth of topics in the health care continuum. Currently, close to 20,000 summarised data sets are available in a searchable database for a variety of conditions. These data sets are updated cyclically, and at each update changes in definitions, coding standards, data collection standards and mechanisms pose challenges to existing manual quality assurance (QA) methods.
Aim: To develop automated QA processes to enhance the scalability of public reporting from administrative data sources while ensuring that data are correct, consistent, and satisfy NSW privacy legislation.
Methods: Point-on-point comparisons were implemented for historic, “stable”, data. Poisson regression and ARIMA forecasting techniques were developed to predict expected values for future estimates. These techniques were evaluated against three common types of changes (coding, standards and source data attributes) to determine success criteria for automated QA.
Results: Appropriate tolerance bands are required for both “stable” data and future projections. Tolerance bands depend upon the type of change, as well as the rarity of the condition. Exception reporting provides a suitable mechanism to scale QA to the large volumes of data sets derived.
Conclusion: Automated quality assurance is necessary to ensure timely and scalable reporting from administrative data sources. With a number of changeable components, automated quality assurance processes must be specific enough to detect multiple types of errors, and sensitive enough to detect changes where conditions are rare.