Session 11

Back to Schedule

Title of session: Quality of multi-source statistics

Chair: Martina Hahn

Room: S4B Lajkonik

Time: 11:30 - 13:00

Date: 27 June

Session 11 - papers & presentations

Presenting AuthorAbstract
Johannes Gussenbauer
Title: <<< Confidence intervals for register-based statistics? >>>
Most of the time register-based statistics are published without quantitative measurements for the estimation error, e.g. a confidence interval. Based on an indicator proposed in the European project ‘Quality of multi-source statistics’ (KOMUSO), a resampling method is applied to the Austrian register-based labour market statistics with the goal to quantify the model error of the imputation process and the measurement error of the target variable in the administrative data. A survey data set is used to benchmark this measurement error and estimate transition probabilities to different states in the categorical target variable. A resample method is then applied to generate repeated estimates which are used to compute confidence intervals.
Bart Bakker
Title: <<< Quality evaluation of registered-based research >>>
Due to the inclined availability of administrative registers, more and more register-based statistics are published. Recently developed quality frameworks have tried to mimic the Total Survey Error approach for register data. Main indicators are under and over coverage, measurement error and linkage error. However, these quality frameworks do not present the methods to measure these indicators. In this paper, we present capture-recapture methods (CRC) to estimate under coverage of already linked registers. Over coverage can be determined and corrected for by removing records of people that do not belong to the target population. Duplicates can be identified by linking the records in the combined registers to each other. Measurement error can be estimated by Structural Equation Models (SEM, for numerical variables) and Latent Class Analysis (LCA, for categorical variables) with a measurement component if another source that measures the same concept is linked to the register data. Linkage error can be estimated using probabilistic linkage methods. However, none of these methods is error free in itself. Linkage error also could have impact on the outcomes of CRC, SEM and LCA. The paper show the interdependencies between these methods.
Roberta Varriale
Title: <<< Quality evaluation of statistical processes based on administrative data: a new version of the TSE approach >>>
Over the last decade, National Statistical Institutes (NSIs) have progressively moved from single- to multi-source statistics. By combining different data sources (direct survey, administrative and Big data) NSIs can increase the detail of information, save data production costs and reduce burden on respondents. The Italian NSI (Istat) has strongly increased the use of administrative archives as primary source for statistical production purposes. To this aim, a system of statistical registers based on the integrated use of administrative sources is under development, and many statistical processes have being accordingly re-designed. Such a change calls for a tailoring of the current approaches for quality measurement and assessment. While in Istat a total quality framework based on the Total Survey Error (TSE) is well developed for surveys, a quality framework supporting the design of the new required statistical processes, based on the use of several types of sources, their evaluation and monitoring is still missing. To this extent, the adaptation of the TSE lately proposed in literature for statistical processes using administrative data sources has been taken as reference. In this paper we illustrate as the proposed quality framework has been tested on a new process - the statistical register Frame-SBS that supports the Structural Estimates on businesses- representing a milestone in the new system of statistical registers in Istat. As a major result, the paper contains a proposal for an additional quality assessment phase. It is necessary, indeed, to define an evaluation phase during which each single administrative source is evaluated on its own with respect to the specific statistical aims of the process. This phase should help the main decisions about how to integrate data, that is in identifying the main steps about how to combine the external sources and to outline the specific steps delivering the final output.

Back to Schedule

Font Resize