Chair: Outi Ahti-Miettinen
Room: S3A Barbakan
Time: 18:45 - 19:30
Date: 27 June
|Title: <<< But are those numbers correct?: Towards criteria to assess the reliability of statistics >>>
Confronted with a new statistic for the first time, how do we know whether to trust it?
Since 1990, several international documents have enunciated principles and practices for ensuring sound statistics. But they do not spell out the inherent attributes of sound statistics. Such attributes can be found in “quality frameworks”, but these often include practical aspects such as accessibility and timeliness that say nothing about whether data are actually correct.
A checklist for assessing the reliability of data could have multiple uses. In particular, it would help establish whether a given statistic should be accepted as knowledge, and thus as suitable for use in hypotheses, or as a sound basis for action or policy. It might also help test whether existing agreed pinciples suffice to ensure reliable data. This paper is a first attempt to identify characteristics of data that may be accepted as true.
The analysis is in three sections, corresponding to stages of statistical production, namely:
Measurability: this section introduces the idea of an “evidence threshold” separating knowledge from speculation. It emphasises, however, that statistical quantities are found along a continuum ranging from known countable items to model-based projections.
Measure: this identifies characteristics of measures that promote reliability, including conceptual clarity, definitional rigour, and unambiguous application to the target variable.
Measurement: this stresses the value of comprehensive counts, adequate and competent enumerators, the absence of incentives to distort figures, and mimimisation of processing steps.
The discussion draws on official and non-official critiques of statistical quality, highlights common errors, and cites practical examples. It aims to initiate discussion of general criteria to help users assess the credibility of statistics, and help compilers maximise the reliability of their output.
|Title: <<< Comparison of datasets as a method of monitoring data quality: A conceptual framework and two applications with banking data >>>
Cross-validation using different datasets reporting comparable data can be a very effective way of monitoring data quality of macro data and complements validation rules applied within a single dataset. In particular, cross-validation can identify data quality issues which would not be found with the validation rules approach, specifically those related to the completeness or double-counting of information. This approach is especially relevant for banking data, given that, due to the need of informing different types of policies related to the banking sector, like monetary, macro-prudential and micro-prudential policies, various banking datasets have become available at central banks and banking supervisors. These datasets differ with respect to their scope, method of consolidation and granularity. They also differ in their collection method and the institution which processes the information. Despite these differences, one can formulate logical relationship which should hold between the different aggregates (e.g. at a jurisdiction level or at a bank sector level, like ‘Significant Institutions’). Two examples are then provided. First, a comparison between the Consolidated Banking Statistics (CBD) which are used mainly for macro-prudential analysis and the Supervisory Banking Statistics which are collected in the context of the banking supervision exercised by the Single Supervisory Mechanism at firm-level. Second, we investigate the international interconnectedness of banking systems in the International Banking Statistics of the Bank for International Settlements and the CBD dataset of the European Central Bank. In both cases, we find that these comparisons provide important insights about the underlying data and allow the identification of data quality issues which could not be achieved through intra-dataset validation rules.
|Title: <<< Supporting the compilation of quality reports – Improvement of guidelines, provision of a checklist and standard texts >>>
During the last years, the quality unit of Destatis made valuable experiences at national and international level with the compilation of quality reports concerning the following questions:
- Which concepts of the quality reports are typically posing problems for the subject matter units?
- How can existing guidelines be improved (in wording and form) in order to better support the compilation of quality reports? Based on the experiences made by Destatis, the aim of the paper is to present which additional support could be provided to the compilers of quality reports – besides the already existing ESS or national guidelines for quality reports:
- A checklist for quality reports based on the guidelines for quality reporting,
- Extensions and further specifications on the content of the guidelines for quality reporting,
- Provision of standard texts for designated concepts.
|Title: <<< Quality Reporting Improvement Depending on the Generic Statistical Business Process Model (GSBPM): PCBS Experience >>>
PCBS has been working on preparing reports about data quality of its statistical surveys, aiming at creating a general perspective regarding the extent of applying the quality indicators in statistical surveys. The contribution of this paper is to address three parts of quality report based on the Palestinian experience at PCBS:
The first part focuses on the historical overview of PCBS and Quality Department in the preparation of quality reports for statistical surveys. Starting with quality reports, they are prepared after the completion of the statistical survey and then we have developed and improved "operations and data quality reports" to control the quality during survey implementation, which contains many indicators associated with data quality in line with the GSBPM standard, and its cover many of process and sub-process in each phase ( specifying needs, designs, building, collection, processes, analysis, disseminates, and evaluations in all phases ), these reports help us to decrease the non-sampling errors.
Secondly, this paper also focuses on operations and data quality reports and their contribution in solving the problems that may face the project management, as well as reducing non-sampling errors; leading to the improvement of the data quality of statistical survey. In addition, the possibility to evaluate the quality of the survey during its different phases and it helps in determining the most important strengths and opportunities for improvement for each phase, through drafting recommendations to improve quality during the current or the next survey cycle, and documenting the results of monitoring data quality during the project cycle.
Finally, we are looking to improve the quality report by utilizing measurable quality dimensions depending on GSBPM.
|Ana Isabel Sánchez-Luengo Murcia|
|Title: <<< Structural metadata as a key element in the management of microdata >>>
Within the international framework of statistical production standards (GSBPM and GSIM) there is a clear aim of building a modern production system metadata driven. INE Spain is putting its efforts into an approach based in two main ideas; on the one hand, to create or reuse the metadata at the beginning of the process and maintain them throughout the process; and, on the other hand, to design structural metadata elements as GSIM objects. The model proposed in this paper uses structural metadata, i.e., variables, classifications, concepts and statistical units as an information system, key for the storage of microdata. These metadata are stored once in a single institutional repository and should be used and reused as much as possible. Designed and developed, the model is beginning to be implemented at INE-Spain. The structure metadata model allows to build microdata bases and access them easily. The access can be made to data with different temporal references and in different subprocesses of the data collection phase. As regards the quality of metadata, a strategy has been put in place where the metadata unit is responsible for building, maintaining and improving structural metadata, applying international standards as much as possible, always bearing in mind the institutional needs.
|Title: <<< Measuring response burden at the Swiss Federal Statistical Office >>>
The Swiss Federal Statistical Office (SFSO) is willing and needs to put in place a procedure for systematically measuring the burden that stems from its different surveys (household and business surveys). A first evaluation of survey burden, in terms of time and cost imposed on businesses, was carried out in 2013 by external specialists at the University of St.Gallen, Switzerland. A working group within SFSO has recently examined the existing literature and the dispositions taken by other national statistical institutes. Its current task is to expand and improve on the 2013 works and thus define the way in which burden will be accounted for. This paper presents an overview of the difficulties encountered when measuring actual response burden, the principal challenges that need to be met and the different options available. It lists measures that have already been adopted so as to reduce the global survey burden, such as, for instance, mandatory use of administrative data, online questionnaires, profiling of large and multi-establishment companies. Also is to mention SFSO’s tool for coordinated selection of samples, which allows to spread the response burden for most of our surveys. Among other benefits, it allows to track unit selections over time and thus facilitates survey burden evaluation.
|Title: <<< Reduction of response burden by utilising extensively register data and modelling: Cases from new EU data needs in agricultural statistics >>>
In ESS statistics, the use of administrative data is supported, which is also emphasized in the ESS Strategy for Agricultural Statistics for 2020 and beyond. Natural Resources Institute Finland (Luke) has a long experience of using administrative registers as a source for statistics. However, when new data needs are expressed or the administrative data develops, new possibilities arise. Our objective is to replace survey data in forthcoming farm surveys with register data combined with advanced modelling in survey estimation procedures. We examine and demonstrate the recent advances accomplished in the usability of the agricultural registers data and other data sources to reduce both the response burden and direct data collection from the farms.
The case studies we examine are:
1) Greening the agricultural production and crop rotation.
2) Agricultural labour
3) Number of animals.
The general objectives in our study are:
1) To investigate possibilities for a broader analysis of crop rotation based on the IACS parcel data icluding the new geospatial parcel data obtained from the farmers through farm subsidy administration from the year 2015 on.
2) To pilot the use and examine the quality of the individual level register information on farm labour that can be linked with identificators to farm level from a register other than IACS as a source of FSS data.
3) To evaluate the quality and feasibility of all existing animal registers as sources of data for animal statistics.
4) To examine thepossibilities to respond to the new ad hoc information needs.
Common advantages of the use of registers are the total coverage of farms, reduction of survey costs and the avoidance of misinterpretations by farmers when answering the questionnaires which is a significant factor in the case of the crop rotation variable.