Speed Talk Session 6

Back to Schedule

Title of session: Quality & innovative sources and methods in statistics

Chair: Magdalena Six

Room: S4B Lajkonik

Time: 18:45 - 19:30

Date: 27 June

Speed Talk Session 6 - papers & presentations

Presenting AuthorAbstract
Cristina Neves
e-mail: cristina.neves@ine.pt
Title: <<< Mobility survey on metropolitan areas - innovation on methods and procedures >>>
Statistics Portugal implemented a Mobility survey on metropolitan areas, aiming to get data concerning urban trips’ patterns of the population according to residence area and motivations. In the particular case of Metropolitan Areas (MA), where the complementarity of transport modes is essential especially for commuters, the mobility surveys are important tools to support the decision in the transport system scope, namely in what concerns the intermodal network definition and ticketing systems. The survey was defined in order to produce results to feed the national and European statistical systems, as official statistics, and to obtain information in the most possible harmonized way according to EU recommendations for mobility statistics. Given the specificities of the MA in terms of population and territory characteristics, an innovative approach in terms of methods and procedures have been taken into account by Statistics Portugal in the methodological study of the survey. A study on territorial division was conducted, in order to allow the definition of homogeneous accessibility areas, via a cluster analysis to group parishes with similar characteristics in terms of mobility, using the combination of Census data about the population and commuting and also the geographic data about transport infrastructure: rail and metro stations; main road intersections; inland waterways connections, on which buffers on their influence were created. Those homogeneous areas have been taken into account for the sampling design. Also an innovative approach in terms of sampling design was considered, in order to allow a combination of different data collection methods, namely computer assisted web interview (CAWI) and computer assisted personal interview (CAPI). A stratified two stage sample was implemented, with systematic sampling selection for the CAWI in a first phase and, in a second phase, a sub sample from the first phase was selected for CAPI.
Thomas Riede
e-mail: thomas.riede@destatis.de
Title: <<< Digital transformation as an opportunity to become more relevant >>>
The digital transformation resounds throughout the land – it encompasses the economy, it pervades society as well as everyday life and it forces public administration, which generally lags behind, to stretch and catch up. Destatis wants to view the digital transformation not as a threat to its existing business model but as an opportunity to alter limitations of existing (legal) frame conditions and thus shape its future. Digital transformation however, is not an end in itself. It serves to better fulfill the strategic goals of Destatis, prime among them the quality and the relevance of statistical data. Therefore Destatis has developed a Digital Agenda that links the ambition for the digital transformation to the strategic goals. The Digital Agenda also provides a road map towards this ambition by outlining key measures that need to be undertaken in order to successfully transform. The Digital Agenda shall be revised every year in order to adapt ambition and key measures to new requirements as well as to reflect progress achieved. This paper aims to present the Digital Agenda and to outline its effect on the quality and relevance of statistics.
Marcos Rodrigues Pinto
e-mail: marcos.pinto@ibge.gov.br
Title: <<< Data collection quality control using paradata and geolocation >>>
This paper describes the progress made by the Brazilian Official Statistics Office (IBGE - Institute of Geography and Statistics) to assure quality on its latest census. Management information systems are being extensively improved to deliver critical information to all the levels of administration on the data collection phase, producing important data for decision making. Tools like enumerator tracking by geolocation brings new possibilities to field monitoring, as it makes available the enumeration area coverage, not seen solely with the household coordinates registered before an interview on a CAPI approach. Also, the use of paradata (data about the process of data collection, produced by the enumerator) has shown it´s extremely promising value on quality control and consequent cost saving. The behaviour of the enumerator on the use of the handheld device can give detailed information concerning quality and productiveness. Examples of data that are currently used in Brazil are the record of coordinates during questionnaire execution, enumerator navigation on the questionnaire, interview length and questions sequence. Analysing this data while the collection is in progress and sharing with the field supervision has been shown very important as they can act preventively, checking an enumerator work that is signed as suspicious (possibly with maid up questionnaires), and making correct decisions. This work shows the potential use of paradata and geolocation as central piece of management information systems to produce high quality official statistics.
Cristina Rovira Trepat
e-mail: crovira@idescat.cat
Title: <<< Building the territorial statistical register: quality control on geocoded administrative data >>>
The Statistical Institute of Catalonia (Idescat) is developing a statistical information system based on administrative registers. All these registers usually contain a set of variables defining the spatial location of the microdata through a postal address. One of the subsystems used is the information about real estate cadastre, which contains around 10 million georeferenced estates. Our aim is to analyse its statistical quality. A set of polygons was created, associated with the concave or convex hull from the centroids of the estates grouped together by roads, as a first step in establishing the quality of the positions. One of the tasks carried out is that of the detection of outliers in these polygons. We focus on analysing the distribution of the distances between the points of each polygon. We begin with the idea of creating an indicator to rank the polygons according to their level of reliability, indicating the possibility of there being an outlier in the polygon. Our first concern is to identify the polygons with the most extreme values on the indicator, both those which could contain suspicious points (around 5%) and those we are sure do not contain any (approximately 70%). By means of a comparison using multivariate analysis and anomaly detection techniques, we aim to verify that the results are almost the same. Secondly, our concern is to study the cases found in the intermediate zone, and in particular to check in which threshold we will obtain the optimal relation between the well-classified and the badly-classified ones. In order to evaluate our indicator, we must select a sample of cases and study them manually to ascertain whether they are correct or not. It is therefore in our interest that the sample mainly contains polygons found in the intermediate zone of the indicator.
Pedro Cunha
e-mail: pedro.cunha@ine.pt
Title: <<< Does Big Data mean Big Problems, or Bigger Opportunities? >>>
The traditional approach to design a survey for the production of a deliverable is no longer maintainable. Life has changed in the last years not only on the perception of how the use of administrative data could lessen the burden on the respondents and the survey costs but also as the gathering of information through sensors and other automated devices proved to be a cheap and useful resource, shortening the time to produce statistics thus increasing its timeliness. Many questions arise when we try to use data, be it administrative or sensor generated, to produce official statistics. Most of them because the data was not collected with statistical purposes and methodologically its use can be complex as it does not meet statistical standards on concepts or definitions. For this reason our approach to BigData has been cautious and gradual. After the census 2011 operation a national dwelling register was built and since then this database has been fed with administrative data. We are now considering its enrichment using electricity consumption information collected through smart meters. This BigData source could assist us in determining dwelling occupation. Due to several restrictions on the BigData source availability it is not possible to establish a connection per household but instead to validate the coverage of areas. The resulting impossibility to perform microdata linkage lead us to study different ways in which to improve quality on the register, validate under or over cover areas and to develop new methodologies for sources combination. Different quality checks to the data extracted from BigData sources have to be applied and new combining methods are in this paper addressed in the specific context of the national dwelling register. Through the presentation of this case study we share the obstacles and gold nuggets found along the road of BigData exploration.
Daniela Ferrazza
e-mail: ferrazza@istat.it
Title: <<< Local decisions and new guidelines of the Official Statistics. A pilot study on Early School Leavers to explore the quality of integrated administrative data used to support local policies. >>>
The work is based on the experimental use of data from the ArchIMEDe project (ARCHivio Integrato di Microdati Economici e Demografici – i.e. Integrated Archive of Economic and Demographic Microdata) of the Italian Institute of Statistics (ISTAT), created with the aim of expanding the supply of information of regional and local interest through the production of micro-data from administrative sources. In a scenario of possible integration and/or transition from survey sources to administrative sources of Italian official statistics, the first step in the study of phenomena is the problematization of operational measures and existing definitions. Although the data used - coming from secondary sources - are not structurally similar to those of ISTAT sample surveys and are not even directly comparable, the new boundaries of official statistics show an increasing interrelationship which, at local level, requires an integrated reading of phenomena through the different sources available in order to obtain the best information and the best quality. The contribution, conceived as part of the SPoT project (data and methodologies for the development of Statistics for Policies on the Territory), proposes a case study of Early School Leavers, defined as young people between 18 and 24 years of age who do not have a secondary school qualification and who are not enrolled in an education cycle. The paper presents the results of some analyses carried out on residents in Lombardy using spatial association measures and hot spot maps. These results highlight, at the same time, elements which are useful from a substantial point of view for the understanding of the phenomenon at a local level, as well as methodological aspects relating to strengths and weaknesses in experimental use of secondary sources in statistics to support local policies.
Enrique Moran Alaez
e-mail: Enrique-Moran@eustat.eus
“Ebalua”, the Basque word for “Assess”, is the provisional brand name of a new application that the Basque Statistics Institute-Eustat is currently building in order to assist the statistical staff to assess the quality of statistics produced inside the Basque Statistics Organization. Assessment is seen in Eustat as an internal reflection that the responsible for the statistics makes with his/her team about the quality of them so that some improvements for the new cycle can be introduced. The application provides the entire set of tools that the statistics team need to conduct the assessment, the questionnaires to fill in -based on the DESAP formdeveloped by Statistics of Lithuania-, the assessment report, the access to the data base with previous assessments,..., following the assessment protocol adopted by Eustat. The application is developed on the Internet because it is open to the Basque Statistical Organization, which includes Eustat and the statistical staff from the Basque Government Departments, in case this staff is part of a statistics team under Eustat’s responsibility. EBALUA is integrated in the information system of Eustat so that it can take several programming data from it: the names of the statistics and of the team members, e-mail addresses, planned dates of the assessment, and so on. EBALUA communicates the start and the end of the assessment process to the statistics team and other people, such as the heads of the departments involved and Eustat’s managers, sending messages and the assessments reports. EBALUA will store all the information in an Oracle data base and produce statistical reports of the assessment process, duration, delays an main features, giving a precise idea of the spreading of the culture of quality assessment in Eustat.
María Rosalía Vicente
e-mail: mrosalia@uniovi.es
Title: <<< Improving official statistics with credit and debit card data >>>
This paper explores the use of credit and debit card transactions to measure consumption and poverty, the latter understood as the absence of consumption or the restriction of it. These data are available on a timely basis and provide detailed information about spending activity with a finer granularity than that available in traditional household surveys. Accordingly, the exploitation of these data source can provide novel insights about both consumption patterns and poverty. Specifically, we estimate spending activity by the smallest geographical unit possible and investigate differences by age and gender. We also analyze the spatial distribution of these variables and how they relate to the macroeconomic features of each district. Finally, our estimates are compared with the official figures from household surveys.

Back to Schedule

Font Resize