Coauthor
  • MAIROT Alexandre (7)
  • FROMONT Emilie (2)
  • LE CORGNE Simon (2)
  • GROSHENS Emilie (1)
  • Show more
Document Type
  • Conference contribution (11)
The Center for Socio-Political Data (CDSP) of Sciences Po has been providing access to social sciences data resources from 2005. Our goal over the next year is to evolve our environment to a more user-friendly DDI-compliant structure. In the context of the new French law “Loi du numérique” from 2017, we need to question our quantitative data access procedures. We need to evaluate which are the resources that can be downloaded in open access and which ones need to be restricted in terms of access, in order to preserve the confidentiality of the survey respondents. Another element to take into account in order to provide data in a user-friendly manner is the improvement of the quality of our metadata. Our unequal interpretation of the DDI elements in our environment needs to be addressed in order to improve the user’s data discovery and comparative analysis experience: a metadata harmonization process is currently ongoing (use of controlled vocabularies, metadata re-use…) This presentation highlighted the processes, challenges and questions for providing easier data access to our users’ community on a DDI-compliant platform.

Le CDSP propose des services et outils de données pour la communauté nationale des sciences sociales : documentation, traitement, diffusion et archivage d'enquêtes et de données quantitatives et qualitatives. Après un rappel historique des missions du CDSP, la présentation passe d'abord en revue les différents services traitant des données électorales et quantitatives (sur Nesstar, Quetelet, Vizlab), avant de faire un zoom sur la banque d'enquêtes qualitatives beQuali. Cette dernière propose la diffusion et l'archivage pérenne d'enquêtes qualitatives en sciences sociales. Les différentes étapes de traitement, les choix et procédures techniques, juridiques et documentaires sont passés en revue. L'appel à propositions de dépôt d'enquête dans beQuali est ouvert à l'ensemble de la communauté recherche : les modalités de réponse sont explicités.

Depuis 2005, le CDSP documente et met à disposition de la communauté scientifique des enquêtes et données quantitatives. Les métadonnées liées à une étude, à ses résultats et aux données de l'étude elles-mêmes sont renseignées avec le logiciel Nesstar. Cette communication traite de la documentation de séries d’enquêtes quantitatives répétées dans le temps, et, plus précisément, des métadonnées au niveau de la description de l’étude. Par leur caractère longitudinal, ces séries posent des questions spécifiques de documentation. Nous proposons de comparer les pratiques de documentation des enquêtes produites par le CDSP, pour lesquelles la documentation commence dès leur conception (Enquêtes annuelles ELIPSS, Pratiques numériques...) et des dispositifs d’enquêtes dans lesquels le caractère récurrent n’est intégré qu’après la fin de leur terrain (Agoramétrie, Image de la Science...). L’objectif de cette comparaison est double : d’une part, identifier les éléments qui semblent intéressants à conserver dès la conception d’une enquête en vue de sa documentation ; d’autre part et plus généralement, interroger les enjeux posés par la prise en compte du cycle de vie dans la documentation.

Publication date 2018
BRUNEL Valentin
FROMONT Emilie
15
views

15
downloads
Le groupe de travail 'Pérennisation des informations numériques' a été mis en place en 2000 au sein de l'association Aristote. Dans le cadre de la réunion plénière du groupe, le CDSP a présenté les outils et services qu'il propose à la communauté scientifique ainsi que les moyens mis en œuvre pour conserver ces données de façon pérenne.

The French Center of Socio-Political Data has been disseminating surveys provided by external researchers since 2005. We have been using DDI-C to document the databases in Nesstar Publisher. Since 2012, the CDSP has been the data producer of the ELIPSS panel. Panel members have been administered a monthly web questionnaire. To date, this has resulted in 15 databases that are to be disseminated to the scientifically interested public. What are the best practices for using DDI effectively in this context? Our first challenge was to identify an approach in which the entire lifecycle of the data collection, processing and archiving of the early-phase project ELIPSS is addressed. To meet its longitudinal information needs, we took DDI-L under consideration. At the same time, shifting to the Lifecycle model raised the issue of the migration of the existing databases of the CDSP catalogue into the new standard. We started by making a list of specifications: level of metadata detail, possibility of upgrading DDI-C documented databases into DDI-L, user-friendliness of the interface, etc. We then compared different tools (Colectica, Questasy, DDI Editor etc.) and selected the one that best fits our needs. We also identified good documentation practices. Our paper presents this strategy

Harmonised datasets improve data discovery and allow comparative statistical analysis. If metadata is not entirely harmonised across waves and studies, researchers may experience difficulties in finding the data they need. Also, data managers may encounter difficulties in managing metadata. Taking into consideration the DDI Lifecycle specifications, our data harmonisation process was engaged in 2015. We are using an ad-hoc database with the existing variable documentation and after calculating the co-occurrence measure between two sequences we validate or not the similarities. A further step, in progress, is the processing of the link between data and documentation. This work is done to rebuild the data and metadata lifecycle. The question addressed in this paper is how DDI Lifecycle can be used to support data harmonisation and comparison projects. On the one hand, we will discuss benefits for the researchers that access our repository and on the other hand, benefits for the data managers in the documentation workflow.

Secondary analysis is gaining a central role in contemporary social research. This type of research uses existing materials. Finding the appropriate data is a crucial step of a research project. Researchers can contact the French Center for Socio-Political Data to obtain quantitative surveys. The CDSP data dissemination system includes platforms that allow data discovery (a question bank, a data discovery webviewer…) and data download. The several data dissemination platforms are not inter-operable. When we publish a new study, we have to publish it on each of the dissemination platforms. The consequences are a risk of de-synchronisation of the versions of the files. Besides this, the metadata is not entirely harmonised across waves/studies. Data labels may be different for the same question asked across time. In this context, researchers may encounter difficulties in finding the data they need. The presentation will address: - How can we make data more available and usable for the scientific community by determining how best to recognise continuities between data within the studies, including question continuity and methodological continuities. - How to create visualisations of all the waves including the question and download the ones of interest. - How through the use of concepts, we aim to make our catalog more attractive for the scientific community. Harmonisation of studies/waves over space and time and across studies, will allow researchers to gain an easier and improved access to the data and metadata. The paper will present the programs, interfaces and procedures that are to be designed in order to accomplish this purpose, in the context of a workflow including limited human and material means.

The French Center of Socio-Political Data has presented its reflection on the process of shifting from DDI-C to DDI-L at EDDI14. This year, we will discuss the creation and storage of a DDI-L compliant XML record by capturing metadata of a nine-wave political study of the ELIPSS panel. Determining how best to recognise continuities between metadata collections within the same study, including question continuity and methodological continuities has been a primary challenge. To answer it, the starting point was the creation of a questions database. As seen at the 2014 DDI workshop in Dagstuhl, the minimum requirements that a metadata system should meet before being able to import/export DDI-L are uniqueness of items, versioning and granularity. To conceive such a database, we had to start by using simple tools. We first identified metadata in CSV files that include variable-level information. We then performed a semi-manual import from these files to the database using importing scripts. Once we removed automatically the redundancy, with a further stage of human control, we generated the structure of the DDI-L compliant XML file. Our paper will present this process and discuss its replication to other DDI-C documented studies.

Next