Co-auteur
  • VENTURINI Tommaso (15)
  • GIRARD Paul (12)
  • OOGHE Benjamin (12)
  • PLIQUE Guillaume (5)
  • Voir plus
Type de Document
  • Communication non publiée (10)
  • Article (8)
  • Site Web (8)
  • Rapport (4)
  • Voir plus
in OMICS. A Journal of Integrative Biology Publié en 2020
NECULA Andra
LEIBING Annette
BLASIMME Alexandro
39
vues

0
téléchargements
The expression “public opinion” has long been part of common parlance. However, its value as a scientific measure has been the topic of abundant academic debates over the past several decades. Such debates have produced more variety and contestations rather than consensus on the very definition of public opinion, let alone on how to measure it. This study reports on the usefulness of web-based big data digital network analytics in deciphering the distributed meanings and sense making related to controversial biotechnology applications. Using stem cell therapies as a case study, we argue that such digital network analysis can complement the traditional opinion polls while avoiding the sampling bias that is typical of opinion polls. Although the polls cannot account for the opinion dynamics, combining them with web-based big data analysis can shed light on three dimensions of public opinion essential for sense making: counts or volume of opinion data, content, and movement of opinions. This approach is particularly promising in the case of ongoing scientific controversies that increasingly overflow into the public sphere morphing into public political debates. In particular, our study focuses as a case study on public controversies over the clinical provision of stem cell therapies. Using web entities specifically addressing stem cell issues, including their dynamic aggregation, the internal architecture of the web corpus we report in this study brings the third dimension of public opinion (movement) into sharper focus. Notably, the corpus of stem cell networks through web connectivity presents hot spots of distributed meaning. Large-scale surveys conducted on these issues, such as the Eurobarometer of Biotechnology, reveal that European citizens only accept research on stem cells if they are highly regulated, while the stem cell digital network analysis presented in this study suggests that distributed meaning is promise centeredness. Although major scientific journals and companies tend to structure public opinion networks, our finding of promise centeredness as a key ingredient of distributed meaning and sense making is consistent with therapeutic tourism that remains as an important facet of the stem cell community despite the lack of material standards. This new approach to digital network analysis has crosscutting corollaries for rethinking the notion of public opinion, be it in electoral preferences or as we discuss in this study, for new ways to measure, monitor, and democratically govern emerging technologies.

Publié en 2018-10 Nom de la conférence WS.2 2018 International conference on Web Studies, Paris, France — October 03 - 05, 2018
35
vues

0
téléchargements
The emergence and success of web platforms raised a gimmick into social studies: “Hyperlink is dead!“. Capturing web users into mobile applications and private web platforms to propose them a specific user experience (and a business model) created indeed new silos in the open World Wide Web space. The simplified availability of user behavioural data through these platforms APIs reinforced this idea in academic communities by providing scholars with a rich and easy way to collect user centric data for their research. After discussing the methodological and ethical aspects of the web divide between platforms and classical websites, we will argue in this communication that hyperlinks, although more complex to collect, manipulate and apprehend, remain an invaluable matter to use the web as a research field. We will illustrate it using Hyphe, a dedicated web corpus creation tool we developed to mine hypertexts.

in The Routledge Handbook to Developments in Digital Journalism Studies Sous la direction de ELDRIDGE II Scott Publié en 2018-08
BOUNEGRU Liliana
GRAY Jonathan
26
vues

0
téléchargements
Networks are classic but under-acknowledged figures of journalistic storytelling. Who is connected to whom and by which means? Which organizations receive support from which others? What resources or information circulate through which channels and which intermediaries enable and regulate their flows? These are all customary stories and lines of inquiry in journalism and they all have to do with networks. Additionally, the recent spread of digital media has increasingly confronted journalists with information coming not only in the traditional form of statistic tables, but also of relational databases. Yet, journalists have so far made little use of the analytical resources offered by networks. To address this problem in this chapter we examine how “visual network exploration” may be brought to bear in the context of data journalism in order to explore, narrate and make sense of large and complex relational datasets. We borrow the more familiar vocabulary of geographical maps to show how key graphical variables such as position, size and hue can be used to interpret and characterise graph structures and properties. We illustrate this technique by taking as a starting point a recent example from journalism, namely a catalogue of French information sources compiled by Le Monde’s The Decodex. We establish that good visual exploration of networks is an iterative process where practices to demarcate categories and territories are entangled and mutually constitutive. To enrich investigation we suggest ways in which the insights of the visual exploration of networks can be supplemented with simple calculations and statistics of distributions of nodes and links across the network. We conclude with reflection on the knowledge-making capacities of this technique and how these compare to the insights and instruments that journalists have used in the Decodex project – suggesting that visual network exploration is a fertile area for further exploration and collaborations between data journalists and digital researchers.

Hyphe, a web crawler for social scientists developed by the SciencesPo médialab, introduced the novel concept of web entities to provide a flexible and evolutive way of grouping web pages in situations where the notion of website is not relevant enough (either too large, for instance with Twitter accounts, newspaper articles or Wikipedia pages, or too constrained to group together multiple domains or TLDs...). This comes with technical challenges since indexing a graph of linked web entities as a dynamic layer based on a large number of URLs is not as straightforward as it may seem. We aim at providing the graph community with some feedback about the design of an on-file index - part Graph, part Trie - named the "Traph", to solve this peculiar use-case. Additionally we propose to retrace the path we followed, from an old Lucene index, to our experiments with Neo4j, and lastly to our conclusion that we needed to develop our own data structure in order to be able to scale up.

3
vues

0
téléchargements
Le web c’est grand, surtout vers le fond. Et ce n’est pas très organisé, même si ce n’est pas non plus le chaos. Quelle est la structure du web, et comment s’y orienter ? Question plus difficile encore, comment trouver et identifier l’information pertinente sans amasser de téraoctets inutiles ? Le web nous oppose des défis à la fois méthodologiques et technologiques. Le médialab de Sciences Po a développé HYPHE, un robot amasseur de données web aussi appelé «crawler», ajusté aux besoins de la recherche en sciences sociales. Il s’adresse aux sociologues qui veulent investiguer le web comme terrain d’enquête qualitative et en tirer des indicateurs quantitatifs. S’appuyant sur le modèle du web «en couches», il guide son utilisateur pour construire, itération après itération, un corpus de ressources et/ou d’acteurs. Le travail manuel de sélection et de qualification de l’information est récompensé par un réseau de ressources que l’on peut exploiter de différentes manières: en analysant sa topologie avec GEPHI, en exportant ses textes vers un logiciel de traitement du langage, ou encore en construisant un moteur de recherche dédié. Le médialab vous propose une présentation de ce logiciel libre et open source, et une initiation à ses principaux concepts. Des exemples tirés des travaux des chercheurs l’ayant utilisé illustreront ses possibilités. Une démo de HYPHE est également disponible en ligne - hyphe.medialab.sciences-po.fr

In this article, we present a few lessons we learnt in the establishment of the Sciences Po médialab. As an interdisciplinary laboratory associating social scientists, code developers and information designers, the médialab is not one of a kind. In the last years, several of such initiatives have been established around the world to harness the potential of digital technologies for the study of collective life. If we narrate this particular story, it is because, having lived it from the inside, we can provide an intimate account of the surprises and displacements of digital research. Founding the médialab in 2009, we knew that we were leaving the reassuring traditions of social sciences to venture in the unexplored territory of digital inscriptions. What we couldn't foresee was how much such encounter would change our research. Buying into gospel of Big Data, we imagined that the main novelty of digital research came from handling larger amounts of data. We soon realized that the interest of digital inscriptions comes instead from their proliferating diversity. Such diversity encouraged us to reshape our professional alliances, research practices and theoretical perspectives. It also led us to overcome several of the oppositions that used to characterize social sciences (qualitative/quantitative, situation/aggregation, micro/macro, local/global) and to move in the direction of a more continuous sociology.

Nous proposons dans cette communication de présenter les principes et étapes de développement de production du datascape (corpus et interface d'exploration visuelle), ainsi que les contraintes et limites rencontrées dans l'application de cette méthode d'exploration de données pour la recherche. L'outil développé permet d'explorer le corpus de données à partir de trois entrées distinctes (acteurs web, texte des pages web, thèmes identifiés par les topics) et de basculer entre ces entrées. Il est fondé sur deux principes de navigation. Un premier principe que l'on pourrait qualifier de vertical, qui vise à passer du « tout » vers « les parties », c'est à dire du réseau complet aux web entités, puis aux pages web, mais également des topics, aux termes qui les constituent. De plus, il permet de dépasser cette fonction de zoom dans les données (Boullier et al, 2016) en offrant un second principe de navigation horizontale. Le datascape est conçue pour circuler, à chaque étape de la navigation, entre les différents attributs des données du corpus, des acteurs aux documents, et des documents aux topics.

We defined this website as a datascape (Latour and al., 2012). A datascape is a tool that allows exploring a dataset from different levels of aggregation and different points of view related to the attributes of each element of the corpus. The philosophy of this datascape is to always be able to qualify actors (web entities) and the terms of potential controversies (topics and text content of pages). To do this we have designed a tool that allows following the links between web entities, their pages and associated topics. We have also included two visualization tools, a graph to locate web entities, and a matrix to explore links between topics.

Suivant