Co-auteur
  • OOGHE Benjamin (12)
  • PLIQUE Guillaume (10)
  • JACOMY Mathieu (10)
  • DEDINGER Beatrice (7)
  • Voir plus
Type de Document
  • Communication non publiée (22)
  • Site Web (10)
  • Article (2)
  • Actes de colloque (2)
  • Voir plus
11
vues

0
téléchargements
France started to compile statistics about its trade in 1716. The "Bureau de la Balance du Commerce" (Balance of Trade's Office) centralized local reports of imports/exports by commodities produced by french tax regions. Many statistical manuscript volumes produced by this process have been preserved in French archives. This communication will relate how and why we used network technologies to create a research instrument based on the transcriptions of those archives in the TOFLIT18 research project. Our corpus composed of more than 500k yearly trade transactions of one commodity between a French local tax region or a foreign country between 1718 and 1838. We used a graph database to modelize it as a trade network where trade flows are edges between trade partners. We will explain why we had to design a classification system to reduce the heterogeneity of the commodity names and how such a system introduce the need for hyperedges. Our research instruments aiming at providing exploratory data analysis means to researchers, we will present the web application we've built on top of the neo4j database using JavaScript technologies (Decypher, Express, React, Baobab, SigmaJS). We will finally show how graph model was not only a convenient way to store and query our data but also a poweful visual object to explore trade geographical structures and trade products' specialization patterns. Project funded by the French Agence Nationale de la Recherche (TOFLIT18)

Cet article reprend une recherche de Luc Boltanski sur les enseignants de l’IEP de Paris. Dans cette recherche, Boltanski s’appuie sur une représentation tabulaire des champs sociaux pour montrer que la classe dominante se caractérise avant tout par sa multipositionnalité, c’est-à-dire par la tendance de ses membres à occuper plusieurs positions dans plusieurs champs. En remplaçant le tableau de Boltanski par un graphe d’individus et d’institutions, nous discuterons les caractéristiques et les avantages d’une sociologie de réseaux hétérogènes.

The web is a field of investigation for social sciences, and platform-based studies have long proven their relevance. However the generic web is rarely studied in itself though it contains crucial aspects of the embodiment of social actors: personal blogs, institutional websites, hobby-specific media… We realized that some sociologists see existing web crawlers as “black boxes” unsuitable for research though they are willing to study the broad web. In this paper we present Hyphe, a crawler developed with and for social scientists, with an innovative “curation-oriented” approach. We expose the problems of using web-mining techniques in social science research and how to overcome those by specific features such as step-by-step corpus building and a memory structure allowing researchers to redefine dynamically the granularity of their “web entities”.

Since its foundation in May 2009, the médialab Sciences Po works to foster the use of digital methods and tools in social sciences. With the help of existing tools and methods, we experienced the use of web mining techniques to extract data on collective phenomena. We also attended the symposiums organised by the two institutions responsible of web archiving in France: BnF and INA where we learnt about the difficulties posed to social scientists by the use of web archives. Actually our own experience in mining the live web wasn’t easier. Such difficulties, we believe, can be explained by the lack of tools allowing scholars to build themselves the highly specialized corpora they need from the wide heterogeneity of the web. The web isn’t a well-known document space for scholars or librarians. Its hyperlinked and heterogeneous nature requires to envision new ways of conceiving and building web corpora. And this notion of web corpus is a necessity for both live and archived web. If methods are not appropriate enough for analysing the live web, the problem will not be easier on an archive where the time dimension adds complexity.

11
vues

0
téléchargements
Bruno Latour wrote a book about philosophy (an inquiry into modes of existence). He decided that the paper book was no place for the numerous footnotes, documentation or glossary, instead giving access to all this information surrounding the book through a web application which would present itself as a reading companion. He also offered to the community of readers to submit their contributions to his inquiry by writing new documents to be added to the platform. The first version of our web application was built on PHP Yiii and MySQL on the server side. This soon proved to be a nightmare to maintain because of the ultra-relational nature of our data. We refactored it completely to use node.js and Neo4J. We went from a tree system with internal links modelized inside a relational database to a graph of paragraphs included into documents, subchapters etc. all sharing links between them. On the way, we've learned Neo4J thoroughly, from graph data modeling to cypher tricks and developped our custom cypher query graphical monitor using sigma.js in order to check our data trans-modeling consistency. During this journey, we've stumbled upon data model questions : ordered links, sub items grouping necessity, data output constraints from Neo4J, and finally the limitations of Neo4J community edition. Finally we feel much more confortable as developers in our new system. Reasoning about our data has become much easier and, moreover, our users are also happier since the platform's performance has never been better. Our intention is, therefore, to share our experience with the community: - our application's data needs - our shift from a MySQL data model to a Neo4J graph model - our feedbacks in using a graph database and more precisely Neo4J including our custom admin tool [Agent Smith](https://github.com/Yomguithereal/agent-smith) - a very quick description of the admin tools we built to let the researchers write or modify contents (a markdown web editor) The research has received funding from the European Research Council under the European Union’s Seventh Framework Programme (FP7/2007-2013) / erc Grant ‘IDEAS’ 2010 n° 269567” Authors : Guillaume Plique A graduate student from Sciences-Po Lille and Waseda University, Guillaume Plique now offers the médialab his backend development skills as well as his profile in social sciences. He has been working since June 2013 on several projects such as IPCC mapping, AIME and develops scrapers aimed at social sciences researchers. https://github.com/Yomguithereal Paul Girard Paul Girard is an Information Technology engineer specialized in driving collaborations between technology and non-technical domains. He graduated from the cultural industry engineering specialisation in Université de Technologie de Compiègne in 2004 where he studied the relationships between digital technologies and society and the mechanisms of collaborations. He worked in the research laboratories federation CITU (Paris 1 and Paris 8 universities) from 2005 to 2009 where he participated in research and creation projects, collaborations between artists and engineers working with interactivity, digital pictures, virtual and augmented reality. He joined the médialab laboratory at Sciences Po at its foundation during the spring of 2009, as the digital manager of this digital research laboratory dedicated to fostering the use of digital methods and tools in Social Sciences. Since then he oversees the technical direction of the many research projects as collaborations between social sciences, knowledge engineering and information design. His present research fields are digital methods for social sciences, exploratory data analysis and enhanced publication though digital story telling. https://github.com/paulgirard Daniele Guido Daniele Guido is a visual interaction designer interested in data mining applications, text analysis and network tools. He collaborates with researchers in History and Social Science, designers and engineer to conceive and develop digital tools for the humanities. He recently joined the DIgital Humanities lab at CVCE team in Luxembourg after several years working at the Sciences-Po Medialab team in Paris, where he was engaged in the FORCCAST project (forccast.hypotheses.org) and in the AIME project (modesofexistence.org) https://github.com/danieleguido

1
vues

0
téléchargements
A python library to exchange webcorpus format

in Le Monde Publié en 2012-02-03
OOGHE Benjamin
LAROUSSERIE David
4
vues

0
téléchargements

Publié en 2012
LECLERCQ Christophe
GUIDO Daniele
5
vues

0
téléchargements
Experiments in Art and Technology (E.A.T.) is a renowned example of interdisciplinarity at the intersection of Art, Science and Technology, conceived by Rauschenberg, a co-founder of E.A.T., as a « map of engineers, money and equipment », facilitating collaborations between artist, engineer and industry. Despite the amount of information available, E.A.T.’s production and legacy remains unclear. The project, in collaboration with E.A.T. (represented by Julie Martin), is based on the organization’s numerous activities, developed in art and non-art contexts, including realized and unrealized works and projects from the 1960s to the present day. through a ‘datascape’. In the first place, the aim is to describe as extensively as possible the stories of works of art or projects, from their design and development stages to their different exhibitions and receptions. One of the main challenges is to produce a digital archive displaying the process of collaboration, not merely catalogue E.A.T.’s productions. Secondly, the archive will map these works and projects as networks of people, organizations, places and technologies, so as to better understand E.A.T.’s identity using the theories of Science and Technology Studies. The aim is to develop an online archive built as a research tool for humanities scholars in art and social art history. Their work, mapping the material in the archive, will provide the project’s added value; historical documents combined into information networks, revealed by various visualizations, will make of the archive a veritable exploratory tool.

Publié en 2012-05
LECLERCQ Christophe
GUIDO Daniele
2
vues

0
téléchargements

22
vues

22
téléchargements
Le développement des technologies numériques engendre une longue et profonde mutation de notre rapport à la connaissance. Quelle que soit l’étiquette utilisée pour désigner ce phénomène, de nombreuses personnes s’interrogent sur les évolutions de nos pratiques académiques (Lazer et al 2009, Ollion & Bollaert 2016). Explorer les parties pour construire les touts Latour et al avancent en 2012 que de nouvelles façons de représenter et surtout de naviguer dans les données permettraient de revenir sur le rapport entre tout et partie, débat aussi vieux que la sociologie elle-même (Latour 2012). D’après ces auteurs, chaque tout n’est qu’une façon particulière de voir les parties, un trait commun qui rassemble. Un trait parmi d’autres. Le tout est une prise tellement utile, tellement signifiante qu’on en oublie qu’il cache une réduction d’une foule de particularités. Or ce tout peut aujourd’hui être décomposé ou plutôt recomposé dynamiquement dans de nouveaux moyens d’exploration des données appelés Datascapes - autrement dit paysages de données. Reprenant ces intentions, nous avons depuis 2012 conçu des outils d’analyse exploratoire de données (Tukey 1977) qui permettent de multiplier les perspectives sur un même objet.

Suivant