Domain-topic models with chained dimensions : Modeling the evolution of a major oncology conference (1995-2017)
In this paper we introduce a novel approach for the computational analysis of research activities and their dynamics. Named SASHIMI (Symmetrical And Sequential analysis from Hierarchical Inference of Multidimensional Information), our approach provides a multi-level description of the structure of scientific activities that offers numerous advantages over traditional methods such as topic models or network analyses. Our method generates a dual description of corpora in terms of research domains (collections of documents) and topics (collections of words). It also extends this description to clusters of associated dimensions, such as time. SASHIMI only requires access to the textual content of individual documents, rather than specific metadata such as citations, authors, or keywords as is the case with other science-mapping approaches. We illustrate the analytical power of our method by applying it to the empirical analysis of an original dataset, namely the 1995-2017 collection of abstracts presented at ASCO, the largest annual oncology research conference. We show that SASHIMI is able to detect the presence of significant temporal patterns and to identify the major thematic transformations of oncology that underlie these patterns.