Short Papers

The latest overview of the DH2019 programme is available in ConfTool.

Short Papers

Blackwell, Christopher William (1,3);
Smith, David Neel (2,3)
Discoverable Data Models and Extended Text Properties in the CITE ArchitectureThe CITE Architecture is a generic framework for identification, retrieval, and alignment of information about things humanists study. The challenge of a _generic_ framework lies in how it can handle the innumerable specific kinds of data likely to appear in any non-trivial digital library. This paper will describe the implementation of *Discoverable Data Models* and _Extended Text Property Types_ serialized in the CEX line-oriented, plain-text format and implemented in applications. Specific examples will be (a geo-spatial data in different formats (b) textual data in different markup encodings, and (c) image collections, where the same image may be exposed as a JPG on a filesystem, via the IIIF-API, or as a DeepZoom file.
Bellia, AngelaTowards a New Approach in the Study of Intangible Cultural HeritageArchaeoacoustics is used as a new method for the analysis of historical heritage, enabling the evaluation of the sound quality of an archaeological site by using auralisation techniques, which allow cognitive and physical elements to be reproduced and combined. Research has followed different approaches: from the analysis of the relationship between architecture and acoustics in their current condition, to the anechoic recordings of music to be used in the auralisation of sacred places. Using archaeoacoustics as an emerging discipline that involves the study of ancient sites, this paper aim to assess whether performative spaces in Sicily and South Italy were built in a precise place for their acoustical qualities, and to understand the reasons that led ancient cultures to create these spaces, as well as reconstruct how they experienced them.
Bermúdez-Sabel, Helena (1);
Díez Platas, María Luisa (1);
Ros, Salvador (1);
González-Blanco, Elena (2)
Towards a Common Model for European Poetry: Challenges and SolutionsThis paper stems from the analysis of multiple poetic resources that were available on-line, as well as the results of methodological discussions with scholars of European Literature. The goal was to retrieve the informational needs of all these different sources in order to build a common data model for European Poetry. Thus, by implementing a reverse engineering method, we have created the Domain Model for European Poetry, which is an important milestone for making existent poetry resources interoperable. In this paper, we will present some of the challenges we encountered while conceptualizing the information relevant to poetic analysis and how we have worked around them.
Li, Weixuan (1);
Piccoli, Chiara (2);
Heuvel, Charles van den (1,2)
Embracing Complex Interfaces Linking Deep Maps and Virtual Interiors to Big Data of the Dutch Golden Age.Although semantic web technologies are gradually introduced in the digital humanities and cultural heritage institutions the representation of linked data is still very abstract and hardly allows for interactions by researchers or other users.Here we present the first experiments with the creation of complex 2d/3d/4d interfaces on top of the Semantic Web, that express uncertainties in/allow users to interact critically in multiple ways with data. The 2D interface aims to preserve and present the complexities rooted in historical sources through deep mapping. It aims at the visualization and analysis of migration pattern of the creative individuals within Amsterdam during the Dutch Golden Age. The 3D/4D interfaces, which anchored at the GIS deep map layers, will act as an interactive hub to connect the heterogeneous data that are available on 17th century creative industries in a spatially coherent context.
Antonietti, LauraModelling The Editorial Reading Process: The Case Of Giulio Einaudi EditoreThe paper will present the results of the analysis and the modelling of the editorial reading process within the Italian publishing house Einaudi in the aftermath of World War II, an activity that resulted in the production of reading reports, containing summaries and evaluations of the proposed work.

The presentation will highlight the fundamental contribution of tools and methods of DH in the context of my research. I will discuss data modelling and encoding as speculative activities of epistemological value. In fact, by modelling the reading process, I was able to understand and represent its complexity; in addition, by encoding the documents I was able to focus in a consistent and speculative matter on aspects such as textuality, style, rhetoric.

The potential of the DH tools and methods have yet to be exploited and applied to this research field: at present there are no digital editions of reading reports.
Dollman, MelissaChanging Lanes: A Reanimation Of Shell Oil’s Carol LaneFrom 1947 to 1974 Shell Oil Company sponsored a public relations program that engaged single and married women drivers. They especially targeted married women who helped plan leisurely road trips for their families, and single “gals” who wanted to see the country. Over twenty different women portrayed its figurehead, the pseudonymous Carol Lane. I am researching each of the over twenty women who portrayed Lane and her audiences and associates using prosopographical analysis. On an ArcGIS platform I am developing my eventually public-facing digital dissertation whereby I present data (4000+ magazine and newspaper articles, books, films) I have collected, input myself, and am utilizing in a number of ways including data visualization and videographic criticism. I look forward to input on the efficacy of a variety of multimedia factoids I am utilizing and offering such as maps, census tables, and raw, searchable, related data in an (currently) Airtable database.
McKay, Cory (1);
Cumming, Julie E. (2);
Fujinaga, Ichiro (2)
Lessons Learned in a Large-Scale Project to Digitize and Computationally Analyze Musical ScoresThis paper presents insights we have gained from working on the SIMSSA (Single Interface for Score Searching and Analysis) project, which seeks to digitize historical musical manuscripts, use machine learning to convert the notation to searchable symbolic representations, automatically tag the results in musicologically meaningful ways, perform statistical analyses on data extracted from the music and make the resulting data and technologies easily accessible to other researchers. This paper emphasizes insights we have learned while working on this project that are meaningful not only to computational musicology and music information retrieval researchers, but also to those working in the digital humanities in general. We first focus on approaches to dataset construction and machine learning, and then explore approaches to making research data, software and results available, usable and attractive to other researchers in the humanities, including those not yet accustomed to computational approaches.
Ferreira-Lopes, PatriciaSimulating Historical Flows And Connection. The Artistic Transfer During The 15th To 16th Century In The Iberian Peninsula.The Late Gothic period (fourteenth-sixteenth century) was a phase of transition in Europe – with social, political, economic and cultural changes. Within this framework, Europe was the scene of a significant amount of mobility of artists that in some way materialised the production of architecture without borders: a "Pan-European style" [1] capable of reproducing and adapting models in different places. This paper will present the project ArTNet “Analysing artistic transfer network. A social and spatiotemporal study of Late Gothic architectural production in the Iberian Peninsula” which was designed to identify, record and analyse artistic transfer network transcending the building scale to better understand the process of Late Gothic architecture production in the Iberian Peninsula. An integral view bringing together several factors is being studied by multiscale models, combining HGIS and Graph model, and analysis (such as SNA, spatial statistics, map visualisation and spatiotemporal analysis).
Arthur, PaulTracing the Development of Digital Humanities in AustraliaThis paper discusses the development of digital humanities in Australia, with reference to major projects, events and the establishment of the Australasian Association for Digital Humanities (aaDH). It begins by providing an overview of national exemplar projects, events and policies that predate the founding of the association in 2011, as well as significant activities and initiatives that formed a basis for the Australian field. The paper outlines the history of aaDH as a regional association, reflecting on its directions over the past decade, and describes the parallel development of large-scale research infrastructure that has supported the field’s further growth.
Hu, JiajiaUsing Network Analysis to Do Traditional Chinese Phonology StudyTraditional Chinese Phonology, lacking of alphabetic system of phonetic notation such as IPA, had to deal with large written materials in Chinese characters, and used Chinese characters as a tool to analyze sounds of words. This brings up a significant feature of its study, that is, the relationships of words’ sounds are more important than their phonetic values.

Xìlián (literally: "inter-link") is one of the most important methods in traditional Chinese phonology. Its fundamental is to build networks of Chinese characters having same syllabic elements. This paper takes Xìlián of Fǎnqiè in Guǎngyùn as an example to show how to use network analysis and visualization software to improve traditional Chinese phonology study.
Williamson, ElizabethEncoding Early Modern English Drama: Embedding Digital Approaches In Undergraduate Literature Courses.This paper offers a pedagogical case study whereby students were introduced to the complexities of text encoding as a way to destabilize the revered canonical text and teach digital critical literacy, as part of a first year Shakespeare module. I will argue that there is a particular synergy between the existing concerns of the Shakespeare course and those of digital publication, where the latter finds a natural fit in conversations on book history, text technologies, and editorial agency. Neglecting to discuss the digital provenance of a text online and the multiplicity of agents involved in producing any text obscures the encoding as well as editorial choices made at every stage of its creation. This is a conversation that we need to begin early on in students’ academic career, to situate digital critical literacy within an existing tradition of literary criticism.
Vignale, François (2);
Benatti, Francesca (1);
Antonini, Alessio (1)
Reading in Europe - Challenge and Case Studies of READ-ITThis paper aims to present the READ-IT project and the first set of case studies collected by DH and HSS researchers. Use-cases are key in the project’s strategy as they are essentials both to the definition and the validation of READ-IT data model and framework. The case studies include different sources, such as social media, students’ diaries and letters, from the 18th up to today, in Czech, French, German, Italian and Dutch. Each of them is supported by a specific dataset and a specific research question. In this context, this original validation process must be able to demonstrate the relevance, robustness and ability of both the general concept and the data model to process a wide variety of sources. Then, this model should be transferable to other DH projects where the experiential dimension is present.
Shang, Wenyi (1);
Zhang, Jingzhou (2);
Huang, Win-bin (1)
Modelling Poetic Similarity: A Comparative Study of W. B. Yeats and the English Romantic PoetsObserving poetic similarity is a method in comparative studies of poetry, indicating interrelationship among works and poetic influences among authors. This research explores the poetic similarity between the Irish poet W. B. Yeats and four related English Romantic poets, whose influence can be discerned in Yeats’s poetry. Adopting digital approaches, we build a model to quantitatively analyze the poetic similarity among authors in three aspects: intertextuality, formal elements (including rhyme type, meter style and enjambment), and sentiment. After analyzing 1090 poetic works by Yeats and four English Romantic poets and interpreting the results, we find the poetic similarity between Yeats and Blake to be the most significant, corresponding with previous literary critics’ views and leaving room for further explorations. The research's findings assist previous literary studies and its methods may shed light on further studies concerning Yeats’s complex relationship with Romanticism.
Mei, Ching-Hsuan;
Hung, Jen-Jou
Exploring Intertextuality in the Mahoyoga Section of the Rin chen gter mdzodAlthough Tibetan scholars have already noticed the phenomenon on textual reused of treasure literature (gter ma), it remains difficult to conduct a big scale of compare reading and further identify repeated sentences and locate their origin. Deducing from previous studies, we estimate that there might be thick intertextuality embedded in the writings of treasure texts than those already noticed. There is no systematic analysis on big Tibetan textual collections in academic circle so far, thus we propose to apply digital textual analysis technology to deconstruct the great corpus of Tibetan treasure—Mahayoaga section in the Rin chen gter mdzod. Considering the amount of data, we try to implement digital technology to compare each phrasing in order to detect reused sentences, thus we can further interpret the so-called intertextuality in Tibetan treasure literature. After a trial period of this research project, we find it is an approachable goal.
Kräutli, Florian;
Valleriani, Matteo;
Lockhorst, Daan
Calculating Sameness: Identifying Image Reuse In Early Modern booksWe extracted around 16.000 images and diagrams from a corpus of university textbooks published between 1472 and 1650. Some of these images appear several times throughout the corpus. We present how we identify and analyse recurring images using an image hashing algorithm and a data visualisation tool. The reappearance of images, combined with bibliographic metadata, can offer insights into the kind of knowledge that is being taught, which images have been successful, as well as which images might have been exchanged between different printers and publishers.
Waxman, JoshuaA Graph Database of Scholastic Relationships in the Babylonian TalmudEnvisioning the dense scholastic interactions of the Babylonian Talmud as a multi-generational social network, we built a social network graph of the Talmud. We performed Named Entity Recognition on statement-aligned Hebrew and English texts to produce graphs of rabbis (nodes) and their interactions (edges). The new graph database, currently available online, delivers visual and color-coded information about scholastic generation, teacher-student relationships, and both local (page-level) and global (Talmud-level) interactions. By making a wealth of knowledge accessible and highlighting relevant relationships, this tool can provide valuable insights into the complex dynamics of Talmudic discourse.
Wevers, Melvin (1);
Smits, Thomas (2)
Advertising Gender - Using Computer Vision to Trace Gender Displays in Historical Advertisements, 1920-1990This study applies computer vision techniques to examine the representation of gender in historical advertisements. Using information on the relative size, position, and gaze of men and women in images, we chart gender displays in Dutch newspaper adverts between 1920 and 1990. In this short paper, we operationalize aspects of Erving Goffman's theories on gender displays in two ways. First, Goffman argues that ``differences in size will correlate with differences in social weight." Using facial recognition software, we select adverts that include people and then run gender detection software to estimate whether men or woman were represented in the images This allows us to visually represent the changing faces of men and women in adverts. Part of the process also entails a reflection on the possible bias in these algorithms.
van Lange, Milan MikolajTantrums and Traitors: a Diachronic Analysis of Emotions in Parliamentary Debates on War Criminals and CollaboratorsIn this study emotions in discussions about the punishment of collaborators and war criminals in the Netherlands are investigated by analysing the verbatim minutes of parliament. The application of text mining techniques to this digitised historical text corpus allows for a diachronic perspective on a historical case study. With this paper, I present the historical case, the materials and techniques used, and some insights based on preliminary results. I will also address general advantages and limitations of using text mining in historical research. Aim of this investigation is to explicitly investigate and discuss the validity of the use of emotion lexicons in diachronic historical research.
Bonora, Paolo;
Pompilio, Angelo
E Pluribus Unum: a Uniform DL Solution for Historical Data Management, Archiving and Exploitation of OperaThe paper will present the results of the Corago LOD project, promoted by the Department of Cultural Heritage Studies of the University of Bologna, which has applied the Semantic Web technologies to digital archives about Opera. While use of DL for content management within CH is now consolidated, the project studied what the use of formal ontologies means from the end user's point of view. The complexity of the domain an those deriving from the adoption of the CIDOC CRM and FRBRoo as reference ontologies motivated the investigation of newer strategies to navigate the knowledge base. The identified solution is the introduction of an abstraction layer where the domain expert defines the way information is going to be presented to end users. We will illustrate how this approach allows the creation of applications that make knowledge easily usable by the human actor while maintaining the interoperability prerogatives of Linked Data.
Kruijt, AnneVinKo: Language Documentation Through Digital CrowdsourcingThis talk aims to discuss and introduce the VinKo platform as a linguistics fieldwork tool and crowd-sourced data base. VinKo is an online platform used to collect data from the Trentino/Alto Adige region (Italy), a linguistically complex region, via innovative crowdsourced methods on a digital platform. It is used to collect oral data from the Germanic and Romance variaties spoken in the area in order to gain further insight into the different aspects of multilingualism and microvariation. The data collection is done through a simple interface, which facilitates easy collaboration with speakers and speech communities, and aims to return all collected data to the community in a meaningful manner. Online use of minority languages plays an important role in increasing its prestige and visibility, and it can greatly contribute to the maintenance of the variety by the increasing awareness and pride of the own language.
Rybicki, JanAnalysis of Writer-Text-Translator Social NetworksThis paper is an analysis of the connections between writers, their texts and their translations through social network analysis. Data on writers and translators was obtained from UNESCO’s Index Translationum, a large database of existing translations of texts from numerous domains. The data for this study was limited to literary translations into Polish, a total of almost 18,000 individual editions of novels or collections of short stories by 8290 authors and 6582 translators from 155 languages. This produces a complex mesh of writer-to-translator connections, which is analyzed using the Fruchterman-Reingold force-directed algorithm. Interesting phenomena can be observed using such a visualization of otherwise unaccessible links between items in the database.
Homburg, TimoPaleo Codage - A machine-readable way to describe cuneiform characters paleographicallyThis publication introduces a new system to describe cuneiform characters in a machine-readable way. It includes paleographic information such as relations of cuneiform wedges to each other so that the computer can reconstruct and analyse the characters described. This system provides the first chance to assign each cuneiform character a unique character description based on the characters shape which is easily understandable and valid across all cuneiform languages in all epochs. The system is showcased on a subset of characters currently encoded in the Unicode standard highlighting different positioning variants found in the whole Unicode character set. As a first application a similarity graph based on the new encoding system has been developed and is presented. In addition a web application to create own encodings according to the new system is available for testing and an Android app for character recognition is currently modified to improve accuracy in character detection.
Palladino, Chiara;
Bergman, James;
Trammell, Caroline;
Mixon, Eleanor;
Fulford, Rebecca
Using Linked Open Data to Navigate the Past: An Experiment in Teaching ArchaeologyLinked Open Data is a powerful tool for navigating through the complexity of the inherently multifaceted reality of archaeological sites, which results from the intersections of space, materiality, language, visual culture, history, text, and so on. However, LOD also poses the challenge of how to manage such complexity in a meaningful way. In this paper, we report on an experimental project developed during a Classical Archaeology course in 2018, during which we researched four different Graeco-Roman sites, with the goal of reconstructing the main aspects of their material history through exclusively LOD-based resources.
Dumouchel, Suzanne (1);
Giglia, Elena (2)
CO-OPERAS IN: Integration And Cooperation To Face Fragmentation And Address Complexity In The SSHComplexity in the Social Sciences and Humanities (SSH) can take the shape of the fragmentation of research fields, across many disciplines and subdisciplines, usually grounded in regional, national and linguistic specific communities. “Big data” does not apply to SSH, where data can often need to be very precisely qualified, described, curated and managed: they are smart and small data, which means they have to be specifically managed, all the more so in the perspective of being integrated in the European Open Science Cloud (EOSC) landscape, as a major component of the IFDS. This short paper will present CO-OPERAS - Open Access in the European Research Area Through Scholarly Communication -, which is an Implementation Network (IN) within the GoFAIR initiative, as a tool to face this issue.
Basaraba, NicoleCreating Complex Digital Narratives for Participatory Cultural HeritageMany digital humanities projects have produced digitised archives and museums are increasingly integrating digital media into their visitor experiences. Some of these valuable historic databases have provided public access but can lack sustained public engagement. As Schreibman (2017) argued, public participation in cultural heritage can generate new ideas and could challenge the top-down division between researchers and the public (p. 281). In a move towards this notion, this paper proposes a novel approach to producing interactive digital narratives (IDN) that converge expert-produced content and user-generated content (UGC) with the aim of creating participatory cultural heritage narratives that also maintain narrative control. Examining the potential of IDN formats for creating cultural heritage narratives involves many facets of complexity and this paper will discuss these complexities in the context of how cultural analytics (Manovich, 2007) can be used to create IDNs.
Vigliensoni, Gabriel (1);
Daigle, Alex (1);
Liu, Eric (1);
Calvo-Zaragoza, Jorge (2);
Regimbal, Juliette (1);
Nguyen, Minh Anh (1);
Baxter, Noah (1);
McLennan, Zoe (1);
Fujinaga, Ichiro (1)
Overcoming the Challenges of Optical Music Recognition of Early Music with Machine LearningSeveral centuries of manuscript music sit on the shelves of libraries, churches, and museums around the globe. On-line digitization programs are opening these collections to a global audience, but digital images are only the beginning of true accessibility since the musical content of these images cannot be searched by computers. In the SIMSSA (Single Interface for Music Score Searching and Analysis) project we aim at teaching computers to read music and assemble the data on a single website. However, the automatic retrieval and encoding of music from score images has many complexities. In this paper, we describe our current workflow to perform end-to-end optical music recognition (OMR) of early music sources.
Hulden, ViljaLabor Witnesses at U.S. Congressional Hearings: Historical PatternsThis paper examines what metadata about Congressional hearings can tell us about shifts in the relative power of workers in American society over time. The metadata contains 941,302 instances of testimony between 1877 and 1990; it includes information about the witnesses appearing before Congressional committees as well as about the subjects of the hearings. This data is juxtaposed with strike and union density data to suggest that labor has been most consistently represented at Congressional hearings when a) it has engaged in electoral politics and b) it has possessed demonstrable strength in civil society as measured not only by isolated incidents but by consistent penetration. Future work hopes to juxtapose these data sets with actual legislative outcomes; merely being heard does not, after all, necessarily translate into being listened or deferred to. Such juxtapositions could help elucidate whether organizational strength converts to important outcomes as well as presence at hearings.
Grincheva, NataliaRe-inventing the Past Through Singapore Memory Project: Socio-political Complexities of Digital Crowd-sourcing TechniquesIn my presentation I will explore how online audiences experience time in digital museum communities, and how these experiences change their cultural perceptions and identities. The project will focus on the online museum case study: Singapore Memory Project (2011-2015). It is an online national initiative for public memory preservation. It was facilitated by the National Libraries and Museums Board in 2011 to collect and provide access to Singapore’s culture through crowd-sourcing.

Employing interviews with governmental officials and museum managers, as well as content discursive analysis of the online memory portal, in my research I analysed how this digital space reconstructed time through museum narratives communicating political messages across borders. I also explored how these narratives were aligned with national and foreign policy objectives of the country revealing social and cultural complexities of the memory crowd-sourcing exercise.
Granholm, PatrikManuscripta – A Digital Catalogue of Medieval and Early Modern Manuscripts in SwedenThis paper presents the guiding principles and ongoing development of Manuscripta, a digital catalogue of medieval and early modern manuscripts in Sweden, which started as a project specific database but has since evolved to become a national infrastructure. The manuscript descriptions are encoded in TEI, which is a highly suitable metadata format for detailed, scholarly catalogues since the hierarchical structure of TEI corresponds to the four parts traditionally used in cataloguing: description of contents, codicological description, provenance, and bibliography. The digitised manuscripts are presented using the IIIF API, and the images are available free of restriction under the CC0 public domain dedication. The infrastructure is built entirely using open source software, and the source code, together with the TEI-files, are available on GitHub.
Smeets, Roel (1);
Sanders, Eric (1);
van den Bosch, Antal (1,2)
Modelling Conflicts Between Characters in Present-Day Dutch Literary Fiction.Conflicts have been regarded as one of the driving forces behind narrative action. Bonds and conflicts between characters are indicative of hierarchical oppositions between represented identities. Drawing on extensive metadata of 2137 characters in a corpus of 170 novels, the present paper models conflicts between characters in present-day Dutch literary fiction. Using insights from network theory and narratology, this talks addresses conflicts between two characters (dyads), as well as between three characters (triads). First, a conflict score is introduced through which the more powerful party between two hostile characters can be determined. Second, the amount of social balance between subnetworks of three characters is tested. The results of these two approaches are interpreted in light of existing theories on conflicts in narratives. It will be argued that conflict situations co-shape the ideological representation of characters in literature, which can be mapped on a large scale with our approach.
Thwaites, Denise (1);
Pailthorpe, Baden (2)
Blocumenta: An Experimental Art Project on the BlockchainFrom ‘Bitchcoin’ and ‘CryptoKitties’ to distributed ledgers for nuclear non-proliferation, feverish adoption and experimentation with blockchain technology is matched only by the promissory hype that accompanies it. This paper presents the aims and background of _Blocumenta_ - an experimental contemporary arts project that seeks to hold this technology to its promise, exploring if and how distributed artistic communities could be developed through new blockchain funding and archiving systems.

Departing from an initial stage of creative engagement and aesthetic experimentation across arts and tech communities, _Blocumenta_ interrogates whether a distributed, autonomous and trustless contemporary art economy and archive can overcome the centralised power dynamics of the global contemporary art world, enabling new digital resources for writing Art Histories. Combining a cryptocurrency crowdfunding structure with a decentralized app, _Blocumenta_ will establish a distributed autonomous art organisation (DAAO) exploring alternatives to the implicit power dynamics of traditional archives, and the chaos of Web 2.0.
Benardou, AgiatisImmersive Experiences And Difficult Heritage: Digital Methods As Re-interpreters Of Historically Contested SitesImmersive experiences describe all forms of perceptual and interactive use of technologies that blur the line between the physical world and a simulated or digital world, ie create a hybrid reality aiming at embracing all spheres of the user’s attention. Lately, methods of immersive experience such as virtual and augmented reality, artificial intelligence, as well as mixed methods (ie analogue and digital combined), are all means of memory re-composition in the cultural heritage domain.

The proposed talk will use the infamous Block 15 of the Haidari Concentration Camp in West Athens, the largest and most notorious concentration camp in wartime Greece, as a case study of a largely neglected site of difficult heritage and will attempt to showcase that immersive technologies would be best fit to make accessible, highlight and re-interpret both the site and the narrative surrounding it.
Konstantelos, Leo (1);
Pittock, Murray (1);
Benardou, Agiatis (2);
Economou, Maria (1);
Hughes, Lorna (1)
A World of Immersive Experiences: The Scottish Heritage PartnershipThe Scottish Heritage partnership is a nine-month AHRC-funded initiative aiming to address the existing practice and future potential of immersive experiences and technologies in the collections and heritage industry in Scotland. Its key research question revolves around measuring the success of approaches to immersive technologies at major heritage sites in Scotland, both in terms of outcomes against business plan expectations and in terms of visitor response, and the kinds of future development supported by the evidence.

Development of an evidence-based, decision-making model is currently under-way and will be presented at DH2019. Formulated as a policy and risk assessment document, the model is meant to help heritage institutions identify the kinds of future immersive experiences that are supported by our evidence; as well as assess how to develop effective, meaningful content into leading edge inclusive and impactful immersive experiences.
Du, KeliA Survey On LDA Topic Modeling In Digital Humanities LDA topic modeling is a statistical method that discovers hidden themes and topics from a text corpus. It has been widely applied in digital humanities in the past several years.

In practice, topic modeling is more complex than just training and visualizing topics. There are several factors, which may influence the results of topic modeling. As far as I know, there are no common understanding of how to handle these factors as we are using topic modeling.

In this paper I therefore propose to look at the approaches from the books of abstracts of the annual international conference of the Alliance of Digital Humanities Organizations between 2011 and 2018, in order to provide a comprehensive overview of how the majority of humanities scholars understands and uses topic modeling.
Thompson, Mark L.;
van der Woude, Joanne
Atlantic Journeys through a Dutch City's Past: Building a Mobile Platform for Urban HeritageFor five centuries, the Dutch city of Groningen has been connected to the Americas and the broader Atlantic World. Yet this profound connection remains little known. In order to address this problem, our consortium is developing a mobile application called “Amerigo” that will identify, map, and demonstrate Groningen’s links to the Atlantic World using techniques of digital mapping, storytelling, and curation. When tourists, residents, scholars or students walk through the city with Amerigo, they will read, hear, see, and co-create the stories of real historical characters from Groningen’s past. At the same time, by drawing on data produced within the application, the researchers will analyze how this diverse audience engages with urban space and culture, fellow participants, and the application itself. This innovative combination of features promises to make Amerigo at once a useful medium for historical research and an experimental platform for studying public engagement with urban heritage.
Garcia-Fernandez, Anne (1);
Cogitore, Isabelle (1,2)
Laboratoire numérique pour l’étude de paratextes : l'exemple de Tacitus On LineNous proposons d'exposer une modélisation et des outils de visualisation pour l’étude de paratextes. À partir du corpus des commentaires de Juste Lipse aux Annales de Tacite, nous défendons l’intérêt de proposer des solutions propres aux objectifs scientifiques du projet tout en respectant des standards et permettant la documentation et la réutilisation des outils. Notre démarche est fondée sur les principes suivants : le questionnement préalable de la nature de l'objet d'étude et sa définition ; la volonté de servir avant tout les objectifs scientifiques du projet ; et la mise en place de solutions permettant la réutilisation tant des données que des outils et méthodes.
Tonra, Justin (1);
Davis, Brian (2);
Kelly, David (1);
Khawaja, Waqas (3)
Poetry In Motion: Quantified Self Data And Automated Poetry Generation_Eververse_ is a project which synthesises perspectives from disciplines in the humanities and sciences to develop critical and creative explorations of poetry and poetic identity in the digital age. Deploying tools and methods from poetic theory, data analysis, and Natural Language Generation (NLG), which is the automatic production of natural language output from a non-linguistic data source. _Eververse_ uses data from quantified self (QS) devices to automatically generate and publish poetry which correlates to the wearer/poet’s varying physical states.
Miyake, MakiApplying Measures of Lexical Diversity to Classification of the Greek New Testament EditionsThe study focuses on decision tree models based on several measures of lexical diversity, aiming at classifying genres of authorship attribution and critical types in various editions of the Greek New Testament.

We use measures of lexical diversity that are not significant correlation with tokens.

After creating training and test subsets from several editions, we apply two classification algorithms such as Classification and Regression Tree and Random Forest.

We then figure out the classification accuracy with the token-independent measures.
Dumont, Stefan;
Grabsch, Sascha;
Müller-Laackman, Jonas
Four Years of correspSearch – Challenges, Potentials and Lessons of Open Data AggregationOver the last four years our project has successfully aggregated metadata for about 52000 letters. Most of the data is obtained from external contributions ranging from a wide variety of scholarly editions and institutions. All metadata is openly accessible and licensed CC0. With this recap we aim to infer successful recipes and practices for the decentralized aggregation of domain specific open metadata. Furthermore we will show the possibilities which arise from the aggregation of such metadata on a bigger scale and discuss ways to manage as well as explore the complex realities of our data.
Tsui, Lik Hang (1);
Chen, Jing (2)
Defining and Debating Digital Humanities in China: New or Old?In the global context, no single unified definition of digital humanities (DH) is possible. The scholarly context that DH was defined and debated in the Greater China region is starkly different from that in Western academia, owing to the unique features of humanities data in Chinese, especially for texts. With special focus on the context and cultural politics of the conditions in which DH emerged and the contestations that it encountered, our paper unravels the complex issue of DH emerging as a scholarly field in China from a historical standpoint. All in all, the paradigm shift in China is slowly taking place, but it is a delayed one given the amount of preparation that the “prehistory” phase has seen. DH has become a canopy term for Chinese scholars to reconceptualize, recatergorize, and repackage old projects and academic practices from the “prehistory” phase.
Galleron, IoanaStylometric Analyses of Character Speeches in French PlaysThis paper tries to answer the question wether literarary characters have a style of their own, or if the stylistic specificities of the author that created them is prevalent over all other types of linguistic differences. In order to do so, it puts together a corpus of 8 plays from 1630 to 1740, and extracts speeches of 82 characters. These are further analysed with the stylometric library under R developped by Eder et al. The visualisations thus built show that, more often than not, characters from the same play do not appear together, nor do they display a clear historical split. The paper will identify the main specificities, and will try to propose some explanations of these linguistic dissimilarities,
Aitken, Brian;
Alexander, Marc;
Dallachy, Fraser
(Re)connecting Complex Lexical Data: Updating the Historical Thesaurus of EnglishThe University of Glasgow’s Historical Thesaurus of English (HT) arranges all the recorded words in the last millennium of English into almost a quarter of a million concepts. The work of half a century, it is available online (at and a second edition is underway. This edition draws upon editorial work conducted by the Oxford English Dictionary (OED) in its ongoing third edition, and thus a crucial activity in creating the new edition of the Thesaurus is the meshing of the Glasgow database with the separate data held by the OED. This paper describes the processes developed by the HT editorial team to tackle the complex task of linking these datasets, allowing rapid updates to be made to the HT and the OED.
Mishina, Ekaterina"Open List": How to Collect Primary Data on Soviet Terror“Open list” ( is a public database on wiki-software, which contains information on political terror victims on the territory of former USSR from 1917 to 1991. “Open list” works like Wikipedia. Every person in database has his own page consisting of two parts: biographical card with personal data and a field for the publication of documents and biographical texts.

Crowdsourcing is very important part of a project. There are several possible activities for our users: they can find appropriate pages for unsorted photos, parse data from biographies to field in biographical form or define and merge duplicate pages. Our volunteers decrypts information from archival documents and create new pages in a list or contribute to existed for united electronic “Book of memory” of Moscow and Moscow region, which "Open list" compiles itself.

Our aim is to normalize data in biographical fields to make academic research easier.
Mondaca, Francisco;
Rau, Felix;
Neuefeind, Claes;
Kiss, Börge;
Kölligan, Daniel;
Reinöhl, Uta;
Sahle, Patrick
C-SALT APIs - Connecting and Exposing Heterogeneous Language ResourcesIn this paper, we present a strategy for the integration of existing heterogeneous language resources like texts and dictionaries by connecting these resources and making them available for internal projects and third party applications through APIs. We describe our approach in the context of the C-SALT (Cologne South Asian Languages and Texts) initiative, where projects and resources hosted at the University of Cologne covering South Asian languages are presented. To illustrate the potential use of our setup, we first introduce VedaWeb, a web-based platform that provides access to ancient Indian texts written in Vedic Sanskrit, the oldest form of ancient Indo-Aryan. Then we describe the C-SALT APIs for Dictionaries. These APIs make several large Pāli and Sanskrit dictionaries available. Building on that, we present the architecture behind these APIs and finally we summarize by analyzing the potential role of APIs in Digital Humanities projects.
Povroznik, Nadezhda GeorgievnaDocumentation of Digital Heritage Information Resources: Expanding Access for Research and EducationThis paper discusses the latest approaches to developing information systems for digital cultural heritage on a global scale, including the creation of catalogs and infrastructure for resource documentation. Digital cultural heritage resources are diverse in content, origin, purpose, scale, technology and user audience. Documentation systems are essential to facilitate advanced digital humanities research and to provide greater user access to digital heritage information resources. Such documentation system has been developed. The platform includes a wide range of characteristics related to describing information resources for digital cultural heritage.

The resource meta-description structure includes 39 fields that represent 3 groups of data:

1) Data on the creators of the information resource;

2) General information about the information resource;

3) Content description metadata.

The method and solutions proposed to expand possibilities for finding thematically similar information resources, and provide a global model to make such resources more accessible for research and education.
Alassi, Sepideh (1);
Schweizer, Tobias (1);
Hawkins, Michael (2);
Iliffe, Robert (2);
Rosenthaler, Lukas (1);
Mattmüller, Martin (3);
Harbrecht, Helmut (3)
Newton virtually meets Euler and BernoulliToday there are many online digital editions available but each presented in an individual platform without any connection to other editions. Having access to the data of similar digital editions together with a graph representing the semantic connections of data through one single platform will facilitate historians' research dramatically. In our project we aim to provide such a platform without locally storing all digital editions.

The generic features which will be developed for this project will enable the user to access the data of other digital editions on demand, preform search queries on all connected digital editions, annotate the data, etc. As prototype we have chosen to connect the base platform the _Bernoulli Euler Online (BEOL)_ to _The Newton Project_ because both these projects contain digital editions of early modern mathematics which are significantly related and the data in both projects are in structured XML format.
Anderson, Carrie J. (1);
Kehoe, Marsely L. (2)
Batavia and the Gold Coast: Mapping Textile Circulation in the Dutch Global MarketThe Dutch Republic established its importance on the world stage through its early successes in global trade, becoming for a time the preeminent circulator of luxury and wholesale goods for the European market. Our project is a collaboration between specialists in the Dutch East India Company (VOC) and the Dutch West India Company (WIC), who have both worked to establish the centrality of trade and the circulation of goods to Dutch Golden Age art history, and now join forces to bring the previously siloed considerations of these companies, East and West, together, through the examination of different modes of textile circulation. Our project, Batavia and the Gold Coast: Mapping Textile Circulation in the Dutch Global Market, seeks to make connections between economic, social, and visual data—which so often exist as discrete epistemological categories—through the development of an open-access database and an interactive map.
Meneses, Luis (1);
Martin, Jonathan (2);
Furuta, Richard (3);
Siemens, Ray (1)
A Framework to Quantify the Signs of Abandonment in Online Digital Humanities ProjectsPrevious work presented in Digital Humanities 2017 and 2018 has explored the abandonment, and the average lifespan of online projects in the digital humanities. We believe that managing and characterizing the degradation of online digital humanities projects is a complex problem that demands further analysis. In this abstract, we go one step further into exploring the collectively shared distinctive signs of abandonment to quantify the planned obsolesce of online digital humanities projects.

For this purpose, we have created a framework that collectively quantifies the signs of abandonment in online digital humanities projects. Our study incorporates the retrieved HTTP response codes, number of redirects, a detailed examination of the contents and links returned by traversing the base node, external resources, HTTP headers and linked files. We intend this study to be a step forward towards better preservation mechanisms and for adopting strategies for the planned obsolesce of digital humanities projects.
Smith, David NeelA Corpus-linguistic Approach to the Analysis of Latin MorphologyThis paper presents a corpus-linguistic approach to the analysis of Latin morphology using an automated system for building morphological parsers. It differs from current approaches in two ways.

First, to address the complexity of Latin as it is documented across millenia, it supports parsers tailored to particular corpora. Building parsers is straightforward. A lexicon and set of inflectional rules are maintained in delimited text files following a specified orthography. From these files, the build system composes source code and compiles a parser.

Second, a corpus selected for research or teaching can be fully characterized morphologically in terms of citable results. The output of the parser uses URNs to identify lexical entities, morphological forms, and the inflectional rules and stems used to match lexical entity and form with a surface token.

This approach is illustrated by applications to corpus-linguistic research, validation of digital editing, and Latin pedagogy.
Ullyot, MichaelThe Augmented Criticism Lab’s Sonnet DatabaseSonnets are reducible neither to formal definitions (14 lines of rhymed iambic pentameter) nor to generic definitions (first-person reflection or argument, often with a volta): the many exceptions to these ‘rules’ make sonnets both flexible and prodigious. But how would our understanding of sonnets change were we to move beyond major authors (Petrarch, Shakespeare, Rilke) and anthologized selections to read every sonnet printed in European literary languages?

For instance, is the sonnet a form, or a genre? What rhyme scheme do most sonnets use? Who is the most innovative sonneteer, and who the most typical? And what does the ground truth of all those sonnets do to our standard definitions of this form-genre hybrid?

This paper describes a database that aims to compile every extant sonnet, in order to quantify their features through time.
Jänicke, Stefan (1);
John, Markus (2);
Geßner, Annette (1)
The Value of Tag Cloud Visualizations for Textual AnalysisIn digital humanities applications, tag clouds are often used as a means of distant reading. By dissolving the structure of texts, the frequencies of different words can be determined and visualized with font size. But, there are crucial theoretical problems in the design of tag clouds that question their benefit for text analysis tasks. In this paper, we evaluate the value of several tag cloud visualization techniques that have been designed to support research tasks in various digital humanities scenarios. We base our analysis on the King James Bible being the most influential English translation.
Grossner, Karl;
Mostern, Ruth M.
World-Historical GazetteerThis paper reports on World-Historical Gazetteer (WHGazetteer), a three-year project funded by the US National Endowment for the Humanities, now two-thirds complete. WHGazetteer is a scholarly infrastructure project intended to support historical research across many disciplines. It is principally a web-based software system for aggregating open access data about places and linking it with data about historical entities associated with those places.
Ries, ThorstenBorn-Digital Archives A Digital Forensic Perspective on the Historicity of Born-digital Primary RecordsThe proposed paper will scope the complexity of born-digital archives from a digital forensic, historical and philological perspective. Personal digital archives, institutional repositories, web archives, email archives and social media archives create(d) digital primary records that the historical humanities struggle to fully recognize as documents in their own right. The historicity of the forensic materiality and structure of the born-digital record is a concept still to be methodologically and theoretically understood in the humanities and in archival science.

The purpose of this paper is to argue that forensic materiality and analysis is methodologically relevant for critical appraisal and understanding of production processes of born-digital sources in the humanities as a whole, including history, social history, political and culture studies (including literature, art history etc).
Peng, Yi-Fan (1,2);
Liu, Chao-Lin (1)
Some GIS-Based Analysis of the Complete Taiwan PoemsWe analyze the poets and their poems in the collection of the Complete Taiwan Poems. This collections mainly include classical Chinese poems that were produced between 1661 and 1945 in Taiwan. We focus on the spatiotemporal analysis of the poets and poems, and provide three application examples in this proposal. The examples include the analysis of the distribution of birthplaces of the poets of different time periods, the distribution of place names in poems of different time periods, and the temporal distribution of place names that were mentioned in poems of a specific poet.
Iashchenko, IuliiaRemember How: The Place of Visualization in Preserving the Memory of Repressions of the USSR Against the Volga GermansThis article discusses some of the possibilities of visualization and representation of historical events that are associated with repression against the Volga Germans in the USSR during the Second World War. In particular, special attention is paid to the role of historical electronic maps and the use of geo-information technologies in the preservation of historical memory.

At present, issues of ethnic and political repression in the USSR are studied rather fragmentary, both in Russia and in Europe as a whole. Today, the problem of repression against the Volga Germans has two aspects, firstly, necessary the direct study of repressive practices, deportation processes and secondly, preservation of the memory of victims of repression, the creation of dialogue in society. Achieving these goals is impossible without the use of information technology, especially when it comes to preserving memory and representing historical events in public space.
Vandenbunder, Jeremie;
Bendjaballah, Selma;
Garcia, Guillaume;
Cadorel, Sarah;
Groshens, Emilie;
Fromont, Emilie;
Juillard, Emeline
Former aux Methodes en Sciences Humaines et Sociales avec BequaliCette communication vise à présenter la banque d’enquêtes qualitatives beQuali et ses apports pour la formation aux méthodes en SHS. Il s’agira ainsi de donner quelques éléments de contexte concernant la banque d’enquêtes et de décrire ses différentes activités. Ensuite, nous souhaitons mettre en lumière les usages possibles de beQuali pour la formation méthodologique et les diverses problématiques qu’ils soulèvent en termes de pédagogie, d’accès aux données et de contextualisation de ces dernières. En définitive, cette présentation courte vise à présenter, via l'exemple de beQuali, les possibilités ouvertes par le numérique pour explorer et exploiter les corpus complexes que sont les enquêtes qualitatives dans un contexte de formation universitaire.
Morent, StefanSacred Sound – Sacred Space: In Search Of Lost SoundThe project investigates the interacting of architecture of sacred spaces with sound and the relations between concepts of sacred spaces and their socio-cultural construction and religious experience as well as the shaping of liturgical forms.

Such complex systems of relations are particularly demanding if sacred buildings don't exist anymore or at least not in their original form.

New approaches of research are provided by recently refined methods of virtual reconstruction of historical acoustics based on reconstructed 3D-models of the architecture.

This research project will explore the contextualization of liturgical singing in its original sound space.

The innovative character of the research project consists in the combination of musicological, liturgical and ritual studies with techniques of Digital Humanities.

Investigations will be conducted on the churches of Cluny II and III, St. Peter and Paul at Hirsau, St. Gall and the UNESCO World Heritage monastery Maulbronn.
Wiering, FransA Mobile Website To Support Teachers In Discussing Terrorism In The ClassroomThis paper describes an project that aims to increase the societal resilience against terrorism in Dutch primary and secondary education. It supports teachers by providing them with reliable and compact information, practical support for discussing the topic in class, interpretation of recent developments, and local support. The focus of the paper is on the mobile website that was developed for this project using human-centred design methodology.
Vogeler, GeorgImplementing the Assertive Edition for Historians – Some samplesHistorians have since long considered historical documents as an information carrier. Editing documents for historical research thus often meant to create modernized or abridged texts and offered a wide range of tools to access the content of the texts (abstracts, rich indices or commentaries). In the digital world this approach has ended in a type of digital scholarly edition which is still lacking an accepted term: drawing from the Semantic Web it could be named “semantic edition”, in the context of “fact-checking” it might be termed “factual edition”, or putting it into logical reasing the term could be “assertive edition” (Vogeler 2018). The paper will discuss implementation samples and discuss the key issues of the technology stack involved.
Junginger, Pauline (1);
Ostendorf, Dennis (2);
Avila Vissirini, Barbara (2);
Voloshina, Anastasia (3);
Kreiseler, Sarah (4);
Dörk, Marian (2)
Close-Up Cloud: Gaining A Sense Of Overview From Many DetailsWe present a visualization technique for the exploration of digitized cultural collections that proposes a novel approach towards the overview. Overview and detail are typically positioned as opposites in the visual representation of information spaces. However, when visualizing image collections annotated by art historians, there is an opportunity to reveal the visual details of individual images while at the same time exposing iconographic patterns prevalent within a collection. As part of an iterative research and design process in collaboration with a museum of arts and crafts, we have devised a visualization technique that arranges detailed close-ups into frequency-based collages. The resulting visual interface is designed for open-ended exploration of digitized glass plate negatives without requiring prior knowledge about the collection or the need for entering search queries. We implemented the concept as a web-based interface and evaluated the potential of the approach.
van Cranenburgh, Andreas (1);
Koolen, Corina (2)
The Literary Pepsi Challenge: intrinsic and extrinsic factors in judging literary qualityIn this paper we develop a new survey, based on fragments from the novels used in the National Reader Survey, to collect evidence that text-intrinsic characteristics play a role in ratings of literary quality, and investigate exceptions where we suspect various biases may play a role (cf. Koolen, 2018). The results will tell us more about how perceptions of literariness are formed and which particular textual aspects play a role. They will also enable a direct comparison between the performance of humans and a computer model on this task.
Ros, Ruben;
van Eijnatten, Joris
Disentangling a Trinity: A Digital Approach to Modernity, Civilization and Europe in Dutch Newspapers (1840-1990)This paper fleshes out the relations between the conceptual trinity of modernity, civilization and Europe (MCE) using digital history techniques. Based on a dataset of four Dutch newspapers that span the period 1840-1990, we show how conceptual connections between the members of the MCE trinity are highly restricted. N-gram frequency measures and collocations are employed to map the word usage and, building on recent advancements in diachronic vector semantics, we use word embeddings to study changing relations between modernity, civilization and Europe. These methods show how the trinity is characterized by intermittent and alternating connections, but not by perennial semantic boundaries. Given that these results differ from research based on elite discourse, this paper demonstrates the need for digital research into conceptual interrelationships.
Neuefeind, Claes (1);
Schildkamp, Philip (1);
Mathiak, Brigitte (1);
Marčić, Aleksander (2);
Hentschel, Frank (2);
Harzenetter, Lukas (3);
Breitenbücher, Uwe (3);
Barzen, Johanna (3);
Leymann, Frank (3)
Sustaining the Musical Competitions Database: a TOSCA-based Approach to Application Preservation in the Digital HumanitiesThis contribution presents an approach to the preservation of web-based research applications in the DH (e.g. databases, digital editions, interactive visualizations, or virtual research environments). Our approach is based on TOSCA, an OASIS standard for modeling, provisioning, and managing cloud applications in a standardized and provider-independent way.

We describe the key concepts of our approach in the context of an exemplary use case application, where the application's topology is modeled in a TOSCA-compliant way. Our use case is the Musical Competitions Database, a web application providing comprehensive information about music related competitions from 1820 to 1870.

With this contribution, we want to trigger a discussion about the applicability of methods and technologies of professional cloud deployment and provisioning strategies to problems of long-term availability of research software in the DH-community.
Puren, Marie;
Vernus, Pierre
Improving the understanding and preservation of European Silk Heritage. Producing accessible and reusable Cultural Heritage data with the SILKNOW ontology in CIDOC-CRMSilk was a major factor for progress in Europe, mostly along the Western Silk Road’s network of production and market centres. Silk, however, has become a seriously endangered heritage. Although many European specialized museums are devoted to its preservation, they usually lack size and resources to establish networks or connections with other collections. The H2020 SILKNOW project aims to produce an intelligent computational system in order to improve our understanding of European silk heritage.

This computational system is modelized and trained thanks to these datasets, mapped according to the SILKNOW ontology. In this paper, we will present how we have defined this data model, and how we have specified the entities to be represented by the ontology and the existing relationships between these entities. The design and implementation of the SILKNOW ontology representing the model is based on CIDOC-CRM.
Li, HuiDishes on the menu: Turning Historic Menu into Menu NetworkHistoric menus contain abundant information about changing regional tastes, the ingredients of popular dishes, the arrangements of different meals, and fascinating stories behind the menu. However, research upon the modeling, measurement, and analysis of menus network is still at its very beginning.

In this paper, we aim to propose a menu network that closely resembles today's social network based on the metadata and content of menus. We set the formalization and standard for the basic elements in most menus, and introduce our menu network, which integrates temporal, geographical, economic and textual information into a graph structure.
Cornell, Deborah A (1);
Callaghan, Samantha (2)
Transatlantic Collaboration in Digital HumanitiesCollaboration is fundamental to digital humanities work and DH researchers and practitioners spend significant effort, time and resources on collaborative processes. Additionally collaboration is frequently necessary and actively encouraged by funders (AHRC, 2019; NEH-DHAG, 2019) and yet little formal discourse and attention is given to this topic in DH publications and project reports (Griffin and Hayler, 2018; Lawrence, 2006). In an attempt to address this lack of dialogue, this short paper introduces a project that aims to map and document the collaboration of multiple diverse partners, in a large-scale distributed digital humanities project.
Bourgatte, MichaelPrendre en Compte le Contexte d’Usage et le Contexte Technique dans le Développement du Service d’Annotation Vidéo CelluloidNous présenterons ici les résultats d’un projet de recherche intitulé *Celluloid* ayant pour objet les usages de *l’annotation vidéo* en contexte d’éducation ou de recherche. Définie en ce sens très large, l’histoire des pratiques d’annotation est aussi ancienne que celle de la production intellectuelle. Cette pratique est toutefois prolongée et renouvelée avec l’apparition des outils numériques. Dans le cadre du projet *Celluloid*, nous sommes partis de l’observation de *pratiques d’enseignants-chercheurs*, puis nous avons analysé et comparé les dispositifs proposés par plusieurs plateformes d’annotations vidéo. Ce travail nous a permis de mettre au jour les difficultés que rencontrent les enseignants et les chercheurs pour conduire des projets éducatifs ou de recherche faisant appel à la vidéo. Il a également révélé que les choix ergonomiques et technologiques qui sont faits par les développeurs freinent les dynamiques collaboratives. À partir de ces analyses, nous avons donc développé un outil adapté.
Boukhaled, Mohamed Amine;
Fagard, Benjamin;
Poibeau, Thierry
A Predictive Approach to Semantic Change ModellingThe availability of large textual corpora spanning several centuries makes it possible to observe the evolution of language over time. This observation can be targeted towards the search for general laws of language evolution.

In this contribution, we propose a computational model that can predict the semantic evolution of words over time. Computational modeling of language change is a relatively new discipline in the world of Digital Humanities, which includes early works that aimed at characterizing the evolution through statistical and mathematical modeling and more recent and advanced works involving robotics and large-scale computer simulations.

Semantic change, on which we shall focus in this work, aims at modelling at a macro-scale all kinds of changes affecting the meaning of lexical items over time. Our aim is to capture the systemic change of words meaning in an empirical model that can also predict such change, making it falsifiable.
Ilvanidou, MariaAnd The First One Now Will Later Be Last, For The Times They Are A-changin': Modeling Land Communication In Roman CreteThe present contribution has a twofold aim: on the one hand it will seek to demonstrate how the use of digital tools and methods enabled the reconstruction of the Roman road network in Crete back in 2005, while on the other hand it will showcase how the rapid developments in digital tools often deems research in the field of the Humanities outdated and obsolete.

Such an initiative as the integration, connection and modeling of complex data on Roman road networks in the digital domain was indeed quite innovative back then. An analogue approach would still have been up to date and re-usable. Sustainability of this Roman roads modeling project has proven to be next to impossible. Therefore, one could argue that, what the digital so generously offered my work, it has taken it back rather fiercely.
Vaara, Ville;
Hill, Mark;
Tolonen, Mikko
Publishers, Printers and Booksellers - Implications of Properly Structured Metadata for Digital HistoryWith any computational analysis of a large historical dataset, there is a strong temptation to approach the dataset as holistic representation of the language and intellectual landscape of its era. However, as digital history projects are often criticised for naive and historically laxed approaches to sources resulting in simplification of complex phenomena, such an approach will not hold water. In this paper we demonstrate how proper use of metadata is necessary for serious corpus control and digital source criticism.

This work makes two specific contributions to the history of the book and digital history. First, we present a general methodological approach for creating a historical biographical database out of a bibliographical catalogue. Second, we demonstrate solutions for forming a uniform dataset from a noisy and heterogenous starting point.
Fragkiadakis, Manolis (1,2,3)To Sign or not to Sign: Automated Generation of Annotation Slots for Sign Language Videos using Machine LearningOver the last years various corpus projects documenting sign languages have started all over the world. During the annotation process of these corpora the researcher has to determine the precise time a sign occurs and properly gloss it. Consequently, the annotation process is extremely labor intensive, but a condition for a reliable quantitative analysis of the corpora.

The aim of this project is to develop a tool that automatically annotates the signs and their phonological features in a video. The first part towards automatic annotation is to recognise the exact time-frame a sign occurs. To remove the redundant information from the raw video a pose estimation framework (namely OpenPose) has been used. The extracted hand locations have been used to train four different classifiers. The result of this process is a tool that uses XGBoost to accurately predict the span of a sign and automatically create the annotation slot.
Islam, Jumayel (1);
Xiao, Lu (2);
Mercer, Robert E. (1);
High, Steven (3)
Tension Analysis in Survivor Interviews: A Computational ApproachTension is an emotional experience that can occur in different contexts. This phenomenon can originate from a conflict of interest or uneasiness during an interview. In some contexts, such experiences are associated with negative emotions such as fear or distress. People tend to adopt different hedging strategies in such situations to avoid criticism or evade questions.

In this work, we analyze several survivor interview transcripts to determine different characteristics that play crucial roles during tension situation. We discuss key components of tension experiences and propose a natural language processing model which can effectively combine these components to identify tension points in text-based oral history interviews. The model provides a framework that can be used in future research on tension phenomena in oral history interviews.
Thomae, Martha E.;
Cumming, Julie E.;
Fujinaga, Ichiro
Taking Digital Humanities to Guatemala, a Case Study in the Preservation of Colonial Musical HeritageThe goal of this project is the preservation and dissemination of Guatemala’s colonial musical heritage by applying music information retrieval tools to a group of Guatemalan manuscript sources while maintaining the original sources in their homeland. These sources are written in mensural notation, a music notation style used in Europe throughout the Late Middle Ages and the Renaissance. In this paper, we will present the step-by-step process that will unravel the accessibility barriers of this repertoire: (i) lack of high-quality digital images, (ii) notation style, and (iii) layout. This process will result in the digitization and encoding of the repertoire as musical scores in a machine-readable file format, which would facilitate its dissemination, study by musicologists, and appreciation by the general public. We expect this research to be used as a model for the digitization of the mensural repertoire of other countries that were once part of the colonial past.
Kiessling, Benjamin (1,2)Kraken - an Universal Text Recognizer for the HumanitiesKraken is a language-agnostic optical character recognition engine that can be applied to both printed and handwritten texts with relatively modest training effort. It includes a number of features making it of special interest to digitization work in the humanities.
Liu, Chao-Lin;
Chang, Wei-Ting;
Zheng, Ti-Yong;
Chiu, Po-Sen
Toward Building Chronicles from Biographies in Local Gazetteers: An Application of Syntactic and Dependency ParsingStatements in a typical personal chronicle are short and informative. The goal of creating chronicles from biographies led us to the very challenging task of sentence compression. We used the biographies of the Taipei gazetteers as the testbed, and abbreviated sentences based on the results of constituency and dependency parsing of Stanford tools. We shortened the original sentences by heuristically dropping some nodes in the parsing results. Although we have gathered invaluable experience in this preliminary exploration, we accepted only 45% of the shortened sentences that our program produced.
Weinfurtner, Anne;
Dorner, Wolfgang;
Graf, Simon
How to Better Find Historical Photographs in an Archive - Geographic Driven Reverse Search for PhotographsThis contribution presents web map based retrieval techniques that allow to store spatial data included in historical photographs with the aim to better satisfy the information demand for spatially oriented and object centric disciplines.
Ströbel, Phillip Benjamin;
Clematide, Simon
Improving OCR of Black Letter in Historical Newspapers: The Unreasonable Effectiveness of HTR Models on Low-Resolution ImagesWe showcase the usefulness of Handwritten Text Recognition (HTR) models when it comes to the recognition of black letter in historical newspapers. We illustrate how simple the production of a ground truth, the training and the evaluation of such HTR models are with the help of the integrated platform Transkribus. Our paper highlights that a model trained on only a limited amount of data achieves state-of-the-art performance and beats commercial software like ABBYY FineReader by oftentimes large margins. We are particularly interested in how HTR models trained on medium-resolution data perform on high-resolution images and we are able to show that the performance is comparable, which means that costly and time-consuming re-digitisation processes are not required in order to improve OCR quality. Moreover, we investigate the transferability to other newspapers. In short, our findings demonstrate how digital humanists can improve their source material for text mining with a reasonable effort.
Bon, Bruno (1);
Alexandre, Renaud (1);
Nowak, Krzysztof (2);
Vangone, Laura (1)
VELUM : Towards Innovative Ways of Visualising, Exploring and Linking Resources for Medieval LatinThis paper aims to present the _Velum_ project, which is a first step towards an innovative digital environment for the study of the language and culture of medieval Europe. The medieval civilization can only be investigated by means of the study of traces that have survived to our times. The best source of our knowledge is the texts, preserved in huge quantity and variety. Written mainly in Medieval Latin, they have not benefited from recent advances in computational linguistics.

To challenge this situation we will build a large and balanced corpus of Medieval Latin texts composed between 500 and 1500 AD all across Europe. It will be annotated with PoS, lemma, time and place labels. Tools allowing efficient statistical analysis and data visualization will be developed. The texts and tools will be made freely available to the scientific community, to help researchers answer their own questions or discover new ones.
Rajan, Vinodh (1);
Stiehl, H. Siegfried (1,2)
Advanced Manuscript Analysis Portal (AMAP): An Interactive Visual Language Environment for Manuscript StudiesApplication of digital methods in the fields of Digital Paleography and Manuscript Studies has long been a challenging task as one requires programming experience to use the methods and create computational solutions. Recently, Visual Language-based applications like AppInventor have gained a lot of attention. By using an intuitive visual syntax, they let non-programmers to create computational solutions easily. However, Visual Language (VL) environments do not exist for Manuscript Studies.

In this context, we introduce AMAP, a Visual Language environment for programming with DIA methods. It offers a largely self-usable toolbox that humanists can use to build solutions themselves. We initially outline the need and motivations for developing AMAP and further elaborate on the design and implementation of AMAP along with its potential applications.
Barget, Monika Renate;
Schreibman, Susan;
MacCarron, Pádraig
Complexities And Compromise – User-Centred Interfaces For Public Humanities projectsDrawing on recent experiences of the Letters 1916-1923 team in re-designing their project website, this paper will elaborate how changing user expectations, academic standards and special requirements of source material can be reconciled in the creation of database driven interfaces designed for public humanities projects. Digital Humanities scholars agree that interfaces are “part of the design” and need to visually tell the project’s story. But despite extensive theoretical discourse on user-centered designs, many DH projects still tend to fall short in practice. This paper will explore these issues theoretically and practically by describing some of the initial problems and design choices made in the course of the recent Letters 1916-1923 re-launch, thus contributing to an on-going discourse. The results of onsite user testing, in particular, have shown how user expectations have transformed since the first release of the website in 2015, and these findings may benefit similar public humanities projects.
Atanassova, Rossitza IlievaA National Library’s digitisation guide for Digital HumanistsThis short paper will give practical advice about the Library’s digitisation planning process for scholars who wish to use digitised resources in their research. The information will help scholars understand the institutional context, the roles involved in digitisation, the preparation stages and documentation required, typical timelines and the decision-making that happens at different stages. With this knowledge it is hoped that DH scholars will be better prepared for the process and will factor it in their research funding proposals. They will also gain an understanding of the Library’s considerations and policy for making available for reuse existing digitised resources and how scholars could request this for their projects. In making the policy and processes at the institution more transparent, the presentation will expose some of the hidden labour undertaken by cultural heritage staff to enable Digital Humanities (DH) research.
Bardiot, Clarisse (1);
Broadwell, Peter (2);
Oiva, Mila (3);
Suarez, Pablo (4);
Wevers, Melvin (5)
The Leonardo Code: Deciphering 50 Years of Artistic/Scientific Collaboration in the Texts and Images of Leonardo Journal, 1968-2018Leonardo (1968-present), published by MIT Press, is the leading international peer-reviewed publication on the relationship between art, science and technology, making it an ideal dataset to analyze the emergence of such complex collaborations over time. To identify and analyze both the visible and latent interaction patterns, the research employs different granularities of data (article texts, images, publication dates, authors, their places of affiliation and disciplines) as part of a multimodal approach. Using a convolutional neural network, we examined the features of the images to analyze the modes of representing (and actually doing) art, science or engineering. We paired these features with information extracted using text mining to examine the relationships between the visual and the textual over time.
Škvrňák, Jan (1);
Škvrňák, Michael (2);
Ochab, Jeremi K. (3)
How To Detect Coup d’État 800 Years LaterThe thirteenth century in the Czech lands is undoubtedly the most interesting period for the nobility. In the first half of the century an almost invariable group of noble families around the monarchs was established so that the impossibility of political upheaval led to the uprising of part of the nobility and the civil war between 1248-1249.

Having collected data on approx. 2300 noblemen from 568 charters, with the use of social network analysis we attempt to describe polarization within the nobility and explain which noblemen joined the uprising in the ranks of Přemysl Otakar II (Ottokar II.) and how their position in the social network influenced their chances to be appointed to high-ranking positions within the kingdom after the coronation of Přemysl Otakar II.
Backman, Agnieszka;
Petrulevich, Alexandra
Norse World – The Complexities Of Spatiality In East Norse Medieval TextsThe Norse perception of the world project sees East Norse (Old Swedish and Old Danish) literature as a mine of information on how foreign lands were visualised in the Middle Ages: What places were written about and where? How do place names link different texts?

The study of spatial thinking and knowledge in medieval Scandinavia and its development as an area of enquiry have been hampered by a dearth of information on place names in literary texts. Any research aiming to uncover what pre-modern Scandinavians understood about places abroad requires as a minimum an index of foreign place names in East Norse literature, an infrastructure that has not existed until now. To answer questions like those above and more the project has created the online resource Norse World ( consisting of a MySQL database, compatible with both GeoJSON and JSON-LD, with interactive search and mapping using the Leaflet library.
Vaara, Ville (1);
Ijaz, Ali (1);
Tiihonen, Iiro (1);
Kanner, Antti (1);
Säily, Tanja (1);
Lahti, Leo (2)
The Emerging Paradigm of Bibliographic Data ScienceIn order to facilitate research use of library catalogues, we recently proposed the concept of _bibliographic data science_. This aims to improve data reliability and completeness through systematic and reproducible harmonization, deriving from the paradigms of open science and data science. We have constructed a comprehensive bibliographic data science ecosystem that facilitates semi-automatic harmonization and enrichment of bibliographic entries. The work is based on an iterative process where research use often leads to new enhancements in data processing. The overall ecosystem integrates a number of distinct workflows that are dedicated to harmonizing specific subsets of the data collections. Further algorithmic tools support the integration with other data sources, such as full text collections, and final statistical analysis, visualization, and summarization of the data. As such, bibliographic data science can advance the methodological and conceptual basis in book history and digital humanities.
Takeda, Joey (1);
Roberts-Smith, Jennifer (2)
One More Time With Feeling: Revisiting XPointers to Address the Complexities of Promptbook EncodingTEI has long supported the use of XPointers (Grosso et al. 2003), but they are seldom implemented or recommended as a method of linking TEI documents (Cayless 2013). We make the case that they may still a viable option for TEI projects, by means of a real-world example in which XPointers are necessary: the Waterloo-based Stratford Festival Online (SFO) project, which aims to encode the Festival’s world-class collection of theatrical promptbooks (Malone 2013; 2018). To represent the complex ontologies of the contents of promptbooks, our research team is developing an approach that uses two data files linked by stand-off markup and XPointers, one for the verbal text that a stage manager uses as a timeline during a performance, and the other for the non-verbal events the stage manager enacts or monitors (Roberts-Smith, Kaethler, Malone et al. forthcoming). This short paper is illustrated by a sample implementation in XSLT.
Molineaux, Benjamin JosephThe Corpus of Historical Mapudungun:This paper presents the challenges and prospects of building the Corpus of Historical Mapudungun, focusing on the difficulties of materials and methods used to reconstruct the history of a Native American Language. Special focus is placed on sound change – particularly epenthesis – in a language with abundant complex morphological structure (aka polysynthesis).
Gabay, Simon (1);
Riguet, Marine (2);
Barrault, Loïc (3)
A Workflow For On The Fly Normalisation Of 17th c. FrenchIf NMT has proven to be the most efficient solution for normalising pre-orthographic texts, the amount of training data required remains an obstacle. In this paper, we address for the first time the case of normalising modern French and we propose a workflow to create the parallel corpus that an NMT solution requires.
Uyola, RosalieDHK12: Open-Access Digital Humanities Curricula for K-12 SchoolsDHK12 is a free, unrestricted, online open-access collaborative network of educators who seek to:

use digital tools to make the humanities come to life for students

draw on the scholarship of women and people of color to diversify curricula

support students as they do the work of historians by creating knowledge.

This year, DHK12 is in the process of launching two digital projects -- built by students, the people who will need the tools to deal with the complexities of the future -- that will contribute to public scholarship in digital indigenous studies, digital black studies, Africana and diasporic studies, digital queer studies, and digital feminist studies. With these interdisciplinary and transnational, student-driven digital archive projects (currently in progress; please see full abstract for project descriptions), we will build complex models of memory and commemoration, analysing our data with computational methods and communicating the results to a broader public.
Karrouche, NorahStill Waters Run Deep. Including Minority Voices in the Oral History Archive Through Digital PracticeWhile digital archiving practices in the Netherlands have provided better access to oral history collections, the effort has also demonstrated that the voices heard in those oral history projects are predominantly white. This paper argues that the composition of the Dutch oral history archive is in dire need of revision and seeks to generate a dialogue on how to remedy this silence. In a discipline that has traditionally prided itself on its emancipatory potential, ethnic minorities and formerly colonized peoples in particular have received relatively little attention. First, I closely examine the state of the art of digital oral history in the Netherlands. Second, I will explore how digital research infrastructures and repositories can contribute to a more inclusive archive through closely collaborating with community archives
Bukula, Andiswa;
Steyn, Juan
The Complexities Of The Representation Of Xhosa Protagonists, Represented By Male And Female Authors In IsiXhosa Dramas Using Computational MethodologiesThis paper will report on a research that has already been carried out for a Masters study, using a manual process, but now will use computational methods, to analyse the data. The research will look at the representation of women protagonists in isiXhosa dramas as presented by male and female authors, and look at the different ways in which these authors of different sexes depict these women differently due to the influence of their different sexes. The computational tools that will be utilized will include, the use of Voyant Tools, analysis using regular expressions as well as testing the feasibility of BookNLP when used with conjunctively written languages.
Mimno, David (1);
Martin, Meredith (2);
Algee-Hewitt, Mark (3)
The Sonnet StretcherWe present a tool for viewing patterns in the position of words in poems. Given a collection of 10,000 English-language sonnets, we stretch each poem to fit a standardized square, with each line fully justified. We then create a visualization for each distinct word showing the position of all instances of that word. For example, we find that "start" and "apart" appear almost always at the end of lines, and that "start" rarely occurs in the first line. The visualization allows scholars to gain an abstracted view of poetry without losing the poets' individual choices about word placement. This tool can help scholars generate and test theories about the interplay of rhyme, meter, syntax, and emphasis.
Lavorel, MarieThe Living Archives Of Rwandan exiles And Genocide Survivors In Canada: Une Nouvelle Façon D’explorer, Sur Une Plateforme Numérique, Les Récits De Vies De Survivants De ViolenceLe centre d’histoire oral et de récits digitalisés de l’Université Concordia à Montréal en collaboration avec l’association des Parents et amis des victimes du génocide des Tutsis au Rwanda (Page-Rwanda), représentant les survivants du génocide de 1994 vivants désormais à Montréal, ont décidé en 2016 de créer une plateforme numérique pour partager et explorer 31 interviews (vidéos) de survivants.

Ce projet de recherche et de création (financé par le Conseil de recherches en sciences humaines du Canada 2016-2019) en humanités numériques est une occasion de renouveler non seulement la recherche collaborative en histoire orale et digitale mais également de développer de nouvelles façons d’utiliser et de transmettre les archives vidéo de récits de vie.

Cette plateforme « d’archives vivantes » répond à un grand besoin de développer de nouvelles méthodes d’accès, de partage, de visualisation, de cartographie, d’écoute et d’analyse des histoires de enregistrées des survivants de violence de masse.
Eisazadeh, Negin (1,2);
Bordalejo, Barbara (1)
Digital Documentation of Abandoned Heritage. The Case of Château de NoisyUrban exploration or urbex is the exploration of human-made spaces that are generally inaccessible and hidden away from the general public. Recording the visit of these ‘forgotten’ spaces through photography is a main component of this phenomenon which has resulted in a wealth of urban exploration photos and videos of abandoned sites.

This research focuses on the documentation and information management of abandoned heritage sites and looks into the potentials of the rich collection of existing digital urbex resources for their preservation by exploring their content and new means of representation and engagement. These photos and videos can shed light on these unknown places, and with the right utilization can not only document and digitally preserve some aspects of the valuable heritage but also can bring public attention to heritage sites that may still be saved from deterioration and revived.
Larrousse, Nicolas;
Marchand, Joel
A Techno-Human Mesh for Humanities in France: Dealing with preservation complexityNowadays, as the use of digital data for research in Humanities has become the norm, researchers are dealing with a huge amount of data. As a consequence, the risk of data loss is increasing. Another difficulty is to provide full access to this flood of data to users often located in distant areas. These problems can no longer be addressed individually by researchers or even at a laboratory level: it is therefore necessary to use a technical infrastructure with specific skills to provide stable preservation services.

This paper will present the implementation of a preservation system in France, branded “Huma-Num-Box”, which aims to address these challenges. This solution is proposed by Huma-Num, the French national infrastructure dedicated to Digital Humanities.
Andrews, Tara Lee;
Safaryan, Anahit;
Atayan, Tatevik
Continuous Integration Systems for Critical Edition: The Chronicle of Matthew of EdessaWe present here a project to prepare the digital critical edition of the Armenian-language Chronicle of Matthew of Edessa, which is due to finish its first stage in April 2019.

One of the central features of our project was to adopt a continuous integration (CI) system in order to manage the work across these stages in a sensible manner. The primary challenge we then had to overcome was the need to ensure that the data was cleanly maintained from beginning to end, as the nature of CI design does not allow for modifications in the middle of the pipeline. While our implementation of the CI pipeline does not carry us all the way to the finished edition, we believe that this would be a very desirable future direction.
Lorenzini, Matteo;
Rospocher, Marco;
Tonelli, Sara
Computer Assisted Curation of Digital Cultural Heritage RepositoriesThe objective of metadata curatorship is to ensure that users can effectively and efficiently access objects of interest from a repository, digital library, catalogue, etc. using well-assigned metadata values aligned with an appropriately chosen schema. However, we are often facing problems related to the low quality of metadata used for the description of digital resources, for example wrong definitions, inconsistencies, or resources with incomplete descriptions. There may be many reasons for that, all completely valid, e.g, in many cases those who host a digital repository have few human resources to work on improving metadata, and often data providers are not themselves the metadata creators.

Taking as reference the framework developed by Bruce and Hillmann (2004), in this paper we present our ongoing work, which aims at defining computable metrics to assess metadata quality and automatize metadata quality check process.
Van Galen, Quintus (1);
Hall, Mark (2);
Nicholson, Bob (1)
Durchdruck im Fokus: Visualising the Spatiality of Articles in Historical NewspapersThis paper presents a tool for visualising the positioning of historical newspaper articles within their original source. It extracts the article bounding box from the metadata, normalises the coordinates of said bounding box, and plots these on a heatmap. This tool allows researchers to investigate the complex context of the source material on which they rely, by gaining them understanding of the editorial context in which the article apeared.
Impett, Leonardo LaurenceEarly Modern Computer VisionComputer vision necessarily embodies a theory of vision (primarily a neuroscientific one): conversely, important discoveries in the theory of vision have come from computer vision algorithms. This paper describes a project, Early Modern Computer Vision, which therefore attempts to prototype a computer vision (that is to say, a way for machines to read images) which is based on Italian theories of optics, vision and visual art of the 16th century, as an experimental apparatus to investigate those theories. I present a passage by Michael Baxandall in which he suggests something similar in the 1990s (though he didn't attempt technical implementation), and sketch an initial prototype for an Early Modern Computer VIsion: a digital colour-space based on Giovanni Paolo Lomazzo's Temple of Painting (1590).
Alshanqiti, Ahmed MohammedWaqf Libraries And The Digital AgeThis short paper raised as a part of my ongoing PhD thesis that aims to introduce and study the concept of a Digital Waqf Library. The paper argues that the current rules and guidelines of the concept of Waqf needs to be reviewed and updated in order to adapt with the digital age and the new digital innovations.
Sowerby, Zachary DavidEncoding Ancient Greek MusicThe goal of this project is to create a digital framework by which to study the entire surviving corpus of the music of Ancient Greece. Surviving examples of the notation, although scattered and quite fragmentary, provide a rich picture of ancient song, with information concerning lyric, pitch, rhythm, meter, section, dynamics, and instrumentation. This open-source project uses aligned digital diplomatic editions of the surviving notated music sources to perform computational analysis on the different aspects of this complex corpus to further the study of this once-lost tradition.

Existing initiatives for encoding music are not designed to handle either the ancient notation or the competing music theories written by ancient musicologists. The music encoding system specially developed for this project was developed to both create a one-to-one semantic correspondence with the source material and be machine-readable. The code is designed to coordinate and manipulate the data for analysis.
Cummings, James;
Kirkley, Laura;
Sousa Garcia, Tiago;
Turner, Mark
New Approaches to Women’s Writing Virtual Research EnvironmentThe New Approaches to Women’s Writing (NEWW) Network brings together scholars from across the globe to research women writers’ transnational collaborations and reception histories from the early modern era to the twentieth century. The aim is not only to recuperate national histories of women’s writing but also to establish how feminist ideas were disseminated as texts crossed national and linguistic borders. This short paper seeks to introduce the NEWW network and its pilot virtual research environment as it seeks to develop this further.
Wendell, Augustus;
Ozludil, Burcak
Agent-Based Modeling in Art History: Simulating an Insane AsylumThis short paper reports on the development of a framework and system that incorporates agent-based modeling (ABM) in art/architecture historical research and scholarship. ABM is a computational process simulating agents and their behaviors; the relationships between agents; and the interaction between agents and their environments. In our prototype of the Istanbul Toptasi Insane Asylum (functioned 1876-1924), we model medical and daily routines of the asylum inhabitants. This setting presents both a computational and philosophical challenge considering that agents are typically assumed to be active, autonomous individuals with decision-making capability in a non-restricted environment. In contrast, the asylum is a highly regulated environment with unpredictable (and arguably irrational) agents. The asylum presents a productive case study of applying ABM to art/architectural history as the movement of agents provides insight into the functioning of a nineteenth century imperial medical facility.
van den Heuvel, Henk (1);
Draxler, Christoph (2);
van Hessen, Arjan (3);
Corti, Louise (4);
Scagliola, Stefania (5);
Calamai, Silvia (6);
Karouche, Norah (7)
A Transcription Portal for Oral History Research and BeyondOver the past 2 years a number of researchers from various backgrounds have been working on the exploitation of digital techniques and tools for working with oral history (OH) data. The Transcription Chain (TC) can be considered as a couple of concatenated different software tools that ingest Audio and or Video documents and output Time-stamped Transcriptions (TT). In this proposal, we see a TC as a set of web based tools, running on one or more computer servers “in the internet”. A TC typically uses different tools that run on different servers in different countries. The TC was implemented as a OH Transcription portal by developers of the Bavarian Archive for Speech Signals (BAS) in Munich. In this contribution we address the implementation of the portal (and its URL), the first experiences as reported in a follow-up CLARIN workshop in Munich, and our future plans with the portal.
Hughes, Lorna (1);
Benardou, Agiatis (2)
“The Ties That Bind': The Creation, Use, And Sustainability Of Community Generated HistoriesThe use of digital content, tools and methods allow new insights into historical research, through enriched engagement with primary sources via digitization and datafication, advanced approaches to data analysis and visualization, and immersive approaches. In response to these digital opportunities, research on the First World War has seen a digital 'big bang': the period from 1914-19 has greater digital coverage than any other historical period.

This paper will address the theme of community generated content (CGC) in the context of digital First World War initiatives across Europe, and explore the value and digital legacy of community generated content that is, methodologically, of significance to broader issues around using and sustaining digital histories. It will discuss the sustainability and use of GCG in historiography, and as a disruption in the research life cycle.
Larrousse, Nicolas (1);
Jacobs, Christophe (2);
Jacobson, Michel (1);
Kagan, Gilles (1);
Marchand, Joel (1);
Masset, Cyril (1)
“Un Manuscrit Naturellement ” Rescuing a library buried in digital sandThis long story about preservation of human thought began during the Middle Ages, with the creation of manuscripts by copyist monks.

An agreement was signed with the Ministry of Culture and IRHT to digitize all the manuscripts stored in French public libraries. In 2018, this corpus is among the most important digitized medieval source representing more than 6000 manuscripts: this is still a work in progress!

The fantasy of digital immortality is widely shared, but in reality, digital resources are highly fragile. In short, over many years, we have built a very safe and costly digital necropolis progressively covered by layers of digital sand rather than a clean organized library. This paper will present the consecutive operations made during the preservation project of this very valuable collection of manuscripts.
Schwartz, Michelle (1);
Crompton, Constance (2)
Where Our Responsibilities Lie: People, Method, and Digital Cultural HistoryAs cultural historians, should our responsibility be to the people in our historical data set or to our methodology? Is it possible to, as they say, have it both ways, and if so, what do Digital Humanities methods offer us as we seek to responsibly represent political history? In digitizing and digitally remixing a primary source data, should we value data collection consistency or value recovering information that the original methodology could not capture? We plan to report on the data collecting practices of a TEI-based Canadian history project. In this short-presentation, we will report back on the findings of this research into methodological best practice and will demonstrate the affordances of the alpha version of our public history site.
Schwartz, Daniel L.Syriac Persons, Events, and Relations: A Linked Open Factoid-based ProsopographyThis paper explores the development of a prosopographical database for the field of Syriac studies called SPEAR: Syriac Persons, Event, and Relations. Syriac is a dialect of Aramaic used in the Near East between the 3rd and 8th centuries and continues to be used liturgically by Christians in the Middle East and India as well as expatriate communities in Europe and North America. This project employs a factoid-based approach to prosopography. Where most factoid-based prosopographies organize data in a relational database, SPEAR encodes prosopographical data from primary source texts in TEI XML using a customized schema designed to facilitate linking this propopographical data to other linked data resources and for serialization into RDF. SPEAR shows how a prosopography project can employ TEI, field-specific scholarly standards, and Linked Open Data to produce a highly structured and semantically rich database that maintains close ties to the texts from which it is derived.
Camps, Jean-Baptiste (2);
Ing, Lucence (2);
Spadini, Elena (1)
Collating Medieval Vernacular Texts: Aligning Witnesses, Classifying VariantsThis paper presents first results of an ongoing research on automatic categorization of variants within automatic collation. Performing the normalization phase using NLP tools, instead of doing it manually, not only speeds up the task, but also allows the identification of fine-grained categories.

The case studies shows strong and weak points of this proposal and of the different technical solutions for its implementation, in particular with regard to NLP tools.

This study also allows to compare the results in the alignment obtained using no normalization, manual normalization, automatic normalization and fuzzy match parameters.

Eventually, this research forces us to reflect upon the importance of having software components which are open and modular, in order to modify and improve them and to include them in computational pipelines.
Logan, Peter M (1);
Greenberg, Jane (2);
Grabus, Samantha (2)
Knowledge Representation: Old, New, and Automated IndexingHistorical documents are probably the most common source material for digital humanities projects. And yet many of these projects lack adoption of controlled terminologies to represent their content and aid search, discoverability, and use.

Automated indexing programs exist and may represent a solution to this problem, but they also raise other questions: when generating subject metadata terms for historical documents, should you use current controlled vocabularies, like the _Library of Congress Subject Headings_? Or will you get different results by using an older controlled vocabulary from the same time period as the documents?

Our presentation describes the results of our experiments comparing the output of current and historical vocabularies to automatically index historical documents, and discusses our findings.
Adelmann, Benedikt (1);
Andresen, Melanie (1);
Begerow, Anke (2);
Franken, Lina (1);
Gius, Evelyn (1);
Vauth, Michael (3)
Evaluation of a Semantic Field-Based Approach to Identifying Text Sections about Specific TopicsWith the increasing availability of large corpora, humanist scholars gain opportunities to choose their material in a more data-driven way. How can we identify texts or text sections relevant to our research question if we abandon prior knowledge as a determining factor? In this paper, we explore the potential of semantic fields for finding text sections about a topic of interest. Additionally, we want to address the major issue of evaluating a task involving a great deal of interpretation.
Seydi, Masoumeh (1);
Romanov, Maxim (2)
Al-Ṯurayyā, the Gazetteer and the Geospatial Model of the Early Islamic WorldFrom a historian perspective, information about places whose locations are not easily comprehensible with certainty, alternative names for places, evolution of names over time, and the specific historical contexts in which names were used, are of great importance. Places of cultural meaning or administrative units meet the needs of historians, rather than physiographic landforms on which many existing digital gazetteers and data models focus. Al-Ṯurayyā provides an extensive gazetteer of the early Islamic Empire with over 2,000 toponyms and almost as many route sections from Georgette Cornu’s Atlas—where the primary attribute of collected objects are their geographical coordinates and their place in the Empire’s administrative hierarchy. Beyond the gazetteer, al-Ṯurayyā implements a spatial model that visualizes settlements, routes, itineraries, regions, and networks; additionally, it can perform specific queries that are meant to help to analyze specific historical events and phenomena through resulting visualizations.
Houston, Natalie MAn Evaluation of Rhyme Detection Using Historical DictionariesAs part of a larger project in distant reading nineteenth-century British poetry, a method for detecting line-end rhymes was devised that utilizes rhyme dictionaries published in the eighteenth and nineteenth centuries. This method was proposed in order to account for historical debates about the definition of poetic rhymes in English as well as historical changes in pronunciation. This paper describes an evaluation of this approach that compares it to a method commonly used in computational analysis, which is based on the CMU Pronouncing Dictionary, in order to understand what significant differences occur.
Mol, Angus A. A.Gaming Genres: Using Crowd-Sourced Tags to Explore Family Resemblances in Steam Games.As is the case in the production and consumption of other forms of (entertainment) media, a video game’s genre classification is a topic of much debate among creators, critics, and consumers. These language games are not new: the complex classification of games was already used as a discussion of generalities in language by Wittgenstein in his Philosophical Investigations. This paper will provide the results of an ongoing project that puts Wittgenstein’s concept of game families into practice as a way to explore the complexity of genre in this medium. To this end data has been collected from the digital distribution platform Steam. A user-based tag recommender system is used to explore game families and genres through network community detection algorithms. To illustrate ongoing work, two case-studies will be presented: one of games that are tagged as “historical” and another highlighting Steam’s 100 best-selling games.
Carbe', Emmanuela (1);
Giannelli, Nicola (2)
A Digital Platform for the “Latin Silk Road”: Issues and Perspectives in Building a Multilingual Corpus for Textual AnalysisThis contribution briefly illustrates the project of a digital platform for a large corpus of Latin texts and documents from medieval and early modern times, its architecture and some of its early results: the preliminary goal consists of a digital library platform, with freely searchable text as well as accessibility to all metadata related to the resource. More other tools will be added to the preliminary and elementary ones, in order to pursue a more granular text analysis, including its semantic declination. Also, front-end projects for authenticated users of the digital platform will be proposed.
Zuanni, ChiaraData in Museums: Digital Practices and Contemporary HeritageThis short paper focuses on new forms of curation emerging in museums in relation to digital data. It explores the acquisition, collection, management, use, and preservation of born digital data in museums, while discussing how these practices affect museum notions of digital heritage.

The paper will first highlight recent examples of collecting and exhibiting strategies targeting digital data, and social media data in particular; secondly, it will discuss how user generated content add new layers to the online lives of museum collections and can therefore be included in object biographies; thirdly, it will focus on the curatorial challenges the inclusion of digital data within collection management systems poses to existing data models and vocabularies. In doing so, the paper will consider issues of digital preservation, ethics, and heritagisation, cutting across the fields of digital humanities, museology, and critical heritage studies.
LAMÉ, Marion (1,2);
PONCHIO, Federico (3);
PITTET, Perrine (4);
MARLET, Olivier (1)
OpenTermAlign : Interface Web d’alignements de vocabulaires archéologiques hétérogènes.Les outils d'alignement terminologique à disposition des archéologues et de leurs collaborateurs dans la chaîne de production et de publication de la connaissance scientifique en ligne permettent différentes modalités de travail. Ainsi il est possible d'aligner une terminologie source vers une terminologie cible déterminée (BBTalk vers le métathésaurus Backbones), vers des terminologies de structures homogènes (ex. : entre thésaurus - outil VISTA), vers des structures différentes (ex. 3M de structure xml vers l'ontologie formelle du CIDOC-CRM). Afin de faciliter l’alignement entre des terminologies peu ou pas structurées, sans organisation standardisée de la connaissance (ex. cluster de mots-clés) et produites selon les processus métier des archéologues nécessitant une certaine liberté linguistique, il n’existe pas à ce jour, d'outil intermédiaire offrant les mêmes fonctionnalités et une compatibilité avec ces outils tout en accompagnant de manière simple le processus d'alignement et le dialogue entre les différents acteurs (archéologues et experts de l'organisation de la connaissance, informaticiens).
Wu, Shang-Yun (1,2);
Wu, Cheng-Han (1);
Pai, Pi-Ling (3);
Wang, Yu-Chun (5);
Tsai, Richard Tzong-Han (1,3);
Fan, I-Chun (3,4)
Climate Event Classification Based on Historical Meteorological Records and Its Presentation on A Spatio-Temporal Research PlatformTo trace the occurrence and impact of climate disasters, many clues can be found in the rich records left by historical materials. "China's Three Thousand Years of Meteorological Records" extracts meteorological descriptions from 8,228 historical sources and organizes these descriptions by regions and dates. T The "East Asian Historical Climate Database" is compiled based on chorographies and official histories. This study develops an event classification method based on the meteorological records in the early Qing Dynasty in this database. By representing classical Chinese texts into word embedding vectors and the k-means algorithm, we overcome the difficulty of analyzing classical Chinese and not having enough training data. We then integrate the classification results with the map and timeline to develop a Spatio-Temporal search interface, which facilitates climatologist to access and analyze data according to the three dimensions of time, area and event categories.
Stell, JohnQualitative Space of PoetryWe report on experiences using qualitative spatial representation (a technique from artificial intelligence) to represent spatial form in poetry. This allows the two-dimensional structure of text to be represented computationally by describing spatial relations between lines and between blocks of text.
Higgins, Devin;
Calvert, Scout;
Nicholson, Shawn
Disciplinary Topologies: Using dissertations to map deviant interdisciplinesWith this proposal we explore the question: How can we characterize disciplines by looking at the discursive flows between scholars in university departments, and thus describe interdisciplinarity amid shifting topologies of knowledge? Taking as a provocation the premise that “every field of knowledge is the centre of all knowledge” (Frye 10), we explore paths of connectedness between disciplines, as constructed from a dataset of approximately 5,000 theses and dissertations (ETDs), in order to elucidate the boundaries, shapes, and concentrations of disciplinary knowledge in the making.
Casties, RobertA Database of Islamic Scientific Manuscripts — Challenges of Past and FutureI will present the database of the Islamic Scientific Manuscript Initiative (ISMI) which aims to make accessible information on all Islamic manuscripts in the exact sciences (astronomy, mathematics, optics, mathematical geography, and related disciplines), whether in Arabic, Persian, Turkish, or other languages from the 9th to the 19th century.

The first version of the database was built in 2006 using a flexible graph-like data model that developed and expanded over time.

The database and its web presentation are now being migrated to new standard tools like a Drupal web frontend, a CIDOC-CRM based data model and a ResearchSpace based backend.

The new Drupal frontend is already online offering access to more than 6900 witnesses of 2300 texts and an experimental area with access to the graph database.

The ISMI project aims to be a continuing and growing resource in the future and we invite all interested to participate.
Nijboer, Harm (1);
Brouwer, Judith (1);
Bok, Marten Jan (2)
Unthinking Rubens and Rembrandt: Counterfactual Analysis and Digital Art HistoryIn this paper we analyze the centrality of Rubens and Rembrandt in their artistic communities. We argue that counterfactual analysis is key to understanding their roles in the art worlds of Amsterdam and Antwerp in the seventeenth century.
Akdag Salah, Alkim Almila (1);
Ocak, Meral (2);
Kaya, Heysem (3);
Kavcar, Evrim (4);
Salah, Albert Ali (1)
Hidden in a Breath: Tracing the Breathing Patterns of Survivors of Traumatic EventsMany people experience a traumatic event during their lifetime. In some extraordinary situations, such as natural disasters, war, massacres, terrorism or mass migration, the traumatic event is shared by a community and the effects go beyond those directly affected. Today, thanks to recorded interviews and testimonials, many archives and collections exist that are open to researchers of trauma studies, holocaust studies, historians among others. These archives act as vital testimonials for oral history, politics and human rights. As such, they are usually either transcribed, or meticulously indexed. In this project, we look at the nonverbal signals emitted by victims of various traumatic events and seek to render these for novel representations that are capable of representing the trauma without the explicit (and often highly politicized) content. In particular, we propose to detect breathing and silence patterns during the speeches of trauma patients for visualization and sonification.
Pytlowany, AnnaA European-Hindustani Dictionary? Reflections on MethodsThis presentation is the first report on the project “Hindi Lexicography and the Cosmopolitan Cultural Encounter between Europe and India around 1700” from Uppsala University (UU). The primary goal of the project is to produce an online dictionary (Latin-Hindustani-French) on the basis of the unpublished 'Thesaurus Linguae Indianae' by François-Marie de Tours (1). The shortcomings of the Uppsala project will guide the design of an extended cross-linked online dictionary of early modern Hindustani based on little known wordlists and vocabularies compiled by European merchants and missionaries in the 17th c. India. The novelty of the approach resides in combining multilingual sources describing a foreign language to create a ‘pan-European perspective’, which may offer new comparative insights for the historical linguistics of target languages. If successful, this approach can be applied to other early modern vocabularies constituting unique and valuable descriptions of non-European languages.
McKee, Sarah E.DH And The Evolving MonographThis presentation will begin with a brief overview of the history and theoretical underpinnings of the Digital Publishing in the Humanities initiative at Emory University. This four-year experiment seeks to find best practices for supporting faculty in the development of digital monographs, including securing funding and collaborating with publishers to create works that extend beyond the form of a traditional book. The focus will then shift to several case studies of digital monographs currently under development by Emory faculty.
Sanders Garcia, AshleyFrom the Margins to the Center: A Method to Mine and Model Complex Relational Data from French Language Historical TextsThis project employs the spaCy Python library to build an information extraction system to mine personal relational data in French language sources. As a test corpus, it uses four digitized, OCRed, and hand-cleaned nineteenth-century French chronicles of Ottoman Algerian history in order to model socio-political networks and uncover the positions and roles of women in this society. The challenge is to extract not only named entities and their relations to one another, but to extract unnamed persons and their relationships as well. Those who remain unnamed are most often women, servants, slaves, and Indigenous people – the very people about whom scholars are most anxious to know more. This short presentation will share the complete information extraction code, its accuracy, the resulting visualizations, a brief analysis from the case study, and additional use cases that extend far beyond the initial case study to other languages and textual sources.
Peroni, SilvioThe Open Citations MovementPurpose: This article introduces the benefits of releasing a huge set of open citation data as public domain material.

Findings: The open citations movement has reached an extensive media coverage since the launch of the I4OC, and several projects and datasets have been release so far so as to leverage the open citation data available online.

Implications: The open citation data available is still far from being competitive with well-known proprietary citation databases such as Scopus and Web of Science. However, recently, several federated and interlinked open citation database have been released and are accessible and interoperable with each other by means of the Web technologies.

Value: Open citation data makes a positive disruption in the world of scholarly communication, since they change entirely how we face to science, its evolution, and all the related context, such as research assessment evaluations, science of science, bibliometrics, and future scientific discoveries.
Ijaz, Ali Zeeshan (1);
Roivainen, Hege (1);
Lahti, Leo (2)
Analytical Edition Detection In Bibliographic MetadataAnalytical bibliography's aim is to understand books and other printed objects as artifacts and how they were produced. Bibliographic metadata can represent important historical trends and resolve issues such as the ordering of editions.

In this paper, we present the state of the art analytical approach for determining editions and their ordering. By providing harmonized data and information on historical developments in book production, this will be a great aid for projects aiming to do large-scale text mining. Contemporary text mining approaches do not utilize edition level information to the fullest extent and therefore are limited in their scope.

Using the ESTC metadata, we have developed harmonizing techniques that convert free-form text into more coherent entries for statistical analysis. Furthermore, a new gold standard was developed for validation purposes, with multiple layers of information. The use of this data would significantly enhance the understanding of early modern publishing.
Sturgeon, StephenThe Digital Humanities Certificate Option: What's At Stake?This short paper will first trace the roots of the digital humanities certificate option as it is now most commonly conceived to Lisa Spiro’s 2010 post on 'Opening Up Digital Humanities Education', summarize how her ideas were then developed by scholars such as Lynne Siemens and Kara Kennedy in journal papers, and move on to examining how these ideas were put into different kinds of practice at institutions like Texas A & M University, the University of Nebraska-Lincoln, and the University of Virginia. This short paper will then turn to my own experience as the developer of a digital humanities workshop series who then participated in an attempt to standardize it for university accreditation as a certificate program, focusing on issues of labor equity, fair intellectual representation, the complexity that conversations about these can take on, and conditions for librarians and campus partners to agree to when developing curricula together.
Lavorel, Marie (1);
Bourgatte, Michael (2)
Annoter Des Contenus Audiovisuels: Récit D’une Collaboration Entre Montréal Et ParisUne équipe pilotée par Michael Bourgatte au sein du Département d’Humanités numériques de l’Institut Catholique de Paris a réfléchi au déploiement d’un service d’annotation open source qui puisse être utilisé à la fois par des enseignants chercheurs et des étudiants. Le résultat est l’outil technologique Celluloid, plateforme d’annotation vidéo collaborative.

Sous la coordination de la chercheure Marie Lavorel à l'université Concordia à Montréal, une plateforme numérique d'archives vivantes dédiée aux survivants du génocide Rwandais vivants à Montréal est actuellement développée et a pour objectif notamment de créer ou d’adapter des outils technologiques afin de naviguer, annoter, visualiser, cartographier un corpus d’entretiens vidéo qui constitue l’unité de cette plateforme.

Dans ce contexte, Michael Bourgatte et Marie Lavorel, ont pris contact afin de voir les possibilités d’adaptation de l’outil Celluloid sur la plateforme

et de tester l’outil dans différents cadres d’utilisation tant au sein du milieu académique que du milieu communautaire.
Steiner, ChristianCooking Recipes of the Middle Ages: Corpus, Analysis, VisualizationCooking traditions, whether they are regional or in a larger context, are one of the most distinguishable items of European culture and an important part of European identities. But how did they become to what we know them now? How did they develop and what were their influences?

Cooking recipes are culturally charged transient texts, which are best diachronically and spatially analyzed by strongly relying on digital humanities methods. In my presentation I will explain the core of our digital research strategy, the Semantic Web and the idea of Linked Open Data, and why we chose this focus. I will also talk about the possible analysis methods that will arise from the described workflow and the technical environment (GAMS) we are placing the project in.
Mandell, Laura C (1);
Tarpley, Bryan (1);
Brown, Susan (2);
Laiacona, Nicholas (3);
Moore, Shawn (4);
Pratt, Lynda (5)
Cutting the Gordian Knot: Sustaining Digital Scholarly EditionsAfter explaining the Advanced Research Consortium (ARC) and the ARCScholar Digital Publishing Cooperative, we describe a collaboration with Texas A&M University Libraries (TAMU-L) to develop a model for library acquisitions of digital scholarly editions at low cost to the library. We argue that ARC's work with TAMU-L offers a model for sustaining digital editions in perpetuity, and ultimately, creating ways for future scholars to discover editorial commentary via the semantic web.
Hladík, Radim (1,2);
Štechová, Markéta (3)
Semantics of Shame in Social Media Discussions of Reality TV FansWe examine the discussions on the Facebook Page of the Czech Reality TV show Výměna manželek (Wife Swap). A commercial TV Nova acquired the originally British program for the Czech market in 2005. In 2018, the show is in its 10th season and consistently ranks among the most popular prime-time programs. We intend to find out if the show’s viewers active on social media partake in the shaming of lower class participants on the show. Specifically, we map the semantic space of “shame” in the comments associated with negative sentiment, interrogate the space for class-based content, and compare it with the alternatives. We find that the viewers engage in the shaming of reality show characters by affirming personal hygiene as the demarcation line between acceptable and unacceptable poverty.
Farrell, JeremyFrom Reductionism to Complexity: A Digital Corpus for SufismOne of the great promises of digital corpora is the possibility of conducting rigorous empirical investigation of complex data sets related to the evolution of societies, including those in the pre-modern age. The realization of this ambition is subject, however, to several well-known historiographical and technical barriers. As for the first concern, critiques of narrative modes of history have called into question the its reliability as a basis for conducting research into the complexities large scale societal change. With respect to technical challenges, the demands of harnessing the multitude of available are rarely met by traditional database solutions. This paper proposes a solution to these dilemmas by compiling non-narrative data structures found in literature produced by early members of early Sufism into a non-relational database in order to facilitate the computational analysis of the formation of this religious movement.
Casenave, JoanaMise en Discours de l’Information et Parcours de Lecture dans l’Edition critique numériqueLes éditions critiques traditionnelles, imprimées sur papier et regroupées dans des collections spécialisées, répondent à des codes stricts de structuration du contenu. Lorsque l’on passe au support numérique, ces conventions disparaissent en partie. A l'écran, il s'agit donc de reprendre les normes élaborées pour l’imprimé et de les faire évoluer afin qu'elles s'adaptent aux particularités du format numérique. Les éditeurs qui s’occupent des artefacts scientifiques numériques doivent donc d’une part reprendre certains codes sémiotiques du livre imprimé pour les adapter au nouveau format, et d’autre part inventer de nouveaux codes propres à l’environnement numérique. Dès lors, il importe particulièrement de guider l’utilisateur dans son appréhension de l’édition. Mais comment l’éditeur met-il en discours l’information qu’il souhaite présenter aux lecteurs ? Dans cette communication, nous allons nous concentrer sur l’un des aspects de cette représentation et mise en discours de l’information, qui est traduit par l’élaboration de parcours de lecture spécifiques.