CyberResearch on the Ancient Near East and Neighboring Regions (vol.1): Abstracts

Part 1: Archaeology

A Conceptual Framework for Archaeological Data Encoding

Sveta Matskevich and Ilan Sharon

Field recording systems are the vocabulary and syntax of descriptions and low-level abstractions by which we characterize often fleeting and unrepeatable primary archaeological observations. Recording is the basis of all high-level constructs of archaeological theory. However, recording methods and archaeological theory have had an uneven relationship through the history of the discipline. In the world of postmodern archaeology, we have had to abandon the search for the Holy Grail—a single methodology that all archaeologists will accept as “best,” despite geographical, chronological, and increasing theoretical rifts. Yet in an increasingly networked world, there is a need for recording systems that, if they cannot mend these rifts, are at least able to communicate across them. This paper presents a meta-model for field recording that conceptually suits all theoretical approaches and allows for the integration of data without losing its conceptual features. The case studies, drawn from the Tel Dor project (in Israel), show how the model integrates datasets from several excavation projects held on the site during the last hundred years, and how it handles complex post-processual archaeological concepts such as multivocality and uncertainty.

Keywords: archaeological recording systems, archaeological theory, conceptual modeling, low-range theory, Tel Dor

Landscape Archaeology and Artificial Intelligence: the Neural Hypersurface of the Mesopotamian Urban Revolution

Marco Ramazzotti, Paolo Massimo Buscema & Giulia Massini

Today, there is a constant debate over computer semiotics as a discipline aiming to establish the function of the logical operators of programming on the basis of structured and complex semantic units. Semiotic analyses centered on redefining the analytical object are also one of the main trends in computer science and, in particular, in the sector interested in encoding physical systems into connective artificial networks of nodes or cells. Thus, the so-called artificial adaptive systems are synthetic representations of the observed reality that must undergo interrogation processes or the most advanced analytical tools for learning and modeling complex data-set configurations. Given these basic coordinates, it seems clear that simulating the dynamic and complex behavior of the high variability of the natural and cultural factors in networks thus conceived equals tracking down, selecting, and separately recreating a wide variety of functions associating variables, a wide variety of inferences controlling their semantic structure, and an equally wide variety of causes producing their transformation. In this specific sense, the application of artificial intelligence models to the Mesopotamian Urban Revolution Landscape has value: it recreates a possible world of other associations of meaning from the body of incomplete sources and scattered information, exhibits the nuances and complex interrelationships, and, furthermore, helps the researcher to codify other, unforeseen—or even hidden—interrelationships.

Keywords: Artificial Adaptive Systems (aas), Artificial Intelligence (ai), landscape archaeology, Mesopotamian Urban Revolution Landscape (murl), Ubaid, Uruk

Part 2: Objects

Data Description and the Integrated Study of Ancient Near Eastern Works of Art: The Potential of Cylinder Seals

Alessandro di Ludovico

The problem of documenting and systematically describing the material witnesses of the artistic cultures of the ancient Near East is an especially urgent one, considering the increasingly endangered nature of the heritage in Western Asiatic regions. As shown in experiments carried out in the 1960s and 1970s, there are many factors that make the task a very challenging (if not almost impossible) one, even if one focuses exclusively on specific fields. Despite this, some sub-categories of Western Asiatic artistic products could be represented and described in open archives in a formal way that would be sufficient for the largest part of the investigations and needs of scholars. At the core of this contribution is a discussion of the exemplary potential of cylinder seals in the field of archival encoding. Textual encodings, on the one hand, and different kinds of presence/absence encodings, on the other, can be used, as will be shown here, to develop descriptions, classifications, interpretations, and comparisons concerning the representations depicted in cylinder seal intaglios and their impressions. The use of such encodings on specific glyptic categories, such as “presentation scenes,” will serve as a concrete example through experiments in statistical analyses (specifically, correspondence analysis) carried out using the software package spad 5.5. In these experiments, the encoding strategies were determined before the data was processed. They therefore provided a starting point for the systematic description and representation of both the basic data (the artifacts) and the outcomes.

Keywords: archives and open data, cylinder seals, formally encoding figurative languages, presentation scenes, Ur III period

A Quantitative Method for the Creation of Typologies for Qualitatively Described Objects

Shannon Martino and Matthew Martino

Typologies are fraught with subjectivity; one attribute is considered relevant, while another may go unnoticed entirely. The typical typology, however, is created through a process inspired by leaps of intuition. In the interest of omitting the inherent subjectivity from this process, archaeologists sometimes use statistical methods, yet such methods were originally created for the analysis of quantitative data, rather than for the qualitative data that archaeologists usually use to characterize objects. Here we will present a new computer program developed specifically to produce typologies. We will show how it can be applied to both figurines and ceramics using the findings from the Early Bronze Age site of Demircihüyük in northwestern Turkey. This method was not only time-saving, given the large datasets, but it also helped to assuage concerns of subjectivity, illuminate areas where the data was incomplete, and highlight the diversity of certain production aspects. We will explain the necessity of a collaborative process between programmer and archaeologist and show how our program can quantify some of the remaining subjectivity of a typological analysis.

Keywords: archaeology, cluster analysis, figurines, multivariate analysis, pottery, typology

Part 3: Texts

A Qualitative Approach Using Digital Analyses for the Study of Action in Narrative Texts: ktu 1.1-6 from the Scribe ʾIlimilku of Ugarit as a Case Study

Vanessa Bigot Juloux

What shared data within a narrative corpus can we use to analyze the actions and agency of an animated entity (ae)? Can the same data give us insight into an author’s intentions? And how should this data be encoded for digital analysis to show relevant criteria? Based on Donald Davidson’s philosophy of action, Gertrude Anscombe’s concept of intention, and Gricean pragmatics, I set up analytical taxonomies to reveal significant evidence (hitherto untapped due to a lack of shared analytical variables among scholars) for Ugaritic narrative studies. These taxonomies are divided into three main groups: (1) primary data: verb and ae; (2) objective variables: semantic verbal categories, ae, spheres (inside/outside a spatial delimitation), roles, and contexts (e.g., battle, assembly); and (3) subjective variables: types and levels of emotions, types of intentionality, levels of desire, and the consequences of action. All the data is encoded for digital analysis. I follow a seven-step process for the interpretation of the transcription; each step takes into account variables from these analytical taxonomies. This qualitative method is used to sort relevant information into categorical data, so that it can be used for further data manipulation (text mining and refining methods for quantification). This will allow me to set up a new kind of hermeneutics: namely, the hermeneutics of action for ethno-anthropological purposes.

Keywords: analytical taxonomy, markup tagging, philosophy of action, pragmatics, qualitative method, tei-xml, text mining, Ugaritic

Network Analysis for Reproducible Research on Large Administrative Cuneiform Corpora

Emilie Pagé-Perron

As network analysis slowly but surely sets foot into Assyriology, it opens the door to quantitative analysis in a field that has been traditionally almost exclusively qualitative. This chapter discusses how network analysis can provide scholars with an additional and innovative approach to the study of Mesopotamian social history, especially in the case of larger administrative corpora. In textual studies, network analysis is a digital methodology that focuses on the relationships among entities in the written record and enables these relationships to be studied at a larger scale than is normally feasible with traditional philological methods. Additionally, a fortunate side effect of using network analysis is the increased potential for the reproducibility of the research.

Keywords: cuneiform, digital cuneiform, disambiguation, Mesopotamia, Natural Language Processing (nlp), network analysis, research reproducibility

Semantic Domains in Akkadian Texts

Saana Svärd, Heidi Jauhiainen, Aleksi Sahala, and Krister Lindén

This paper examines the possibilities offered by language technology for analyzing semantic fields in Akkadian. Our research group used the existing electronic Open Richly Annotated Cuneiform Corpus (Oracc). In addition to more traditional Assyriological methods, this paper explores two technological methods: Pointwise Mutual Information (pmi) and Word2vec. The theoretical basis of our research lies in its emic approach. In lexical semantics, an emic approach is an endeavor to examine syntagmatic and paradigmatic relationships of words. We concentrated on three sample lexemes: sisû, “horse”; qabû, “to speak”; and danānu, “to be strong, powerful.” First, we built a suggestion for both syntagmatic and paradigmatic fields with the help of the Assyrian Dictionary of the Oriental Institute of Chicago, known as the Chicago Assyrian Dictionary (cad). The results gleaned from analyzing the cad were then compared to the results gleaned by pmi and Word2vec. These particular methods were chosen because pmi can be used to analyze the syntagmatic relationships, and Word2vec can be used to analyze the paradigmatic relationships of our test words. These first results of our research group offer some promise that quantitative data on the connections between individual lexemes can indeed help Assyriologists. At the very least, the detailed suggestions for semantic domains generated with the help of pmi and Word2vec enable Assyriologists to develop their philological work in a fruitful direction. However, automatically differentiating between paradigmatic and syntagmatic sematic fields will need more work.

Keywords: Akkadian, language technology, lexical semantics, linguistics, methodology, pmi, Word2vec

Using Quantitative Methods for Measuring Inter-Textual Relations in Cuneiform

M Willis Monroe

This project employs known methods used in the digital humanities and text processing to look for patterns of associations in a corpus of cuneiform astrological documents. The material is fragmentary, damaged, and difficult to read and analyze with traditional methods. Digital tools allow the corpus to be quickly processed and analyzed in furtherance of a larger research goal. The article serves as an introduction to these methods and as a discussion of their application to a non-traditional text corpus.

Keywords: astrology, cluster analysis, cuneiform, text analysis, zodiac

Part 4: Online Publishing, Digital Archiving, and Preservation

On the Problem of the Epigraphic Interoperability of Digitized Texts of the Mediterranean and Near Eastern Regions in First Millennium bce

Doğu Kaan Eraslan

This paper focuses on the integration of the encoding schemes for digitizing first-millennium bce texts from the Mediterranean and Near Eastern regions. It first gives a brief survey of the encoding schemes used to document these texts and identifies a key problem, from a computational perspective, i.e., the lack of interoperability of the encoding schemes. The reasons behind this problem are discussed, and a solution is proposed: using vector spaces for representing signs in order to express them in a common unit. Possible tools to achieve this solution and how these tools can be adapted to ongoing projects are also discussed. The project used as a case study concerns the quadrilingual vase of Darius.

Keywords: c-atf, cal Code, digital epigraphy, EpiDoc, Manuel de Codage, vector spaces

Digital Philology in the Ras Shamra Tablet Inventory Project: Text Curation through Computational Intelligence

Miller C. Prosser

The Ras Shamra Tablet Inventory (rsti) is a research project co-directed by Miller C. Prosser and Dennis Pardee. A primary goal of the project is to create reliable digital editions of the texts in the Ras Šamra-Ugarit corpus within a research database environment. More than just a data store of texts translated from their ancient languages, rsti serves as a tool for addressing research questions. To this end, the project also seeks to integrate archaeological data from the excavations at Ras Šamra, including published archaeological plans, grid and square systems, and any other information freely available. Using the Online Cultural and Historical Research Environment (ochre), we are currently adding and curating data with the help of various workflow wizards. To add new data to rsti, we begin with a standard text transliteration saved as a Microsoft Word document or in another common document format. We load this document into ochre, which uses intelligent functions to atomize the linear transcription into individual signs or letters. As part of this process, ochre validates these signs and letters to make sure there are no typographical errors. Once the text is added to the database, analytical wizards guide the user through the tasks of finding words in project dictionaries, adding grammatical properties to the words, and identifying people and places in the texts. The importation and curation steps both employ processes developed specifically for the task of knowledge representation of philological data.

Keywords: archaeology, database, Online Cultural and Historical Research Environment (ochre), philology, Ras Shamra, Ras Šamra, Ugarit

Publishing Sumerian Literature on the Semantic Web

Terhi Nurmikko-Fuller

Literary Sumerology in particular, and Assyriology in general, are in a position to be relevant and significant in the ongoing development of semantic web (SW) technologies through the promotion of common data formats and exchange protocols. The rich philological material of the ancient Near East can be used to evaluate the robustness and flexibility of models and schemas designed from the perspective of other disciplines, with the aim of upper-level (and thus universal) applicability. Assyriological research can in turn be supported by knowledge from other data-streams and has become increasingly relevant in interdisciplinary research agendas; data is disseminated further and becomes better known in the public domain. This chapter outlines the main practical considerations and terminologies needed for understanding and applying the Linked Data (LD) online publication paradigm to philological, museological, curatorial, and archaeological data. Three existing ontologies—the cidoc crm, frbroo, and Ontomedia—are evaluated in the context of their suitability for representing information contained in the Electronic Text Corpus of Sumerian Literature (etcsl), an existing online resource that provides access to some 400 transliterated and translated composite texts.

Keywords: Linked Data, literature, ontologies, rdf, semantic web, Sumerian