Ontology for Informatics Research Artifacts (IRAO)

Viet Bach Nguyen orcid, Vojtěch Svátek orcid  | PRAGUE UNIVERSITY OF ECONOMICS AND BUSINESS

Background

Informatics and computer science researchers usually contribute to scientific knowledge by delivering and tangible outputs, namely research artifacts. Typical examples of informatics research artifacts are software prototypes, datasets, ontologies, methodologies, frameworks. Computer science conferences such as ISWC, ESWC, and others have started to use resource tracks to allow for resource papers that describe these artifacts. By our recent survey [2], no ontology has paid particular attention to this topic. The goal is to create a new ontology to fill this gap.

Structure of IRAO ontology

diagram

Motivation

  • The occurrence of different kinds of artifacts in research publications can be traced over time for subdisciplines or venues, thus providing a broad picture of trends in informatics research.
  • Networks of complementary or competitive artifacts (such as software tools being developed using a given methodology and applied on specific datasets backed on ontologies) can be connected, allowing researchers to rapidly navigate from one to another and finding a reuse target (and even associated publications) more easily.
  • Industrial companies can retrieve artifacts that they might consider transforming into deployed products.

Design & features

Conforming to the NeOn ontology engineering methodology, we list out a set of competency questions as part of our requirement specification document (ORSD) [3] to elicit relevant concepts. Both the CQs and the ORSD can be found in our GitHub repository. The features of IRAO ontology are:

  • basic information about research artifacts, required for representing the artifact data gathered from repositories of theses, publications, software data repositories – incl. authorship, publication date, research field, topic, identifiers, etc.,
  • types of research artifacts in terms of what they are useful for and how to use them,
  • their development status, e.g., alpha, beta, release, or numbered version,
  • their quality attributes, such as accessibility, use of an open standard, accessibility, or design principles,
  • relationships between different types of artifacts, e.g., a dataset is described by a data model, a software uses a framework, etc.

Description

IRAO consists of four parts that model the mentioned features.

  • The artifact classification part lists out possible types (subclasses) of artifacts, e.g., Dataset, Framework, Vocabulary, and Methodology.
  • The meta information part includes relationships such as hasAuthor, hasPublication, having the range of Researcher, Publication, respectively.
  • The property hasDevelopmentStatus points to information about the maturity of the artifact. Properties hasAccessibility, hasDesignQuality, are used to provide the artifact with classifying tags.
  • The last part of our ontology deals with the relationships between different types of artifacts to describe situations such as when a research project can produce.

Implementation

Informatics Research Artifact Ontology (IRAO) was implemented in OWL using the Protégé editor. We also used OnToology [1] to automatically build the ontology using recommended metadata properties for self-documentation. IRAO is listed in the Linked Open Vocabularies portal (LOV).

Visualization