Category Archives: data modeling

State of the Snap-Nation

With the end of the pilot project scarily in sight it is time to review where we are and where we hope to be by the end of December.

The big news is that (hopefully) the first set of SNAP identifiers are now frozen!

What this means is that for the first 5 datasets have now been ingested and had SNAP identifiers linked to each of the persons and those identifiers are fixed. There may still be a few tweaks to the RDF descriptive data coming in from the projects but the identifiers will remain the same. Continue reading State of the Snap-Nation

SNAP and VIAF

We’ve had a couple of meetings with Karen Smith-Yoshimura and Thomas Hickey, of the Scholars Contributions to VIAF group, to discuss possible collaborations, exchange of information, and mutual benefits of sharing standards between the SNAP:DRGN project and VIAF (the Virtual International Authority File, a federated authority list of persons from library catalogs, mostly from author or subject fields).

We considered two main questions: Continue reading SNAP and VIAF

Looking Towards an API for SNAP:DRGN

During the first SNAP:DRGN workshop a breakout group was convened to discuss the potential API for the project. Rather than come up with a specific API during that session, we instead focused on creating a “wish list” of applications and functions that we wanted to support. We were then able to abstract the functions that would be needed to support the list. Continue reading Looking Towards an API for SNAP:DRGN

You Aren’t Gonna Need It

We’ve been discussing lately how to merge person records in SNAP, so that when we encounter partner projects that each have a record for the same person, SNAP can provide a useful service by combining those into single, merged records, and we can start to get an idea of the requirements for performing operations like merges on our data. This discussion has proved something of a rabbit hole. Continue reading You Aren’t Gonna Need It

The Old Classes vs Properties Debate (or Relationships Are Hard, Part 2)

One of the decisions that has to be made when creating an ontology is which concepts you encode as classes and which you encode as properties of those classes. One of the difficulties is that there is no overarching ‘right answer’ (although there are wrong ones) to how you should model your domain, in has to be decided on a case-by-case basis of what works best for the type of world view that you are trying to encapsulate within your model. This post is a request for feedback to help us decide which model works best for both the project and the wider community. Continue reading The Old Classes vs Properties Debate (or Relationships Are Hard, Part 2)

Why SNAP IDs?

One question that came up during the workshop a couple of weeks ago was: if partner projects already assign their own URIs/ids to their person/name/etc. records, then why should SNAP assign its own identifiers? There are two answers to that, one very practical, and the other a bit more philosophical.

  1. SNAP IDs will be URIs themselves, and when dereferenced in a browser, or by an application, will return a result. Either a web page listing what SNAP knows about the record in question, or RDF data about it. We can’t do this in a practical way without assigning our own identifiers.
  2. On a more theoretical level, we think that any updates made to data post-ingest shouldn’t be made directly on our partners’ data. We believe, for example, that while SNAP might assert an identity between two person records coming from two partner datasets, it will be up to the partners whether they accept that identification.

Continue reading Why SNAP IDs?

SNAP at Digital Humanities 2014

The SNAP Project is proud to announce the Ontologies for Prosopography: Who’s Who? or, Who was Who? one-day workshop developed in conjunction with the People of the Founding Era project based at the University of Virginia. The workshop will give the opportunity for SNAP to present our data model to a wider audience and engage with the researchers working on similar problems other periods and geographic areas. Continue reading SNAP at Digital Humanities 2014

Tensions

Last week, Faith gave a great overview of some of the issues involved in describing the relationships between people. This week, I’m going to come at the problem from the other side, looking at what data we have, and how SNAP plans to represent them.

Our initial datasets include Trismegistos People (TM, described by Mark), the Lexicon of Greek Personal Names (LGPN, described by Sebastian), and a set of names (article headwords) from the Prosopographia Imperii Romani, 2nd edition (PIR²) put together by Tom Elliott. TM has web pages that document the references to names and people found in papyri, many of which are hosted at Papyri.info, as well as resources describing the names and person; references, names, and persons all have unique identifiers. LGPN comes at the problem of modeling people from a different angle. They start with persons and add names and references; persons and names have unique identifiers. From PIR², we have only persons, with a “principal” name and identifying number (the article number) attached to them. Continue reading Tensions