Why SNAP IDs?

One question that came up during the workshop a couple of weeks ago was: if partner projects already assign their own URIs/ids to their person/name/etc. records, then why should SNAP assign its own identifiers? There are two answers to that, one very practical, and the other a bit more philosophical.

  1. SNAP IDs will be URIs themselves, and when dereferenced in a browser, or by an application, will return a result. Either a web page listing what SNAP knows about the record in question, or RDF data about it. We can’t do this in a practical way without assigning our own identifiers.
  2. On a more theoretical level, we think that any updates made to data post-ingest shouldn’t be made directly on our partners’ data. We believe, for example, that while SNAP might assert an identity between two person records coming from two partner datasets, it will be up to the partners whether they accept that identification.

The practical

When a person (or person-like entity) record in the SNAP triplestore is queried by URI via a web browser, we would expect this URI to dereference to a HTML page giving more information about the person recorded. The main information, the title of the page, would be the immediate source of the information: i.e. the contributing dataset, or datasets in the case of a merged or co-referenced person. Other information about the person–names, associated dates and places, primary text attestations, etc.–would also be listed, in a simple and standard layout, as would relationships to other persons, and other assertions about the person that SNAP knows about. An example (completely fictional) entry might therefore look something like:

TM 1234 = LGPN V5a-567
SNAP Person id: 10002
Apollonius/Apollonios/Ἀπολλώνιος
c. II cent
Aphrodisias
Father of TM 1233 (SNAP pid 10001), Diogenes/Διογένης
Attested in: PHI 256884; BGU.12.16024

In addition to this information, which may be different from, in some case supplemental to, the information in the contributing databases, we can imagine other information and services being added to this page. For example, a feed showing external projects that have linked to this person as annotations to names in their texts or archaeological objects; or a Social Network Analysis visualization of persons, places, texts, etc. within two steps of relationship to this person. All of these SNAP-specific services will only be possible if we have SNAP identifiers to dereference to pages containing this information.

The theoretical

When the SNAP system ingests data from a partner (Trismegistos, for example) we’ll get data from them that looks like:

<http://www.trismegistos.org/person/414#this>
   a lawd:Person ;
   dc:publisher <http://www.trismegistos.org> ;
   lawd:hasName <http://www.trismegistos.org/name/5663#this> ;
   lawd:hasAttestation <http://www.trismegistos.org/ref/1662#this> .
    
<http://www.trismegistos.org/name/5663#this>
   a lawd:PersonalName ;
   dc:publisher <http://www.trismegistos.org> ;
   lawd:primaryForm "Σαραπίων"@grc;
   lawd:primaryForm "Sarapion"@en ;
   lawd:hasAttestation <http://www.trismegistos.org/ref/1662#this> .

and SNAP will assign a new person id, like http://data.snapdrgn.net/person/1234 to the lawd:Person http://www.trismegistos.org/person/414#this. The theoretical reason for this is that SNAP plans to add functionality for the identification of persons belonging to multiple datasets and the annotation of those persons. As we noted above, we think those sorts of updates shouldn’t be applied directly to the Trismegistos resource by us. If you contribute data to SNAP, we feel strongly that we shouldn’t change that data. You should be free, of course, to accept new facts or assertions that emerge in SNAP that are relevant to your data back into your dataset, but those shouldn’t be forced on you, nor should it be made to look as if your project asserts something it doesn’t. There are a couple of possible ways to achieve this, but one very simple one is to create a derived resource to which new facts and assertions may be added. SNAP ids allow us to preserve the integrity of contributed datasets while allowing us to build upon them.

Leave a Reply

Your email address will not be published. Required fields are marked *