One of the decisions that has to be made when creating an ontology is which concepts you encode as classes and which you encode as properties of those classes. One of the difficulties is that there is no overarching ‘right answer’ (although there are wrong ones) to how you should model your domain, in has to be decided on a case-by-case basis of what works best for the type of world view that you are trying to encapsulate within your model. This post is a request for feedback to help us decide which model works best for both the project and the wider community.
In the previous post we considered three patterns that we could use to describe relationships. Further discussion has led us to discarding the third, event-driven, option both in a drive towards simplicity and more importantly because it has the furthest conceptual distance from the information we want to represent. The source material is diverse in both type and style but if we consider what is normally captured in prosopographical data, and why, we would expect something like:
Επιγόνη daughter of Επίγονος (from Thasos) http://www.lgpn.ox.ac.uk/id/V1-37074
There are a number of events that we can hypothesise from these type of statements, in this case that Επίγονος fathered a girl, Επιγόνη. This fits in with the logical rules that it is possible to create in structured data: when person A fathered/gave birth to a girl B then B is the daughter of A. While epigraphs, like that in the example, are unlikely to go into further detail other sources may have specific description of some events moving them from the realm of the assumed (we assume that Επίγονος fathered Επιγόνη and did not, for example, adopt or was cuckolded resulting in the above statement) to the evidenced (trustworthiness of that evidence is a issue for a different day/post). For those people familiar with CIDOC CRM, this is basically the model that they employ – and it is a good one allowing a rich and detailed encoding of the biographical history of the person (or object). However much of this information is well beyond the scope of what SNAP sets out to model. If it wasn’t then we could just use CIDOC CRM, a well known and common standard, and all go home early for tea. One of the guiding principles behind SNAP is that we are only encoding the minimum information necessary to name/identify an individual entity. We need to know that Επιγόνη is the daughter of Επίγονος only in so much as that is part of her significant identity. So while we would encourage projects to encode this level of information in there own data, events are beyond the scope of SNAP, which leaves us with two other possibilities.
Defining every possible relationship via properties is arguably the simplest way that we could encode the information we need:
[Επιγόνη] -- daughter-of --> [Επίγονος]
There are two potential downsides to this. Firstly, the number of properties expands pretty fast. Not only do we have the basic property tree with
but each of those needs to have versions for ‘acknowledged’, ‘claimed’, ‘foster’, ‘adopted’, ‘step’. And then there is the extended family and even if we only go as far as the grandparent/grandchild relationship along with the basic aunt-of/uncle-of (interesting there is no collective gender-neutral word for this relationship), cousin-of (and no non-gender neutral term for this), nephew-of/niece-of then we still have to add in maternal and paternal versions (although we can probably be forgiven for dropping the ‘acknowledged’, ‘claimed’ etc). Added to these we need the important non-“blood” relationships: formalised intimate relationships (i.e. recognised marriage), non-formalised intimate relationships (i.e mistresses), slave-of, master-of, freedman-of, parton-of, client-of…
All in all that gets to approximately 90 relationships, plus a few more if we start including things like disciple-of and teacher-of.
This is not necessarily a problem in itself, although it does get a bit messy. It is at least nicely organised into a hierarchy and there are plenty of opportunities for adding disjunct and inverse property restrictions. However what we gain in the simplicity of the direct link we loose in sacrificing the possibility of relating additional information to the connection such as provenance, reference or certainty. If we model the relationship as a concept (i.e. a Class) rather than as a property connecting two entities they we immediately open up more possibilities.
There are three obvious ways to do this:
[Entity1] --<generic-linking-property>--> [Relationship Class] --<relationship-specification>--> [Entity2]
e.g. [Επιγόνη] –has-relationship–> [AcknowledgedRelationship] –daughter-of–> [Επίγονος]
[Entity1] --<generic-linking-property>--> [Relationship] --<generic-linking-property>--> [Entity2] --<generic-type-linking-property>--> [RelationshipSpecification]
e.g. [Επιγόνη] — has-relationship –> [AcknowledgedRelationship]
[Entity1] --<generic-linking-property>--> [Relationship Classes] --<generic-linking-property>--> [Entity2]
e.g. [Επιγόνη] — has-relationship –> [AcknowledgedRelationship, Daughter] — relationship-with –> [Επίγονος]
Although the first two of which could just as easily be modelled the other way around depending on where we preferred to put the emphasis:
[Επιγόνη] --has-relationship--> [Daughter] --acknowledged-with--> [Επίγονος]
[Επιγόνη] --has-relationship--> [Daughter] --relationship-with--> [Επίγονος] --relationship-type--> [AcknowledgedRelationship]
This is important because any additional information such as provenance, reference or certainty would be attached to the intermediary class and comes down to whether we see the hierarchy as being:
We can cut out some of this discussion by dropping the additional property and duel-classing the instance as shown in the third example. Expanding on that our class hierarchy would look like:
- HereditaryFamily (If anyone can think of a better term I am open to suggestions)
- Extended Family
- HereditaryFamily (If anyone can think of a better term I am open to suggestions)
- RelationshipQualifier (all disjoint with everything except HereditaryFamily classes)
- Half (disjount with everything except Sibling classes)
- RelationshipAxis (all disjoint with everything except ExtendedFamily classes)
- Inlaw (disjoint with everything except HereditaryFamily and ExtendedFamily classes)
Disjoints would be defined for the gender specific classes (Son/Daughter, Mother/Father, Aunt/Uncle etc) and for those that are impossible without the use of time travel (Child/Parent, Ancestor/Descendent etc) but given the period we are dealing with (Romans and Egyptians – I’m looking at you) it would be unwise to add any additional disjoints that we might otherwise consider between related people.
Of the options that use classes instead of, or in addition to properties, then this is the simplest. It tends to be bad design when you end up making everything a Class which is what we have ended up doing here. Equally we can go to far in the opposite direction in search of “simplicity” and the desire to have as few classes as possible. The intermediary options offer a combination of properties and classes but also raise some options as to how we want the emphasis of the encoding to lie. These are questions that we feel it would be better to open up to discussion by the wider community rather than just making an executive decision.
Option 1: All properties
<http://clas-lgpn2.classics.ox.ac.uk/id/V1-37074> &snap;daughter-of <http://clas-lgpn2.classics.ox.ac.uk/id/V1-40436>
Option 2a: Combination of Classes and Properties (classes defines the relationship, properties the specific relationship)
<http://clas-lgpn2.classics.ox.ac.uk/id/V1-37074> &snap;has-relationship [ a &snap;AcknowledgedRelationship; &snap;daughter-of <http://clas-lgpn2.classics.ox.ac.uk/id/V1-40436>] .
Option 2b: Combination of Classes and Properties (classes define the specific relationship, properties the relationship type)
<http://clas-lgpn2.classics.ox.ac.uk/id/V1-37074> &snap;has-relationship [ a &snap;Daughter; &snap;acknowledged-with <http://clas-lgpn2.classics.ox.ac.uk/id/V1-40436>]
Option 3a: Combination of Classes and Properties (emphasis on classes but with properties explicitly linking rather than duel classing, main class is the relationship)
<http://clas-lgpn2.classics.ox.ac.uk/id/V1-37074> &snap;has-relationship [ a &snap;AcknowledgedRelationship; &snap;relationship-with <http://clas-lgpn2.classics.ox.ac.uk/id/V1-40436> &snap;relationship-type &snap:Daughter] .
Option 3b: Combination of Classes and Properties (emphasis on classes but with properties explicitly linking rather than duel classing, main class is the specific relationship)
<http://clas-lgpn2.classics.ox.ac.uk/id/V1-37074> &snap;has-relationship [ a &snap;Daughter; &snap;relationship-with <http://clas-lgpn2.classics.ox.ac.uk/id/V1-40436> &snap;relationship-type &snap:Acknowledged] .
Option 4: All classes
<http://clas-lgpn2.classics.ox.ac.uk/id/V1-37074> &snap;has-relationship [ a &snap;AcknowledgedRelationship; a &snap;Daughter; &snap;relationship-with <http://clas-lgpn2.classics.ox.ac.uk/id/V1-40436>] .
I hope this post has clearly laid out the options as we see them and I’d like to invite your opinions and suggestions as to which way we go.