This article is related to the github project : https://gitlab.forge.orange-labs.fr/dhbt3263/ziggy-enabler
This project aims to build a solid SDK to help services, clients and other platforms to build their own injector in order to populate their object to Thing'in.
This library will help you to :
About data injection, basically, everytime we inject data into Thing'in through an enabler we do a 3 steps process, this could be seen as the good old Extract Transform Load (ETL) from already existing systems:
Right now we do not support the ingest step, it is solely your job. Most of the time this step is very specific to your own needs, you may extract data from many different sources and apply specific normalisation to aggregate the different sources, we lack the tools right now to help you at this specific point. We will put example into the project to give you ideas about how we work on our side when we want to crawl from open data but once again, this is specific to your needs.
Once you have built your unique aggregated set of data we will ask you to find (or build if necessary) an ontology to describe your data and a mapping which we will use to "semantify" your JSON data as OWL-Turtle. At this point give the mapping and the data to the enabler, you have done your homework, the enabler will take of the rest .
Inside the enabler we make use of a mapper which will transform your JSON data into Turtle on-the-fly before inserting it to Thing'in. The idea behind using a mapper and to do automatic conversion is that given enough time and adjustements, we will have a mapper expressive enough to handle most of the injection cases and most importantly it will be practical enough to be used by basically everybody.
Right now we can handle data in simple forms like List. We can handle structured data like Trees. Unfortunately we don't have the tools to manipulate Graph yet, we are working on it.
First things first, if you're not familiar with the Semantic World or you want a little reminder, go check this article : [[The semantic world in Easy Mode]]
So, as said into the article, you must see your data as a graph, meaning you have a bunch of vertices connected together through edges. You can also have no edge at all, it is fine, it is just awkward to use a graph database to store data with no relation but we will manage. Every vertice is an instance of a class with its own set of attributes and edges.
You have a lot of things to look for, i know, and i looks really bothersome to you right now, i know that too. That is why we created this service for you : ''O''ntology ''L''ookup ''S''ervice (''OLS''). OLS is a research engine which will help you find ontologies which are already integrated and ready to use into Thing'in. You can find it the portal at Explore > Explore Ontology Lookup Service.
Pay attention to the keywords you are using and make sure to try different kind of vocabulary to find concepts which may interest you, sometimes the terminology used in some ontology is awkward so make sure to not miss anything. As most of you users are expected to populate connected object (it is the goal of thing'in after all), as soon as you need to describe device like sensors / connected object, go check saref.owl and dogont.owl.
A few advices, once you find an ontology which seems to meet your needs, ask yourself the following questions in the given order :
Do i have every needed classes for my data ?
For each of those classes, can i translate each of my attributes into datatypes properties ? and can i translate each of my object properties into object properties ?
Make sure to check the range of the datatype properties, the range for a datatype property is the expected value datatype (for datatype property uuid in iot.owl, the range is a string because we expect to store string with the property). The same can be said for the object properties, when you link two instances together with an object property check that you have the right to do so given its domain and range.
At last make sure you use the ontology the way it has been expected to, sometimes you will see an ontology which seems perfectly fine for you but in fact you are using it wrong. As we were injecting data to describe a building the dogont.owl ontology, we were using the #Floor class to link together rooms on the same ... floor, but that was wrong. Let's take a very simple conversation :
A :"I need to speak to Michael, where is his office ?"
B :"Third floor, second door on the left"
Semantically speaking, it is perfectly fine to use the "floor" word here, but in the dogont ontology a floor is specific to a room, not to a ''storey'' and that is exactly what we were missing. We were not respecting the semantic defined into the Dogont ontology, it is only after that we came to the conclusion (because of the way the #Floor class is described) that we were wrong, so make sure you do not make the same mistake as we did and check for any possible synonym of the class you use and take your time to really understand how this ontology works.
In the case you have one or more ontology but at the end you are still missing concepts, if those ontologies does not belong to you, you will have to build a new ontology which will be based on those you have seen as useful. Hopefully you will not have to start from scratch and the amount of work will be limited. On the contrary, if you have nothing you can start from, it will be a bit slower.
To be done.
If you are here, we consider you have an ontology ready to use, what is left to you is to write a proper mapping file. To make things easier, we will start from an example :
{
"skeleton": [
{
"_mapping_id": "display",
"room": {
"_mapping_id": "room",
"floor": {
"_mapping_id": "storey",
"building": {
"_mapping_id": "building"
}
},
"building": {
"_mapping_id": "building"
}
},
"floor": {
"_mapping_id": "storey",
"building": {
"_mapping_id": "building"
}
},
"building": {
"_mapping_id": "building"
}
}
],
"display": {
"_id": {
"static": "http://orange-labs.fr/fog/ont/og-display.owl#Display-",
"param": "id"
},
"_class": {
"field_dependent": false,
"value": "http://orange-labs.fr/fog/ont/og-display.owl#Display"
},
"_object_properties": [
{
"generate_id": true,
"_mapping_id": "room",
"field": "room",
"object_property_ori": "http://orange-labs.fr/fog/ont/og-display.owl#isIn"
},
{
"generate_id": true,
"_mapping_id": "storey",
"field": "floor",
"object_property_ori": "http://orange-labs.fr/fog/ont/og-display.owl#isIn"
},
{
"generate_id": true,
"_mapping_id": "building",
"field": "building",
"object_property_ori": "http://orange-labs.fr/fog/ont/og-display.owl#isIn"
}
],
"id": {
"datatype_property_ori": "http://orange-labs.fr/fog/ont/og-display.owl#hasID",
"type": "string"
},
"type": {
"datatype_property_ori": "http://orange-labs.fr/fog/ont/og-display.owl#hasType",
"type": "string"
},
"name": {
"datatype_property_ori": "http://orange-labs.fr/fog/ont/og-display.owl#hasName",
"type": "string"
},
"lte": {
"datatype_property_ori": "http://orange-labs.fr/fog/ont/og-display.owl#hasLTE",
"type": "string"
},
"idefacilities": {
"datatype_property_ori": "http://orange-labs.fr/fog/ont/og-display.owl#hasIDFacilities",
"type": "string"
},
"state": {
"datatype_property_ori": "http://orange-labs.fr/fog/ont/og-display.owl#hasState",
"type": "integer"
}
},
"room": {
"_id": {
"static": "http://elite.polito.it/ontologies/dogont.owl#Room-",
"param": "id"
},
"_class": {
"field_dependent": false,
"value": "http://elite.polito.it/ontologies/dogont.owl#Room"
},
"_object_properties": [
{
"generate_id": true,
"_mapping_id": "storey",
"field": "floor",
"object_property_ori": "http://elite.polito.it/ontologies/dogont.owl#isIn"
},
{
"generate_id": true,
"_mapping_id": "building",
"field": "building",
"object_property_ori": "http://elite.polito.it/ontologies/dogont.owl#isIn"
}
],
"id": {
"datatype_property_ori": "http://orange-labs.fr/fog/ont/og-display.owl#hasID",
"type": "string"
}
},
"building": {
"_id": {
"static": "http://elite.polito.it/ontologies/dogont.owl#Building-",
"param": "id"
},
"_class": {
"field_dependent": false,
"value": "http://elite.polito.it/ontologies/dogont.owl#Building"
},
"_object_properties": [],
"id": {
"datatype_property_ori": "http://orange-labs.fr/fog/ont/og-display.owl#hasID",
"type": "string"
}
},
"storey": {
"_id": {
"static": "http://elite.polito.it/ontologies/dogont.owl#Storey-",
"param": "id"
},
"_class": {
"field_dependent": false,
"value": "http://elite.polito.it/ontologies/dogont.owl#Storey"
},
"_object_properties": [
{
"generate_id": true,
"_mapping_id": "building",
"field": "building",
"object_property_ori": "http://elite.polito.it/ontologies/dogont.owl#isIn"
}
],
"id": {
"datatype_property_ori": "http://orange-labs.fr/fog/ont/og-display.owl#hasID",
"type": "string"
}
}
}
It may seem impressive but once you break it into several parts, it is not that big.
At first, you have the "skeleton" block which describes how the JSON payload you will give to the mapper will be structured:
"skeleton": [
{
"_mapping_id": "display",
"room": {
"_mapping_id": "room",
"floor": {
"_mapping_id": "storey",
"building": {
"_mapping_id": "building"
}
},
"building": {
"_mapping_id": "building"
}
},
"floor": {
"_mapping_id": "storey",
"building": {
"_mapping_id": "building"
}
},
"building": {
"_mapping_id": "building"
}
}
],
The value of the skeleton starts with a list ''skeleton:[]'', it means the payload will be a list of items.
Each of those items will be JSON Object : ''skeleton:[{}]''.
Every time you enter a new JSON Object in the skeleton, you must set which sub mapping to apply with the ''_mapping_id'', example : ''skeleton:[{ '_mapping_id' : 'display' }]'' and every sub mapping must be defined at the same level of the skeleton declaration : ''{ 'skeleton':[{ '_mapping_id' : 'display' }], 'display':{}}''.
Lastly, you can embed multiple mapping into each other. if we follow the whole chain for the display we have ''display > room > storey > building''. It doesn't necessary mean the data from the upper mapping will be linked with data in the under mapping, it has to be defined into the sub mappings.
Let's take for example the sub mapping of display.
"display" : {
"_id" : {
"static" : "http://orange-labs.fr/fog/ont/og-display.owl#Display-",
"param" : "id"
},
"_class" : {
"field_dependent" : false,
"value": "http://orange-labs.fr/fog/ont/og-display.owl#Display"
},
"_object_properties" : [
{ "generate_id" : true, "_mapping_id": "room", "field" : "room", "object_property_ori" : "http://orange-labs.fr/fog/ont/og-display.owl#isIn" },
{ "generate_id" : true, "_mapping_id": "storey", "field" : "floor", "object_property_ori" : "http://orange-labs.fr/fog/ont/og-display.owl#isIn" },
{ "generate_id" : true, "_mapping_id": "building", "field" : "building", "object_property_ori" : "http://orange-labs.fr/fog/ont/og-display.owl#isIn" }
],
"id": {"datatype_property_ori": "http://orange-labs.fr/fog/ont/og-display.owl#hasID", "type": "string"},
"type": {"datatype_property_ori": "http://orange-labs.fr/fog/ont/og-display.owl#hasType", "type": "string"},
"name": {"datatype_property_ori": "http://orange-labs.fr/fog/ont/og-display.owl#hasName", "type": "string"},
"lte": {"datatype_property_ori": "http://orange-labs.fr/fog/ont/og-display.owl#hasLTE", "type": "string"},
"idefacilities": {"datatype_property_ori": "http://orange-labs.fr/fog/ont/og-display.owl#hasIDFacilities", "type": "string"},
"state": {"datatype_property_ori": "http://orange-labs.fr/fog/ont/og-display.owl#hasState", "type": "integer"}
}
Every attributes which starts with an underscore (''"_"'') are ''mandatory''. Those are :
*_id : Define the way you will build the IRI (Individual Resources Identifier) of the current item
*_class : Define which class you will affect to the current item
*_object_properties : Define the bindings with other item and how to retrieve their corresponding IRI
Every other attributes are dedicated to handle the datatype property, you may have already figured this out but if we do not handle graph yet, it is because of the structure of the mapping which has a tree approach to describe data. Soon enough it will be reworked to handle cyclic, thus graph, data.
"_id" : {
"static" : "http://orange-labs.fr/fog/ont/og-display.owl#Display-",
"param" : "id"
}
Here, the ''static'' attributes allows you to enter a prefix to use for the item and ''param'' an attribute into the item to use in order to make each item unique. Remember folks, no duplicates are
allowed into the Semantic World when it comes to IRI. In that particular case, ''id'' is an attribute already existing into the dataset which we are trying to convert, it could be a totally different fields and if needed you may have to generate your own ids in the ingestor step.
"_class" : {
"field_dependent" : false,
"value": "http://orange-labs.fr/fog/ont/og-display.owl#Display"
}
There are two possibilities when it comes to setting class to an item. Either it is static or dynamic. If it is static, you know that whatever the case, it will always be of the same Ontology class. If it is dynamic, you need to know which class apply in which situation and you must beforehand add markers in the ingestor step. A dynamic class configuration would look like this :
"_class" : {
"field_dependent" : true,
"field" : "model",
"map": {
"fitbit:charge 2": "http://orange-labs.fr/fog/ont/datashare.owl#FitBit",
"hue:LCT001": "http://orange-labs.fr/fog/ont/datashare.owl#PhilipsHue",
"hue:bridge": "http://orange-labs.fr/fog/ont/datashare.owl#PhilipsBridge",
"hue:PHDL00": "http://orange-labs.fr/fog/ont/datashare.owl#PhilipsSensor",
"lifx:lifx_color_a19": "http://orange-labs.fr/fog/ont/datashare.owl#Lifx",
"netatmo:NAWeatherStation": "http://orange-labs.fr/fog/ont/datashare.owl#WeatherStationSet",
"netatmo:NAModule1": "http://orange-labs.fr/fog/ont/datashare.owl#WeatherStationMain",
"netatmo:NAMain": "http://orange-labs.fr/fog/ont/datashare.owl#WeatherStationModule"
}
}
This is the configuration we use when injecting objects taken from datashare. Into the item there is a "model" attribute and we do a simple mapping, if the value of model is "fitbit:charge 2" we will apply the class "http://orange-labs.fr/fog/ont/datashare.owl#FitBit".
"_object_properties" : [
{ "generate_id" : true, "_mapping_id": "room", "field" : "room", "object_property_ori" : "http://orange-labs.fr/fog/ont/og-display.owl#isIn" },
{ "generate_id" : true, "_mapping_id": "storey", "field" : "floor", "object_property_ori" : "http://orange-labs.fr/fog/ont/og-display.owl#isIn" },
{ "generate_id" : true, "_mapping_id": "building", "field" : "building", "object_property_ori" : "http://orange-labs.fr/fog/ont/og-display.owl#isIn" }
],
Whenever you build an object property between two items, you need to ask yourself: Is the IRI of the targeted item generated or not? If it is generated, it means you will have to find it or at least give a way to reconstruct it, thus precising which mapping to apply on which attribute. The mapper will descend into the appropriate field with the given mapping and regenerate the IRI in order to build the object property.
In some cases it can be static, simply because you have immuable individuals which are strongly described directly into an ontology as OWLNamedIndividual. To bind towards those individuals, do the following :
"_object_properties" : [
{ "generate_id" : true, "_mapping_id": "room", "field" : "room", "object_property_ori" : "http://orange-labs.fr/fog/ont/og-display.owl#isIn" },
{ "generate_id" : true, "_mapping_id": "storey", "field" : "floor", "object_property_ori" : "http://orange-labs.fr/fog/ont/og-display.owl#isIn" },
{ "generate_id" : true, "_mapping_id": "building", "field" : "building", "object_property_ori" : "http://orange-labs.fr/fog/ont/og-display.owl#isIn" }
{
"generate_id" : false,
"field" : ''"capabilities.data"'',
"object_property_ori" : "https://w3id.org/saref#IsUsedFor",
"map": {
"/devices/battery/level": "http://orange-labs.fr/fog/ont/datashare.owl#BatteryLevelProperty",
"/me/activity/steps": "http://orange-labs.fr/fog/ont/datashare.owl#StepsProperty",
"/me/activity/distance": "http://orange-labs.fr/fog/ont/datashare.owl#DistanceProperty",
"/me/activity/energy": "https://w3id.org/saref#Energy",
"/me/activity/elevation": "http://orange-labs.fr/fog/ont/datashare.owl#ElevationProperty",
"/me/body/heart/rate": "http://orange-labs.fr/fog/ont/datashare.owl#RateProperty",
"/me/sleep": "http://orange-labs.fr/fog/ont/datashare.owl#TimeProperty",
"/outdoor/air/temperature": "https://w3id.org/saref#Temperature",
"/outdoor/air/humidity": "https://w3id.org/saref#Humidity",
"/indoor/air/temperature": "https://w3id.org/saref#Temperature",
"/indoor/air/co2": "http://orange-labs.fr/fog/ont/datashare.owl#Co2Property",
"/indoor/air/humidity": "https://w3id.org/saref#Humidity",
"/indoor/air/pressure": "https://w3id.org/saref#Pressure",
"/indoor/noise": "http://orange-labs.fr/fog/ont/datashare.owl#NoiseProperty",
"/lights/state": "https://w3id.org/saref#Light"
}
}
],
Once again, it is an example taken from Datashare, note the capabilities.data; it is a specific case where you have an embedded JSON Object into an item but there is no need to apply a specific mapping to it because it would mean building a separate individual. In this case the structure looks like this :
{
"capabilities" : {
"data" : [ "val1", "val2", "val3" ]
}
}
To handle that, you need need to set a point .
to mark the access to an attribute which is indide an embedded JSON object.
"id": {
"datatype_property_ori": "http://orange-labs.fr/fog/ont/og-display.owl#hasID",
"type": "string"
},
Nothing much to say, here we are applying the datatype property ''"http://orange-labs.fr/fog/ont/og-display.owl#hasID"'' to the attribute ''"id"'' and we indicated that we must find a string value.
"_location": {
"latitude": "gpscoord2",
"longitude": "gpscoord1"
},
"_visibility": 0,