Integration of function executions within RML
Paper accepted at ESWC 2017

About

Mapping languages allow us to define how Linked Data is generated from raw data, but only if the raw data values can be used as is to form the desired Linked Data. Since complex data transformations remain out of scope for mapping languages, these steps are often implemented as custom solutions, or with systems separate from the mapping process. The former data transformations remain case-specific, often coupled with the mapping, whereas the latter are not reusable across systems.

Hence, we propose an approach where data transformations

  1. are defined declaratively and
  2. are aligned with the mapping languages.

We employ an alignment of data transformations described using the Function Ontology (FnO) and mapping of data to Linked Data described using the RDF Mapping Language (RML). Our approach is not case-specific: data transformations are independent of their implementation and thus interoperable, while the functions are decoupled and reusable. This allows developers to improve the generation framework, whilst contributors can focus on the actual Linked Data, as there are no more dependencies, neither between the transformations and the generation framework nor their implementations.

This approach has been used to map and transform DBpedia in a declaratively defined and aligned way.

Example

We generate Linked Data using an RML document, that generates ex:Person-resources.

                    
<#Person_Mapping>
    rml:logicalSource <#LogicalSource> ;      # Specify the data source
    rr:subjectMap <#SubjectMap> ;             # Specify the subject
    rr:predicateObjectMap <#NameMapping> .    # Specify the predicate-object-map

<#NameMapping>
    rr:predicate dbo:title ;                  # Specify the predicate
    rr:objectMap [
        rr:reference "name"                   # Specify the reference within the data source
    ] .
                    
                

However, the original data values are not yet conform to the requested Linked Data format, namely, the names are not correctly cased. Thus, we employ a function within the RML document.

The FnO description of the function toUppercase is as follows:

                    
grel:toUppercase a fno:Function ;
    fno:name "upper case" ;
    dcterms:description "return the input string in upper case" ;
    fno:expects ( [ fno:predicate grel:stringInput ] ) ;
    fno:output ( [ fno:predicate grel:stringOutput ] ) .
                    
                

The execution of such function would be described as follows:

                    
:exe a fno:Execution ;
    fno:executes grel:toUppercase ;
    grel:stringInput "This is an input STRING." ;
    grel:stringOutput "THIS IS AN INPUT STRING." .
                    
                

To connect this function with the RML mapping document, we make use of a fnml:FunctionMap:

The fnml namespace is hosted at http://semweb.mmlab.be/ns/fnml#

                    
<#Person_Mapping>
    rml:logicalSource <#LogicalSource> ;                  # Specify the data source
    rr:subjectMap <#SubjectMap> ;                         # Specify the subject
    rr:predicateObjectMap <#NameMapping> .                # Specify the predicate-object-map

<#NameMapping>
    rr:predicate dbo:title ;                              # Specify the predicate
    rr:objectMap <#FunctionMap> .                         # Specify the object-map

<#FunctionMap>
    fnml:functionValue [                                  # The object is the result of the function
        rml:logicalSource <#LogicalSource> ;              # Use the same data source for input
        rr:predicateObjectMap [
            rr:predicate fno:executes ;                   # Execute the function…
            rr:objectMap [ rr:constant grel:toUppercase ] # grel:toUppercase
        ] ;
        rr:predicateObjectMap [
            rr:predicate grel:inputString ;
            rr:objectMap [ rr:reference "name" ]          # Use as input the "name" reference
        ]
    ] .
                    
                

Before the name-value is referenced, the value is first used as grel:inputString-parameter for the grel:toUppercase-function. The output of that function is then used as object within the <#NameMapping>

A full example of a DBpedia mapping file can be found below.


@prefix rr: <http://www.w3.org/ns/r2rml#>.
@prefix rml: <http://semweb.mmlab.be/ns/rml#>.

<#WikiSource> rml:source "http://en.wikipedia.org/wiki/Mapping_en:Infobox_country?oldid=35237";
    rml:referenceFormulation <http://semweb.mmlab.be/ns/ql#wikitext>;
    rml:iterator "Infobox" .

<http://en.dbpedia.org/resource/Mapping_en:Infobox_country> rml:logicalSource <#WikiSource>;
  rr:subjectMap [ rr:constant "http://en.dbpedia.org/resource/{{wikititle}}" ];
  rr:predicateObjectMap [
    rr:predicate <http://dbpedia.org/ontology/demonym>;
    rr:objectMap [
      <http://semweb.mmlab.be/ns/fnml#functionValue> [
        rr:subjectMap [];
        rr:predicateObjectMap [
          rr:predicate <http://dbpedia.org/function/propertyParameter>;
          rr:objectMap [ rml:reference "demonym" ]
        ], [
          rr:predicate <http://dbpedia.org/function/dataTypeParameter>;
          rr:objectMap [ rr:constant "rdf:langString" ]
        ], [
          rr:predicate <http://w3id.org/function/ontology#executes>;
          rr:objectMap [ rr:constant <http://dbpedia.org/function/simplePropertyFunction> ]
        ];
        rml:logicalSource <#WikiSource>
      ]
    ]
  ]
] .