Tutorial: Jena Semantic Web Framework

Today I was poking around the Jena RDF Framework, mostly to figure out how one would use an OWL reasoner there.

The Data

My test ontology is fairly simple:

@prefix xsd:  <http://www.w3.org/2001/XMLSchema#> .
@prefix :     <http://www.whatever.com/#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix owl:  <http://www.w3.org/2002/07/owl#> .

:Person
      a       owl:Class .

:Opus
      a       owl:Class .

:Painting
      a       owl:Class ;
      rdfs:subClassOf :Opus .

:MasterPiece
      a       owl:Class ;
      rdfs:subClassOf :Painting .

:hasCreated
      a       owl:ObjectProperty .

:hasPainted
      a       owl:ObjectProperty ;
      rdfs:subPropertyOf :hasCreated .

:Genius
      a       owl:Class ;
      rdfs:subClassOf :Person ;
      owl:equivalentClass
              [ a       owl:Restriction ;
                owl:onProperty :hasCreated ;
                owl:someValuesFrom :MasterPiece
              ] .

The Genius class I only defined to force the reasoner to classify everything into it which is firstly an instance of Person and then also anything which also hasCreated a MasterPiece. To make it a bit harder on the inferencer, all instance data is using hasPainted as specialization of hasCreated:

:VanGogh
      a       :Person ;
      :hasPainted :TheOldMill ;
      :hasPainted :SunFlowers .

:TheOldMill
      a       :MasterPiece .

:SunFlowers
      a       :Painting .

Obviously, the TheOldMill is also an instance of Painting and since it is a MasterPiece I expected the inferencer to tell me that VanGogh is a Genius.

Inspecting an Ontology

Since I am completely new to Jena (and also new to the boring language), I thought it would be wiser to start with simple tests, such as loading the ontology first and then trying to inspect it:

import java.util.*;
import com.hp.hpl.jena.ontology.*;
import com.hp.hpl.jena.rdf.model.*;
import com.hp.hpl.jena.util.*;
...
OntModel onto = ModelFactory.createOntologyModel(
                   OntModelSpec.OWL_MEM, null );

With that I created first a completely empty ontology object, one which sits in memory (so it does not persist), and also one which does not include any cleverness in terms of inferencing. That is exactly what the OWL_MEM recipe within the OntModelSpec says.

After creating it, Jena would allow me to add classes and property definitions via code, but it is clearly more comfortable to take it from a document:

onto.read( "file:art.n3", "N3" );

Jena treats this onto object not only as ontology, but also as an RDF model, which it clearly is. In this sense I could use all the model methods for introspection. In addition to these there are also methods which convey the ontology aspect, as for instance listHierarchyRootClasses

import com.hp.hpl.jena.util.iterator.Filter;
...
Iterator i = onto.listHierarchyRootClasses()
        .filterDrop( new Filter() {
            public boolean accept( Object o ) {
            return ((Resource) o).isAnon();
            }} );

while (i.hasNext()) {
    System.out.println(i.next().toString());
}

While this iterator business looks a bit arcane (in Perl this would be all one single line), it seems to be the Java way to

  • extract the list of all top-level classes (those which are directly subclasses of owl::Thing), and
  • filter out those which are anonymous classes.

Here I simply chose to output what I found.

It took me some reading to figure out how I could get hold of the object for a particular OWL class. The scary monster

static final String artNS   = "http://www.whatever.com/#";
...
OntClass opus = (OntClass) onto.getResource (artNS + "Opus")
                               .as( OntClass.class );

can be thankfully reduced to a less insane

OntClass opus = onto.getOntClass (artNS + "Opus");

The reason I wanted this object is that I now can directly test (or ask) for particular subclasses

if (opus.hasSubClass (onto.getOntClass (artNS + "Painting"))) {
    ...

or for other interesting things about an OntClass.

I was not overly surprised to see that this did not write out hello subclass here:

if (opus.hasSubClass (onto.getOntClass (artNS + "MasterPiece"))) {
    System.out.println("hello subclass");

For that the onto would have to do at least taxonometric reasoning, something I have not yet turned on.

Taxonometric Reasoning

This also holds true for finding instances. To experiment with these, I could create some like this

onto.createIndividual (artNS + "SunFlowers",
                       onto.getOntClass (artNS + "MasterPiece"));

or better load them again from a file:

onto.read( "file:paintings.n3", "N3" );

As expected, listing all instances of an opus will not reveal much:

for (Iterator bs = opus.listInstances(); bs.hasNext(); ) {
     System.out.println("instance " + bs.next().toString());
}

Actually nothing at all. This makes all sense if you need the ontology as is, i.e. the pure, asserted information.

As soon as you want to have Jena honor the subclass transitivity you will have to create your onto slightly differently in the first place:

OntModel onto = ModelFactory.createOntologyModel(
                    OntModelSpec.OWL_MEM_MICRO_RULE_INF, null );

The OWL_MEM_MICRO_RULE_INF from the OntModelSpec makes Jena use the micro OWL rules inference engine which seems sufficient for doing that trick, as

for (Iterator bs = opus.listInstances(); bs.hasNext(); ) {
     System.out.println("instance " + bs.next().toString());
}

now returns all instances of Opus, direct or indirect ones.

Jena provides several more different levels of sophistication regarding reasoning, but I stopped there.

Real Reasoners

Instead I wanted to test whether a real reasoner would find VanGogh to be a Genius. To set up this experiment one has to follow the architecture in

http://kill.devc.at/system/files/reasoning.jpg

To quote the Jena documentation

Applications normally access the inference machinery by using the ModelFactory to associate a data set with some reasoner to create a new Model. Queries to the created model will return not only those statements that were present in the original data but also additional statements than can be derived from the data using the rules or other inference mechanisms implemented by the reasoner.

So, to get one of these reasoners, the ReasonerRegistry is the place to go:

Reasoner reasoner = ReasonerRegistry.getOWLMiniReasoner();

To tailor the generic reasoner to the rules of our ontology both have to be combined to create a new reasoner:

reasoner = reasoner.bindSchema (onto);

To use that together with instance data we have to load that into a separate model:

Model instances = ModelFactory.createDefaultModel();
instances.read ("file:art.n3", "N3");

I also found an alternative way to load such a model by using the FileManager utility:

import com.hp.hpl.jena.util.FileManager;
...
FileManager.get().loadModel("file:art.n3");

That is actually quite clever as it supports encoding, caching and mapping of URIs to files to avoid constant dereferencing of documents over the Internet.

Now it is time to duct-tape the reasoner and the instance data together:

InfModel model = ModelFactory.createInfModel (reasoner, instances);

The model is virtual in the sense that it not only contains the triples from the instance data, but also all inferred ones as mandated by the reasoner. Interestingly enough, any instance data within your onto object is also honored. In our case we already had loaded instance data there.

As first cautious experiment we walk through the triples. Not all of them, but only those which emerge from the subject VanGogh:

Resource r = model.getResource (artNS + "VanGogh");
for (StmtIterator sti = model.listStatements(r, null, (RDFNode) null);
     sti.hasNext(); ) {
     Statement stmt = sti.nextStatement();
     System.out.println(" - " + PrintUtil.print(stmt));
}

To make the PrintUtil visible, I had to load

import com.hp.hpl.jena.util.*;

first. As output I get exactly what I wanted to see (shortened the URI):

- (#VanGogh #hasPainted #SunFlowers)
- (#VanGogh #hasPainted #TheOldMill)
- (#VanGogh rdf:type #Person)
- (#VanGogh #hasCreated #SunFlowers)
- (#VanGogh #hasCreated #TheOldMill)
- (#VanGogh rdf:type owl:Thing)
- (#VanGogh rdf:type -54b8320:1166309f0d2:-7ffb)
- (#VanGogh rdf:type rdfs:Resource)
- (#VanGogh rdf:type #Genius)
- (#VanGogh owl:sameAs #VanGogh)

Well, at least mostly.

Using SPARQL

Iterating with a statement template through the whole triple set is not overly comfortable, so let us try to test the SPARQL support of Jena. Setting up a query is slightly cumbersome if you are not conditioned to the usual Java drivel:

import com.hp.hpl.jena.query.*;

QueryExecution qe = QueryExecutionFactory.create (
                   "SELECT ?opus "+
                   "WHERE { "  +
                   "   <http://www.whatever.com/#VanGogh> " +
                   "   <http://www.whatever.com/#hasCreated> ?opus}",
           model);

(I will probably never understand why all the imports have to be at the beginning of the class file and why Java does not seem to have multi-line strings.)

In any case, the query execution thing can be executed to give a ResultSet iterator:

ResultSet rs = qe.execSelect();

All elements of this iterator are bindings for SPARQL variables. To access them is quite straightforward:

for (ResultSet rs = qe.execSelect() ; rs.hasNext() ; ) {
     QuerySolution binding = rs.nextSolution();
     System.out.println("Opus: " + binding.get("opus"));
}

And it also returns all the instances, direct or indirect.

More importantly, the reasoner also figured correctly that VanGogh is the Genius, so also the SPARQL query

SELECT ?person
WHERE {
  ?person a  <http://www.whatever.com/#Genius>
}

worked:

Genius: http://www.whatever.com/#VanGogh

Goodie.

AttachmentSize
reasoning.jpg8 KB
Posted In

Regarding the iterator

Regarding the iterator business: This really is almost unbearably ugly. I hate writing this sort of Java code, but often there is no more reasonable way.

Another thing: import in Java is actually a misnomer. It doesn't really import anything, it just adds another namespace to be searched for class names during compilation. That's all.

Anonymous | Wed, 11/21/2007 - 23:47

Oh, and by the way, I'm Lars

Oh, and by the way, I'm Lars Marius Garshol. I didn't really intend to be anonymous, but I seem to have no choice.

Anonymous | Wed, 11/21/2007 - 23:48

Re: Anonymous

I switched on Subject: and stuff in Drupal.

rho | Fri, 11/23/2007 - 18:40

Jena 2.5.3 does not return VanGogh as a Genius

I have not been able to get Jena to return VanGogh as a Genius.

Any idea why this might be? I am using Jene 2.5.3.

Benito (not verified) | Wed, 07/23/2008 - 03:19

Re: Jena 2.5.3 does not return VanGogh as a Genius

I wrote this while testing 2.5.4.

I cannot see anything in the release notes which would affect the behaviour, so I would suspect a problem with your program.

rho | Wed, 07/23/2008 - 06:08

Is there any other queries that may show up the problem?

Perhaps there is some other SWRL that may reveal the problem. Any sugestions? All the other queries and outputs look identical.

Benito (not verified) | Thu, 07/24/2008 - 02:56