CatBert on F{}OWL Ontologies (Pilot)

My CatBert has become quite recalcitrant lately. From what I understood from its musings, it is all because of the bad jokes I used to crack about it. And also the cruel schroedinger'sch thought experiments I conducted.

Virtually conducted, mind you. But people never believe me anything.

http://kill.devc.at/system/files/angry-cat.jpg

After some long, heated debate I conceded CatBert to vent its anger, even publicly in this very blog. So what follows is CatBert's first public statement. Apologies for the talk-down style, but as you know: a cat is a cat is a cat.

And hence CatBert writes:


We cat-a-log-icians know that most humans have a rather skewed perception of the universe. While in daily life that thankfully stays at the subliminal level, it becomes annoyingly apparent when one looks at human-made ontologies.

Random Example Ontology

Let us, for instance, look at the gastronomy ontology which has been created as part of the eb-semantics initiative. It is supposed to further eCommerce along the Semantic Web paradigm.

Side note: You may have to fix the broken OWL file before importing it into any tool.

Un-RDFish Relationalism

Some example instance data on that page reveals facts about a well-known Cafe in Vienna:

:Cafe_Einstein a gastro:Inn .

That by itself looks harmless, but then I find

:Cafe_Einstein gastro:town     "Wien" .
:Cafe_Einstein gastro:street   "Rathausplatz" .
:Cafe_Einstein gastro:streetNr "4" .

Now, it may be unwise to name properties town or street as these nouns could mislead someone to believe that they are meant as classes. But it surely is unwise to not model this obvious N-ary relation with a blank node, say of type address. Amazingly enough, the ontology contains a property hasAddress. Eeek.

This relational database style of data modelling happily continues:

:Cafe_Einstein gastro:wlan      "false" .
:Cafe_Einstein gastro:billard   "false" .

That may not be overly pretty as it begs the question what the difference is between the property existing and having a boolean value, the property not existing (welcome to OWA) and the property existing with conflicting values, such as true and false. The ontology itself remains silent on that.

And does this really mean what it says?

:Cafe_Einstein gastro:liveMusic "true" .

Namely, that there is live music? All the time? Just like the billard table, or the WLAN? And what if I wanted to add information about the band?

Also the simplistic

:Cafe_Einstein noSmoking "false" .
:Cafe_Einstein noSmoking "true" .

is exactly that: simplistic, at least according to the current legislation which allows businesses to have smoker and non-smoker departments.

And, hurray, there are also data properties gastro:website and gastro:picture. If you do not know why this is funny, then, well, you are obviously no cat-alogician.

Ungeographical

Then it loosely associates the cafe with a city:

:Cafe_Einstein gastro:belongsToRegion eb:Wien .

Clearly this is something geographical, so one might have expected an geo:isLocatedIn property or similar. But for some reason that did not happen. Instead, the property gastro:belongsToRegion is really highly specialized:

gastro:belongsToRegion rdfs:domain gastro:Inn .
gastro:belongsToRegion rdfs:range  gastro:Region .

even more reducing the chance of a successful merge.

Finally Wien gets its label. But not via the predefined label; no, via another dedicated name property:

eb:Wien a gastro:Region .
eb:Wien gastro:name "Wien" .

Sheesh.

Atypical Taxonometry

But my hair gets all spiky whenever I see properties named hasWhateverKind or hasWhateverCategory:

:Cafe_Einstein gastro:hasInnCategory gastro:Cafe .

In the light that the whole point of these OWL ontologies is to categorize things into classes, a hasCategory reveals some, uhm, massive misconceptions.

What I would have understood is a straightforward

:Cafe_Einstein a gastro:Cafe .

the Cafe being a subclass of Inn. But, no, it actually gets worse, when you consult the OWL ontology. There the Cafe is an instance of a category:

gastro:Cafe     a gastro:InnCategory .
gastro:Pizzeria a gastro:InnCategory .

All this bypasses the whole taxonometric mechanism of OWL. Urgh, I mean, miau.

And the Ontology?

Ok, first there is a harmless redundancy, namely

Inn rdfs:subclass Thing .
Inn rdfs:subclass LocationOfSalesOrServiceProvisioning .

But I seriously got confused by the (good being the prefix for the external GoodRelations ontology. I will deal with that one later.):

gastro:Company rdfs:subClassOf good:BusinessEntity .

In which way then is the Inn connected with the Company. Or is it? If not, why not?

Then the ontology goes on to say something about a FoodOrBeverage class:

gastro:Beverage rdfs:subclass gastro:FoodOrBeverage .
gastro:Food     rdfs:subclass gastro:FoodOrBeverage .

Are these two now meant to be disjoint? Probably, but then I also would miss a covering axiom, so that both subclasses together exhaust their superclass.

But outright humanly-bizarre is that

gastro:FrenchCuisine a gastro:Cuisine .
gastro:Cuisine rdfs:subclass FoodOrBeverageCategory.

It makes my whiskers curl.

Only Dog Food?

If this is meant as a water-tight industry ontology, then it has missed its mark by several underwater miles. It is

  • using confusing concepts,
  • introducing its homemade taxonometric terms,
  • and has no single DL axiom to somewhat beat semantics into it.

Worse, it has all the hallmarks of good-n-old relational DB modelling. Welcome to the 1980ies. Now there is a challenge for ontology mediation.

Humans. Pah.

And now back to that slave who is supposed to feed me.


AttachmentSize
angry-cat.jpg21.27 KB
iswc-lightning-talk-hepp3-small.png59.83 KB
Posted In

Cuisine Type

Very interesting comments to the gastro-ontology. Can you explain in more detail

gastro:FrenchCuisine a gastro:Cuisine .
gastro:Cuisine rdfs:subclass FoodOrBeverageCategory.

how you would design this concept? Do you see cuisine as a physical object and therefore not subclass of foodorbeveragecategory?

Christoph Grün (not verified) | Fri, 01/23/2009 - 14:13

Re: Cuisine Type

how you would design this concept? Do you see cuisine as a physical object and therefore not subclass of foodorbeveragecategory?

Took me a while to get this out of CatBert (he sleeps a lot). He says that a gastro:Cuisine never can be something physical. He says physical things are overrated anyway.

He continues to lecture me that gastro:Cuisine characterizes a collection of ConsumableThings, which - he continues - links that directly into the OWL Guide.

A collection is - of course - modelled in OWL as class, so

gastro:Cuisine rdfs:subclass food:ConsumableThing .

The french cuisine would select a subset from the cuisine, so this is another subclassing.

Then he made a long pause, indicating that something wise is supposed to follow:

  • "Modelling with OWL is modelling with classes. This is the only valuable concept one has. Of course, you can stray from the Golden Path and model your own collections, categories, etc. But then you throw out of the window the only valuable asset you have in OWL."

Maybe CatBert has a point here. But maybe this is exactly OWL's Achilles heel. That happens when you model the world with Duplo.

rho | Sun, 01/25/2009 - 11:22

Categories

Nice greetings to your cat, if it is the next time awake. I would have another question and I hope that it is not bored with it too much. If I have subclasses of accommodation, such as:
- accommodation
- Apartment
- BedAndBreakfast
- Hostel
- Hotel
- ect...

your cat proposes not to model categories as own class, but as restriction to the existing classes (subsets). Is it a good way to define the different star-categories (1-star, two-star, etc...) as direct sublcasses of accommodation?
such as:
- Accommodation
- Apartment
- BedAndBreakfast
- Hostel
- Hotel
- Accommodation1Star
- Accommodation2Star
- etc...

... if we do not consider the different star-categories for the different types (hotel has another star category than hostel, for example).

Thanks a lot!

Christoph Grün (not verified) | Mon, 01/26/2009 - 12:00

Re: Categories

I shortly talked with CatBert, just before he disappeared into the garden hunting. Hunting bad ontologies, my guess.

your cat proposes not to model categories as own class, but as restriction to the existing classes (subsets). Is it a good way to define the different star-categories (1-star, two-star, etc...) as direct sublcasses of accommodation?

CatBert says "that's the way", but "you have to be clever".

What he means, I think, is that Accommodation1Star should not forcibly be subclassed under Accomodation, but rather defined via an

  • there exists at least one "rating with one star"

definition (using DL).

The rating property would then go from Accomodation to a value partition (or a defined set). Implicitely so the above class would be become a subclass of Accomodation.

There is, of course, the issue of having several rating properties for one particular accomodation, but rating could be made functional...

I'll check with CatBert when (if) he comes back.

rho | Mon, 01/26/2009 - 21:27

Cuisine

I'm afraid that you misunderstood Catbert here. His initial charactarization is that "a cuisine can never be something physical". But then you go on to make Cuisine a subclass of food:ConsumableThing, which contradicts what Catbert just said. Every instance of ConsumableThing is physical, so every instance of any of its subclasses (such as gastro:Cuisine) also are.

I agree with Catbert. The last bite of eggplant i ate last night was not an instance of Cuisine. Cuisine turns out to be a meta-class; its instances are classes. This cannot be modeled in OWL-DL, but can in OWL-Full. Instances of Cuisine would be FrenchCuisine, EthiopianCuisine, ThaiCuisine, etc. Instances of THOSE would be physical objects.

Meta-classes are computationally expensive to deal with. They are one reason why creating ontologies are difficult. The tools for creating them are too restrictive, so people mis-ontologize.

dougf (not verified) | Tue, 03/10/2009 - 20:01

Categories

I agree that this ontology stinks. No question about it.

However, sometimes you do want to separate categorization from the formal types of your ontology. I do this, for example, for XML tools. The reason is that very often these categories are kind of fuzzy and fluid, and so treating them as categories can make life easier.

All tools really are Tools, no question about that. But a lot of them are very difficult to pin down with any precision, and don't really belong to any clear-cut category. And when none of the subclasses you could define under Tool add any extra properties or structural rules it's hard to see what benefit one would get from blowing up the size of the ontology.

But the way categories are handled in this ontology stinks anyway.

Lars Marius Garshol (not verified) | Fri, 01/23/2009 - 16:29

Re: Categories

All tools really are Tools, no question about that. .... And when none of the subclasses you could define ... add any extra properties or structural rules it's hard to see what benefit one would get ....

I know that CatBert would tilt his head now and would remark that subclasses in OWL do not hurt either and that they could help in merging situations.

But a proliferation of classes itself is useless, as long as there is no sufficient criterion to classify something into them. CatBert nonchalantly refers to the Tractatus (thesis 7).

But then he would also say that these problems are all syndroms of a single-paradigmatic formalism that OWL is. In the same way as Java programs look pretty retarded, because the OOish nature necessitates the programmer to warp every information piece along this one single paradigm. In the same way as Haskell programs can look absurd.

CatBert (this is why I still love him) would conclude that languages such as Perl may look dirty and confusing to humans. But this is because they are multi-paradigmatic.

I have to say, I'm pretty impressed with CatBert. As annoying as he is most of the time.

rho | Sun, 01/25/2009 - 11:51

gastro:Company rdfs:subClassOf good:BusinessEntity

Hi,

as for your question:

This is because the ebSemantics ontology imports the GoodRelations ontology (http://purl.org/goodrelations/), and GoodRelations links Location(s)OfSalesProvisioning and BusinessEntities via the respective Offering instance only.

Initially, it was not planned to use BusinessEntity and LocationOfSalesProvisioning without also expressing an Offering, e.g. just to model a company and its shops. The next service update of GoodRelations, however, will include a respective property so that one can link a shop directly to the business entity operating it.

I agree that some parts of the ebsemantics ontology are not modeled in perfect beauty. However, for the general criticism, keep in mind two things:

1. A good, business-relevant ontology must find a compromise between existing conceptual structures in the domain (e.g. database schemas in popular software or popular XML schema definitions), because otherwise users cannot easily populate your ontology. If you make too subtle modeling choices, you would require owners of legacy data to lift the existing resources in a very labor-intensive process to your ideal structures in perfect beauty. And you should first give evidence that -- on a Web scale -- that effort is justified by later savings.

See
http://kill.devc.at/system/files/iswc-lightning-talk-hepp3-small.png

for an explanation.

Simple example: If many databases store first name and last name in a single string, then a clean ontology that requires both to be separate elements is very hard to populate - at least automatically.

2. As for the lack of disjointness axioms etc.: Well, such do usually not harm - but the exact contribution of DL reasoning on data integration on a Web scale is yet pretty unclear.

The most popular Web vocabularies like FOAF etc. are on the lightweight side of the game, and there may be some reasons for that.

See
http://pingthesemanticweb.com/stats/namespaces.php

for a popularity ranking of namespaces / SW vocabularies

and

"Possible Ontologies: How Reality Constrains the Development of Relevant Ontologies" in: IEEE Internet Computing, Vol. 11, No. 1, pp. 90-96, available at

http://www.heppnetz.de/files/IEEE-IC-PossibleOntologies-published.pdf

for a discussion of that effect.

Best
Martin Hepp
http://www.heppnetz.de

Martin Hepp (not verified) | Mon, 01/26/2009 - 16:00

Re: gastro:Company rdfs:subClassOf good:BusinessEntity

Martin,

Thanks for all your links. I (rho) will pass this on to the catalogical CatBert. I know that he was also quite unhappy with the GoodRelations ontology itself. So there might be a followup post.

OTOH CatBert may be in the quest for ontological beauty.

PS: I (rho) have re-edited your comment to use a smaller version of the referenced image.

rho | Mon, 01/26/2009 - 16:19

Good Relations Ontology

Dear Martin,
in your good relations ontology you defined the concept ProductOrService. Do you have a definition of those two concepts? When is something a product and when a service? Thanks a lot, Christoph

Christoph Grün (not verified) | Mon, 01/26/2009 - 17:11

Good Relations

I came accross a definition of Product (cf. Classifying Services to Gain. Strategic Marketing Insights, Lovelock 2001) and Ziv Baida (http://www.baida.nl/Assets/papers/2006_Baida_PhD_BW.pdf) from IBM.

Product is defined as: as “the core output of any type of industry”;
goods can be described as “physical objects or devices”, and “services are
actions or performances” (Lovelock 2001).

Offering in the good relations ontology is defined as the announcement by a Business Entity to provide a certain Business Function for a certain Product or Service Instance to a specified target audience. In the tourism domain, time or saison plays an important role. Because an offering of a hotel room can be targeted to a certain customer and may be dependent on saison (winter/summer). Do you plan to relate the concept of time to the concept "offering"?

Anonymous (not verified) | Mon, 01/26/2009 - 21:13

ProductOrService

A definition of ProductOrService is e.g. at

http://www.heppnetz.de/ontologies/goodrelations/v1#ProductOrService

(since all GoodRelations URIs are dereferenceable ;-) )

It should be pretty much compatible with Baida's definition.

GoodRelations does not enforce a distinction between "Product" and "Service", because making this mandatory creates a lot of practical problems for business users of the ontology, while the real (non-academic) benefit is limited.

"In the tourism domain, time or saison plays an important role. Because an offering of a hotel room can be targeted to a certain customer and may be dependent on saison (winter/summer). Do you plan to relate the concept of time to the concept "offering"?"

As for the constraints regarding the target audience, countries, or period of time of validity, this is explicitly supported by the following elements:

gr:eligibleCustomerTypes
gr:eligibleRegions

and

gr:validFrom
gr:validThrough

See also the primer at

http://www.heppnetz.de/projects/goodrelations/primer/

For more background on the modeling choices underlying GoodRelations, I also recommend the full technical report, available at

http://www.heppnetz.de/projects/goodrelations/GoodRelations-TR-final.pdf

Best
Martin

Martin Hepp (not verified) | Fri, 01/30/2009 - 07:22

ProductOrService

gr:ProductOrService seems to be owl:sameAs openCyc:Product, which is described in its comment as
"Each instance of Product is a TemporalThing that is, or was at one time, offered for sale or performed as a commercial service, or was produced with the intent of being offered for sale. Positive examples of Product include:

- a barrel of crude oil being shipped to a customer;
- a completely assembled automobile in a factory;
- that same automobile sitting in a dealer's sales lot;
- that same automobile after it has been purchased and driven for ten years;
- a ripe tomato in the produce department of a grocery store;
- that same tomato later thrown away because it has become rotten before it could be sold;
- a professional masseuse or masseur giving a massage to a customer;
- a professional plumber installing a new sink for a customer.

Negative examples of Product include:

- some natural crude oil lying in the ground;
- an automobile prototype developed by a car company for testing purposes;
- a tomato grown in someone's backyard garden for personal use;
- a person giving a massage to his or her spouse, who has had a stressful day at work;
- a do-it-yourselfer installing a new sink in his or her house.

Some important specializations of Product are ProductForSale, PartiallyTangibleProduct, and ServiceProduct."

OpenCyc has ServiceProduct as a subclass of ServiceEvent, which the home massage would be an instance of.

gr:Product would be equivalent to openCyc:PartiallyTangibleProduct -- the class of physical objects that are offered for sale or rent, have been so offered, or were produced with the intent of so offering them.

gr:Service would be equivalent to openCyc:ServiceProduct, i.e., actions done to help some party in exchange for remuneration.

dougf (not verified) | Tue, 03/10/2009 - 19:44

ad: must find a compromise

1. A good, business-relevant ontology must find a compromise between existing conceptual structures in the domain ... [otherwise] ... you would require owners of legacy data to lift the existing resources in a very labor-intensive process ...

Well, CatBert gave me the "look" on this question. Saying "\rho, you do not need me for answering that one, do you?".

And he's right. It is a bogus argument.

Obviously the process of ontologisation is inherently always a trade-off (compromise) between the partners involved in a project or domain.

But to excuse bad ontologies with the technological challenge of uplifting, compromises the whole idea of ontologisation in the first place. I mean, only because the Perl script doing it would be 5 lines longer, cannot be made responsible to water down something as long-living as an ontology.

And in the case under discussion, this is not really rocket science.

And you should first give evidence that -- on a Web scale -- that effort is justified by later savings.

Who on earth can seriously do that? And who seriously ever considers even doing that? And who seriously ever considers even asking to do that?

rho | Tue, 01/27/2009 - 14:26

Compromises in Ontology Engineering

I mean, only because the Perl script doing it would be 5 lines longer, cannot be made responsible to water down something as long-living as an ontology.

Anybody who aims for real impact and not just academic papers should do that. I do not mean to provide a formal proof, but *some evidence*.

PS: I think the technical issues of GoodRelations in this thread are now clarified - I am not going to continue a discussion on the foundations of SW technology and ontology engineering in here. If there is interest in that, please contact me directly.

Martin Hepp (not verified) | Fri, 01/30/2009 - 07:27

ad the lack of disjointness axioms

2. As for the lack of disjointness axioms etc.: Well, such do usually not harm - but the exact contribution of DL reasoning on data integration on a Web scale is yet pretty unclear.

Ok, CatBert's response is completely cryptic to me. Maybe you get what he means, I'm just typing:

People in a car set out to drive up a long slippery slope called the "semantic web". The car now seems to have lost traction. Wheels are still turning and the car is moving more sideways than upwards. The people in the car call the side movement "web of data".

I definitely need a dumber cat.

rho | Tue, 01/27/2009 - 14:39