Tutorial: TM Semantic Visualisation (Part III)

(continued from Part II)

The landscape we looked at last time only consisted of mountain ranges and valleys made up by the intensity and extensity of certain words (or word groups) within a set of documents. Documents which are all captured with a topic map.

http://kill.devc.at/system/files/bipolar-linked-small.jpg

Where Are The Documents?

One interesting question is how documents fit into the landscape: Every document can be characterized by the intensities of the words within it. And this very intensity can be interpreted within the map.

http://kill.devc.at/system/files/synth-with-docs-small.jpg

In the picture above every circle corresponds to one document. The position of the circle marks the best fitting place for the document. The size of a circle signals the uncertainty of the position. The larger the circle the less precisely a position can be called home.

This idea I copied from SOMs (self organizing maps). In real-world maps, though, there may well be several "best" positions for one particular document. More about that later.

Ok, But What Documents?

Seeing the document positions lining up the major landscape features is fine, and it helps to understand how the landscape is formed, but in the end the user wants to

  • see a preview of the document, and
  • then have full access to it.

For this purpose I created a bit of HTML and jQuery code:

http://kill.devc.at/system/files/synth-hover-small.jpg

When you move the mouse over one of the circles, a window with terms will fade in. The terms in it are ordered by relative importance within the document, with the most frequent first. You will notice that if a document is placed along a prominent site, the first term(s) will match the terms of the site.

If you are interested in the full document, you can activate the link (read: you click). But as all my documents currently live inside the synthetic topic map, I just use a fake link for the moment.

You can play around with it yourself. It seems to work in Firefox and Safari but may need more love (resp. hate) for other browsers.

AttachmentSize
synth-with-docs-small.jpg33.92 KB
synth-with-docs.jpg62.27 KB
synth-with-docs.png581.56 KB
jquery.js.txt55.91 KB
synth.html34.75 KB
synth-hover.jpg67.43 KB
synth-hover-small.jpg34.9 KB
Posted In

Opera etc

I've followed this series with interest, but have to say that I'm still uncertain about what uses this visualization has. I think my problem is mostly due to the use of meaningless examples, which makes it harder to grok what is going on.

And regarding browsers: the jQuery code appears not to work at all in Opera 10.

Lars Marius Garshol (not verified) | Sat, 10/03/2009 - 11:38

Re: Uses and Opera problems

I've followed this series with interest, but have to say that I'm still uncertain about what uses this visualization has.

I have been silent on the value proposition of these map. Intentionally so, because I am still uncertain how far this can be driven. There are many ideas not yet tried out. I have to look for more financing, too.

My main agenda is to find out (a) whether a landscape can be computed to reflect the quantity of ontological concepts and (b) whether documents can have a natural place. Once this is somewhat robust, a map is simply an alternative way to find information. Alternative to fulltext search or facetted navigation.

I think my problem is mostly due to the use of meaningless examples, which makes it harder to grok what is going on.

Ok, here I have the advantage that I also test this on live material. Actually mostly on the MapReduce map. The problem with live material for you (as a reader) is, though, that you cannot have a complete overview over the map and all the documents it is referring to. So a synthetic (abstract) map is probably more adequate for the moment.

The other problem is that there is a lot of statistical magic going on, something which does not immediately translate into the landscape. I guess this will have the same problem as Google PageRank where it is also quite unclear why some
web sites rank higher than others. At some time my blog ranked on page 1 for Animal Sex. Just because of my despair.com posters.

And regarding browsers: the jQuery code appears not to work at all in Opera 10.

Yes, this is annoying as I use Opera myself. There are numerous reported problems with jQuery and Opera (and MSIE8), but I hope they will be resolved.

rho | Sat, 10/03/2009 - 12:41

Uses

Keep at it. I'll wait for further instalments in the series in the hope that it will all make sense in the end. :)

Lars Marius Garshol (not verified) | Sat, 10/03/2009 - 13:33