Perl TM Tutorial: Persistency (Part VIII)

(Followup to part VII)

One of the most important things people need is a way to read Topic Map content from disk, and maybe write it back later. This is usually referred to as persistency, but the whole persistency business can be quite contrived. The Perl TM distribution tries to help to keep things organized.

Serializably Materialized

When map content is stored in a text file, say in XTM 2.0 format, then we can do the following to connect this with an in-memory object:

my $tm = new TM::Materialized::XTM (file => 'somemap.xtm');

This line creates a map object, but important it is to understand that it is you who has to say when the content from the file has to be synchronized into memory:

$tm->sync_in;

If the map file on disk is modified while your process runs, you could resynchronize over and over. If you trigger synchronisation and the file has not been actually modified, then nothing happens. This is achieved with timestamps. Unfortunately timestamps under UNIX have a rather high granularity (a second), so you have to be aware of this.

In the same vain it your responsibility to specify when memory content will go into the file:

$tm->sync_out;

As you would expect, this works exactly the same with AsTMa, LTM (or later CTM) files:

my $atm = new TM::Materialized::AsTMa (file => 'somemap.atm');
my $ltm = new TM::Materialized::LTM   (file => 'somemap.ltm');

All maps which can be serialized into one of the above formats can be actually serialized not only from text files, but also from strings:

my $atm = new TM::Materialized::AsTMa (inline => '# AsTMa here');

or also - more generally - from resources specified via a URL:

my $ltm = new TM::Materialized::LTM (url => 'http://far.far.away/map.ltm');

Serializables

If you glimpse into one of these package you will notice that all of them inherit from TM::Materialized::Stream. What is more remarkable is that each of the packages is loading a trait for its format. For instance, for AsTMa:

use base qw (TM::Materialized::Stream);
use Class::Trait qw(TM::Serializable::AsTMa);

It is the trait TM::Serializable::AsTMa which provides the serialization (and deserialization) functionality. Otherwise the package is empty.

One consequence of the above is that you can build you own homegrown persistency package, possibly one which takes maps always from an AsTMa file and which will serialize it only into XTM:

use base qw (TM::Materialized::Stream);
use Class::Trait ('TM::Serializable::AsTMa' => {
                         exclude => [ "serialize" ]
                   },
                  'TM::Serializable::XTM' => {
                         exclude => [ "deserialize" ]
                   }
                  );

The exclude options just ensure that only the methods we want are imported into our new package.

Should you want to implement a new serializable format, say CTM, then you will simply implement the methods serialize (and possibly deserialize) in TM::Serializable::CTM.

Synchronizably Materialized

The above implementations deal with persistence methods which are all based on serializable formats. There are other storage methods which are not serializable, but which also can offer a sync_in, sync_out functionality.

A map stored in an MLDBM file is an example of this class. The whole content can be transferred between a file (in DBM format) and the in-memory object on your command:

use TM::Materialized::MLDBM;
my $tm = new TM::Materialized::MLDBM (file => '/tmp/map');

$tm->sync_in;
# time passes ....

$tm->sync_in;  # why not?
# more time passes ....

$tm->sync_out;

Another example of a non-serializable store is a relational database, at least when we maintain the processing model of synchronizing the whole map.

Should someone volunteer to write a TM::Materialized::DBI, then the methods sync_in and sync_out would have to be implemented.

Resourceable Materialized

As we have seen above, not all resources are serializable. Also not all resources are synchronizable.

When you have an implementation which is capable of storing topic maps in a relational database, then using the full synchronisation mechanism is pretty wasteful. You would rather consider to use the database for every access, be it for reading or writing.

In this sense, the map object is connected to an external resource (the database). Hence the naming resourceable.

While the Perl distribution does not (yet) offer an implementation for relational databases, it has one using MLDBM, again:

{
  use TM::Materialized::MLDBM2;
  my $tm = new TM::Materialized::MLDBM2 (file => '/tmp/map.dbm');

  # modify the map here.....

} # $tm goes out of scope here

Here the $tm object serves only as handle to the backend. Any access into the map will consult the file map.dbm, be it for reading or writing.

When the handle goes out of scope, the connection to the backend is resolved, but after all changes are safely written onto disk. And so there is no need for any explicit synchronisation.


Work supported by the Austrian Research Centers.

Posted In