Update: Perl TM::Corpus 0.03 - Persistency
I have just uploaded a new version of TM::Corpus onto CPAN.
One notable addition is TM::Corpus::MLDBM which allows to persistently store the document corpus on disk. That definitely helps as you would not want to download mega bytes over and over again. MLDBM is the acronym for multi-level DB and you probably know Berkeley DB files for years.
Maybe I will also add a DBI (read: relational) backend to TM::Corpus at some later time.
The other addition is that some functionality (plucene search in that case) is exposed as plugin so that it can be used in the tm workbench. I will show you later how this can be used when the language has settled down.
Debianized version attached.
Work supported by the Austrian Research Centers.