Announce: Perl TM::Corpus

Here is TM::Corpus as experimental package to extend a topic map by all the documents it references. A map corpus is then all the internal and external (text and data) content a map covers.

Usage is simplistic:

use TM;
my $tm = ... # some map

use TM::Corpus;
my $co = new TM::Corpus (map => $tm) # bind with map
         ->update                    # link in all content from map
         ->harvest;                  # link in content external to map

Once such a map corpus is in your hands, applications can use all sorts of text mining operations on it.

One obvious application is fulltext search which is bundled as trait TM::Corpus::SearchAble::Plucene.


Work supported by the Austrian Research Centers.

Posted In