Re: XML::Index

Markus Heller Sun, 12 Dec 2004 17:06:14 -0800

Hi Smylers,

> > XML::Index::DataGuide
>
> Personally I don't think I would have guessed what "data guide" is and
> understood the purpose of your module just from the name -- but that
> might just be me, and I can't think of a better name.
>
> > XML::Index::CADG (Content Aware DG)
There has been the indexing technology called "DataGuide" before. It works 
basically like this: You have an index tree that leads to keyword occurences 
in documents and you have a flat (inverted file) index that leads to keyword 
occurences in documents. The problem is that at query time a big and 
expensive join has to be made and this slows down query processing in a 
manner linear to the number of documents you have in your whole index system 
- or depending on the properties of your query.


The CADG applies a special way of document processing by generating an 
"annotated" index tree that allows to prune out all the irrelevant paths and 
thus speeds up semistructured search by a factor of up to 600 times, off 
course, depending on the type of the query. 

> If that is a specific type of data guide then it should be named
> XML::Index::DataGuide::CA (or whatever) to indicate that -- otherwise
> there's nothing linking the "DG" in the second module with the first
> one.
>
> This does mean that very specific modules do end up with rather long
> names, but generally they don't have to be typed very often (the use
> line, plus in the constructor for OO modules), and in the long run a
> meaningful name is worth more than a few keystrokes.

I don't object to long names. Though, I think if we open such a namespace, we 
should follow it by the technology (CADG). And below that we should deploy 
the according methods (like the constructor, the "add", the "search" and 
possibly some servicing methods).

The terminology "CADG" does not originate from me but from the consortium of 
authors (colleagues) from the Institute of Computational Linguistics and the 
Institute of Computing Sciences of Munich University .

Regards,
Markus

Re: XML::Index

Reply via email to