Re: Sentence boundary storage

Grant Ingersoll Mon, 31 Oct 2005 05:44:03 -0800

Inline below

Chris Hostetter wrote:

: Actually, I was thinking of writing something along the lines of
: Span*BoundaryQuery where it would be more explicit than what was
: described below.  You could say SpanSentence and say you want the terms

I'm not clear on how such a SpanSentence class would work -- the index
must contain info about where sentence boundaries are, which means users
would need a special analyzer/tokenizer to create Terms for those
boundaries, and would need to tell the SpanSentence class what those
tokens are.

Right, I was think providing it as a package of code that would storethe tokens needed on indexing, etc, probably by extending theStandardTokenizer/Analyzer. The Span classes would need to take in theappropriate information and create the underlying SpanNotQuery, etc. asdiscussed in the previous email.

It sounds like maybe you could write some convinience methods to construct
the SpanQuery structure for you, but I don't see any practicle way to make
a generic SpanSentence class.

: codify what is discussed below into a few convenience Span queries, or
: maybe we should just write it up better and put on the wiki or something...

If you impliment it in an acctual application (instead of just theorizing
it like Doug and I have done) then i definitely think I would make a
usefull HOWTO if you hvae time to write one up...

        http://wiki.apache.org/jakarta-lucene/HowTo

I have added a fair amount to the current Lucene Demo for my ApacheContalk in December, which will be available freely then that I mightconsider putting in a proof of concept/demo of how to do such a thing.I will try to write it up when I get the chance.


-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Sentence boundary storage

Reply via email to