MemoryIndex or RAMDirectory, but score using term statistics from a corpus given during preprocessing?

2010-10-28 Thread Joseph Turian
How do I use MemoryIndex or RAMDirectory, but score using term statistics from a corpus given during preprocessing? Let's say I want to use a MemoryIndex or RAMDirectory to store a *single* document, and then run a query against it, and get the score of the query using just this one document. I kn

Re: Query Formalism for Texts with Program Code

2010-10-28 Thread Erick Erickson
typically one simply escapes the symbols that have special meaning in your syntax. In your example, hot\(dog\) would indicate to the parser that the () characters were to be interpreted as text rather than part of the query language. Lucene uses javaCC to parse queries following grammar rules

Lucene 3.0.3 Release Date

2010-10-28 Thread Shay Banon
Hi, It seems like current 3.0 branch has accumulated some important bug fixes, especially the possible index corruption bug. Is there a date for a formal 3.0.3 release? -shay.banon

Query Formalism for Texts with Program Code

2010-10-28 Thread Jan Burse
Dear All Was setting up a web search with a query language that uses (, !, ), ^, *, ?, {, } and < in its syntax. For example: hot dog: Looks for documents with hot and dog in close vincinity. (hot dog): Looks for documents with hot or dog in it. This all

Re: Email Indexing

2010-10-28 Thread Bill Janssen
Hasan Diwan wrote: > On 27 October 2010 18:16, Troy Wical wrote: > > Depends on what your trying to index, I suppose. Maildir or mbox? For some > > time now, off and on, I have been working to index an ezmlm mailing list > > archive. In the end, I went with Swish-E and have made quite a bit of