Re: Changing wildcard characters

2008-02-25 Thread Chris Hostetter
: is it possible to change the wildcard charaters which are used by : QueryParser? : : Or do I have to replace them myself in the query string? You can change the grammer and regenerate QUeryParser.java, but there is no progromatic mechanism for this. -Hoss -

Re: When does QueryParser creates PhraseQueries

2008-02-25 Thread Daniel Noll
On Tuesday 26 February 2008 01:05:27 [EMAIL PROTECTED] wrote: > Hi all, > > I have the behaviour that when I search with Luke (version 0.7.1, Lucene > version 2.2.0) inside an arbritray field, the QueryParser creates a > PhraseQuery when I type in > ~ termA/termB (no "...") > When

RE: Rebuilding Document from index?

2008-02-25 Thread Itamar Syn-Hershko
Hello again, If I wanted to do this programmatically, how would I do this (retrieve a list of all terms in a field for a specific document - better if it was in alphabettic order and with frequency data)? Thanks, Itamar. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECT

Re: regex expressions within phrase queries

2008-02-25 Thread Jim Bogan
Thanks for the advice Chris. What I am working on now is extracting the matching phrases. The current code for MultiPhraseQuery and SpanQueries just return all matching terms, not matching phrases. I implemented some code matching up the TermPositions, but this is pretty slow. Is there any way

RE: Transactions in Lucene

2008-02-25 Thread spring
> I don't think creating an IndexWriter is very expensive at all. Ah ok. I tested it. Creating an IndexWriter on an index with 10.000 docs (about 15 MB) takes about 200 ms. This is a very cheap operation for me ;) I only saw the many calls in init() which reads files and so on and therefore I to

Re: Transactions in Lucene

2008-02-25 Thread Michael McCandless
I don't think creating an IndexWriter is very expensive at all. Especially compared to creating an IndexReader, for example. Mike <[EMAIL PROTECTED]> wrote: For what time is the 2.4 release planned? Not really sure at this point ... Hm. Digging into IndexWriter#init it seems that this is

RE: Transactions in Lucene

2008-02-25 Thread spring
> > For what time is the 2.4 release planned? > > Not really sure at this point ... Hm. Digging into IndexWriter#init it seems that this is a really expensive operation and thus my self made "commit" too. Isn't it? - To unsubsc

Re: Transactions in Lucene

2008-02-25 Thread Michael McCandless
<[EMAIL PROTECTED]> wrote: For what time is the 2.4 release planned? Not really sure at this point ... Mike - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

RE: Transactions in Lucene

2008-02-25 Thread spring
> In 2.4, commit() sets the rollback point. So abort() will > roll index > back to the last time you called commit() (or to when the writer was > opened if you haven't called commit). > > In 2.3, your only choice is to close & re-open the writer to reset > the rollback point. OK, thank yo

Re: Transactions in Lucene

2008-02-25 Thread Michael McCandless
<[EMAIL PROTECTED]> wrote: Then, you can call close() to commit the changes to the index, or abort() to rollback the index to the starting state (when the writer was opened). As I understand the docs, the index will get rolled back to the state as it was when the index was opened. How can

RE: Transactions in Lucene

2008-02-25 Thread spring
> Then, you can call close() to commit the changes to the index, or > abort() to rollback the index to the starting state (when the writer > was opened). As I understand the docs, the index will get rolled back to the state as it was when the index was opened. How can I achieve a rollback which o

When does QueryParser creates PhraseQueries

2008-02-25 Thread duiduder
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi all, I have the behaviour that when I search with Luke (version 0.7.1, Lucene version 2.2.0) inside an arbritray field, the QueryParser creates a PhraseQuery when I type in ~ termA/termB (no "...") When I read the documentation a

Re: Out of Memory Exception

2008-02-25 Thread Grant Ingersoll
What's your heap size? -Grant On Feb 25, 2008, at 3:04 AM, Jawahar Lal wrote: Hi, I am using Lucene2.0. I am doing full text index of pdf file. To extract the text from pdf I am using pdfbox library. When I start indexing of pdf files I get Out of memory exception. This is becuase files

Re: Timeout for search threads

2008-02-25 Thread Grant Ingersoll
See https://issues.apache.org/jira/browse/LUCENE-997 On Feb 24, 2008, at 11:52 PM, Anshum wrote: Hi, Is there a way I could add a timeout value for each search instance while searching? Currently I encounter issues while a bottleneck causing query continues to execute for over 1000 seconds(

Re: Security filtering from external DB

2008-02-25 Thread Gabriel Landais
Gabriel Landais a écrit : How to create a Filter for a field in Collection? First, split Collection in Collection with BooleanQuery.maxClauseCount items maximum in each collection. For each collection : create a BooleanQuery with a TermQuery for each String. perform a search with a HitCollect

Out of Memory Exception

2008-02-25 Thread Jawahar Lal
Hi, I am using Lucene2.0. I am doing full text index of pdf file. To extract the text from pdf I am using pdfbox library. When I start indexing of pdf files I get Out of memory exception. This is becuase files are about 10 mb in size. I tried different value for mergefactor, maxmergefactor an