Use of In-like query and performance implications

2005-03-02 Thread Paul Smith
If I had a query of the form (assuming this gets converted to Lucene-speak) Org=X or Folder In(g,h,i,j,.) or Public=true Having a field for Org, and Public is easy, but the Folder field In-like clause (which would be converted into OR's) could potentially have quite a few in the clause. my q

Re: Remove document fails

2005-03-02 Thread Mauro Verrocchio
Try IndexDirectory.unlock(directory). It's in the javadoc. Cheers, Mauro. On Wed, 02 Mar 2005 22:40:27 +0100, Gusenbauer Stefan <[EMAIL PROTECTED]> wrote: > Volodymyr Bychkoviak wrote: > > > may be you have open IndexWriter at the same time you are trying to > > delete document. > > > > Alex Ki

Re: How to set individual boost factor to each word in a phrase query ?

2005-03-02 Thread Daniel Naber
On Wednesday 02 March 2005 22:43, Robichaud, Jean-Philippe wrote: > "some^0.81 list^0.12 of^0.5 words^0.99" You could try "some list of words"^0 AND (some^0.81 list^0.12 of^0.5 words^0.99) Will only match documents that contain the phrase, but score the terms (but also those terms in the docum

Indexing sit (stuff it) files

2005-03-02 Thread Luke Shannon
Hello; I've almost completed my zip file indexer. I used the following to get an InputStream for each file in the archive: ZipFile zip = new ZipFile(new File(fileLocation)); ZipEntry zipEntry; Enumeration files = zip.entries(); while (files.hasMoreElements

How to set individual boost factor to each word in a phrase query ?

2005-03-02 Thread Robichaud, Jean-Philippe
Hi everyone. I've been playing with Lucene a lot in the past few months for an important project. We are using the raw score returned by Lucene (we created a custom similarity) as a part of a confidence score calculation. My problem is exactly what the subject line of this email says: How to s

Re: Remove document fails

2005-03-02 Thread Gusenbauer Stefan
Volodymyr Bychkoviak wrote: may be you have open IndexWriter at the same time you are trying to delete document. Alex Kiselevski wrote: Hi, I have a problem doing IndexReader.delete(int doc) and it fails on lock error. Alex Kiselevski +9.729.776.4346 (desk) +9.729.776.1504 (fax) AMDOCS > INTEGR

Re: warm up lucene, especially sort by cache

2005-03-02 Thread Doug Cutting
Morus Walter wrote: So if you use sort, doing one sort after creating the index might be useful. Yes, this is a good way to pre-load lots of things. For reading relevant parts of the index into OS caches, I'd rather use the most commonly searched terms, than the most frequent ones. If the index was

Re: Fast access to a random page of the search results.

2005-03-02 Thread Stanislav Jordanov
Thank you guys, there's a good chance that I will have the management persuaded to drop the 'random access requirement'. As you surely know, the management (usually) tends to be franticly optimistic. True to this trend, our management suggested us (the R&D team) that: "... it is time to assume th

Re: possible concurrent actions table

2005-03-02 Thread Otis Gospodnetic
I think the table is still up to date, but the most comprehensive explanation of concurrency rules that I know of is in Chapter 2 of Lucene in Action: http://www.lucenebook.com/search?query=concurrency+rules As you can see from the 1st hit snippet, it really all boils down to eliminating concurr

Re: How exactly is 'Lucene' pronounced?

2005-03-02 Thread Charles Stover
Erik Hatcher wrote: Loo seen Lou seen On Mar 2, 2005, at 10:15 AM, Stanislav Jordanov wrote: How exactly is 'Lucene' pronounced? Some of my collegues pronounce it like "Liu-sin" (accent on the second syllable) I use to pronounce like "Lu-sen" (accent on the second syllable) How's the right way to d

Re: help with boolean expression

2005-03-02 Thread Yonik Seeley
> I agree that the current behavior is awkward. Is it worth breaking > backwards compatibility to correct this with the patch applied? IMHO, definitely. The current behavior is so surprising that I doubt that no one is relying on it. I didn't really see this behavior explicitly pointed out in "

Re: How exactly is 'Lucene' pronounced?

2005-03-02 Thread Erik Hatcher
Loo seen Lou seen On Mar 2, 2005, at 10:15 AM, Stanislav Jordanov wrote: How exactly is 'Lucene' pronounced? Some of my collegues pronounce it like "Liu-sin" (accent on the second syllable) I use to pronounce like "Lu-sen" (accent on the second syllable) How's the right way to do it? --

How exactly is 'Lucene' pronounced?

2005-03-02 Thread Stanislav Jordanov
How exactly is 'Lucene' pronounced? Some of my collegues pronounce it like "Liu-sin" (accent on the second syllable) I use to pronounce like "Lu-sen" (accent on the second syllable) How's the right way to do it? - To unsubscribe

RE: help with boolean expression

2005-03-02 Thread Morus Walter
Omar Didi writes: > I checked the code for the patch and I had no clue how to use it. > can you please give me some instructions? I guess it just patches QueryParser.jj so patch < {patchfile} in the directory where QueryParser.jj is found, should do (on the command line of course). If you're on

Re: possible concurrent actions table

2005-03-02 Thread Volodymyr Bychkoviak
Friso van Vollenhoven wrote: Hi, I found this table at jGruru: http://www.jguru.com/forums/view.jsp?EID=910778 Since the table seems to be about 2.5 years old, I was wondering if it is still correct. It says that I can concurrently delete and read a document. So if there are two threads using one

RE: help with boolean expression

2005-03-02 Thread Omar Didi
I checked the code for the patch and I had no clue how to use it. can you please give me some instructions? thanks -Original Message- From: Morus Walter [mailto:[EMAIL PROTECTED] Sent: Wednesday, March 02, 2005 9:01 AM To: java-user@lucene.apache.org Subject: RE: help with boolean express

possible concurrent actions table

2005-03-02 Thread Friso van Vollenhoven
Hi, I found this table at jGruru: http://www.jguru.com/forums/view.jsp?EID=910778 Since the table seems to be about 2.5 years old, I was wondering if it is still correct. It says that I can concurrently delete and read a document. So if there are two threads using one IndexReader instance, they

RE: help with boolean expression

2005-03-02 Thread Morus Walter
Omar Didi writes: > thank you so much Eric and Morus, I have a clear idea now how it works. i > will try to implement a custom code that adds the parenthesis to boolean > expressions with some rules about operator precedence. > I rather suggest, that you patch QP instead. Adding parenthesis be

RE: help with boolean expression

2005-03-02 Thread Omar Didi
thank you so much Eric and Morus, I have a clear idea now how it works. i will try to implement a custom code that adds the parenthesis to boolean expressions with some rules about operator precedence. Omar -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Wednesday

Re: help with boolean expression

2005-03-02 Thread Daniel Naber
On Wednesday 02 March 2005 12:25, Erik Hatcher wrote: > I agree that the current behavior is awkward.  Is it worth breaking > backwards compatibility to correct this with the patch applied? I'd vote for fixing this as long as the current QueryParser is still available in Lucene core, maybe as Ol

Re: help with boolean expression

2005-03-02 Thread Erik Hatcher
I'm deep into implementing a custom (not generalizable, sorry) query parser and am evaluating this very issue now. Lucene indeed does some funny stuff with boolean operators. Output the toString of your resultant Query's to see the details, or have a look at the Bugzilla issue that Morus menti

Re: Fast access to a random page of the search results.

2005-03-02 Thread Stanislav Jordanov
You're rihgt: nHits - nHits == 0 :) But I did the right tests - it just happened that I've sent you a wrong source. I mean I performed the tests accessing the proper last doc: doc(nHits - 1) then I switched to accessing the first hit, just to make sure (once again) there is essential difference in

Re: Large Index managing

2005-03-02 Thread Volodymyr Bychkoviak
this is solved by keeping document key not in list but in set. then even with two updates delete and add will appear only once. Miles Barr wrote: On Wed, 2005-03-02 at 05:49, Otis Gospodnetic wrote: Or you can just buffer your update requests and delete in batch and then add in batch. Or you cou

Re: Large Index managing

2005-03-02 Thread Miles Barr
On Wed, 2005-03-02 at 05:49, Otis Gospodnetic wrote: > Or you can just buffer your update requests and delete in batch and > then add in batch. > > Or you could keep that IndexReader on the large index and use it to > delete objects, while doing adds on a RAMDirectory. Then, when you are > done,

Re: warm up lucene, especially sort by cache

2005-03-02 Thread Morus Walter
Chris Lu writes: > 1. Need an efficient way to pick up the most frequent words in an index. > I think this can be done, any example will be appreciated. I don't see an alternative to looping through all terms and look at their frequency. > 2. search by the most freqent words, with sort by opti

warm up lucene, especially sort by cache

2005-03-02 Thread Chris Lu
1. Need an efficient way to pick up the most frequent words in an index. I think this can be done, any example will be appreciated. 2. search by the most freqent words, with sort by options Is this the only way to warm up lucene? For large indexes, the first sort-by search is slow. Chris -