incremental indexing - efficiency

2005-03-22 Thread sunil goyal
Hello all, I am trying to use Lucene for doing incremental indexing of the order of million of records daily using a single machine (P4 2.4Ghz 1 GB RAM). I do get messages updated every few minutes based on which I need to update the index. I am using a StandardAnalyzer and writing documents usin

Re: PHP-Lucene Integration

2005-03-22 Thread Dawid Weiss
Your implementation and ideas sound very interesting, Owen. Can we see the system anywhere in public (and play with it?) We are hoping the institute can afford to have us work on true clustering techniques such as Carrot2 uses. (Thanks to Dawid and all the Poznan University folks who's papers w

PHP-Lucene Integration

2005-03-22 Thread Owen Densmore
[Sorry if this is received twice .. I tried earlier but didn't see it in the list!] A while back I asked folks how they deployed Lucene in a PHP environment. This summarizes how we proceeded with doing so. The response to the initial question was quite helpful. Kelvin Tan mentioned "How about

DirectLink followed by redirect

2005-03-22 Thread Erik Hatcher
A common pattern I've been implementing is to throw a RedirectException in my listener methods from form submits or DirectLink's. I want to ensure the browser's URL is always bookmarkable and refreshable. A DirectLink to add an item to a shopping cart would add the item again when refreshed.

Re: NumberTools

2005-03-22 Thread Chuck Williams
Doug Cutting writes (3/22/2005 10:05 AM): Chuck Williams wrote: If there is going to be any generalization to built-in sorting representations, I'd like to suggest two things be included: 1. Fix issue 34028 (delete the one word "final") Done. Thank you! 2. Include a provision for query-time

Re: NumberTools

2005-03-22 Thread John Patterson
Doug Cutting apache.org> writes: > I'd like to see benchmarks that demonstrate the improvement before we > consider including such a patch. You're making a lot of assumptions > about where time is spent performing numeric searching and sorting. > Sort and RangeFilter are already pretty effici

Re: NumberTools

2005-03-22 Thread Doug Cutting
Chuck Williams wrote: If there is going to be any generalization to built-in sorting representations, I'd like to suggest two things be included: 1. Fix issue 34028 (delete the one word "final") Done. 2. Include a provision for query-time parameters Can you provide a proposal? Doug --

Re: NumberTools

2005-03-22 Thread Chuck Williams
John Patterson writes (3/22/2005 12:56 AM): It would be great if this could be incorporated into Lucene as it will make numeric searches much more efficient. I will soon need to store simple geographical data in my index to do a "find the nearest x" type of search. I just added "find the neares

Re: NumberTools

2005-03-22 Thread Doug Cutting
John Patterson wrote: It would be great if this could be incorporated into Lucene as it will make numeric searches much more efficient. I'd like to see benchmarks that demonstrate the improvement before we consider including such a patch. You're making a lot of assumptions about where time is sp

lucene incremental indexing - efficiency

2005-03-22 Thread sunil goyal
Hello all, I am trying to use Lucene for doing incremental indexing of the order of million of records daily using a single machine (P4 2.4Ghz 1 GB RAM). I do get messages updated every few minutes based on which I need to update the index. I am using a StandardAnalyzer and writing documents usin

Re: new added documents not showing

2005-03-22 Thread roy-lucene-user
Pasha, in short, that is all I'm trying to do. Wasn't an issue really before. Otis, not sure what Luke is. But the documents appear after we optimize. Roy. On Mon, 21 Mar 2005 18:20:32 -0800 (PST), Otis Gospodnetic <[EMAIL PROTECTED]> wrote: > * Replies will be sent through Spamex to java-use

lucene incremental indexing - efficiency

2005-03-22 Thread sunil goyal
Hello all, I am trying to use Lucene for doing incremental indexing of the order of million of records daily using a single machine (P4 2.4Ghz 1 GB RAM). I do get messages updated every few minutes based on which I need to update the index. I am using a StandardAnalyzer and writing documents usin

Re: boosting?

2005-03-22 Thread Stefan Groschupf
Erik, thanks, I see. Stefan Am 22.03.2005 um 02:38 schrieb Erik Hatcher: Stefan, Boosts are not stored directly, necessarily. Each field has an associated normalization factor, of which boost is multiplied into. This value is precomputed at indexing time, so getting the boost isn't possible un

Re: NumberTools

2005-03-22 Thread John Patterson
Chris Hostetter fucit.org> writes: > I haven't worked through the math to prove to myself that your algorithm > is a viable way of expressing any Integer as a 4 byte String; such that > any two Integers sort lexigraphically correct as strings ... but let's > assume that i have, and that it works

Re: NumberTools

2005-03-22 Thread Chris Hostetter
: > I can see in FieldDocSortedHitQueue where the case statement deals with : > the various types of SortField, but at that point it's comparing FieldDoc : > objects whose fields[i] is expected to allready be an "Integer" object. : > where is that "Integer" object parsed from the String value of th