Re: Memory Usage

2008-07-03 Thread Keith Watson
Thanks very much for this; I'll give it a shot. Keith. On 4 Jul 2008, at 00:02, Paul Smith wrote: (there are around 6,000,000 posts on the message board database) Date encoded as yyMMdd: appears to be using around 30M Date encoded as yyMMddHHmmss: appears to be using more than 400M! I g

Re: Memory Usage

2008-07-03 Thread Paul Smith
(there are around 6,000,000 posts on the message board database) Date encoded as yyMMdd: appears to be using around 30M Date encoded as yyMMddHHmmss: appears to be using more than 400M! I guess I would have understood if I was seeing the usage double for sure, or even a little more; no idea

Re: Memory Usage

2005-11-17 Thread Daniel Noll
Doug Cutting wrote: Daniel Noll wrote: Doug Cutting wrote: Daniel Noll wrote: I actually did throw a lot of terms in, and eventually chose "one" for the tests because it was the slowest query to complete of them all (hence I figured it was already spending some fairly long time in I/O, a

Re: Memory Usage

2005-11-17 Thread Marvin Humphrey
On Nov 17, 2005, at 4:16 PM, Daniel Noll wrote: Doug Cutting wrote: Daniel Noll wrote: I actually did throw a lot of terms in, and eventually chose "one" for the tests because it was the slowest query to complete of them all (hence I figured it was already spending some fairly long tim

Re: Memory Usage

2005-11-17 Thread Doug Cutting
Daniel Noll wrote: Doug Cutting wrote: Daniel Noll wrote: I actually did throw a lot of terms in, and eventually chose "one" for the tests because it was the slowest query to complete of them all (hence I figured it was already spending some fairly long time in I/O, and would be penalised t

Re: Memory Usage

2005-11-17 Thread Daniel Noll
Doug Cutting wrote: Daniel Noll wrote: I actually did throw a lot of terms in, and eventually chose "one" for the tests because it was the slowest query to complete of them all (hence I figured it was already spending some fairly long time in I/O, and would be penalised the most.) Every oth

Re: Memory Usage

2005-11-17 Thread Doug Cutting
Daniel Noll wrote: I actually did throw a lot of terms in, and eventually chose "one" for the tests because it was the slowest query to complete of them all (hence I figured it was already spending some fairly long time in I/O, and would be penalised the most.) Every other query was around 7ms

Re: Memory Usage

2005-11-16 Thread Daniel Noll
Doug Cutting wrote: Daniel Noll wrote: Timings were obtained by performing the same search 1,000 times and averaging the total time. This was then performed five times in a row to get the range that's displayed below. Memory usage was obtained using a 20-second sleep after loading the index,

Re: Memory Usage

2005-11-16 Thread Doug Cutting
Daniel Noll wrote: Timings were obtained by performing the same search 1,000 times and averaging the total time. This was then performed five times in a row to get the range that's displayed below. Memory usage was obtained using a 20-second sleep after loading the index, and then using the Win

Re: Memory Usage

2005-11-15 Thread Daniel Noll
Marvin Humphrey wrote: The formatting of the results turned up a little screwy in my email reader, so here's a reformatted version... I noticed the same thing on Thunderbird, although viewing the source showed that the original was okay, and KMail didn't seem to have the same issue. Howeve

Re: Memory Usage

2005-11-15 Thread Marvin Humphrey
Good stuff, Daniel... Thanks for taking the time to tabulate the results and present them. If your results hold, it may have a significant impact on my application. I'm working on a Perl/XS port, and I think a lot of people who want to run it won't be running mod_perl, so startup times

Re: Memory Usage

2005-11-15 Thread Daniel Noll
Doug Cutting wrote: Marvin Humphrey wrote: You *can't* set it on the reader end. If you could set it, the reader would get out of sync and break. The value is set per-segment at write time, and the reader has to be able to adapt on the fly. It would actually not be too hard to change

RE: Memory Usage

2005-11-15 Thread Vanlerberghe, Luc
lto:[EMAIL PROTECTED] Sent: maandag 14 november 2005 18:19 To: java-user@lucene.apache.org Subject: Re: Memory Usage Marvin Humphrey wrote: > You *can't* set it on the reader end. If you could set it, the reader > would get out of sync and break. The value is set per-segment at write &

Re: Memory Usage

2005-11-14 Thread Marvin Humphrey
On Nov 14, 2005, at 9:19 AM, Doug Cutting wrote: It would actually not be too hard to change things so that there was such a parameter that could be set on an IndexReader. It would determine the fraction of entries in the .tii file that are kept in RAM. So if the parameter were, e.g., 10

Re: Memory Usage

2005-11-14 Thread Doug Cutting
Marvin Humphrey wrote: You *can't* set it on the reader end. If you could set it, the reader would get out of sync and break. The value is set per-segment at write time, and the reader has to be able to adapt on the fly. It would actually not be too hard to change things so that there was

Re: Memory Usage

2005-11-14 Thread Marvin Humphrey
On Nov 13, 2005, at 10:22 PM, Daniel Noll wrote: Okay, I've gone and revised how things are fitting together in our app. It seems that we already call optimize() at the end of all the processing, before which I could figure out what kind of value we should be using and call this setter m

Re: Memory Usage

2005-11-13 Thread Daniel Noll
Chris Hostetter wrote: : I think though, that I will need a setter on the reader, rather than the : writer. That is, I don't know what factor we want until I know how : large the index is. And I don't know how large the index will be at the : time of creating the writer, but I can just ask for

Re: Memory Usage

2005-11-13 Thread Marvin Humphrey
On Nov 13, 2005, at 6:27 PM, Chris Hostetter wrote: I believe if you really want to determine settings like this after building the index, you'll need to do an initial build the index using best guess values -- then if the calculations you do once the index is built aren't close enough to your

Re: Memory Usage

2005-11-13 Thread Chris Hostetter
: I think though, that I will need a setter on the reader, rather than the : writer. That is, I don't know what factor we want until I know how : large the index is. And I don't know how large the index will be at the : time of creating the writer, but I can just ask for maxDoc() at the time : o

Re: Memory Usage

2005-11-13 Thread Marvin Humphrey
On Nov 13, 2005, at 6:11 PM, Daniel Noll wrote: Now, to figure out how to set it. There's no setter that I can see... then again it may be in trunk, and just not in the version we're stuck on for the time being. I haven't checked 1.4.3, but yes, I'm looking at the subversion trunk. It's

Re: Memory Usage

2005-11-13 Thread Daniel Noll
Marvin Humphrey wrote: You want indexInterval. Here's an excerpt from the docs in TermInfosWriter. Excellent, that looks like exactly what we're after. Now, to figure out how to set it. There's no setter that I can see... then again it may be in trunk, and just not in the version we're

Re: Memory Usage

2005-11-09 Thread Marvin Humphrey
On Nov 9, 2005, at 4:48 PM, Daniel Noll wrote: My question is: is this 1/128 figure set in stone, or can it be changed without major consequences? You want indexInterval. Here's an excerpt from the docs in TermInfosWriter. // TODO: the default values for these two parameters // shoul