Thanks very much for this; I'll give it a shot.
Keith.
On 4 Jul 2008, at 00:02, Paul Smith wrote:
(there are around 6,000,000 posts on the message board database)
Date encoded as yyMMdd: appears to be using around 30M
Date encoded as yyMMddHHmmss: appears to be using more than 400M!
I g
(there are around 6,000,000 posts on the message board database)
Date encoded as yyMMdd: appears to be using around 30M
Date encoded as yyMMddHHmmss: appears to be using more than 400M!
I guess I would have understood if I was seeing the usage double for
sure, or even a little more; no idea
Doug Cutting wrote:
Daniel Noll wrote:
Doug Cutting wrote:
Daniel Noll wrote:
I actually did throw a lot of terms in, and eventually chose "one"
for the tests because it was the slowest query to complete of them
all (hence I figured it was already spending some fairly long time
in I/O, a
On Nov 17, 2005, at 4:16 PM, Daniel Noll wrote:
Doug Cutting wrote:
Daniel Noll wrote:
I actually did throw a lot of terms in, and eventually chose
"one" for the tests because it was the slowest query to complete
of them all (hence I figured it was already spending some fairly
long tim
Daniel Noll wrote:
Doug Cutting wrote:
Daniel Noll wrote:
I actually did throw a lot of terms in, and eventually chose "one"
for the tests because it was the slowest query to complete of them
all (hence I figured it was already spending some fairly long time in
I/O, and would be penalised t
Doug Cutting wrote:
Daniel Noll wrote:
I actually did throw a lot of terms in, and eventually chose "one"
for the tests because it was the slowest query to complete of them
all (hence I figured it was already spending some fairly long time in
I/O, and would be penalised the most.) Every oth
Daniel Noll wrote:
I actually did throw a lot of terms in, and eventually chose "one" for
the tests because it was the slowest query to complete of them all
(hence I figured it was already spending some fairly long time in I/O,
and would be penalised the most.) Every other query was around 7ms
Doug Cutting wrote:
Daniel Noll wrote:
Timings were obtained by performing the same search 1,000 times and
averaging the total time. This was then performed five times in a row
to get the range that's displayed below. Memory usage was obtained
using a 20-second sleep after loading the index,
Daniel Noll wrote:
Timings were obtained by performing the same search 1,000 times and
averaging the total time. This was then performed five times in a row
to get the range that's displayed below. Memory usage was obtained
using a 20-second sleep after loading the index, and then using the
Win
Marvin Humphrey wrote:
The formatting of the results turned up a little screwy in my email
reader, so here's a reformatted version...
I noticed the same thing on Thunderbird, although viewing the source
showed that the original was okay, and KMail didn't seem to have the
same issue. Howeve
Good stuff, Daniel...
Thanks for taking the time to tabulate the results and present them.
If your results hold, it may have a significant impact on my
application. I'm working on a Perl/XS port, and I think a lot of
people who want to run it won't be running mod_perl, so startup times
Doug Cutting wrote:
Marvin Humphrey wrote:
You *can't* set it on the reader end. If you could set it, the
reader would get out of sync and break. The value is set
per-segment at write time, and the reader has to be able to adapt on
the fly.
It would actually not be too hard to change
lto:[EMAIL PROTECTED]
Sent: maandag 14 november 2005 18:19
To: java-user@lucene.apache.org
Subject: Re: Memory Usage
Marvin Humphrey wrote:
> You *can't* set it on the reader end. If you could set it, the
reader
> would get out of sync and break. The value is set per-segment at
write
&
On Nov 14, 2005, at 9:19 AM, Doug Cutting wrote:
It would actually not be too hard to change things so that there
was such a parameter that could be set on an IndexReader. It would
determine the fraction of entries in the .tii file that are kept in
RAM. So if the parameter were, e.g., 10
Marvin Humphrey wrote:
You *can't* set it on the reader end. If you could set it, the reader
would get out of sync and break. The value is set per-segment at write
time, and the reader has to be able to adapt on the fly.
It would actually not be too hard to change things so that there was
On Nov 13, 2005, at 10:22 PM, Daniel Noll wrote:
Okay, I've gone and revised how things are fitting together in our
app. It seems that we already call optimize() at the end of all
the processing, before which I could figure out what kind of value
we should be using and call this setter m
Chris Hostetter wrote:
: I think though, that I will need a setter on the reader, rather than the
: writer. That is, I don't know what factor we want until I know how
: large the index is. And I don't know how large the index will be at the
: time of creating the writer, but I can just ask for
On Nov 13, 2005, at 6:27 PM, Chris Hostetter wrote:
I believe if you really want to determine settings like this after
building the index, you'll need to do an initial build the index using
best guess values -- then if the calculations you do once the index is
built aren't close enough to your
: I think though, that I will need a setter on the reader, rather than the
: writer. That is, I don't know what factor we want until I know how
: large the index is. And I don't know how large the index will be at the
: time of creating the writer, but I can just ask for maxDoc() at the time
: o
On Nov 13, 2005, at 6:11 PM, Daniel Noll wrote:
Now, to figure out how to set it.
There's no setter that I can see... then again it may be in trunk,
and just not in the version we're stuck on for the time being.
I haven't checked 1.4.3, but yes, I'm looking at the subversion
trunk. It's
Marvin Humphrey wrote:
You want indexInterval. Here's an excerpt from the docs in
TermInfosWriter.
Excellent, that looks like exactly what we're after. Now, to figure out
how to set it.
There's no setter that I can see... then again it may be in trunk, and
just not in the version we're
On Nov 9, 2005, at 4:48 PM, Daniel Noll wrote:
My question is: is this 1/128 figure set in stone, or can it be
changed without major consequences?
You want indexInterval. Here's an excerpt from the docs in
TermInfosWriter.
// TODO: the default values for these two parameters
// shoul
22 matches
Mail list logo