I hope that helps, if you find anything interesting do post it somewhere.
I'm afraid I'm a little bit far away from New Orleans at the moment.
Regards.
2008/11/4 Todd Benge <[EMAIL PROTECTED]>
> Thanks Pablo.
>
> I'll be flying to New Orleans tomorrow for ApacheCon and would love
> the opportuni
Thanks Pablo.
I'll be flying to New Orleans tomorrow for ApacheCon and would love
the opportunity to talk with others about architectures others are
using.
Todd
On 11/4/08, PabloS <[EMAIL PROTECTED]> wrote:
>
> Sure Todd,
>
> the idea basically consist in the following:
>
> - Subclassing FIeld
Sure Todd,
the idea basically consist in the following:
- Subclassing FIeldSortedHitQueue and calling support with an empty
SortField array: this disables caching because the comparators are retrieved
during construction
- Creating a new SortComparatorSource that creates the sort comparators
sim
Pablo,
Would you mind adding a little more detail about how you're working
around the problem?
I'm still evaluating our different options so am interested in what you did.
Todd
On Mon, Nov 3, 2008 at 2:37 PM, PabloS <[EMAIL PROTECTED]> wrote:
>
> Thanks hossman, but I've already 'solved' the pr
Thanks hossman, but I've already 'solved' the problem without the need to
patch lucene. I had to code a bit around Lucene's visibility restrictions
but I've managed to completely skip the field caching mechanism and add
ehcache to it.
At the moment it seems to be working quite well, although not
: I'm having a similar problem with my application, although we are using
: lucene 2.3.2. The problem we have is that we are required to sort on most of
: the fields (20 at least). Is there any way of changing the cache being used?
there is a patch in Jira that takes a completley different approa
Thanks for the quick reply :). For now, I'd settle with just storing cache
values in soft references so at least the GC would be able to free up some
space when it needs to.
I think I'll just try to override the default sorting mechanism by
subclassing FieldSortedHitQueue. I'll let you know how i
20 fields on a huge index? Wow - not sure there is a ton you can do with
that...anyone have any suggestions for that one? Distributed should help
I suppose, but thats a lot of sort fields for a large index.
If LUCENE-831 ever gets off the ground you will be able to change the
cache used, and p
Hi,
I'm having a similar problem with my application, although we are using
lucene 2.3.2. The problem we have is that we are required to sort on most of
the fields (20 at least). Is there any way of changing the cache being used?
I can't seem to find a way, since the cache is being accessed using
rs,
> Mark
>
>
>
>
>
> - Original Message
> From: Mark Miller <[EMAIL PROTECTED]>
> To: "java-user@lucene.apache.org"
> Sent: Thursday, 30 October, 2008 10:37:48
> Subject: Re: OutOfMemory Problems Lucene 2.4 / Tomcat
>
> Michaels got
to write a more optimized custom field cache then the above
code may be a useful start point.
Cheers,
Mark
- Original Message
From: Mark Miller <[EMAIL PROTECTED]>
To: "java-user@lucene.apache.org"
Sent: Thursday, 30 October, 2008 10:37:48
Subject: Re: OutOfMemory Pr
Michaels got some great points (he the lucene master), especially
possibly turning off norms if you can, but for an index like that i'd
reccomwnd solr. Solr sharding can be scaled to billions (min a billion
or two anyway) with few limitations (of course there are a few). Plus
it has further
The terms index (*.tii), which is loaded entirely into RAM, can
consume an unexpectedly large amount of memory when there are an
unusually high number of terms. If you are not using compound file
format, can you look at the size of *.tii?
If this is what is affecting you, one simple wor
Thanks Mark. I appreciate the help.
I thought our memory may be low but wanted to verify there if there is
any way to control memory usage. I think we'll likely upgrade the
memory on the machines but that may just delay the inevitable.
Wondering if anyone else has encountered similar issues wit
The term, terminfo, indexreader internals stuff is prob on the low end
compared to the size of your field caches (needed for sorting). If you
are sorting by String I think the space needed is 32 bits x number of
docs + an array to hold all of the unique terms. So checking 300 million
docs (I kn
There's usually only a couple sort fields and a bunch of terms in the
various indices. The terms are user entered on various media so the
number of terms is very large.
Thanks for the help.
Todd
On 10/29/08, Todd Benge <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I'm the lead engineer for search on a
How many fields are you sorting on? Lots of unuiqe terms in those
fields?
- Mark
On Oct 29, 2008, at 6:03 PM, "Todd Benge" <[EMAIL PROTECTED]> wrote:
Hi,
I'm the lead engineer for search on a large website using lucene for
search.
We're indexing about 300M documents in ~ 100 indices.
17 matches
Mail list logo