Thanks Chris, your suggestion is very appropriate and I am happy to share my
work with the Lucene community,
Regards,
Shashi
On Tue, Mar 24, 2009 at 7:15 PM, Chris Hostetter
wrote:
>
> : This is perfect, exactly what I was looking for. Thanks much Andrzej!
>
> if you code that up and it works o
: This is perfect, exactly what I was looking for. Thanks much Andrzej!
if you code that up and it works out well, contributing your code as a
Jira attachment could help it become a re-usable tool for others in the
future.
(a simple command line that takes the directory of hte index, a
value
This is perfect, exactly what I was looking for. Thanks much Andrzej!
On Mon, Mar 23, 2009 at 1:43 AM, Andrzej Bialecki wrote:
> Shashi Kant wrote:
>
>> Is there an "elegant" approach to partitioning a large Lucene index (~1TB)
>> into smaller sub-indexes other than the obvious method of re-ind
Shashi Kant wrote:
Is there an "elegant" approach to partitioning a large Lucene index (~1TB)
into smaller sub-indexes other than the obvious method of re-indexing into
partitions?
Any ideas?
Try the following:
* open your index, and mark all documents as deleted except 1/Nth that
should fill
Is there an "elegant" approach to partitioning a large Lucene index (~1TB)
into smaller sub-indexes other than the obvious method of re-indexing into
partitions?
Any ideas?
Thanks,
Shashi
On 4-Jul-07, at 5:31 AM, Ndapa Nakashole wrote:
I am considering using Lucene in my mini Grid-based search engine.
I would
like to partition my index by term as opposed to partition by
document. From
what i have read in the mailing list so far, it seems like
partition by term
is impossible
Ndapa Nakashole a écrit :
> I am considering using Lucene in my mini Grid-based search engine. I
> would
> like to partition my index by term as opposed to partition by
> document. From
> what i have read in the mailing list so far, it seems like partition
> by term
> is impossible with Lucene. am
I am considering using Lucene in my mini Grid-based search engine. I would
like to partition my index by term as opposed to partition by document. From
what i have read in the mailing list so far, it seems like partition by term
is impossible with Lucene. am i right to conclude this! I know Nutch
nd
plan to put Apache in front of them (as long as we can prove the 2
parts of the mirror stay in sync, initially we'll just set apache to
favor 1 server, with manual failover until we're completely sure).
We have plans to be implemented eventually that include an Index
partit
: Since this isn't in production yet, I'd rather be proven wrong now
: rather than later! :)
it sounds like what you're doing makes a lot of sense given your
situation, and the nature of your data.
the one thing you might not have concidered yet, which doesn't have to
make a big difference in yo
Many thanks for confirming the principles should work fine. It is a
load off my mind! :)
On index update, a small Event is triggered into a Buffer, that is
periodically (every 30 seconds) processed to coalesce them, then
ensure that any open IndexSearcher in the cache is closed.
On 12/07
If you want really real-time updates of search results, then yes.
However, maybe you can live with near-real-time results, in which cases
you can add some logic to your application to check for index version
only every N requests/minutes/hours.
Otis
--- Aalap Parikh <[EMAIL PROTECTED]> wrote:
>I don't really know a lot about what gets loaded into
memory when you
>make/use a new searcher, but the one thing i've
learned from experience
>is
>that the FieldCache (which gets used when you sort on
a field) contains
>every term in the field you are sorting on, and an
instance of
>FieldCache
Paul - I'm doing the same (smaller indices) for Simpy.com for similar
reasons (fast, independent and faster reindexing, etc.). Each index
has its own IndexSearcher, and they are kept in a LRU data structure.
Before each search the index version is checked, and new IndexSearcher
created in case th
Hello,
We are already using this design in production for a email job application
system.
Each client (company) have an account and may have multiple users
When a new client is created, a new lucene index is automatically created when
new job-applications arrive for this account.
Job applicati
On 11/07/2005, at 10:43 AM, Chris Hostetter wrote:
: > Generally speaking, you only ever need one active Searcher, which
: > all of
: > your threads should be able to use. (Of course, Nathan says that
: > in his
: > code base, doing this causes his JVM to freeze up, but I've
never seen
: >
: > Generally speaking, you only ever need one active Searcher, which
: > all of
: > your threads should be able to use. (Of course, Nathan says that
: > in his
: > code base, doing this causes his JVM to freeze up, but I've never seen
: > this myself).
: >
: Thanks for your response Chris. Do y
On 11/07/2005, at 9:15 AM, Chris Hostetter wrote:
: Nathan's point about pooling Searchers is something that we also
: addressed by a LRU cache mechanism. In testing we also found that
Generally speaking, you only ever need one active Searcher, which
all of
your threads should be able to u
: Nathan's point about pooling Searchers is something that we also
: addressed by a LRU cache mechanism. In testing we also found that
Generally speaking, you only ever need one active Searcher, which all of
your threads should be able to use. (Of course, Nathan says that in his
code base, doin
Nathan, first apologies for somewhat hijacking your thread, but I
believe my question to be very related.
Nathan's Scenario 1 is the one we're effectively employing (or in the
process of setting up). Rather than 1 Index To Rule Them All, I have
decided to partition the index structure. Us
20 matches
Mail list logo