Re: Index Partitioning

2009-03-25 Thread Shashi Kant
Thanks Chris, your suggestion is very appropriate and I am happy to share my work with the Lucene community, Regards, Shashi On Tue, Mar 24, 2009 at 7:15 PM, Chris Hostetter wrote: > > : This is perfect, exactly what I was looking for. Thanks much Andrzej! > > if you code that up and it works o

Re: Index Partitioning

2009-03-24 Thread Chris Hostetter
: This is perfect, exactly what I was looking for. Thanks much Andrzej! if you code that up and it works out well, contributing your code as a Jira attachment could help it become a re-usable tool for others in the future. (a simple command line that takes the directory of hte index, a value

Re: Index Partitioning

2009-03-23 Thread Shashi Kant
This is perfect, exactly what I was looking for. Thanks much Andrzej! On Mon, Mar 23, 2009 at 1:43 AM, Andrzej Bialecki wrote: > Shashi Kant wrote: > >> Is there an "elegant" approach to partitioning a large Lucene index (~1TB) >> into smaller sub-indexes other than the obvious method of re-ind

Re: Index Partitioning

2009-03-22 Thread Andrzej Bialecki
Shashi Kant wrote: Is there an "elegant" approach to partitioning a large Lucene index (~1TB) into smaller sub-indexes other than the obvious method of re-indexing into partitions? Any ideas? Try the following: * open your index, and mark all documents as deleted except 1/Nth that should fill

Index Partitioning

2009-03-21 Thread Shashi Kant
Is there an "elegant" approach to partitioning a large Lucene index (~1TB) into smaller sub-indexes other than the obvious method of re-indexing into partitions? Any ideas? Thanks, Shashi

Re: Index partitioning by term

2007-07-04 Thread Mike Klaas
On 4-Jul-07, at 5:31 AM, Ndapa Nakashole wrote: I am considering using Lucene in my mini Grid-based search engine. I would like to partition my index by term as opposed to partition by document. From what i have read in the mailing list so far, it seems like partition by term is impossible

Re: Index partitioning by term

2007-07-04 Thread Mathieu Lecarme
Ndapa Nakashole a écrit : > I am considering using Lucene in my mini Grid-based search engine. I > would > like to partition my index by term as opposed to partition by > document. From > what i have read in the mailing list so far, it seems like partition > by term > is impossible with Lucene. am

Index partitioning by term

2007-07-04 Thread Ndapa Nakashole
I am considering using Lucene in my mini Grid-based search engine. I would like to partition my index by term as opposed to partition by document. From what i have read in the mailing list so far, it seems like partition by term is impossible with Lucene. am i right to conclude this! I know Nutch

Re: Index Partitioning ( was Re: Search deadlocking under load)

2005-07-12 Thread Paul Smith
nd plan to put Apache in front of them (as long as we can prove the 2 parts of the mirror stay in sync, initially we'll just set apache to favor 1 server, with manual failover until we're completely sure). We have plans to be implemented eventually that include an Index partit

Re: Index Partitioning ( was Re: Search deadlocking under load)

2005-07-12 Thread Chris Hostetter
: Since this isn't in production yet, I'd rather be proven wrong now : rather than later! :) it sounds like what you're doing makes a lot of sense given your situation, and the nature of your data. the one thing you might not have concidered yet, which doesn't have to make a big difference in yo

Re: Re[2]: Index Partitioning ( was Re: Search deadlocking under load)

2005-07-11 Thread Paul Smith
Many thanks for confirming the principles should work fine. It is a load off my mind! :) On index update, a small Event is triggered into a Buffer, that is periodically (every 30 seconds) processed to coalesce them, then ensure that any open IndexSearcher in the cache is closed. On 12/07

Re: Index Partitioning ( was Re: Search deadlocking under load)

2005-07-11 Thread Otis Gospodnetic
If you want really real-time updates of search results, then yes. However, maybe you can live with near-real-time results, in which cases you can add some logic to your application to check for index version only every N requests/minutes/hours. Otis --- Aalap Parikh <[EMAIL PROTECTED]> wrote:

Re: Index Partitioning ( was Re: Search deadlocking under load)

2005-07-11 Thread Aalap Parikh
>I don't really know a lot about what gets loaded into memory when you >make/use a new searcher, but the one thing i've learned from experience >is >that the FieldCache (which gets used when you sort on a field) contains >every term in the field you are sorting on, and an instance of >FieldCache

Re: Re[2]: Index Partitioning ( was Re: Search deadlocking under load)

2005-07-11 Thread Otis Gospodnetic
Paul - I'm doing the same (smaller indices) for Simpy.com for similar reasons (fast, independent and faster reindexing, etc.). Each index has its own IndexSearcher, and they are kept in a LRU data structure. Before each search the index version is checked, and new IndexSearcher created in case th

Re[2]: Index Partitioning ( was Re: Search deadlocking under load)

2005-07-11 Thread Sven Duzont
Hello, We are already using this design in production for a email job application system. Each client (company) have an account and may have multiple users When a new client is created, a new lucene index is automatically created when new job-applications arrive for this account. Job applicati

Re: Index Partitioning ( was Re: Search deadlocking under load)

2005-07-10 Thread Paul Smith
On 11/07/2005, at 10:43 AM, Chris Hostetter wrote: : > Generally speaking, you only ever need one active Searcher, which : > all of : > your threads should be able to use. (Of course, Nathan says that : > in his : > code base, doing this causes his JVM to freeze up, but I've never seen : >

Re: Index Partitioning ( was Re: Search deadlocking under load)

2005-07-10 Thread Chris Hostetter
: > Generally speaking, you only ever need one active Searcher, which : > all of : > your threads should be able to use. (Of course, Nathan says that : > in his : > code base, doing this causes his JVM to freeze up, but I've never seen : > this myself). : > : Thanks for your response Chris. Do y

Re: Index Partitioning ( was Re: Search deadlocking under load)

2005-07-10 Thread Paul Smith
On 11/07/2005, at 9:15 AM, Chris Hostetter wrote: : Nathan's point about pooling Searchers is something that we also : addressed by a LRU cache mechanism. In testing we also found that Generally speaking, you only ever need one active Searcher, which all of your threads should be able to u

Re: Index Partitioning ( was Re: Search deadlocking under load)

2005-07-10 Thread Chris Hostetter
: Nathan's point about pooling Searchers is something that we also : addressed by a LRU cache mechanism. In testing we also found that Generally speaking, you only ever need one active Searcher, which all of your threads should be able to use. (Of course, Nathan says that in his code base, doin

Index Partitioning ( was Re: Search deadlocking under load)

2005-07-08 Thread Paul Smith
Nathan, first apologies for somewhat hijacking your thread, but I believe my question to be very related. Nathan's Scenario 1 is the one we're effectively employing (or in the process of setting up). Rather than 1 Index To Rule Them All, I have decided to partition the index structure. Us