Re: Clustering with Lucene?

2011-04-26 Thread vivek sar
Thanks Dawid. I was trying to give some example, but this is not exactly our text. Our fields include things like "user name", "IP Address", "Application Name", "Port 3", "Byte Count" - all network related stuff. So, if user searches on certain IP address then we would need to group the result by u

Re: Clustering with Lucene?

2011-04-26 Thread vivek sar
? > > With the sizes you report Carrot2 won't work for you, I'm afraid, but > Mahout may. Still, there's plenty of algorithms and preprocessing > options to consider, so if you provide more background somebody may > push you in the right direction. > > Dawid >

Clustering with Lucene?

2011-04-26 Thread vivek sar
Hi, I've been researching about clustering with Lucene. Here is what I've found so far, 1) Lucene clustering with Carrot2 - http://download.carrot2.org/head/manual/#section.getting-started.lucene - but, this seems suitable for only smaller size index (few hundred documents) - http://downlo

Re: background merge hit exception

2009-02-25 Thread vivek sar
Hi, We ran into the same issue (corrupted index) using Lucene 2.4.0. There was no outage or system reboot - not sure how could it get corrupted. Here is the exception, Caused by: java.io.IOException: background merge hit exception: _io5:c66777491 _nh9:c10656736 _taq:c2021563 _s8m:c1421051 _uh5:

Re: Background merge hit exception

2008-09-18 Thread vivek sar
are running in JRE 1.5+ >environment, you can set the default exception handler for >threads to do your own logging: > > > > http://java.sun.com/j2se/1.5.0/docs/api/java/lang/Thread.html#setDefaultUncaughtExceptionHandler(java.lang.Thread.UncaughtExceptionHandler) &

Background merge hit exception

2008-09-17 Thread vivek sar
Hi, We have been running Lucene 2.3 for last few months with our application and all the sudden we have hit the following exception, java.lang.RuntimeException: java.io.IOException: background merge hit exception: _2uxy:c11345949 _2uxz:c150 _2uy0:c150 _2uy1:c150 _2uy2:c150 _2uy3:c150 _2uy

java.lang.IllegalArgumentException: Segment is too large

2008-03-30 Thread vivek sar
Hi, I'm using 2.3.0 Lucene build and have following merge parameters, mergeFactor = 100 maxMergeDocs = 9 maxBufferedDocs = 1 maxRAMBufferSizeMB = 200 After running with this setting for a month without problem all the sudden I'm getting following exception, java.lang.IllegalArgumentE

Re: DefaultIndexAccessor

2008-02-28 Thread vivek sar
7:14 PM, Mark Miller <[EMAIL PROTECTED]> wrote: > > > vivek sar wrote: > > Mark, > > > > Just for my clarification, > > > > 1) Would you have indexStop and indexStart methods? If that's the case > > then I don't have to call clos

Re: DefaultIndexAccessor

2008-02-28 Thread vivek sar
e and your > code should work as it did. I'll be sure to add this to the test cases. > > > Just as a personal interest question, what has led you to setup your > index this way? Adding partitions as it grows that is. > > > > - Mark > > vivek

Re: DefaultIndexAccessor

2008-02-28 Thread vivek sar
that after the Executor gets shutdown it is > not reopened in the open method. I can certainly change this, but I need > to look for any other issues as well. I will add an open after a > shutdown test to investigate. I am going to think about the issue > further and I will get back

Re: DefaultIndexAccessor

2008-02-28 Thread vivek sar
te? Any thing else I can check? Thanks, -vivek On Thu, Feb 28, 2008 at 1:26 PM, vivek sar <[EMAIL PROTECTED]> wrote: > Mark, > > We deployed our indexer (using defaultIndexAccessor) on one of the > production site and getting this error, > > Caused by: java.util.concur

Re: DefaultIndexAccessor

2008-02-28 Thread vivek sar
running your latest IndexAccessor-021508 code. Any ideas (it's kind of urgent for us)? Thanks, -vivek On Fri, Feb 15, 2008 at 6:50 PM, vivek sar <[EMAIL PROTECTED]> wrote: > Mark, > > Thanks for the quick fix. Actually, it is possible that there might > had been simulta

Re: DefaultIndexAccessor

2008-02-15 Thread vivek sar
iller <[EMAIL PROTECTED]> wrote: > Here is the fix: https://issues.apache.org/jira/browse/LUCENE-1026 > > > vivek sar wrote: > > Mark, > > > >There seems to be some issue with DefaultMultiIndexAccessor.java. I > > got following NPE exception, > > > &g

Re: DefaultIndexAccessor

2008-02-15 Thread vivek sar
ease which sub > Searcher. It's all rather simple, and I am struggling to see another > possibility beyond returning a foreign MultiSearcher somehow. > > I will keep looking and keep you posted. In the mean time, do you have > any other data or code snippets to share? > > >

Re: DefaultIndexAccessor

2008-02-15 Thread vivek sar
Mark, There seems to be some issue with DefaultMultiIndexAccessor.java. I got following NPE exception, 2008-02-13 07:10:28,021 ERROR [http-7501-Processor6] ReportServiceImpl - java.lang.NullPointerException at org.apache.lucene.indexaccessor.DefaultMultiIndexAccessor.release(Defa

Luke for Lucene 2.3?

2008-01-29 Thread vivek sar
Hi, Has anyone tried Luke v0.7.1 with the latest Lucene build, v2.3? I'm getting "Unknown format version: -4" error when opening Lucene 2.3 index with Luke 0.7.1. Is there any upgraded version of Luke anywhere? I also read something about web-based Luke, but can't find it in the contrib in 2.3,

Re: Archiving Index using partitions

2008-01-24 Thread vivek sar
m/ -- Lucene - Solr - Nutch > > - Original Message > From: vivek sar <[EMAIL PROTECTED]> > To: java-user@lucene.apache.org > Sent: Monday, January 21, 2008 3:06:50 PM > Subject: Archiving Index using partitions > > Hi, > > As a requirement I need to be abl

Re: Using RangeFilter

2008-01-24 Thread vivek sar
I've a field as NO_NORM, does it has to be untokenized to be able to sort on it? On Jan 21, 2008 12:47 PM, Antony Bowesman <[EMAIL PROTECTED]> wrote: > vivek sar wrote: > > I need to be able to sort on optime as well, thus need to store it . > > Lucene's default

Archiving Index using partitions

2008-01-21 Thread vivek sar
Hi, As a requirement I need to be able to archive any indexes older than 2 weeks (due to space and performance reasons). That means I would need to maintain weekly indexes. Here are my questions, 1) What's the best way to partition indexes using Lucene? 2) Is there a way I can partition document

Re: Optimize for large index size

2008-01-20 Thread vivek sar
ergeDocs=9 -- do you really mean maxMergeDocs and not maxBufferedDocs? > > Larg(er) maxBufferedDocs will speed up indexing. > > Otis > > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > - Original Message > From: vivek sar <[EMAIL PROTECT

Re: Using RangeFilter

2008-01-19 Thread vivek sar
y not just index them and not > store them if index size is a concern? > > Otis > > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > - Original Message > From: vivek sar <[EMAIL PROTECTED]> > To: java-user@lucene.apache.org > Sent:

Using RangeFilter

2008-01-19 Thread vivek sar
Hi, I have a requirement to filter out documents by date range. I'm using RangeFilter (in combination to FilteredQuery) to do this. I was under the impression the filtering is done on documents, thus I'm just storing the date values, but not indexing them. As every new document would have a new d

Re: Optimize for large index size

2008-01-18 Thread vivek sar
). Thanks, -vivek On Jan 18, 2008 2:37 AM, Michael McCandless <[EMAIL PROTECTED]> wrote: > > vivek sar wrote: > > > Hi, > > > > We are using Lucene 2.2. We have an index of size 70G (within 3-4 > > days) and growing. We run optimize pretty frequently (o

Optimize for large index size

2008-01-18 Thread vivek sar
Hi, We are using Lucene 2.2. We have an index of size 70G (within 3-4 days) and growing. We run optimize pretty frequently (once every hour - due to large number of index updates every min - can be up to 100K new documents every min). I have seen every now and then the optimize takes 3-4 hours t

Re: restoring a corrupt index?

2007-11-13 Thread vivek sar
We have seen similar exceptions (with Lucene 2.2) when were doing the following mistakes, 1) Not closing the old searchers and re-creating a new one for every new search (fixed it by closing the searcher every time, if you want you could only one searcher instance as well) 2) Not having any jvm sh

Re: Help with Lucene Indexer crash recovery

2007-10-05 Thread vivek sar
r from it. Thanks, -vivek On 10/5/07, Michael McCandless <[EMAIL PROTECTED]> wrote: > "vivek sar" <[EMAIL PROTECTED]> wrote: > > > We are using Lucene 2.3. > > Do you mean Lucene 2.2? Your stack trace seems to line up with 2.2, > and 2.3 isn't quite

Help with Lucene Indexer crash recovery

2007-10-04 Thread vivek sar
Hi, We are using Lucene 2.3. The problem we are facing is quite a few times if our application is stopped (killed or crash) while Indexer is doing its job, the next time when we bring up the application the Indexer fails to run with the following exception, 2007-10-04 12:29:53,089 ERROR [PS thre