Re: Optimizing indexes with mulitiple processors?

2005-06-10 Thread Chris Collins
Yeh I think the bug is related to an array copy that expects 1k blocks (if I recall it was RAMDirectory or something like that). C --- Kevin Burton <[EMAIL PROTECTED]> wrote: > Chris Collins wrote: > > >Well I am currently looking at merging too. In my application merging will > >occur again

Re: Optimizing indexes with mulitiple processors?

2005-06-10 Thread Chris Collins
Dont forget that when a document is indexed it starts life in its own segment. If you have min merge of 4k you could have an awefull lot of 1 doc segments on the segment stack.thats why I run out of memory. If that is the case that each of these at some point has a buffer of 8k or say 64k you

Re: Lucene in clustered environment (Tomcat)

2005-06-10 Thread Nader Henein
Considering you have all your servers on one machine a simple memory failure and the whole thing goes south. But you're right, we have an independent Lucene index sitting next to each one of our webservers on each machine, but they are all updated from a central location powered and organized by an

Re: Question on lucene sandbox highlighter

2005-06-10 Thread Erik Hatcher
On Jun 10, 2005, at 11:28 AM, Terence Lai wrote: Hi all, I have a couple questions regarding to the Highlighter. Question 1: === I download the highlighter source files. When I compile the code, I got the following error: org/apache/lucene/search/highlight/TokenSou

Re: Optimizing indexes with mulitiple processors?

2005-06-10 Thread Kevin Burton
Peter A. Friend wrote: I changed that value to 8k, and based on the truss output from an index run, it is working. Haven't gotten much beyond that to see if it causes problems elsewhere. The value also needs to be altered on the read end of things. Ideally, this will be made settable via

Re: Optimizing indexes with mulitiple processors?

2005-06-10 Thread Kevin Burton
Chris Collins wrote: Well I am currently looking at merging too. In my application merging will occur against a filer (read as higher latency device). I am currently working on how to stage indices on local disk before moving to a filer. Assume I must move to a filer eventually for whatever c

Re: Optimizing indexes with mulitiple processors?

2005-06-10 Thread Peter A. Friend
On Jun 10, 2005, at 9:33 AM, Chris Collins wrote: How many documents did you try to index? Only about 4000 at the moment. I am using a relatively large minMergeDoc that causes me to run out of memory when I make such a change. (I am using 1/2 gb of heap btw). I was running out of mem

Re: view index file

2005-06-10 Thread Aalap Parikh
Hi, Use Luke. It's an excellent tool and everybody in the Lucene community uses that. http://www.getopt.org/luke/ Aalap. --- avrootshell <[EMAIL PROTECTED]> wrote: > Hi, > >I'm curious to know,if there is any way to view > the .cfs file(the > index file created). > > Someone plz shred s

Re: view index file

2005-06-10 Thread Nader Henein
The one browsing utility I've come across to browse through Lucene Indecies was Luke (I use it successfully to debug index issues) check it out, http://www.getopt.org/luke/ Hope this answers your question Nader Henein avrootshell wrote: Hi, I'm curious to know,if there is any way to view

view index file

2005-06-10 Thread avrootshell
Hi, I'm curious to know,if there is any way to view the .cfs file(the index file created). Someone plz shred some light on this. Tia. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PRO

Re: Optimizing indexes with mulitiple processors?

2005-06-10 Thread Chris Collins
How many documents did you try to index? I am using a relatively large minMergeDoc that causes me to run out of memory when I make such a change. (I am using 1/2 gb of heap btw). I believe changing it in the outputstream object means that a lot of in memory only objects use that size too...I assu

Re: Optimizing indexes with mulitiple processors?

2005-06-10 Thread Peter A. Friend
On Jun 9, 2005, at 11:52 PM, Chris Collins wrote: In that case I have a different performance issue, that is that FSInputStream and FSOutputStream inherit the buffer size of 1k from OS and IS This would be useful to increase to reduce the amount of RPC's to the filer when doing merges ...

Usenet Bridge or LSA Support?

2005-06-10 Thread Mike Winter
Pardon me if this has been asked before, but I was wondering if there exists a Lucene -> Usenet bridge or support for latent semantic scoring? Thanks for any information. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additio

Re: Optimizing indexes with mulitiple processors?

2005-06-10 Thread Chris Collins
Hi John, your comments are correct. But based on the fact we know on our box we have almost 80MB sustainable bandwidth and very low latency to disk per second, and observing that the io we are doing in lucene is small in comparison a 1 second I am reasonably confident that this time spent is not f

Question on lucene sandbox highlighter

2005-06-10 Thread Terence Lai
Hi all, I have a couple questions regarding to the Highlighter. Question 1: === I download the highlighter source files. When I compile the code, I got the following error: org/apache/lucene/search/highlight/TokenSources.java [19:1] cannot resolve symbol symbol : clas

Re: Optimizing indexes with mulitiple processors?

2005-06-10 Thread John Haxby
Chris Collins wrote: Ok that part isnt surprising. However only about 1% of 30% of the merge was spent in the OS.flush call (not very IO bound at all with this controller). On Linux, at least, measuring the time taken in OS.flush is not a good way to determine if you're I/O bound -- all tha

Re: Optimizing indexes with mulitiple processors?

2005-06-10 Thread Chris Collins
Yes, that would line up with being pretty much cpu bound. So if you were to have 2 xeon's with HT then you kinda have almost 4 resources (threads) of execution you could take advantage of. So from my current tests where I have a multiple threads producing work for an index and one index writer (o

SimilarityDelegator examples ?

2005-06-10 Thread Robichaud, Jean-Philippe
Hi Everyone, I've been using Lucene a lot and I would like to know how the SimilarityDelegator should be used. I would like to override only the lengthNorm member of the DefaultSimilarity and I understand that this is exactly the purpose of SimilarityDelegator ? Am I right? Does this class

Re: Optimizing indexes with mulitiple processors?

2005-06-10 Thread Bill Au
That's not true in my case. The CPU never went over 50%. I/O wait is often greater the CPU and can be as high as 90%. Bill On 6/10/05, Kevin Burton <[EMAIL PROTECTED]> wrote: > Bill Au wrote: > > >Optimize is disk I/O bound. So I am not sure what multiple CPUs will buy > >you. > > > > > > N

Re: Optimizing indexes with mulitiple processors?

2005-06-10 Thread Chris Collins
Kevin I would be curious to know more about your merging issues. As I mentioned I am concerned about merge time and in my case its against a filer that of course have high latency. The other issue is that I effectively index things with a primary key. I need to ensure an efficient way of prevent