Multiple query criteria search

2006-11-30 Thread spinergywmy
Hi guys, I m actually trying on search indeces by entering multiple queries, for instance, I have 4 textboxes with search criterias such as AND, OR, NOT and Exact phrase. I m using queryParser operator and then add the query into booleanquery, therefore I dont think my search result was correc

indexing performance issue

2006-11-30 Thread spinergywmy
Hi guys, I have posted this question before and this time I found that it could be pdfbox problem and this pdfbox I downloaded doesn't use the log4j.jar. To index the app 2.13mb pdf file took me 17s and total time to upload a file is 18s. So, is there any way or others software than pdfbox

Re: 2.1-dev memory leak?

2006-11-30 Thread Michael McCandless
Otis Gospodnetic wrote: Hi, Is anyone running Lucene trunk/HEAD version in a serious production system? Anyone noticed any memory leaks? I'm asking because I recently bravely went from 1.9.1 to 2.1-dev (trunk from about a week ago) and all of a sudden my application that was previosly consu

Re: indexing performance issue

2006-11-30 Thread Grant Ingersoll
http://lucene.apache.org/java/docs/contributions.html lists several PDF alternatives, but I can't speak to their performance. I am sure if you googled PDF converters you could find a fair number of hits. Perhaps w/ some more details about your app we might be able to find a workaround. We

Re: indexing performance issue

2006-11-30 Thread spinergywmy
Hi Grant, Thanks for the tips. I will take ur adviced and look into the link that u send to me. For my scenario will be every time the users upload the single file, I need to index that particular file. Previously was because the previous version of pdfbox integrate with log4j.jar file and

Re: indexing performance issue

2006-11-30 Thread Grant Ingersoll
On Nov 30, 2006, at 10:54 AM, spinergywmy wrote: Hi Grant, Thanks for the tips. I will take ur adviced and look into the link that u send to me. For my scenario will be every time the users upload the single file, I need to index that particular file. Previously was because the

Re: Multiple query criteria search

2006-11-30 Thread Chris Hostetter
:I m actually trying on search indeces by entering multiple queries, for : instance, I have 4 textboxes with search criterias such as AND, OR, NOT and : Exact phrase. I m using queryParser operator and then add the query into : booleanquery, therefore I dont think my search result was correct.

Re: 2.1-dev memory leak?

2006-11-30 Thread Otis Gospodnetic
Hi, Wow, that was fast - java-user support is just as fast as I heard! ;) I'll try your patch shortly. Like I said, the bug may be in my application. Here is a clue. Memory usage increases with the number of open files (file descriptors) on the system, and lsof gives: COMMAND PIDUSER

Re: 2.1-dev memory leak?

2006-11-30 Thread Chris Hostetter
: IndexSearchers open. The other ones I "let go" without an explicit : close() call. The assumption is that the old IndexSearchers "expire", : that they get garbage collected, as I'm no longer holding references to : them. yeah ... that just seems really bad in general, i would try to explicitl

Re: 2.1-dev memory leak?

2006-11-30 Thread Yonik Seeley
On 11/30/06, Chris Hostetter <[EMAIL PROTECTED]> wrote: : IndexSearchers open. The other ones I "let go" without an explicit : close() call. The assumption is that the old IndexSearchers "expire", : that they get garbage collected, as I'm no longer holding references to : them. yeah ... that j

Re: 2.1-dev memory leak?

2006-11-30 Thread Michael McCandless
Yonik Seeley wrote: On 11/30/06, Chris Hostetter <[EMAIL PROTECTED]> wrote: : IndexSearchers open. The other ones I "let go" without an explicit : close() call. The assumption is that the old IndexSearchers "expire", : that they get garbage collected, as I'm no longer holding references to : t

Re: 2.1-dev memory leak?

2006-11-30 Thread Michael McCandless
Otis Gospodnetic wrote: Wow, that was fast - java-user support is just as fast as I heard! ;) Well let's withhold judgment until we see if that tool really works correctly :) I'll try your patch shortly. Like I said, the bug may be in my application. Here is a clue. Memory usage increases

Re: indexing performance issue

2006-11-30 Thread Antony Bowesman
Grant Ingersoll wrote: On Nov 30, 2006, at 10:54 AM, spinergywmy wrote: For my scenario will be every time the users upload the single file, I need to index that particular file. Previously was because the previous version of pdfbox integrate with log4j.jar file and I believe is the log4j.j

Re: indexing performance issue

2006-11-30 Thread Antony Bowesman
spinergywmy wrote: I have posted this question before and this time I found that it could be pdfbox problem and this pdfbox I downloaded doesn't use the log4j.jar. To index the app 2.13mb pdf file took me 17s and total time to upload a file is 18s. Re: PFDBox. I have a 2.5Mb test file that

any ides on this type of analyzer?

2006-11-30 Thread Van Nguyen
I've been trying to brainstorm on this but could not figure out a way to go about this. Let's say I'm searching for "batman". I want results that include: batman bat man bat-man etc. or if I search screwdriver, I would want results to include: screwdriver screw drivers etc.

Re: any ides on this type of analyzer?

2006-11-30 Thread Dennis Watson
NGramAnalyzer should do this. I think there is one in the contribs area or in LUA. Dennis On Thursday 30 November 2006 17:25, Van Nguyen wrote: > I've been trying to brainstorm on this but could not figure out a way to > go about this. > > > > Let's say I'm searching for "batman". I want res

Re: 2.1-dev memory leak?

2006-11-30 Thread Yonik Seeley
On 11/30/06, Michael McCandless <[EMAIL PROTECTED]> wrote: I tested this in 1.9.1, 1.9.2, 2.0.0, and trunk, and all of these versions would run out of descriptors. So I'm at a loss so far on where the regression is here ... Right... no regression, just a Java limitation. -Yonik http://incubat

Re: 2.1-dev memory leak?

2006-11-30 Thread Otis Gospodnetic
Yeah, in this case, I'm running out of memory, and open file descriptors are, I think, just an indicator that IndexSearchers are not getting closed properly. I've already increased the open file descriptors limit, but I'm limited to 2GB of RAM on a 32-bit box. I'll try explicitly closing searc

Re: 2.1-dev memory leak?

2006-11-30 Thread Otis Gospodnetic
Hi Mike, Thanks for looking into this. I think your stress test may match my production environment. I think System.gc() never guarantees anything will happen, it's just a hint. I've got the following in one of my classes now. Maybe you can stick it in your stress test and look at that last n

Full text searching on documents saved in database as BLOB

2006-11-30 Thread Inderjeet Kalra
Hi, I have a query related to the full text searching on documents saved in database as BLOB. In our application, we are planning to save our documents in the database as BLOB and we have a requirement of searching a document on it's meta data and the content of the document i.e. search within doc