RE: Searching for similar documents

2005-07-20 Thread Derek Westfall
Your solution below is undoubtedly my problem. I didn't even consider the need to create all those directory levels. I'm sure that will solve it! -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Wednesday, July 20, 2005 1:31 PM To: java-user@lucene.apache.org Subject

Re: StackOverflowError when index pdf files

2005-07-20 Thread Ben Litchfield
Yes, this sounds like an issue with PDFBox, can you determine if it is a single PDF document and post an issue on the PDFBox sourceforge site. Thanks, Ben Litchfield On Wed, 20 Jul 2005, Otis Gospodnetic wrote: > It sounds like the problem may stem from your PDF parser > > Otis > > --- [EM

Re: Searching for similar documents

2005-07-20 Thread Erik Hatcher
You'll want to re-think that re-JARing approach for the long term, as you'll want to upgrade Lucene at some point I suspect. But congrats on hacking it! Erik On Jul 20, 2005, at 5:44 PM, Derek Westfall wrote: Okay, I figured out how to use JAR, extracted all the files from lucene-1.

RE: Searching for similar documents

2005-07-20 Thread Derek Westfall
Okay, I figured out how to use JAR, extracted all the files from lucene-1.4.3.jar, added the MoreLikeThis classes in the appropriate folder, recreated and replaced the JAR. Since Lucene is my first exposure to Java I am pretty proud of myself at this point. The only thing that still wasn't working

Re: StackOverflowError when index pdf files

2005-07-20 Thread Otis Gospodnetic
It sounds like the problem may stem from your PDF parser Otis --- [EMAIL PROTECTED] wrote: > > Hi, > I've the error "java.lang.StackOverflowError" when I try to index > text files > that I got from transforming pdf files through pdfbox API. > When I index normal text repository, I havn't th

StackOverflowError when index pdf files

2005-07-20 Thread Gayo . Diallo
Hi, I've the error "java.lang.StackOverflowError" when I try to index text files that I got from transforming pdf files through pdfbox API. When I index normal text repository, I havn't this error. may some one help me ? Thanks, Gayo - envoyé via

Re: QueryParser handling of backslash characters

2005-07-20 Thread Jeff Davis
That fix works perfectly, as far as I can tell. As for the unit test, it should actually be: assertEquals("192.168.0.15\\public", discardEscapeChar ("192.168.0.15public")); Jeff On 7/20/05, Eyal <[EMAIL PROTECTED]> wrote: > I think this should work: > > (Written in C# originall

Re: New line

2005-07-20 Thread Otis Gospodnetic
How you tokenize your input is up to you. It sounds like you want a custom Analyzer that has a tokenizer that knows about newline characters and does whatever you need it to do when a newline character is encountered (e.g. stop reading or whatever). The search part of Lucene has no notion of newl

Re: Too many open files error using tomcat and lucene

2005-07-20 Thread jian chen
Hi, Dan, I think the problem you mentioned is the one that has been discussed lot of times in this mailing list. Bottomline is that you'd better use the compound file format to store indexes. I am not sure Lucene 1.3 has that available, but, if possible, can you upgrade to lucene 1.4.3? Cheers,

Re: Too many open files error using tomcat and lucene

2005-07-20 Thread Daniel Naber
On Wednesday 20 July 2005 22:49, Dan Pelton wrote: > We are getting the following error in our tomcat error log. > /dsk1/db/lucene/journals/_clr.f7 (Too many open files) > java.io.FileNotFoundException: /dsk1/db/lucene/journals/_clr.f7 (Too > many open files) See http://wiki.apache.org/jakarta-l

Re: Using QueryParser with a single field

2005-07-20 Thread Erik Hatcher
On Jul 19, 2005, at 8:10 AM, Eyal wrote: Hi, In my client application I allow the user to build a query by selecting a field from a combobox and entering a value to search by. I want the user to enter free text queries for each field, but I don't want to parse it myself so I thought I'd u

Re: New line

2005-07-20 Thread [EMAIL PROTECTED]
Chris, If I understand your question correctly, you are saying why is the the search output of lucene not returning the two lines as distinct two lines? If you are returning the lucene search output to the web and want the new line \n to be dispalyed as such, you need to replace the character

Re: BOOLEAN OPERATOR HOWTO

2005-07-20 Thread Erik Hatcher
On Jul 19, 2005, at 8:31 AM, Karthik N S wrote: Given a Search word = 'erik OR hatcher AND otis OR gospodnetic' , Is it possible to RETURN COUNT occurances for each of the word with in the Searched documents. This would give me the Each word's Term Frequency. How to achieve this Wow - I

RE: QueryParser handling of backslash characters

2005-07-20 Thread Eyal
I think this should work: (Written in C# originally - so someone please check if it compiles - I don't have a java compiler here) private String discardEscapeChar(String input) { char[] caSource = input.toCharArray(); char[] caDest = new char[caSource.length]; int j = 0

Too many open files error using tomcat and lucene

2005-07-20 Thread Dan Pelton
We are getting the following error in our tomcat error log. /dsk1/db/lucene/journals/_clr.f7 (Too many open files) java.io.FileNotFoundException: /dsk1/db/lucene/journals/_clr.f7 (Too many open files) at java.io.RandomAccessFile.open(Native Method) We are using the following lucene-1.3-f

Re: Searching for similar documents

2005-07-20 Thread Erik Hatcher
On Jul 20, 2005, at 1:47 PM, Derek Westfall wrote: I hope you will forgive the newbie question but do I have to add the MoreLikeThis.class file to the Lucene-1.4.3.JAR for it to work? I put the .class file in my \wwwroot\web-inf\classes folder If you put it in the right package directory under

Re: QueryParser handling of backslash characters

2005-07-20 Thread Erik Hatcher
On Jul 19, 2005, at 11:19 AM, Jeff Davis wrote: Hi, I'm seeing some strange behavior in the way the QueryParser handles consecutive backslash characters. I know that backslash is the escape character in Lucene, and so I would expect "" to match fields that have two consecutive backslashes

Re: New line

2005-07-20 Thread christopher may
When my text file is being searched it seems every line is blending. So I need the index searcher to see a newline character or field separator in the text file. What can be used in the text file to separate my lines ? From: Otis Gospodnetic <[EMAIL PROTECTED]> Reply-To: java-user@lucene.ap

RE: Searching for similar documents

2005-07-20 Thread Derek Westfall
I hope you will forgive the newbie question but do I have to add the MoreLikeThis.class file to the Lucene-1.4.3.JAR for it to work? I put the .class file in my \wwwroot\web-inf\classes folder and I am getting an error I don't understand when trying to instantiate the object from Cold Fusion. I al

Re: Re: wild card with keyword fileld

2005-07-20 Thread Ian Lea
Rahul Looks like you've got the args mixed up in your qp calls. I think it should be: QueryParser qp = new QueryParser("keywords",analyzer); qp.setLowercaseWildcardTerms(false); Query query = qp.parse(line); -- Ian. On 20 Jul 2005 14:06:32 -, Rahul D Thakare <[EMAIL PROTECTED]> wrote

Re: Re: wild card with keyword fileld

2005-07-20 Thread Rahul D Thakare
  Erik/Ian I tried using query.parse(String) did't return any result also my query.toString() returns mainboard:keywords if i give the keyword as mainboard. pls see the changed code again. PerFieldAnalyzerWrapper analyzer = new PerFieldAnalyzerWrapper( new StandardAnalyzer() ); analyze

Re: wild card with keyword fileld

2005-07-20 Thread Erik Hatcher
On Jul 20, 2005, at 1:22 AM, Rahul D Thakare wrote: Hi Ian, Yes, I did implement Eric's suggestion last week, but couldn't help. Also, just to note it I did mention the parse(String) method in the e-mail referenced below! :) Erik I am using a demo program from Lucene.jar to

Re: wild card with keyword fileld

2005-07-20 Thread Erik Hatcher
On Jul 20, 2005, at 1:22 AM, Rahul D Thakare wrote: /* QueryParser qp = new QueryParser(line,analyzer); qp.setLowercaseWildcardTerms(false); Query query = qp.parse(line, "keywords", analyzer); */ Query query = QueryParser.parse(line, "keywords", analyzer); You've been bitten, as many othe

Re: Re: wild card with keyword fileld

2005-07-20 Thread Ian Lea
What does query.toString() show in each case? I still think you should try lowercasing everything, if only to see if it helps. If it does you could either keep it or figure out what you need to do. -- Ian. On 20 Jul 2005 05:22:29 -, Rahul D Thakare <[EMAIL PROTECTED]> wrote: > > > >