search that span over consecutive documents

2005-07-07 Thread Daniel Moldovan
Hello everyone, My application must index a lot of books that are stored in xml files. Each xml file represents a page of the book and this way each page becomes a lucene Document. Each page is organized in different sections and finally each section contains lines. What I need to do is g

Re: Lucene faster on JDK 1.5?

2005-07-07 Thread Otis Gospodnetic
Nothing significant, but I've been using 1.5 on Simpy.com (lots of Lucene behind it) for over a year now, and I'm happy with it. Otis --- [EMAIL PROTECTED] wrote: > Are people seeing a significant speed performance with Lucene when > they upgrade > to JDK 1.5? > > -

Re: non-lexical comparisons

2005-07-07 Thread jian chen
Yeah, RDBMS makes sense. In this case, would it be better to simple store those in a relational database and just use Lucene to do indexing for the text? Cheers, Jian On 7/7/05, Leos Literak <[EMAIL PROTECTED]> wrote: > I know the answear, but just for curiosity: > > have you guys ever thought

RE: Search Timeout - abort a search

2005-07-07 Thread Jason Polites
You could do it asynchronously. That is, separate off the actually lucene search into a different thread which does the actual search, then the calling thread simply waits for a maximum time for the search thread to complete, then queries the status of the search thread to get the results obtained

RE: FileNotFoundException segments

2005-07-07 Thread Jason Polites
if ((indexFile = new File(indexDir)).exists() && indexFile.isDirectory()) { exists = false; Isn't this backwards? Couldn't you just do: indexFile = new File(indexDir); exists = (indexFile.exists() && indexFile.isDirectory()); -Original Message- From: bib_lucene bib [mailto:

Loading large index into RAM

2005-07-07 Thread yahootintin . 11533894
Is it possible to use a RAMDirectory to load a 5 GB index into RAM on Linux? I have access to a server with 6 GB of RAM and will try it next week but I've heard that Java on Linux may only support up to 2 GB of RAM per process. Anyone already tried this? Thanks.

Lucene faster on JDK 1.5?

2005-07-07 Thread yahootintin . 11533894
Are people seeing a significant speed performance with Lucene when they upgrade to JDK 1.5? - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: FileNotFoundException segments

2005-07-07 Thread bib_lucene bib
This is a new directory, created just before this step. I am uploading files to this directory. The file is getting uploaded fine. Any ideas? Muetze303 <[EMAIL PROTECTED]> wrote: probably the dir exists, but the index inside the dir is broken or not complete and you are trying to use it instead o

Re: Help on the ParallelMultiSearcher.rewrite(Query) method

2005-07-07 Thread Erik Hatcher
Terence - you need to do the rewrite using the appropriate IndexReader for a single index. You can use query.rewrite(IndexReader). Erik On Jul 7, 2005, at 3:04 PM, Terence Lai wrote: Hi all, I am currently using Lucene 1.4.2. Since my search documents are huge, I divide the search i

Re: non-lexical comparisons

2005-07-07 Thread Jeff Davis
Have you considered left-padding your numbers with zeros to make each number a string of the same length? e.g., The number 5 would be indexed/queried as "5", which can be correctly compared to 10 ("00010"), 2345 ("02345"), etc. in a lexical comparison... Jeff On 7/7/05, Leos Literak <[EMAIL

Help on the ParallelMultiSearcher.rewrite(Query) method

2005-07-07 Thread Terence Lai
Hi all, I am currently using Lucene 1.4.2. Since my search documents are huge, I divide the search index into different index directory and make use of the ParallelMultiSearcher to perform the search. Currently, I am working on the highlight feature using Lucene Sandbox Highlighter. One of the

non-lexical comparisons

2005-07-07 Thread Leos Literak
I know the answear, but just for curiosity: have you guys ever thought about non-lexical comparison support? For example I started to index number of replies in discussion, so I can find questions without answear, with one reply, two comments etc. But I cannot simply express that I want to find q

Re: Search Timeout - abort a search

2005-07-07 Thread Paul Elschot
On Thursday 07 July 2005 16:06, Dan Armbrust wrote: > Has anyone ever written code to make it possible to return from a > search, after a given amount of time, returning the results that have > been collected so far (but not necessarily all of them)? > > The only thing that I can see to do throu

Search Timeout - abort a search

2005-07-07 Thread Dan Armbrust
Has anyone ever written code to make it possible to return from a search, after a given amount of time, returning the results that have been collected so far (but not necessarily all of them)? The only thing that I can see to do through the public Lucene API's would be to do the search using a

Re: FileNotFoundException segments

2005-07-07 Thread Muetze303
probably the dir exists, but the index inside the dir is broken or not complete and you are trying to use it instead of creating a new one?! bib_lucene bib wrote: Hi All can someone please help me on the error in my web application... I am using tomcat , the path for index dir is obtained fr

Re: Boosting SpanQueries

2005-07-07 Thread Paul Libbrecht
Enclosing it in a boolean-query where its alone and which, itself, has a boosting would seem to work for me... paul Le 7 juil. 05, à 11:04, Vincent Le Maout a écrit : a way to implement something as boosting allowing to enhance the score of documents containing a particular word of a span

Re: IOException IndexReader out of date

2005-07-07 Thread Dirk Hennig
Volodymyr Bychkoviak wrote: method open is static method wich returns new indexReader. maybe this is the problem. I did not see the wood for the trees! That's it! The correct program looks like this: - indexReader = IndexReade

Re: IOException IndexReader out of date

2005-07-07 Thread Volodymyr Bychkoviak
Dirk Hennig wrote: Volodymyr Bychkoviak wrote: the problem is than index was modified between indexReader.open(index); and indexReader.delete(hitId); method calls. That would explain the exception. But How? it can be modified by another indexReader or indexWriter. The program is exactl

Re: IOException IndexReader out of date

2005-07-07 Thread Dirk Hennig
Volodymyr Bychkoviak wrote: the problem is than index was modified between indexReader.open(index); and indexReader.delete(hitId); method calls. That would explain the exception. But How? The program is exactly as I wrote it! --

Re: IOException IndexReader out of date

2005-07-07 Thread Volodymyr Bychkoviak
the problem is than index was modified between indexReader.open(index); and indexReader.delete(hitId); method calls. regards, Volodymyr Bychkoviak Dirk Hennig wrote: Hallo, When I try to use this to remove several documents from the index -

IOException IndexReader out of date

2005-07-07 Thread Dirk Hennig
Hallo, When I try to use this to remove several documents from the index - indexReader.open(index); while (!removeStack.empty()) { int hitId = ((Integer)(removeStack.pop())).intValue(); indexReader.delete(hitId); } indexR

Re: Boosting SpanQueries

2005-07-07 Thread Vincent Le Maout
Hi, I met this problem a few months ago : trying to boost some words in SpanQuery seems to have no effect, which was confirmed by looking at the source code (no reference to boost in the scoring methods, at least as far as lucene 1.4.3 is concerned, correct me if I am wrong). So my first quest

FileNotFoundException segments

2005-07-07 Thread bib_lucene bib
Hi All can someone please help me on the error in my web application... I am using tomcat , the path for index dir is obtained from jsp page using application.getRealPath("/")+"download/compName" I want to index when the file gets uploaded. I am getting this error... java.io.FileNotFoundEx

Re: no results for date field

2005-07-07 Thread Leos Literak
Erik Hatcher napsal(a): QueryParser attempts to parse the from and to strings as simple date formats with the default locale. You would use something like "created:[01/01/04 TO 07/05/05]". Please read up on the issues that the default date handling poses though. Simply indexing with