date:20051208

Re: JVM Crash in Lucene

2005-12-08 Thread Yonik Seeley

The only problems I've had with 1.5 JVM crashes and Lucene was related to stack overflow... try increasing the stack size and see of anything different happens. My crashes happened while trying to use Luke to open a 4GB index with thousands of indexed fields. -Yonik -

Re: JVM Crash in Lucene

2005-12-08 Thread Chris Hostetter

: I'm relatively new to Lucene. When I run my app, I get a JVM error. : This gets called a lot, but only fails every once in awhile (maybe 1 in : 100 calls?) i'm not that familiar with TermFreqVectors, and I have no idea what indexManager is, but I'm suprised this works at all ... I thought calli

JVM Crash in Lucene

2005-12-08 Thread Dan Gould

Hi-- I'm relatively new to Lucene. When I run my app, I get a JVM error. This gets called a lot, but only fails every once in awhile (maybe 1 in 100 calls?) I filed a report with Sun, but I don't expect to hear anything from them. So, I was wondering if any Lucene experts have run across th

RE: delete and optimize

2005-12-08 Thread Dan Liu

There IS difference between something being marked as deleted and something is actually deleted. As these marked as deleted can be undeleted. The document is marked as deleted even before the reader is closed. There is an example in "Lucene in Action". /dan -Original Message- From: Dan Q

RE: delete and optimize

2005-12-08 Thread Dan Quaroni

I'm confused by what you mean - there is no difference between something being marked as deleted and deleted. (Since it's not removed from the index until optimization) I've found that unless I close(), the document isn't even marked for deletion. And if I recall, I think I also had to close

Re: pdf and highlighting

2005-12-08 Thread Erik Hatcher

On Dec 8, 2005, at 10:51 AM, Sonja Löhr wrote: Thank you both, I found it (I really asked a bit too early, sorry) The highlighter works correct if I use my custom Analyzer during indexing (and for QueryParser), BUT when preparing the TokenStream to feed the highlighter, I must NOT use it.

RE: delete and optimize

2005-12-08 Thread Dan Liu

The document is markded as "deleted" when reader.delete(i) is called. It is actually deleted from index when reader.close(). The deleted douments seems put in a separate file with extension ".del" in the index folder. When optimiation happens after deletion, the ".del" file is gone, and Document

Re: delete and optimize

2005-12-08 Thread Michael D. Curtin

Mordo, Aviran (EXP N-NANNATEK) wrote: Optimization also purges the deleted documents, thus reduces the size (in bytes) of the index. Until you optimize documents stay in the index only marked as deleted. Deleted documents' space is reclaimed during optimization, 'tis true. But it can also be

RE: delete and optimize

2005-12-08 Thread Mordo, Aviran (EXP N-NANNATEK)

Optimization also purges the deleted documents, thus reduces the size (in bytes) of the index. Until you optimize documents stay in the index only marked as deleted. -Original Message- From: Dan Liu [mailto:[EMAIL PROTECTED] Sent: Thursday, December 08, 2005 2:00 PM To: java-user@lucene.

RE: delete and optimize

2005-12-08 Thread Dan Liu

The document is indexed first. This is required by the application. Based on Lucene in Action", "Optimizaation" is to merge multiple index files together in order to reduce their number and thus minimize the time it takes to read at search time" The approach1 does deletion on an optimized index. S

Re: Merging with IndexWriter.addIndexes(...)

2005-12-08 Thread Doug Cutting

J.J. Larrea wrote: So... I notice that both IndexWriter.addIndexes(...) merge methods start and end with calls to optimize() on the target index. I'm not sure whether that is causing the unpacking and repacking I observe, but it does wonder whether they truly need to be there: I don't recall

RE: delete and optimize

2005-12-08 Thread Mordo, Aviran (EXP N-NANNATEK)

Well the best way in my opinion is to: 1) open the IndexReader and delete some documents from the same index 2) close the IndexReader 3) open IndexWriter and index documents 4) optimize the indexWriter and close the indexWriter For best performance you want the optimization to be

Re: words with more than 1 hyphen ?

2005-12-08 Thread Beady Geraghty

Thanks for the advice. It is hard to say whether the useability folks want to distinguish between "/usr/include" as oppose to "usr include". Actually, I am sure that they would, but whether they would accept "usr include" is the right question to ask :-) I'll have to sort it out with them :-( Tha

delete and optimize

2005-12-08 Thread Dan Liu

Hi, What is the difference between following approaches? Approach1 1) open IndexWriter and index documents 2) optimize the indexWriter and close the indexWriter 3) open the IndexReader and delete some documents from the same index 4) close the IndexReader Approach2

Re: words with more than 1 hyphen ?

2005-12-08 Thread Erik Hatcher

On Dec 8, 2005, at 10:15 AM, Beady Geraghty wrote: Since someone suggested hyphen, the next requestion is underscore. I can see more and more of these requests. Also, people might like to search for "/usr/include/wchar.h" (hence, the slash) and apostrophe etc. There really isn't a set of re

RE: pdf and highlighting

2005-12-08 Thread Sonja Löhr

Thank you both, I found it (I really asked a bit too early, sorry) The highlighter works correct if I use my custom Analyzer during indexing (and for QueryParser), BUT when preparing the TokenStream to feed the highlighter, I must NOT use it. TokenStream tStream = new GermanAnalyzer().tokenSt

Re: words with more than 1 hyphen ?

2005-12-08 Thread Beady Geraghty

Thank you for your answer. I would like to not give you a "general" question so that I can understand more. But, I have random requests from people. For example, this request for hyphen is originated from a colleaque who is French, and she believes that hyphen is important, though, I don't know w

RE: Lucene performance bottlenecks

2005-12-08 Thread Dalton, Jeffery

Andrzej, I think you did a great job elucidating my thoughts as well. I heartily concur with everything you said. Andrzej Bialecki Wrote: > Hmm... Please define what "adequate" means. :-) IMHO, > "adequate" is when for any query the response time is well > below 1 second. Otherwise the serv

Re: Top n Searches

2005-12-08 Thread msftblows

I had to do the same thing and I used Log4J...that will do the trick for you. -Original Message- From: Cheolgoo Kang <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Thu, 8 Dec 2005 17:51:23 +0900 Subject: Re: Top n Searches Hi, You first save those search keywords entered by

Re: pdf and highlighting

2005-12-08 Thread Erik Hatcher

Wow, those were some great details. But, as I hope you've seen with some other recent issues, things become so much clearer when you can isolate the issues. This is one reason that test-driven development with unit tests is so amazingly helpful. If you could isolate a single PDF going th

Index merging

2005-12-08 Thread Paul . Illingworth

Hello all, Whilst merging one index into another using IndexWriter.addIndexes(IndexReader[]) I got the following error. (index _file_path)\_5z.fnm (The system cannot find the file specified) It would appear that this occurred during the adding of the indexes. The indexes I was merging to an

RE: pdf and highlighting

2005-12-08 Thread mark harwood

> if it comes from PdfBox, the wrong text is > highlighted. Wrong in what sense? A couple of things to consider from looking at your code. * It is preferable to pass a rewritten query to the highlighter (pass the same rewritten query to searcher if you want to avoid query rewriting costs twice).

RE: pdf and highlighting

2005-12-08 Thread Sonja Löhr

Hi, Eric and the other experts! I'll try to collect some code fragments. Many things are configurable and I wrote a Crawler for indexing, but the rest is very close to the examples in "Lucene in Action". I hope I chose the appropriate snippets. The analyzer I use is created once and stored in a

Re: pdf and highlighting

2005-12-08 Thread Erik Hatcher

Sonja, Do you have an example, or at least some relevant code, that would help the community in helping resolve this? Erik On Dec 8, 2005, at 4:24 AM, Sonja Löhr wrote: Hi, all! I have a question concerning analysis and highlighting. I'm indexing multiple document formats (up to

pdf and highlighting

2005-12-08 Thread Sonja Löhr

Hi, all! I have a question concerning analysis and highlighting. I'm indexing multiple document formats (up to now, only html and pdf occured, and use the highlighter from the Lucene sandbox. The documents text is extracted via JTidy and PDFBox, respectively, then in both indexing and search anal

Re: Lucene performance bottlenecks

2005-12-08 Thread Andrzej Bialecki

(Moving the discussion to nutch-dev, please drop the cc: when responding) Doug Cutting wrote: Andrzej Bialecki wrote: It's nice to have these couple percent... however, it doesn't solve the main problem; I need 50 or more percent increase... :-) and I suspect this can be achieved only by som

Re: Top n Searches

2005-12-08 Thread Cheolgoo Kang

Hi, You first save those search keywords entered by users into some kind of storage like a database system or even into a dedicated Lucene index. So it's a database and web issue, not a Lucene one. And, as you know, Lucene does not provide this functionality out of the box. Good luck! On 12/8/0

Re: Confused about ... [SOLVED]

2005-12-08 Thread Alan Chandler

On Wednesday 07 Dec 2005 22:23, Chris Hostetter wrote: > -- the real issue is that your query should matches a certain set of > documents, if there is a document you've added to the index that you > expect to see in that result but isn't there, then use Luke or > something like it to verify: > 1)

Re: JVM Crash in Lucene

Re: JVM Crash in Lucene

JVM Crash in Lucene

RE: delete and optimize

RE: delete and optimize

Re: pdf and highlighting

RE: delete and optimize

Re: delete and optimize

RE: delete and optimize

RE: delete and optimize

Re: Merging with IndexWriter.addIndexes(...)

RE: delete and optimize

Re: words with more than 1 hyphen ?

delete and optimize

Re: words with more than 1 hyphen ?

RE: pdf and highlighting

Re: words with more than 1 hyphen ?

RE: Lucene performance bottlenecks

Re: Top n Searches

Re: pdf and highlighting

Index merging

RE: pdf and highlighting

RE: pdf and highlighting

Re: pdf and highlighting

pdf and highlighting

Re: Lucene performance bottlenecks

Re: Top n Searches

Top n Searches

Re: Confused about ... [SOLVED]

29 matches

Site Navigation

Mail list logo

Footer information