Re: Query suggestion

2006-10-10 Thread Rida Benjelloun
Hi, You can look at Nutch and Ontology plugin. Best regards --- Rida Benjelloun, DocuLibre inc. Site Web : http://www.doculibre.com Courriel : [EMAIL PROTECTED] --- On 10/10/06

Re: corrupt index: .fdx and stored norms

2006-10-10 Thread Nick Puz
Hi Doron, sorry, forgot to include env... Here is env for test machine that create the index: - lucene 1.9.1 - ibm jdk 1.5 sr2 - RedHat enterprise linux release 4 (kernel 2.6.9-34ELsmp on x86_64) - 8GB RAM - 2 dual core xeon 3.4ghz. - indexes on local disks, locks also on local (/tmp) From the

Re: Strange Spellchecker behaviour

2006-10-10 Thread Doron Cohen
I believe this was fixed in http://issues.apache.org/jira/browse/LUCENE-593 - Doron Björn Ekengren <[EMAIL PROTECTED]> wrote on 10/10/2006 02:12:23: > Hello, I have found that the spellchecker behaves a bit strange. My > spell indexer class below doesn't work if I use the spellfield > string set

Re: corrupt index: .fdx and stored norms

2006-10-10 Thread Doron Cohen
I meant ~182K files ... > Nick, could you provide additional info: > (1) Env info - Lucene version, Java version, OS, JVM args (e.g. -XmNNN), > etc... > (2) is this reproducible? By the file sizes there seem to be ~182 indexed > docs when the problem occur, so, if this is reproducible it would hop

Re: corrupt index: .fdx and stored norms

2006-10-10 Thread Doron Cohen
Nick, could you provide additional info: (1) Env info - Lucene version, Java version, OS, JVM args (e.g. -XmNNN), etc... (2) is this reproducible? By the file sizes there seem to be ~182 indexed docs when the problem occur, so, if this is reproducible it would hopefully not take too long. If reprod

Re: Lucene in Action examples complie problem

2006-10-10 Thread Doron Cohen
I wonder if this should be in the FAQ entry "How do i get code written for Lucene 1.4.x to work with Lucene 2.x", Or perhaps just adding there a link to your post here - http://www.nabble.com/Lucene-in-Action-examples-complie-problem-tf2418478.html#a6743189 Erik Hatcher <[EMAIL PROTECTED]> wrote o

Re: What is the advantage of setting using compund file to false

2006-10-10 Thread Yonik Seeley
On 10/10/06, Simon Willnauer <[EMAIL PROTECTED]> wrote: On the other hand, the advantage of compound file appears in searching. In my testing from some time ago, this wasn't the case. non-compound indexes were faster to search too. If you get too many open files, - configure your OS to increa

corrupt index: .fdx and stored norms

2006-10-10 Thread NIck P
Hi, i sent this 30 min ago and it didn't seem to go through so i'm trying again, i apologize if two copies finally arrive. I am working on the development of a product that is using Lucene. A corrupt index was reported by testers and it is in an odd state. The indexes are built in batches (to mul

Re: Lucene in Action examples complie problem

2006-10-10 Thread Erik Hatcher
I have long been meaning to publish an updated codebase for Lucene in Action compatible with Lucene 2.0. I adjusted the code a while ago, but I haven't published it yet (and I think the plan is to wait until LIA2 is finished to do so). I did make notes when I made these changes, and I'm p

Re: Lucene in Action examples complie problem

2006-10-10 Thread Doron Cohen
Field.Text() was deprecated in Lucene 1.9 and then removed in 2.0. The book examples were not updated for 2.0 yet. You should now use Field(String, String, Field.Store, Field.Index). To have the same behavior as old Field.Text use: Field(name, value, Field.Store.YES, Field.Index.TOKENIZED). For

Re: What is the advantage of setting using compund file to false

2006-10-10 Thread Doron Cohen
A bit of clarification: Lucene index is made of multiple "segments". Compound format: stores each segment in a single file - less files created/opened. Not-compound format: stores each segment in multi-files - more files created/opened. Not-compound is likely to be faster for indexing. Optimizing t

corrupt index: .fdx and stored norms

2006-10-10 Thread Nick Puz
I am working on the development of a product that is using Lucene. A corrupt index was reported by testers and it is in an odd state. The indexes are built in batches (to multiple ram indexes in parallel) and then eventually merged into a disk index with IndexWriter.addIndexes(Directory[]). Someho

Lucene in Action examples complie problem

2006-10-10 Thread Serhiy Polyakov
Hi, I started to study Lucene following the book Lucene in Action. I am trying to compile book examples downloaded from the book site: http://www.manning.com/hatcher2/ When I am trying to compile first example (Indexer.java) it gives me the following error: LuceneInAction\src\lia\meetlucene\In

Query suggestion

2006-10-10 Thread Louis Sicard
Hi, I am designing a search engine based on Lucene 2.0 and I would like to add a specific feature to help users in their search by suggesting them a number of queries related to their seach. e. g. user searches for "Java" and we will suggest him "Building web applications in Java" and "SOAP tec

Re: Performing a like query

2006-10-10 Thread Rahil
Hi Erick I think you've made some really important observations. Steven has provided a good regular expression to help with word and non-words. For the moment I have reverted to my analyser and am going to try doing some clever pattern matching later. Also, Ill try using a different analyser

Re: What is the advantage of setting using compund file to false

2006-10-10 Thread Supriya Kumar Shyamal
Hi Simon, Thanks for your very good detailed explanation, it really cleared my doubts. Thanks once again, Regards supriya Simon Willnauer wrote: Hi, In Lucene there are two types of index structure compound index and multi-file index. In multi-file index, when new documents are inserted t

Re: What is the advantage of setting using compund file to false

2006-10-10 Thread Simon Willnauer
Hi, In Lucene there are two types of index structure compound index and multi-file index. In multi-file index, when new documents are inserted to an index, they are stored in a separate segment; this causes increase of files in an index structure. Therefore, multi-file index has more files than

What is the advantage of setting using compund file to false

2006-10-10 Thread Supriya Kumar Shyamal
Hello All, I have question regarding the use of Compound file fo rindex, what is the advantage & disadvantage of enabling use of compound file(which is default I think) or disabling the useo of it. Thanks, supriya -- Mit freundlichen Grüßen / Regards Supriya Kumar Shyamal Software Develope

Strange Spellchecker behaviour

2006-10-10 Thread Björn Ekengren
Hello, I have found that the spellchecker behaves a bit strange. My spell indexer class below doesn't work if I use the spellfield string set in the constructor directly, but it does work if I use the intern() value. The problem resides in the hasNext() method of LuceneIterator where an object c

Re: How to search with empty content

2006-10-10 Thread Erik Hatcher
On Oct 10, 2006, at 3:04 AM, Kumar, Samala Santhosh (TPKM) wrote: I went through lucene API about MatchAllDocsQuery, I didn't uderstood how to use it, can you please exaplain me with small code ? Hits hits = searcher.search(new MatchAllDocsQuery()); Erik regards santhosh -Ori

RE: How to search with empty content

2006-10-10 Thread Kumar, Samala Santhosh (TPKM)
I went through lucene API about MatchAllDocsQuery, I didn't uderstood how to use it, can you please exaplain me with small code ? regards santhosh -Original Message- From: Scott [mailto:[EMAIL PROTECTED] Sent: Monday, October 09, 2006 8:32 PM To: java-user@lucene.apache.org Subject: R