Re: add word filtering?

2006-03-27 Thread abdul muhaimin
No. I'm sorry I didn't convey my question very well. Anyway thanks a lot for the info. What I really meant is, I want to filter out some words like for example, "violence" & "hatred" from the search engine results. Consequently lucene will display some alternative results for the above attempted s

How to write to and read from the same index

2006-03-27 Thread Nick Atkins
I'm using Lucene running on Tomcat to index a large amount of email data and as the indexer runs through the mailbox creating, merging and deleting documents it does lots of searches at the same time to see if the document exists. Actually all my modification operations are done "in batch" every x

Re: Lucene indexing on Hadoop distributed file system

2006-03-27 Thread Doug Cutting
Igor Bolotin wrote: Does it make sense to change TermInfosWriter.FORMAT in the patch? Yes. This should be updated for any change to the format of the file, and this certainly constitutes a format change. This discussion should move to [EMAIL PROTECTED] Doug --

to OR or not

2006-03-27 Thread Amol Bhutada
Hi everybody, I am using lucene in almost every web application I am working on. It's simply a great software. I have developed an advanced search with Lucene 1.4. Now I am looking for developing a fuzzy search i.e get one search string from the user and search across all fields of member docume

Re: Lucene indexing on Hadoop distributed file system

2006-03-27 Thread Igor Bolotin
Does it make sense to change TermInfosWriter.FORMAT in the patch? Igor On 3/27/06, Doug Cutting <[EMAIL PROTECTED]> wrote: > > Igor Bolotin wrote: > > If somebody is interested - I can post our changes in TermInfosWriter > and > > SegmentTermEnum code, although they are pretty trivial. > > Pleas

Re: delte documents into index

2006-03-27 Thread Tom Hill
On Samstag 25 März 2006 00:39, Tom Hill wrote: > IndexModifier won't work > in multithreaded scenario, at least as far as I can tell. Yes it does, but you need to use one IndexModifier object from all classes (see the javadoc). Regards Daniel I stand corrected (after going back and reading

Re: PhraseQuery with synonyms or having n tokens at the same tokenposition.

2006-03-27 Thread Daniel Naber
On Montag 27 März 2006 11:17, Ramana Jelda wrote: > I have indexed name: "sony dsc-d cybershot" as following tokens provided > token positions. > 1: [sony:0->4] > > 2: [dsc:5->10] > > 3: [dscd:5->10] > > 4: [d:5->10] > > 5: [cybershot:11->20] If the first number is the token position, the tokens

Re: Lucene indexing on Hadoop distributed file system

2006-03-27 Thread Andrzej Bialecki
Doug Cutting wrote: Igor Bolotin wrote: If somebody is interested - I can post our changes in TermInfosWriter and SegmentTermEnum code, although they are pretty trivial. Please submit this as a patch attached to a bug report. I contemplated making this change to Lucene myself, when writing

Re: Phrase Query query

2006-03-27 Thread Otis Gospodnetic
Richard, WhitespaceTokenizer (the tokenizer that WhitespaceAnalyzer uses) really just tokenizes on space characters: /** Collects only characters which do not satisfy * [EMAIL PROTECTED] Character#isWhitespace(char)}.*/ protected boolean isTokenChar(char c) { return !Character.isWhite

Re: Does Optimize preserve index order?

2006-03-27 Thread Yonik Seeley
On 3/24/06, chan kang <[EMAIL PROTECTED]> wrote: > What I want to do is to show the results in > chronological order. (btw, the index contains the time field) > One solution I have thought up was: > 1. index the whole set > 2. read in all the time field values > 3. re-index the whole set according

Re: Lucene indexing on Hadoop distributed file system

2006-03-27 Thread Doug Cutting
Igor Bolotin wrote: If somebody is interested - I can post our changes in TermInfosWriter and SegmentTermEnum code, although they are pretty trivial. Please submit this as a patch attached to a bug report. I contemplated making this change to Lucene myself, when writing Nutch's FsDirectory, b

Re: span query scoring vs boolean query scoring

2006-03-27 Thread Doug Cutting
Vincent Le Maout wrote: I am missing something ? Is it intented or is it a bug ? Looks like a bug. Can you please submit a bug report, and, ideally, attach a patch? Thanks, Doug - To unsubscribe, e-mail: [EMAIL PROTECTED]

Re: span query scoring vs boolean query scoring

2006-03-27 Thread Doug Cutting
Vincent Le Maout wrote: I am missing something ? Is it intented or is it a bug ? Looks like a bug. Can you submit a patch? Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Phrase Query query

2006-03-27 Thread Richard Gunderson
Hi I'm using PhraseQuery in conjunction with WhiteSpaceAnalyzer but it's giving me slightly unusual results. If I have a text file containing the text (quotes are just for clarity): "Hello this is some text" I don't find any results when I search. But if I put spaces before and after the phras

RE: Get All Entries

2006-03-27 Thread StefanH
It works perfect. After installation of 1.9. I've the MatchAllDocsQuery. Thanks! -- View this message in context: http://www.nabble.com/Get-All-Entries-t1348226.html#a3610840 Sent from the Lucene - Java Users forum at Nabble.com. -

RE: Get All Entries

2006-03-27 Thread Satuluri, Venu_Madhav
I believe there's a MatchAllDocsQuery class from Lucene 1.9 onwards. You can run this query to get all documents. If you are not using 1.9, to my knowledge, you would have to add a redundant field that would true for all documents and query on that field. Something like Field.Keyword("AllDocsTrue"

Get All Entries

2006-03-27 Thread StefanH
Hello Everyone, I have 6000 Entries in my Lucene DB and if I search for entries with "00*" in the Number-Field it works fine. But additional I must have alle entries no matter which number they have. A Term like "*" doesn't work. How can I get all entries? The code of my search is:

RE: add word filtering?

2006-03-27 Thread Satuluri, Venu_Madhav
Are you asking that common words not be searched? For this, you can use StopFilter to prevent words from being indexed and searched. Alternatively, you can use StandardAnalyzer, which in addition to removing stop words also does more sophisticated tokenizing. Venu -Original Message- From

add word filtering?

2006-03-27 Thread abdul muhaimin
Hi all I'm really new to lucene. In fact I just found it when i googled a few days ago. Never thought that java have this kind of excellent library for free. I would like to ask a few questions, which is where to add if we would like to filter certain text from being searched, and filter certain

PhraseQuery with synonyms or having n tokens at the same tokenposition.

2006-03-27 Thread Ramana Jelda
Hi, PhraseQuery is not working as I wanted,when indexed with synonyms. ex: I have indexed name: "sony dsc-d cybershot" as following tokens provided token positions. 1: [sony:0->4] 2: [dsc:5->10] 3: [dscd:5->10] 4: [d:5->10] 5: [cybershot:11->20] So "dsc-d" is tokenized into 3 tokens "dsc

Re: Re-creating IndexSearcher after update

2006-03-27 Thread Nick Atkins
Luc, I tried adding your DelayCloseIndexSearcher to my project (a Tomcat app where the index is repeatedly searched and frequently updated) and as soon as an index modify occurs (by a separate thread) and I call closeWhenDone() in the main thread I get IllegalStateException("closeWhenDone() alread