date:20071217

Infrastructure question

2007-12-17 Thread v k

Hello, I am using Lucene to build an index from roughly 10 million documents in number. The documents are about 4 TB in total. After some trial runs, indexing a subset of the documents I am trying to figure out a hosting service configuration to create a full index from the entire 10 TB of data

Re: thoughts/suggestions for analyzing/tokenizing class names

2007-12-17 Thread Mike Klaas

On 17-Dec-07, at 11:39 AM, Beyer,Nathan wrote: Would using Field.Index.UN_TOKENIZED be the same as tokenizing a field into one token? Indeed. -Mike -Original Message- From: Mike Klaas [mailto:[EMAIL PROTECTED] Sent: Monday, December 17, 2007 12:53 PM To: java-user@lucene.apache.or

Analyzer to use with MultiSearcher using various indexes for multiple languages

2007-12-17 Thread Jay Hill

I'm working on a project where we are indexing content for several different languages - English, Spanish, French and German. I have built separate indexes for each language using the proper Analyzer for each language(StandardAnalyzer for English, FrenchAnalyzer for French, etc.). We have a require

RE: Phrase Query Problem

2007-12-17 Thread Zhang, Lisheng

Hi Sirish, A few hours ago I sent a reply to your message, if my understanding is correct, you indexed a doc with text as Health and Safety and you used phrase Health Safety to create a phrase query. If that is the case, this is normal since you used StandardAnalyzer to tokenize the input tex

RE: thoughts/suggestions for analyzing/tokenizing class names

2007-12-17 Thread Beyer,Nathan

Would using Field.Index.UN_TOKENIZED be the same as tokenizing a field into one token? -Original Message- From: Mike Klaas [mailto:[EMAIL PROTECTED] Sent: Monday, December 17, 2007 12:53 PM To: java-user@lucene.apache.org Subject: Re: thoughts/suggestions for analyzing/tokenizing class na

Re: FuzzyQuery - rounding bug?

2007-12-17 Thread Erick Erickson

Please do not highack the thread. When starting a new topic, do NOT use "reply to", start an entirely new e-mail. Otherwise your topic often gets ignored by people who are uninterested in the original thread. Best Erick On Dec 17, 2007 5:57 AM, anjana m <[EMAIL PROTECTED]> wrote: > how to i use

Re: thoughts/suggestions for analyzing/tokenizing class names

2007-12-17 Thread Mike Klaas

Either index them as a series of tokens: org org.apache org.apache.lucene org.apache.lucene.document org.apache.lucene.document.Document or index them as a single token, and use prefix queries (this is what I do for reverse domain names): classname:(org.apache org.apache.*) Note that "class

RE: FuzzyQuery + QueryParser - I'm puzzled

2007-12-17 Thread Steven A Rowe

Hi anjana m, You're going to have lots of trouble getting a response, for two reasons: 1. You are replying to an existing thread and changing the subject. Don't do that. When you have a question, start a new thread by creating a new email instead of replying. 2. You are not telling the list

RE: Phrase Query Problem

2007-12-17 Thread Zhang, Lisheng

Hi, Do you mean that your query phrase is "Health Safety", but docs with "Health and Safety" returned? If that is the case, the reason is that StandardAnalyzer filters out "and" (also "or, "in" and others) as stop words during indexing, and the QueryParser filters those words out also. Best reg

Phrase Query Problem

2007-12-17 Thread Sirish Vadala

I have the following code for search: BooleanQuery bQuery = new BooleanQuery(); Query queryAuthor; queryAuthor = new TermQuery(new Term(IFIELD_LEAD_AUTHOR, author.trim().toLowerCase())); bQuery.add(queryAuthor, BooleanClause.Occur.MUST); ...

RE: thoughts/suggestions for analyzing/tokenizing class names

2007-12-17 Thread Beyer,Nathan

Good point. I don't want the sub-package names on their own to match. Text (class name) - "org.apache.lucene.document.Document" Queries that would match - "org.apache", "org.apache.lucene.document" Queries that DO NOT match - "apache", "lucene", "document" -Nathan -Original Message-

Re: thoughts/suggestions for analyzing/tokenizing class names

2007-12-17 Thread Mike Klaas

On 15-Dec-07, at 3:14 PM, Beyer,Nathan wrote: I have a few fields that use package names and class names and I've been looking for some suggestions for analyzing these fields. A few examples - Text (class name) - "org.apache.lucene.document.Document" Queries that would match - "org.apache" ,

Re: How to say Thank You ?

2007-12-17 Thread Grant Ingersoll

I don't consider sending this kind of message to the list pollution. It's good to take a step back from time to time and remember that almost all of us volunteer here, even if we get paid to work w/ Lucene. I am constantly amazed at the Lucene community and what it has to offer in the way

How to say Thank You ?

2007-12-17 Thread Helmut Jarausch

Hi, I have got invaluable help from several people of this list. Unfortunately I couldn't guess the email of some of you. So, many thanks to all who have helped me. Merry Christmas and a Hapy New Year to you all. (Perhaps someone comes up with a means to say 'thank you' without 'polluting' the l

Re: Query.rewrite - help me to understand it

2007-12-17 Thread Erik Hatcher

On Dec 17, 2007, at 5:14 AM, qvall wrote: So does it mean that if I my query doesn't support prefix or wild-char queries then I don't need to use rewrite() for highlighting? As long as the terms you want highlighted are extractable from the Query instance, all is fine. However, it wouldn't

Re: FuzzyQuery - prefixLength - use with QueryParser?

2007-12-17 Thread Erik Hatcher

On Dec 17, 2007, at 3:31 AM, Helmut Jarausch wrote: FuzzyQuery (in the 2.2.0 API) may take 3 arguments, term , minimumSimilarity and prefixLength Is there any syntax to specify the 3rd argument in a query term for QueryParser? (I haven't found any the current docs) No, there isn't. But you

Error with Remote Parallel MultiSearching

2007-12-17 Thread reeja devadas

Hi, We are working with a web server and 10 search servers, these 10 servers have index fragments on it. All available fragments of these search servers are binding at their start up time. Remote Parallel MultiSearcher is used for searching on these indices. When a search request comes, first it l

Re: FuzzyQuery + QueryParser - I'm puzzled

2007-12-17 Thread anjana m

hey i amnot bale to comple packages are not found.. i download..the luncene package.. help me.. .lucene.search.Hits; import org.apache.lucene.search.Query; import org.apache.lucene.document.Field; import org.apache.lucene.search.Searcher; import org.apache.lucene.index.IndexWriter; import org.apach

Re: FuzzyQuery + QueryParser - I'm puzzled

2007-12-17 Thread Doron Cohen

See in Lucene FAQ: "Are Wildcard, Prefix, and Fuzzy queries case sensitive?" On Dec 17, 2007 11:27 AM, Helmut Jarausch <[EMAIL PROTECTED]> wrote: > Hi, > > please help I am totally puzzled. > > The same query, once with a direct call to FuzzyQuery > succeeds while the same query with QueryParse

Re: FuzzyQuery - rounding bug?

2007-12-17 Thread anjana m

how to i use lucene search to serach files of the local system On Dec 17, 2007 2:11 PM, Helmut Jarausch <[EMAIL PROTECTED]> wrote: > Hi, > > according to the LiA book the FuzzyQuery distance is computed as > > 1- distance / min(textlen,targetlen) > > Given > def addDoc(text, writer): >doc = D

Re: Query.rewrite - help me to understand it

2007-12-17 Thread qvall

So does it mean that if I my query doesn't support prefix or wild-char queries then I don't need to use rewrite() for highlighting? -- View this message in context: http://www.nabble.com/Query.rewrite---help-me-to-understand-it-tp14314507p14370200.html Sent from the Lucene - Java Users mailing l

FuzzyQuery + QueryParser - I'm puzzled

2007-12-17 Thread Helmut Jarausch

Hi, please help I am totally puzzled. The same query, once with a direct call to FuzzyQuery succeeds while the same query with QueryParser fails. What am I missing? Sorry, I'm using pylucene (with lucene-java-2.2.0-603782) #!/usr/bin/python import lucene from lucene import * lucene.initVM(luce

FuzzyQuery - rounding bug?

2007-12-17 Thread Helmut Jarausch

Hi, according to the LiA book the FuzzyQuery distance is computed as 1- distance / min(textlen,targetlen) Given def addDoc(text, writer): doc = Document() doc.add(Field("field", text, Field.Store.YES, Field.Index.TOKENIZED)) writer.addDocument(doc) addDoc("

FuzzyQuery - prefixLength - use with QueryParser?

2007-12-17 Thread Helmut Jarausch

Hi, FuzzyQuery (in the 2.2.0 API) may take 3 arguments, term , minimumSimilarity and prefixLength Is there any syntax to specify the 3rd argument in a query term for QueryParser? (I haven't found any the current docs) Many thanks for a hint, Helmut Jarausch Lehrstuhl fuer Numerische Mathematik

Infrastructure question

Re: thoughts/suggestions for analyzing/tokenizing class names

Analyzer to use with MultiSearcher using various indexes for multiple languages

RE: Phrase Query Problem

RE: thoughts/suggestions for analyzing/tokenizing class names

Re: FuzzyQuery - rounding bug?

Re: thoughts/suggestions for analyzing/tokenizing class names

RE: FuzzyQuery + QueryParser - I'm puzzled

RE: Phrase Query Problem

Phrase Query Problem

RE: thoughts/suggestions for analyzing/tokenizing class names

Re: thoughts/suggestions for analyzing/tokenizing class names

Re: How to say Thank You ?

How to say Thank You ?

Re: Query.rewrite - help me to understand it

Re: FuzzyQuery - prefixLength - use with QueryParser?

Error with Remote Parallel MultiSearching

Re: FuzzyQuery + QueryParser - I'm puzzled

Re: FuzzyQuery + QueryParser - I'm puzzled

Re: FuzzyQuery - rounding bug?

Re: Query.rewrite - help me to understand it

FuzzyQuery + QueryParser - I'm puzzled

FuzzyQuery - rounding bug?

FuzzyQuery - prefixLength - use with QueryParser?

24 matches

Site Navigation

Mail list logo

Footer information