Re: General Approach: Analyzer versus Query

2006-07-10 Thread James Pine
Would Lucene's FuzzyQuery be useful in this case? I suppose it would depend on how meaningful the sequences of numbers are. http://lucene.apache.org/java/docs/api/org/apache/lucene/search/FuzzyQuery.html --- Chris Hostetter <[EMAIL PROTECTED]> wrote: > > : I could (1) up front, put in both vers

Re: Reducing the boost for a particular Term

2006-07-10 Thread Chris Hostetter
: particular author using the query. That is for documents returned by : querying: (content:"miracle cure"), I would like to reduce the : relevancy of authorid:3024 : +(content:"miracle cure") +(authorid:3024^0.5 ((-authorid:3024))^10.0) : => The boosted optional prohibited term seems to have no e

Reducing the boost for a particular Term

2006-07-10 Thread Chun Wei Ho
I have a index from which I have a number of documents from authors, but would like to drop the relevance/score for documents from one particular author using the query. That is for documents returned by querying: (content:"miracle cure"), I would like to reduce the relevancy of authorid:3024 How

Re: question regarding Field.Index.UN_TOKENZED

2006-07-10 Thread Chris Hostetter
: I'm storing a field in an index with that option : (Field.Index.UN_TOKENZIED). the key to understanding your problem, is to realize that... UN_TOKENIZED == Not Analyzed ...personally, i think name of the constant is missleading. : The String that is being stored is: NORTH SAFETY PR

Re: General Approach: Analyzer versus Query

2006-07-10 Thread Chris Hostetter
: I could (1) up front, put in both versions of the numbers or (2) during : query, play with the number and search both ways. What's the best : practice approach? In the imortal words of Erik Hatcher... "It Depends :)" #1 takes up more space on disk and in memory, and makes it imposibl

Re: Some obvious questions that I'll be happy to put on the WIKI

2006-07-10 Thread Chris Hostetter
Furash: welcome to Lucene. I suspect you'll find it extremely advantageous to pick up a copy of "Lucene In Action" it has a lot of great examples that may help you understand some of these questions... : token string). But what I want to do is something like IF this is just : a string of more t

question regarding Field.Index.UN_TOKENZED

2006-07-10 Thread Van Nguyen
I'm storing a field in an index with that option (Field.Index.UN_TOKENZIED). The String that is being stored is: NORTH SAFETY PRODUCT (all uppercase) When I try a wildcard query against that field, it only produces results if the query term is capitalized. I'm using the StandardAnalyz

Re: modify existing non-indexed field

2006-07-10 Thread Doron Cohen
The lock time out exception is caused by trying to open multiple IndexWriter objects in parallel - each of the 5 threads is creating its own IndexWriter object in each invocation of addAndIndex(). This cannot work - I think that chapter 2.9 of "Lucene in Action" is essential reading for fixing this

RE: BooleanQuery question

2006-07-10 Thread Van Nguyen
That worked... thanks! -Original Message- From: Michael D. Curtin [mailto:[EMAIL PROTECTED] Sent: Thursday, July 06, 2006 1:04 PM To: java-user@lucene.apache.org Subject: Re: BooleanQuery question Van Nguyen wrote: > I just want results that have: > > ID: 1234 OR 2344 OR 2323 > > LOCA

Re: What are norms?

2006-07-10 Thread Yonik Seeley
Norms are per indexed-field. For every document it's the product of the lengthNorm and the index-time boost. It's really more of an implementation detail that you shouldn't need to know about unless you have a lot of indexed fields and want to omit them for memory reasons. See DefaultSimilarity

Re: How do you use a different analyzer by field?

2006-07-10 Thread Otis Gospodnetic
Use PerFieldAnalyzerWrapper - Original Message From: Furash Gary <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Monday, July 10, 2006 11:56:04 AM Subject: How do you use a different analyzer by field? Maybe I'm approaching this wrong (apologies) and didn't search correctly thr

General Approach: Analyzer versus Query

2006-07-10 Thread Furash Gary
For some things, it's obvious that you would have to put them both on the front end (during indexing) and on the back end. E.g., if you want to do a soundex search, you'd want to encode the words with their soundex version during index creation, and when you query incode the user's search input as

What are norms?

2006-07-10 Thread Furash Gary
I'm guessing they're neither the guy from Cheers nor the sociology term ;-) The examples have you creating them before you do searches. What are they? The javadoc doesn't really explain their function (or at least not in a way I could figure out). Thanks. G ---

How do you use a different analyzer by field?

2006-07-10 Thread Furash Gary
Maybe I'm approaching this wrong (apologies) and didn't search correctly through the archives (mia culpa), but... If I want to apply a different analyzer to different fields in the document, how do I do that? It seems like when you create the index you pass it an analyzer, and that's the one you'

RE: Lucene-In-Action book - any details?..

2006-07-10 Thread Vladimir Olenin
Thanks, Eric. Vlad -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Monday, July 10, 2006 11:45 AM To: java-user@lucene.apache.org Subject: Re: Lucene-In-Action book - any details?.. Vlad, You're right - LIA 1st edition was written for Lucene 1.4.x. All of the c

Re: Lucene-In-Action book - any details?..

2006-07-10 Thread Erik Hatcher
Vlad, You're right - LIA 1st edition was written for Lucene 1.4.x. All of the code works fine with Lucene 1.9 (with deprecation warnings that can be safely ignored), but documented tweaks are necessary to get the LIA code to compile with Lucene 2.0. The coverage of the basics and archite

Lucene-In-Action book - any details?..

2006-07-10 Thread Vladimir Olenin
Hi, Can anyone, pls, advise, based on which version of Lucene the 'Lucene in Action' book is written? I've looked at various releases (http://gulus.usherbrooke.ca/pub/appl/apache/lucene/java/archive/), and it seems like there was a big gap between 1.4 and 1.9 release (over a year), with 1.4 relea

Some obvious questions that I'll be happy to put on the WIKI

2006-07-10 Thread Furash Gary
Big fan of lucene already. Just looking for some advice, with apologies in advance if it's been already answerd in the list and I just didn't search right. 1. Lets say I want to store a term in MORE than one way: e.g., I want to store the soundex version of a word and the real version of a word.

Re: FileDocument : cannot resolve symbol

2006-07-10 Thread James liu
i fix it...when i "import org.apache.lucene.demo.FileDocument;" and thk u for ur answer. 2006/7/10, Chris Hostetter <[EMAIL PROTECTED]>: : when i try javac Package: org.apache.lucene.demo; : name is IndexFiles.java : : it show me : FileDocument ,error info : cannot resolve symbol : : : how c

Re: TermQuery doesn't support non-english charecters

2006-07-10 Thread Erik Hatcher
On Jul 9, 2006, at 8:29 AM, dan2000 wrote: yes, myField is a tokenized field. I've used ChineseAnalyzer. here is an examle text ?? Let me explain what exactly what I want. myField is a tokenized field: new Field("key",key, Field.Store.YES, Field.Index.TOKENIZED) I sometimes need to find

Re: modify existing non-indexed field

2006-07-10 Thread dan2000
Here is the simplified code that causes problem (Lock obtain timed out). MyIndexer is used for indexing and searching. IndexTest starts 5 threads for indexing and 100 threads for searching. MyIndexer.java public class MyIndexer { File m_IndexFile; IndexReader m_IndexReader; Directory

Re: java.lang.OutOfMemoryError when search in large index

2006-07-10 Thread Chris Hostetter
These two threads may be educational... http://www.nabble.com/OutOfMemory-error-while-sorting-tf1785358.html#a4862471 http://www.nabble.com/MemoryUsage-of-sorting-tf1861157.html#a5083259 : Date: Sat, 8 Jul 2006 15:50:30 +0800 : From: yuexiang zhang <[EMAIL PROTECTED]> : Reply-To: java-user@luce

Re: FileDocument : cannot resolve symbol

2006-07-10 Thread Chris Hostetter
: when i try javac Package: org.apache.lucene.demo; : name is IndexFiles.java : : it show me : FileDocument ,error info : cannot resolve symbol : : : how can i compile it? The supported method for compiling Lucene is to use the ant build.xml file. If you take a look at the BUILD.txt file you'll