Re: Lucene query question

2006-05-09 Thread Otis Gospodnetic
Mike, Do you really want to tokenize your emails? StandardAnalyzer may in fact recognize email addresses and leave them as one token, but it would probably be better practice to make that email field UN_TOKENIZED. Most of the time when people have trouble finding a Document they _know_ is in

Lucene query question

2006-05-09 Thread Mike Richmond
I am new to Lucene, but the behavior that I am seeing does not seem to make sense to me. I am using the latest version of Lucene (1.9.1) and executing the following code below which creates an index with a single document and only one field (named "test") with a value of "[EMAIL PROTECTED]". If

Parallel

2006-05-09 Thread [EMAIL PROTECTED]
Hi, first sorry if this may be a stupid question... :-) I've 3 separate index and i use a ParallelMultiSearcher to search in... now i would like to limits the number of hits founded ... for example i would like to get the first 10 hits from each indexes. How can i do this? Any suggestions? Thanks

Re: grouping search results

2006-05-09 Thread Chris Hostetter
: Damn. That's no good, then. What about doing it the opposite way: : make a QueryFilter for each category (these could be cached between : search sessions), and use those to filter the results from searching : for the user's query? Would that actually be any faster than the : original idea of con

Re: grouping search results

2006-05-09 Thread Mike Baranczak
On May 9, 2006, at 2:08 PM, Chris Hostetter wrote: : redundant work. My next idea was to create a QueryFilter from the : user's query, and run a search for each category with this filter and : a term query. Since the QueryFilter is supposed to cache results, : this should theoretically be m

Re: grouping search results

2006-05-09 Thread Chris Hostetter
: redundant work. My next idea was to create a QueryFilter from the : user's query, and run a search for each category with this filter and : a term query. Since the QueryFilter is supposed to cache results, : this should theoretically be more efficient. So my questions to the if you did an appro

Re: grouping search results

2006-05-09 Thread karl wettin
On Tue, 2006-05-09 at 13:46 -0400, Mike Baranczak wrote: > The documents in my index will contain a "category" field. (We can > assume that the number of possible categories will be small - 10 or > so max - and that they'll be known in advance.) I need to be able to > present the search resul

Re: Retrieving field or Document using document id.

2006-05-09 Thread karl wettin
On Tue, 2006-05-09 at 13:53 -0400, varun sood wrote: > Hi, > I have "Doc. Id" of the document stored in the database. Now I want to > query database on that "Doc. Id" (which will always return one document). > How can I do this? Are you aware that the document number created by Lucene is conside

Retrieving field or Document using document id.

2006-05-09 Thread varun sood
Hi, I have "Doc. Id" of the document stored in the database. Now I want to query database on that "Doc. Id" (which will always return one document). How can I do this? To avoid confusion, I am talking about the "Doc. Id" which Lucene automatically creates for every document and hence is unique fo

grouping search results

2006-05-09 Thread Mike Baranczak
The documents in my index will contain a "category" field. (We can assume that the number of possible categories will be small - 10 or so max - and that they'll be known in advance.) I need to be able to present the search results to the end user like this: - top 10 results in category "x":

Re: How do I know the memory size of my RAMDirectory ?

2006-05-09 Thread mark harwood
public int getRAMSize(RAMDirectory ramDir) throws IOException { String []segs=ramDir.list(); int totalSize=0; for(int i=0;ihttp://uk.mail.yahoo.com - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional

How do I know the memory size of my RAMDirectory ?

2006-05-09 Thread Ariel Isaac Romero
Hi every body: How do I know the memory size of my RAMDirectory ? I need to control the memory size of my RAM directory to serialized the index to disk when ram directory memory get the 100 MB size. I have a distributed enviroment I really need to find the way, I must control the size of the inde

re :Range queries

2006-05-09 Thread Nadav Har'El
"Kinnar Kumar Sen, Noida" <[EMAIL PROTECTED]> wrote on 09/05/2006 12:57:16 PM: > When I am trying RANGE QUERY on my index it works fine for a small > index but when the index is large such as 0 - 100 it gives an > exception > > Boolean Clause Exception I have set the 1024 value in boole

Re: Subject indexing and seraching documents with multiple languages

2006-05-09 Thread Grant Ingersoll
[EMAIL PROTECTED] wrote: Grant, considering the answer from Karl, it seems that we have to choice to put all the documents in one index or use an index for each language. You are using an index for each language. We are currently discussing the pros and cons for both solutions. Thus we would be

RE: Range queries

2006-05-09 Thread mark harwood
Typically the 3 most important things to remember when using numerical range queries are: 1) Use a filter instead. 2) Use a filter instead. 3) Use a filter instead. Seriously, number rangeQueries are normally a bad idea because: a) they can produce "too many term" errors (your current problem) b

Re: Subject indexing and seraching documents with multiple languages

2006-05-09 Thread karl wettin
On Tue, 2006-05-09 at 10:18 +0200, [EMAIL PROTECTED] wrote: > > considering the answer from Karl, it seems that we have to choice to > put all the documents in one index or use an index for each language. > You are using an index for each language. We are currently discussing > the pros and cons f

RE: Range queries

2006-05-09 Thread Ramana Jelda
Hi, You can use BooleanQuery.setMaxClauseCount() to set to your required maximum clause count. But ofcourse too big clause count is not advisable. May be you need to find some strategy to reduce this range. Ex:Reduce date ranges to 20060505 from including timestamp like 20060505120530 :) All t

re :Range queries

2006-05-09 Thread Kinnar Kumar Sen, Noida
Hi When I am trying RANGE QUERY on my index it works fine for a small index but when the index is large such as 0 - 100 it gives an exception Boolean Clause Exception I have set the 1024 value in boolean to integer.max but now is giving a out of memory exception . Can some body sugg

RE: Synonyms ...

2006-05-09 Thread Ziv Gome
You are free to take a look at the thread about synonym query from mars, initiated by Andrew Schetinin and myself. This code (suggestion) tries to handle synonym as a query expansion, rather than injection at indexing time, while fix the problems a simple expansion creates (mainly results of IDF).

Re: Subject indexing and seraching documents with multiple languages

2006-05-09 Thread pbatcoi
Karl, no, you didn't misunderstand. We have to admit that we were not aware of the possibility to use different analyzers for the documents in an index. It seems that we were working to close to the examples and did not spend enough time to RTFM. Thank you for the hint! Grant, considering the an

Re: Scoring without floating point calculations

2006-05-09 Thread Paul Elschot
On Tuesday 09 May 2006 01:39, Otis Gospodnetic wrote: > Ah, this is pretty disheartening. Regardless, I'm about to dive into this, > so if you have any tips or experiences to share, I'm all eyeballs. > > Otis > > - Original Message > From: Ken Krugler <[EMAIL PROTECTED]> > To: java-use