Mike,
Do you really want to tokenize your emails? StandardAnalyzer may in fact
recognize email addresses and leave them as one token, but it would probably be
better practice to make that email field UN_TOKENIZED.
Most of the time when people have trouble finding a Document they _know_ is in
I am new to Lucene, but the behavior that I am seeing does not seem to
make sense to me. I am using the latest version of Lucene (1.9.1) and
executing the following code below which creates an index with a
single document and only one field (named "test") with a value of
"[EMAIL PROTECTED]".
If
Hi,
first sorry if this may be a stupid question... :-)
I've 3 separate index and i use a ParallelMultiSearcher to search in... now i
would like to limits the number of hits founded ... for example i would like to
get the first 10 hits from each indexes.
How can i do this? Any suggestions?
Thanks
: Damn. That's no good, then. What about doing it the opposite way:
: make a QueryFilter for each category (these could be cached between
: search sessions), and use those to filter the results from searching
: for the user's query? Would that actually be any faster than the
: original idea of con
On May 9, 2006, at 2:08 PM, Chris Hostetter wrote:
: redundant work. My next idea was to create a QueryFilter from the
: user's query, and run a search for each category with this filter
and
: a term query. Since the QueryFilter is supposed to cache results,
: this should theoretically be m
: redundant work. My next idea was to create a QueryFilter from the
: user's query, and run a search for each category with this filter and
: a term query. Since the QueryFilter is supposed to cache results,
: this should theoretically be more efficient. So my questions to the
if you did an appro
On Tue, 2006-05-09 at 13:46 -0400, Mike Baranczak wrote:
> The documents in my index will contain a "category" field. (We can
> assume that the number of possible categories will be small - 10 or
> so max - and that they'll be known in advance.) I need to be able to
> present the search resul
On Tue, 2006-05-09 at 13:53 -0400, varun sood wrote:
> Hi,
> I have "Doc. Id" of the document stored in the database. Now I want to
> query database on that "Doc. Id" (which will always return one document).
> How can I do this?
Are you aware that the document number created by Lucene is conside
Hi,
I have "Doc. Id" of the document stored in the database. Now I want to
query database on that "Doc. Id" (which will always return one document).
How can I do this?
To avoid confusion, I am talking about the "Doc. Id" which Lucene
automatically creates for every document and hence is unique fo
The documents in my index will contain a "category" field. (We can
assume that the number of possible categories will be small - 10 or
so max - and that they'll be known in advance.) I need to be able to
present the search results to the end user like this:
- top 10 results in category "x":
public int getRAMSize(RAMDirectory ramDir) throws
IOException
{
String []segs=ramDir.list();
int totalSize=0;
for(int i=0;ihttp://uk.mail.yahoo.com
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional
Hi every body:
How do I know the memory size of my RAMDirectory ?
I need to control the memory size of my RAM directory to serialized the
index to disk when ram directory memory get the 100 MB size.
I have a distributed enviroment
I really need to find the way, I must control the size of the inde
"Kinnar Kumar Sen, Noida" <[EMAIL PROTECTED]> wrote on 09/05/2006 12:57:16
PM:
> When I am trying RANGE QUERY on my index it works fine for a small
> index but when the index is large such as 0 - 100 it gives an
> exception
>
> Boolean Clause Exception I have set the 1024 value in boole
[EMAIL PROTECTED] wrote:
Grant,
considering the answer from Karl, it seems that we have to choice to put all
the documents in one index or use an index for each language. You are using
an index for each language. We are currently discussing the pros and cons
for both solutions. Thus we would be
Typically the 3 most important things to remember when
using numerical range queries are:
1) Use a filter instead.
2) Use a filter instead.
3) Use a filter instead.
Seriously, number rangeQueries are normally a bad idea
because:
a) they can produce "too many term" errors (your
current problem)
b
On Tue, 2006-05-09 at 10:18 +0200, [EMAIL PROTECTED] wrote:
>
> considering the answer from Karl, it seems that we have to choice to
> put all the documents in one index or use an index for each language.
> You are using an index for each language. We are currently discussing
> the pros and cons f
Hi,
You can use BooleanQuery.setMaxClauseCount() to set to your required
maximum clause count.
But ofcourse too big clause count is not advisable.
May be you need to find some strategy to reduce this range. Ex:Reduce date
ranges to 20060505 from including timestamp like 20060505120530 :)
All t
Hi
When I am trying RANGE QUERY on my index it works fine for a small
index but when the index is large such as 0 - 100 it gives an
exception
Boolean Clause Exception I have set the 1024 value in boolean to
integer.max but now is giving a out of memory exception . Can some body
sugg
You are free to take a look at the thread about synonym query from mars,
initiated by Andrew Schetinin and myself. This code (suggestion) tries
to handle synonym as a query expansion, rather than injection at
indexing time, while fix the problems a simple expansion creates (mainly
results of IDF).
Karl,
no, you didn't misunderstand. We have to admit that we were not aware of the
possibility to use different analyzers for the documents in an index. It
seems that we were working to close to the examples and did not spend enough
time to RTFM. Thank you for the hint!
Grant,
considering the an
On Tuesday 09 May 2006 01:39, Otis Gospodnetic wrote:
> Ah, this is pretty disheartening. Regardless, I'm about to dive into this,
> so if you have any tips or experiences to share, I'm all eyeballs.
>
> Otis
>
> - Original Message
> From: Ken Krugler <[EMAIL PROTECTED]>
> To: java-use
21 matches
Mail list logo