karl wettin wrote:
27 apr 2007 kl. 14.11 skrev Erik Hatcher:
On Apr 27, 2007, at 6:39 AM, karl wettin wrote:
27 apr 2007 kl. 12.36 skrev Erik Hatcher:
Unless someone has some other tricks I'm not aware of, that is.
I guess it would be possible to add start/stop-tokens such as ^ and
$ to
Just in case norms info cannot be spared, note that since Lucene 2.1 norms
are maintained in a single file, no matter how many fields there are.
However due to a bug in 2.1 this did not prevent the too many open files
problem. This bug was already fixed but not yet released. For more details
on th
Erick,
Thanks for your explaination. I thought using HitCollector. The search
interface we are facing now actually is pretty simple. One of the search
requires maximum of search results of 500 and page size is 500 (basically
return first 500). Second one requires max of 250 and page size is 25
27 apr 2007 kl. 14.11 skrev Erik Hatcher:
On Apr 27, 2007, at 6:39 AM, karl wettin wrote:
27 apr 2007 kl. 12.36 skrev Erik Hatcher:
Unless someone has some other tricks I'm not aware of, that is.
I guess it would be possible to add start/stop-tokens such as ^
and $ to the indexed text:
: >From what I read in the Lucene docs, these .f files store the
: normalization factor for the corresponding field. What exactly is this
: used for and more importantly, can this be disabled so that the files
: are not created in the first place?
field norms are primarily used for length normali
: > Wow, I did not know Lucene 2.1 can do all of this. The problem is that I'm
: > currently using 2.0. Is there something similar to what you just mentioned
: > in dealing with 2.0 indexes--backing up piecewise? Thanks again.
:
: Hmm, OK. Pre-2.1 Lucene will overwrite at least the file "segmen
: I have a question about filter caching. I have a lot of QueryFilters
: that I use when searching that filter on a single field. Sometimes
: alone I use them by themselves, but mostly I use them in some
: combination using ChainedFilter. Does the caching take advantage of
: only the final filte
: In order to do this, we tried subclassing the SnowballAnalyzer... it
: doesn't work yet, though. Here is the code of our custom class:
At first glance, what youv'e got seems fine, can you elaborate on what you
mean by "it doesn't work" ?
Perhaps the issue is that the SnowballStemmer can't hand
: If a BooleanQuery is created as the addition of two TermQuery
...
: The score for this BooleanQuery is double (around 6) when the compared
: document has the field ?pets? with these two values, but we want that
: the score is only 3, although there is more than one coincidence.
.
Hello,
What would be the best strategy to support an index with thousands or even
hundreds of thousands of individual field names?
I have client applications that create a lot of key/value type data. I use the
key as document field name so I end up with _a lot_ of .f files and
eventually the my
"larry hughes" <[EMAIL PROTECTED]> wrote:
> Wow, I did not know Lucene 2.1 can do all of this. The problem is that I'm
> currently using 2.0. Is there something similar to what you just mentioned
> in dealing with 2.0 indexes--backing up piecewise? Thanks again.
Hmm, OK. Pre-2.1 Lucene will
Never mind. The sorting was working correctly. I was just misinterpretting
the results I was seeing.
-Theo
Theodan wrote:
>
> Hello.
>
> I am trying to sort my query results on a String field called "AssetType"
> and then on the relevancy score, but I need a particular ordering of the
> po
Thanks Mike,
Wow, I did not know Lucene 2.1 can do all of this. The problem is that I'm
currently using 2.0. Is there something similar to what you just mentioned
in dealing with 2.0 indexes--backing up piecewise? Thanks again.
LH
Michael McCandless-3 wrote:
>
>
>
> "larry hughes" <[EMAI
"larry hughes" <[EMAIL PROTECTED]> wrote:
> I'm pondering on long term maintenance issues with Lucene indexes
> and would like to know of anyone's suggestions or recommendations to
> backing up these indexes. My goal is to have a weekly, or even
> daily, snapshot of the current index to make su
<4> is also easy
From the javadoc:
"*Caution:* Iterate only over the hits needed. Iterating over all hits is
generally not desirable and may be the source of performance issues."
So an iterator should be fine for all documents, even those > 100. But do be
aware that the entire query gets r
I'm pondering on long term maintenance issues with Lucene indexes and would
like to know of anyone's suggestions or recommendations to backing up these
indexes. My goal is to have a weekly, or even daily, snapshot of the
current index to make sure it is recoverable if the index gets corrupted. I
Regarding your first question (the easy one), there is
some information here:
http://www.gossamer-threads.com/lists/lucene/java-user/44312
--- Tony Qian <[EMAIL PROTECTED]> wrote:
> All,
>
> After playing around with Lucene, we decided to
> replace old full-text search
> engine with Lucene.
All,
After playing around with Lucene, we decided to replace old full-text search
engine with Lucene. I got "Lucene in Action" a week ago and finished reading
most of the book. I got several questions.
1) Since the book was written two years ago and Lucene has made a lot of
changes, is there
On Apr 27, 2007, at 6:39 AM, karl wettin wrote:
27 apr 2007 kl. 12.36 skrev Erik Hatcher:
Unless someone has some other tricks I'm not aware of, that is.
I guess it would be possible to add start/stop-tokens such as ^ and
$ to the indexed text: "^ the $" and place a phrase query with 0 slo
27 apr 2007 kl. 12.36 skrev Erik Hatcher:
Unless someone has some other tricks I'm not aware of, that is.
I guess it would be possible to add start/stop-tokens such as ^ and $
to the indexed text: "^ the $" and place a phrase query with 0 slop.
But that might screw up SpanFirstQuery et c?
On Apr 27, 2007, at 6:08 AM, karl wettin wrote:
27 apr 2007 kl. 08.21 skrev Kun Hong:
I just want that one document which
contains no other words than "the". Is it possible using Lucene
query?
Take a look at SpanFirstQuery. Perhaps you would need implement a
SpanLastQuery too.
Perhaps
27 apr 2007 kl. 08.21 skrev Kun Hong:
I just want that one document which
contains no other words than "the". Is it possible using Lucene query?
Take a look at SpanFirstQuery. Perhaps you would need implement a
SpanLastQuery too.
Perhaps the easiest way about it would be a RegexQuery that
22 matches
Mail list logo