Thanks this is useful class for future...
Koji Sekiguchi-2 wrote:
>
> John Seer wrote:
>> Hello,
>> There is any way that a single document fields can have different
>> analyzers
>> for different fields?
>>
>> I think one way of doing it to create custom analyzer which will do field
>> spastic a
On 4/10/2009 12:56 PM, Steven Bethard wrote:
> I need to have a scoring model of the form:
>
> s1(d, q)^a1 * s2(d, q)^a2 * ... * sN(d, q)^aN
>
> where "d" is a document, "q" is a query, "sK" is a scoring function, and
> "aK" is the exponential boost factor for that scoring function. As a
> si
John Seer wrote:
Hello,
There is any way that a single document fields can have different analyzers
for different fields?
I think one way of doing it to create custom analyzer which will do field
spastic analyzes..
Any other suggestions?
There is PerFieldAnalyzerWrapper
http://hudson.z
Hello,
There is any way that a single document fields can have different analyzers
for different fields?
I think one way of doing it to create custom analyzer which will do field
spastic analyzes..
Any other suggestions?
--
View this message in context:
http://www.nabble.com/Different-Anal
Hello,
I have 3 terms and I want to much them in order I tried to use wildcard
query I am not getting any results back
Terms: A C F
Doc: name:A B C D E F
query: name:A*C*F
I am not getting any results back,
Please any suggestions?
Thanks for help in advance
--
View this message in context
On 4/10/2009 1:08 PM, Jack Stahl wrote:
> Perhaps you'd find it easier to implement the equivalent:
>
> log(s1(d, q))*a1 + ... + log(sN(d, q))*aN
Yes, that's fine too - that's actually what I'd be optimizing anyway.
But how would I do that? If I took the query boost route, how do I get a
TermQue
You got a lot of answers and questions about your index structure. Now
another idea, maybe this helps you to speed up your RangeFilter:
What type of range do you want to query? From your index statistics, it
looks like a numeric/date field from which you filter very large ranges. If
the values are
Perhaps you'd find it easier to implement the equivalent:
log(s1(d, q))*a1 + ... + log(sN(d, q))*aN
On Fri, Apr 10, 2009 at 12:56 PM, Steven Bethard wrote:
> I need to have a scoring model of the form:
>
>s1(d, q)^a1 * s2(d, q)^a2 * ... * sN(d, q)^aN
>
> where "d" is a document, "q" is a que
I need to have a scoring model of the form:
s1(d, q)^a1 * s2(d, q)^a2 * ... * sN(d, q)^aN
where "d" is a document, "q" is a query, "sK" is a scoring function, and
"aK" is the exponential boost factor for that scoring function. As a
simple example, I might have:
s1 = TF-IDF score matching
On Fri, Apr 10, 2009 at 3:06 PM, Mark Miller wrote:
> 24 segments is bound to be quite a bit slower than an optimized index for
> most things
I'd be curious just how true this really is (in general)... my guess
is the "long tail of tiny segments" gets into the OS's IO cache (as
long as the syste
On Fri, Apr 10, 2009 at 3:11 PM, Mark Miller wrote:
> Mark Miller wrote:
>>
>> Michael McCandless wrote:
>>>
>>> which is why I'm baffled that Raf didn't see a speedup on
>>> upgrading.
>>>
>>> Mike
>>>
>>
>> Another point is that he may not have such a nasty set of segments - Raf
>> says he has 2
On Fri, Apr 10, 2009 at 3:14 PM, Mark Miller wrote:
> Raf wrote:
>>
>> We have more or less 3M documents in 24 indexes and we read all of them
>> using a MultiReader.
>>
>
> Is this a multireader containing multireaders?
Let's hear Raf's answer, but I think likely "yes". But this shouldn't
be a
Raf wrote:
We have more or less 3M documents in 24 indexes and we read all of them
using a MultiReader.
Is this a multireader containing multireaders?
--
- Mark
http://www.lucidimagination.com
-
To unsubscribe, e-mail
Mark Miller wrote:
Michael McCandless wrote:
which is why I'm baffled that Raf didn't see a speedup on
upgrading.
Mike
Another point is that he may not have such a nasty set of segments -
Raf says he has 24 indexes, which sounds like he may not have the
logarithmic sizing you normally see
Michael McCandless wrote:
which is why I'm baffled that Raf didn't see a speedup on
upgrading.
Mike
Another point is that he may not have such a nasty set of segments - Raf
says he has 24 indexes, which sounds like he may not have the
logarithmic sizing you normally see. If you have somewh
Michael McCandless wrote:
On Fri, Apr 10, 2009 at 2:32 PM, Mark Miller wrote:
I had thought we would also see the advantage with multi-term queries - you
rewrite against each segment and avoid extra seeks (though not nearly as
many as when enumerating every term). As Mike pointed out to me
On Fri, Apr 10, 2009 at 2:32 PM, Mark Miller wrote:
> I had thought we would also see the advantage with multi-term queries - you
> rewrite against each segment and avoid extra seeks (though not nearly as
> many as when enumerating every term). As Mike pointed out to me back when
> though : we st
When I did some profiling I saw that the slow down came from tons of
extra seeks (single segment vs multisegment). What was happening was,
the first couple segments would have thousands of terms for the field,
but as the segments logarithmically shrank in size, the number of terms
for the segme
On Fri, Apr 10, 2009 at 1:20 PM, Raf wrote:
> Hi Mike,
> thank you for your answer.
>
> I have downloaded lucene-core-2.9-dev and I have executed my tests (both on
> multireader and on consolidated index) using this new version, but the
> performance are very similar to the previous ones.
> The bi
On Fri, Apr 10, 2009 at 11:03 AM, Yonik Seeley
wrote:
> On Fri, Apr 10, 2009 at 10:48 AM, Michael McCandless
> wrote:
>> Unfortunately, in Lucene 2.4, any query that needs to enumerate Terms
>> (Prefix, Wildcard, Range, etc.) has poor performance on Multi*Readers.
>
> Do we know why this is, and
Hello,
I was working with lucene snowball 2.3.2 and I switch to 2.4.0.
After switch I came by to some case where lucene doesn't do lemmatization
correctly. So far I found only one case spa - spas. spas are not getting
lemmatize at all...
BTW I saw the same behavior on solr 1.3
Anybody have any
Hi Mike,
thank you for your answer.
I have downloaded lucene-core-2.9-dev and I have executed my tests (both on
multireader and on consolidated index) using this new version, but the
performance are very similar to the previous ones.
The big index is 7/8 times faster than multireader version.
Raf
Chris Hostetter wrote:
: The second stage index failed an optimization with a disk full exception
: (I had to move it to another lucene machine with a larger disk partition
: to complete the optimization. Is there a reason why a 22 day index would
: be 10x the size of an 8 day index when the do
On Fri, Apr 10, 2009 at 10:48 AM, Michael McCandless
wrote:
> Unfortunately, in Lucene 2.4, any query that needs to enumerate Terms
> (Prefix, Wildcard, Range, etc.) has poor performance on Multi*Readers.
Do we know why this is, and if it's fixable (the MultiTermEnum, not
the higher level query o
Unfortunately, in Lucene 2.4, any query that needs to enumerate Terms
(Prefix, Wildcard, Range, etc.) has poor performance on Multi*Readers.
I think the only workaround is to merge your indexes down to a single
index.
But, Lucene trunk (not yet released) has fixed this, so that searching
through
Hi,
we are experiencing some problems using RangeFilters and we think there are
some performance issues caused by MultiReader.
We have more or less 3M documents in 24 indexes and we read all of them
using a MultiReader.
If we do a search using only terms, there are no problems, but it if we add
to
Thanks Otis,
Yes, we figured that out! Since, we do not intend to migrate to 2.4 yet,
we used the syns2index source code from svn. The problem is now taken
care.
This part is for all:
This brings us to the next question:
1. Is there some contrib code available for using hypernyms and such,
2009/4/10 Matthew Hall :
> I think I would tackle this in a slightly different manner.
>
> When you are creating this index, make sure that that field has a
> default value. Make sure this value is something that could never appear
> in the index otherwise. Then, when you goto place this field into
Hi
I have been playing around with the SpellChecker class and so far it looks
really good. While developing a testcase to show it working I came across a
couple of issues which I have resolved but I'm not certain if this is the
correct approach. I would therefore be grateful if anyone could tell
I think I would tackle this in a slightly different manner.
When you are creating this index, make sure that that field has a
default value. Make sure this value is something that could never appear
in the index otherwise. Then, when you goto place this field into the
index, either write out your
Hi,
I found the API in another post on the net.
new *Sort*(new SortField(null, SortField.DOC, true))
The trick is to set the field to null.
Thanks for the help.
Preetham Kajekar wrote:
Hi Uwe,
Thanks for your response. However, I could not find the API in
SortField and Sort to achieve this.
Hi Uwe,
Thanks for your response. However, I could not find the API in
SortField and Sort to achieve this. SortField can be wrapped inside a
Sort, but you cannot specify to reverse the order .
Thx,
~preetham
Uwe Schindler wrote:
It should, do not use Sort.INDEX_ORDER, create a SortField wit
This (reversing a SortField.FIELD_DOC) should work... if it doesn't it's a bug.
SortField.FIELD_DOC and SortField.FIELD_SCORE are "first class"
SortField objects.
Mike
On Fri, Apr 10, 2009 at 5:31 AM, Uwe Schindler wrote:
> It should, do not use Sort.INDEX_ORDER, create a SortField with indexor
Actually it's perfectly fine for two threads to enter that code
fragment (you obtain a write lock to protect the code so that "there
can be only one").
Second off, even if you didn't have your write lock, the code should
still be safe in that no index corruption is possible. Multiple
threads may
It should, do not use Sort.INDEX_ORDER, create a SortField with indexorder
and the reverse parameter, the SortField can be warpped inside a Sort
instance and voila. I am not sure, if it works, but it should. Same with
score.
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.theta
Hi,
I just realized it was a bug in my code.
On a related note, is it possible to Sort based on reverse index order ?
Thanks,
~preetham
Uwe Schindler wrote:
Hallo Preetham,
never heard of this. What Lucene version do you use?
To check out, try the search in andifferent way:
Combine the two ind
dir is a local variable inside a method, so it's not getting reused.
Should I synchronise the whole method? I think that would slow things
down in a concurrent environment.
Thanks for your response.
Chris Hostetter wrote:
: My code looks like this:
:
: Directory dir = null;
: try {
:di
Hallo Preetham,
never heard of this. What Lucene version do you use?
To check out, try the search in andifferent way:
Combine the two indexes not into a MultiSearcher, instead open an
IndexReader for both indexes and combine both readers to a MultiReader. This
MultiReader can be used like a conven
Hi,
I am using a MultiSearcher to search 2 indexes. As part of my query, I
am sorting the results based on a field (which in NOT_ANALYSED).
However, i seem to be getting hits only from one of the indexes. If I
change to Sort.INDEX_ORDER, I seem to be getting results from both. Is
this a know p
39 matches
Mail list logo