Anyone using a lucene nightly build dated later than 11/20 will want
to upgrade to the next (future) nightly build that will be dated 12/21
http://issues.apache.org/jira/browse/LUCENE-754
Keep in mind that nightly builds are developer builds and not always
stable (though we try our best) :-)
-Y
I appreciate your help Hoss. That has cleared up some things for me. The
problem reamins that I would like to be able to switch between the hits
per doc Similarity and the default Similarity on any given search. I
was hoping that I could index with DefaultSimilarity and store the norms
for nor
> Take a look at TermDocs and TermEnum.
I need to get the frequency of each word in each of the documents I have
indexed.
This is what I could do with TermEnums and TermDocs. For each Term from
TermEnum, I have instantiated a TermsDoc and for each doc, I am trying to
get the frequency of the Ter
On Tuesday 19 December 2006 23:05, Scott Sellman wrote:
> new
> BooleanClause.Occur[]{BooleanClause.Occur.SHOULD,
> BooleanClause.Occur.SHOULD}
Why do you explicitly specify these operators?
> q.add(keywordQuery, BooleanClause.Occur.MUST); //true, false);
You seem to wra
: Foolish me...override a static method...silly silly. Still, I think
: there must be some way. I don't care about the field
: normalization...there must be some way to make it return a constant 1
: when using a new Similarity class.
as discussed: norms are a value explicitly stored in your index
I am not sure if this is a problem with Lucene or if I am building my
Query object improperly. It seems to me, when performing a search that
should exclude certain terms, MultiFieldQueryParser doesn't filter out
documents when it should. Consider the following example to clarify
what I am talking
On 12/19/06, John Song <[EMAIL PROTECTED]> wrote:
How to define default fields? Is it done during index time or during search
time? Strangely, I can't find out any information on how default fields are
defined?
"default" field is simply a QueryParser concept (see it's
constructors). It doe
Foolish me...override a static method...silly silly. Still, I think
there must be some way. I don't care about the field
normalization...there must be some way to make it return a constant 1
when using a new Similarity class.
Doron Cohen wrote:
"Mark Miller" <[EMAIL PROTECTED]> wrote on 19/12
I see your point, but I have to ask whether this is a practical or a
theoretical problem? If it's a practical one, perhaps you'd be willing to
talk about the issue you're actually trying to solve and maybe we can come
up with a solution within the current framework. I know others on the list
have
Thanks for the tip Doron,
What if I replace the decode static method in Similiarity so that it
returns 1 always for the HitPerDocSimiliarity? This would not require a
re-index right?
Doron Cohen wrote:
"Mark Miller" <[EMAIL PROTECTED]> wrote on 19/12/2006 09:21:00:
LIA mentioned somethin
Karl Koch wrote:
Are there any other papers that regard the combination of coordination level matching and TFxIDF as advantageous?
We independently developed coordination-level matching combined with
TFxIDF when I worked at Apple. This is documented in:
http://www.informatik.uni-trier.de/~
Hi:
How to define default fields? Is it done during index time or during search
time? Strangely, I can't find out any information on how default fields are
defined?
thanks,
john
__
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best sp
Steven Rowe wrote:
"2.1" is much more likely to be the label used for the next release than
"2.0.1".
The roadmap in Jira shows 21 issues scheduled for 2.0.1. If there is in
fact no intent to merge these into the 2.0 branch, these should probably
be retargetted for 2.1.0, and the 2.0.1 versio
The problem in reality consists on the fact to have an only dictionary of the
terms for all the fields. If the dictionary of the terms is the many large
performances of a search they diminish, even if the search is made on a single
term. Then it would be wanted to be able to index the fields of
Antonio Bruno wrote:
> To use but directly the docId would render efficient and fastest the
> searches much. Thoughts to the possibility of being able to apply a
> first CachingWrapperFilter F1 on an index and a second
> CachingWrapperFilter F2 on an other index and after to make (F1 AND
> F2) and
"Mark Miller" <[EMAIL PROTECTED]> wrote on 19/12/2006 09:21:00:
> LIA mentioned something about needing to rebuild the
> index if you change Similarity's. That does not make
> sense to me yet. It would seem you could alternate them.
> What does scoring have to do with indexing?
For this part of yo
Could I use another Similarity that returned 1 for most of the scoring terms
and the actual term frequency (rig the equation)? Could I then alternate the
DefaultSimilarity and HitsPerDocSimilairty per search? LIA mentioned
something about needing to rebuild the index if you change Similarity's.
Th
We'ed primarily like to see a release of the LockFactory implementation.
This functionality will help us better control our locking, but we
want to depend on actual releases, not interim builds/snapshots.
Any news on this now that this thread is a couple months old?
-Mark Diggory
George Ar
Hello Guido,
Wednesday, April 5, 2006, 5:23:37 PM, you wrote:
GN> On 05.04.2006, at 17:15 Uhr, Bill Janssen wrote:
>> Or, as I suggested a couple of days ago, a 1.9.2 release could be
>> offered.
GN> Would be a good idea, because the current nightly builds have a lot
GN> of deprecated metho
But you can do something very similar and very quickly using a unique ID
(not the Lucene ID) that's shared across the indexes (assuming I'm reading
your issue correctly). Then use TermDocs/TermEnum and create your filters
that way.
I predict endless problems with user (programmer) errors if Lucen
Hi,
I'm working on learning Lucene for my job, and the book one of my professors
purchased for myself and her is Lucene In Action, which is a good book but
it is based on version 1.4.3 (I believe). I am beginning to grasp a lot of
the basic concepts behind Lucene and have a basic searching and i
To use but directly the docId would render efficient and fastest the searches
much. Thoughts to the possibility of being able to apply a first
CachingWrapperFilter F1 on an index and a second CachingWrapperFilter F2 on an
other index and after to make (F1 AND F2) and to even extract the info of
22 matches
Mail list logo