avoid this issue.
Philippe
On Thu, Sep 5, 2019 at 5:29 PM Uwe Schindler wrote:
> Hi,
>
> this issue is known and cannot be solved by just patching lucene, it
> affects the whole lucene infrastrcuture. A change on this would break
> almost any app out there so it needs to be done on
likely just moving classes in distinct packages
which should not be too complex.
Philippe
[1] https://github.com/AdoptOpenJDK/jsplitpkgscan
7;696 | >= 8'300'672
org.apache.lucene.index.TermBuffer | 9'408 |
526'848 | >= 5'811'240
org.apache.lucene.document.Field |14'350 |
803'600 | >= 3'454'360
-
Dear Lucene group,
I wrote my own Scorer by extending Similarity. The scorer works quite
well, but I would like to ignore the fieldnorm value. Is this somehow
possible during search time? Or do I have to add a field indexed with
no_norm?
Best,
Philippe
pening the index leads to deletion of these documents.
However, is there a possibility to avoid this? Or do I have to re-index
all documents again?
Best,
Philippe
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apac
Hi all,
I want to rank my query by the number of tokens in a field. What would
be the best way to implement such a ranking?
Regards,
Philippe
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For
the "TITLE"-Field?
Regards,
Philippe
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
ector.
However, I did not fully understood your first idea. During indexing I
can store the TermVectors on disk. What do I have to do during
retrieval? I mean, does lucene automatically profit from the
TermVectors? Or do I have to use something different instead of getValues().
Regards,
Philipp
ieving and iterating the scoredocs is quite costly. So is
there a better/faster way to perform this?
Cheers,
Philippe
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: j
rrent solution uses "CachingWrapperFilter" wrapped around a
"QueryWrapperFilter". However, I wanted to know if anybody is aware of a
faster solution?
Regards,
Philippe
-
To unsubscribe, e-mail: java-user-
Hi Paul,
thanks for the code. It is much faster than the implementation before.
Cheers,
Philippe
Am 26.07.2010 16:25, schrieb Paul Libbrecht:
Le 26-juil.-10 à 16:01, Michael McCandless a écrit :
You can make a custom Collector? Ie, it'd just increment a counter
for each hit.
As
Hi,
for some queries I'm only interested in the number of matching
documents. Is there a better/faster way to perform such a query, instead
of retrieving all TopDocs and counting the number of totalHits [1]?
And is it possible/worthwhile to "deactivate" ranking?
Cheers,
Well,
that's difficult at the moment as I can also just reproduce this error
for some few cases. But I will try to generate such an example..
Cheers,
Philippe
Am 22.07.2010 12:34, schrieb Ian Lea:
No, I don't have an explanation. Perhaps a minimal self-contained
program or
ame() say in each case? q.toString()?
searcher.explain(q, n)?
What version of lucene?
--
Ian.
On Wed, Jul 21, 2010 at 10:25 PM, Philippe wrote:
Hi,
I just performed two queries which, in my opinion, should lead to the same
document rankings. However, the document ranking differ between thes
;
1.)
Query q = parser.parse("lucene");
2.)
Query q = parser.parse(TITLE:lucene OR BOOK:lucene);
Regards,
philippe
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org
Hi Yonik,
Am 19.07.2010 16:21, schrieb Yonik Seeley:
On Mon, Jul 19, 2010 at 9:53 AM, Philippe wrote:
is there a possibility to retrieve the lengthNorm for all (or a specific)
fields in a specific document?
See IndexReader: public abstract byte[] norms(String field) throws
Hi,
is there a possibility to retrieve the lengthNorm for all (or a
specific) fields in a specific document?
Regards,
Philippe
-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e
Problem is that the list of relevant ids
is often changing...
Regards,
Philippe
Ian Lea schrieb:
I'm not sure I understand what you are asking, but if you search for
id:5 id:6 then I think doc2 will be ranked higher, because it contains
both fields.
Or are you saying you want to rank bas
"ID"-fields (5,6,7)
I'm more interested in documents containing the ids (5,6) so Document2
should be ranked higher than Document1.
What would be the best way to perform this?
Regards,
Philippe
-
To
That's really wonderful. Everything gets cleaner now.
Thanks, I mean really Thanks, for all the hard work that goes in Lucene
code + Doc + Processes + Mailing list. Lucene is really something I
refer other as "what (open source) software development should be".
I'll go with lucene_2_1 !
Jp
-
I would move to 2.1, but that's just me!
-Grant
On Mar 14, 2007, at 5:12 PM, Jean-Philippe Robichaud wrote:
> Hello Dear Lucene Users!
>
>
>
> Back in the old days (well, last year) the lucene/java/trunk
> subversion
> path was always stable enough for everyone to use
Hello Dear Lucene Users!
Back in the old days (well, last year) the lucene/java/trunk subversion
path was always stable enough for everyone to use into production code.
Now, with the 2.0/2.1/2.2 braches, is it still the case?
In December, I 'ported' my app to use the lucene 2.0 release.
[sorry for the long delay for my answer, we are having some issues with our
mail server...]
Thanks for your comment. Yes it would make sense if the log files were not
so big. In fact, I'm only indexing a subset of the log information.
Because I store the information in Lucene, it is easier and f
[sorry for the long delay for my answer, we are having some issues with our
mail server...]
Thanks for your comment. Yes it would make sense if the log files were not
so big. In fact, I'm only indexing a subset of the log information.
Because I store the information in Lucene, it is easier and f
age-
From: Mike Klaas [mailto:[EMAIL PROTECTED]
Sent: Friday, October 20, 2006 5:00 PM
To: java-user@lucene.apache.org
Subject: Re: "Catalog" backend for document stored fields?
On 10/20/06, Robichaud, Jean-Philippe
<[EMAIL PROTECTED]> wrote:
> 3- Any ideas on how
ould need to be modified? FSDirectory? Document?
3- Any ideas on how else I could do this? I'm fully open to
discussion!
Thanks for your help!
Jp
_
JEAN-PHILIPPE ROBICHAUD
Speech Scientist Professional Services
NUANCE
Thanks for the Field.setOmitNorms(true) tip!
Regarding the Similarity implementation I am trying to do, somehow it does
not work.
Here's what I understand:
Scorer implementation uses the method defined in Similarity, to compute
score. (the formula expressed in
"http://lucene.apache.org/java/docs
in 1.9?
I am just starting to read on Similarity, weights etc.
Can someone give me a heads up?
Thanks!
Philippe Deslauriers
re the OFFSETS and POSITIONS used for? Do I need it for Highlighting?
Can I create the TermFreqVector on the fly for a document, or do I have to
include them in the index?
Philippe
-
To unsubscribe, e-mail: [EMAIL PROTECTE
Hello here,
I'd like to do some geographical searches. Can somebody can tell me
where to go ?
At the moment i put the longitude and latitude in the index as i would
put some text. Then i did some range queries like this :
queryString=foo AND country:United States AND
[EMAIL PROTECTED]:[-74.0
Hi Everyone,
I have a special scenario where I frequently want to insert duplicates
documents in the index. For example, I know that I want 400 copies of the
same document. (I use the docboost of something else so I can't just add one
document and set the docboost to 400).
I would like to hac
Hi everyone.
I need a special query type that looks like a phrase query but with special
logic inside (like allowing inversions of certain terms only and not of
others, special score manipulation on certain 'events', ...) I wonder what
approach I should take? How does someone build a custom q
Hi Everyone.
I'm currently in a situation where I have multiples indexSearcher opened at
the same, each on different indices. They are kept inside a
"IndicesManager" that export getSearcherAtLocation/FreeSearcher method. I
would like to be able to log the "path" used by a searcher I'm about to
"c
It may be simpler and more effective to use the Hits object and keep the
number of time each host was actually "returned" to the user and skip it if
the limit has been reach. This way, if your users just look at the 10-20
highest hits, you will save you a lot of processing time, especially if you
Ok, I know that usually, the scores returned by Lucene do not mean "really"
something. But in my case, it does, I play with the similarity and bla bla
bla... Now my concern is that the Query.setBoost() does not always seems to
affect the score. I've built a simple test (code completely at the e
Hi Everyone,
I've been using Lucene a lot and I would like to know how the
SimilarityDelegator should be used. I would like to override only the
lengthNorm member of the DefaultSimilarity and I understand that this is
exactly the purpose of SimilarityDelegator ? Am I right? Does this class
What about:
http://svn.apache.org/viewcvs.cgi/lucene/java/trunk/src/java/org/apache/luce
ne/index/ParallelReader.java?rev=169859&view=markup
Jp
-Original Message-
From: Bruce Ritchie [mailto:[EMAIL PROTECTED]
Sent: Monday, May 30, 2005 11:26 AM
To: java-user@lucene.apache.org
Subject: RE:
oug Cutting [mailto:[EMAIL PROTECTED]
Sent: Wednesday, May 04, 2005 5:10 PM
To: java-user@lucene.apache.org
Subject: Re: PerFieldSimilarity
Robichaud, Jean-Philippe wrote:
> How cool, I did not knew that... that may help me... If I understand you
> correctly, I can create a boolean que
java-user@lucene.apache.org
Subject: Re: PerFieldSimilarity
Robichaud, Jean-Philippe wrote:
> Again, I can change
> the similarity of the reader at run-time and issue specific queries,
summing
> the score myself, but that is pretty inefficient.
You can also specify a Similarity implementation
From: Erik Hatcher [mailto:[EMAIL PROTECTED]
Sent: Tuesday, May 03, 2005 7:40 PM
To: java-user@lucene.apache.org
Subject: Re: PerFieldSimilarity
On May 3, 2005, at 5:57 PM, Robichaud, Jean-Philippe wrote:
> Hi Everyone,
>
> I've been searching the archive without success
ril 27, 2005 12:30 PM
To: java-user@lucene.apache.org
Subject: Re: Implementation of a ScoreObject ?
Robichaud, Jean-Philippe wrote:
>Probably the simplest/ideal schema of the ScoreObject would be something
>like a hashtable with Term being the keys and a TermScoreObject the value.
>The
Hi Everyone,
I've been searching the archive without success to answer this one: is it
possible to specify one similarity class per field, just like we can do with
an analyzer ? I know I can change the similarity of the searcher, but that
restrict me to break some complex queries into different
Hi Everyone,
Lucene is incredible for a lot of reasons. I've been using it
for the past months and it served me quite well. I'm using the subversion
snapshots, which I update every now and then. Almost every functionality I
need is already present and well implemented, but sadly
Hi Guys,
It is somewhat difficult to suggest something useful without more
details. If you a pretty sure of the quality of the query, then here is my
suggestion:
Index the documents with an extra field called "last_word" that will
contains the last word in the document. So from your exa
Hi everyone.
I've been playing with Lucene a lot in the past few months for an important
project. We are using the raw score returned by Lucene (we created a custom
similarity) as a part of a confidence score calculation. My problem is
exactly what the subject line of this email says: How to s
45 matches
Mail list logo