Re: Improvements to the Explanation class

2019-01-12 Thread Vadim Gindin
Hi all, I think it is a good idea. I have a similar situation and I had to store additional data (features values) as a string and parse it further. So I'd be glad if your proposal will be implemented. Regards, Vadim Gindin On Fri, Jan 11, 2019 at 7:17 PM Sambhav Kothari (BLOOMBERG/ L

Re: Camel case search with Lucene

2018-10-04 Thread Vadim Gindin
Hi Ira. If you want to use camel case query, for example, search "redHotChilly" instead of "red hot chilly" - you should use own pattern tokenizer to divide the query by regex pattern. Regards Vadim Gindin On Thu, Oct 4, 2018 at 11:58 AM Gordin, Ira wrote: > Hi

Re: Question about BytesRef and BinaryDocValues

2018-08-24 Thread Vadim Gindin
(postingsEnum) } return null; } After that you're getting a payload in a CustomFieldScorer.score() in the following way: postingsEnum.nextPosition(); BytesRef payload = postings.getPayload(); Regards, Vadim Gindin On Fri, Aug 24, 2018 at 10:16 AM Kevin Manuel wrote: >

Re: Question about BytesRef and BinaryDocValues

2018-08-23 Thread Vadim Gindin
Hi Kevin! I think that your field is "analyzed" and so your field value is divided to 2 terms "hey" and "tom". So docvalue is written for each of them. Regards Vadim Gindin пт, 24 авг. 2018, 5:19 Kevin Manuel : > Hi, > > I'm using lucene version 4.3

Re: CustomQuery.bulkScorer isn't called from BooleanQuery with filter block

2018-07-26 Thread Vadim Gindin
1:11 AM Adrien Grand wrote: > Hello Vadim, > > It looks like your query only supports bulkScorer() and not scorer()? > Unfortunately this is illegal: queries must implement scorer(). Today, > conjunctions never use the bulkScorer API. > > Le mer. 25 juil. 2018 à 18:47, Va

CustomQuery.bulkScorer isn't called from BooleanQuery with filter block

2018-07-25 Thread Vadim Gindin
seems they should work together. And that is why bulkScorer isn't called. Is there a way to integrate CustomQuery.bulkScorer() with possible adjacent filters? Regards, Vadim Gindin

Re: Explain flag in CustomQuery

2018-06-27 Thread Vadim Gindin
a *search *action. I'll probably ask that in Elasticsearch forum. Thanks :) Regards Vadim Gindin On Tue, Jun 26, 2018 at 1:48 AM Mikhail Khludnev wrote: > Vadim, > Why wouldn't you ask in Elastic forum? > > On Mon, Jun 25, 2018 at 11:39 PM Vadim Gindin > wrote: &g

Explain flag in CustomQuery

2018-06-25 Thread Vadim Gindin
afReader. Could you advice me? Regards, Vadim Gindin

Postings.getPayload() returns null

2018-03-23 Thread Vadim Gindin
ostingsEnum.ALL); int pos = postingsEnum.nextPosition(); BytesRef payload = postingsEnum.getPayload(); // assert payload.bytesEquals(new BytesRef(new byte[]{1})); // TODO: use payload in scoring formula fldScorers.add(new ConstTermScorer(this, t, fieldScores.get(field) * termScores.get(t.text()), postingsEnum)); } } } Regards, Vadim Gindin

Re: Read DocValue twice

2018-02-22 Thread Vadim Gindin
We don't have a solution for this. > > Caching the scorer doesn't work since scorers can only be iterated once. > > Le jeu. 22 févr. 2018 à 12:11, Vadim Gindin a > écrit : > > > I'd like to use "explain" mechanism to output some additional

Re: Read DocValue twice

2018-02-22 Thread Vadim Gindin
most effective way to do this? Is there a possibility to accelerate "explain", for example with scorer caching? - Lucene uses the only Scorer (for entire segment) for calling score() method. What about explain()? - Iterators are really - readable-once only? Regards, Vadim Gindin On T

Re: Read DocValue twice

2018-02-21 Thread Vadim Gindin
PM, Adrien Grand wrote: > This might not solve all problems, but you should stop caching the weight > in the query and stop caching the scorer in the weight: just create a new > scorer in calls to explain(). > > Le mer. 21 févr. 2018 à 14:05, Vadim Gindin a > écrit : > &

Re: Read DocValue twice

2018-02-21 Thread Vadim Gindin
. On Tue, Feb 20, 2018 at 8:03 PM, Vadim Gindin wrote: > Probably it is not possible to attach files from email letter. Here they > are: > > ConstTermScorer.java > <http://lucene.472066.n3.nabble.com/file/t493564/ConstTermScorer.java> > PrizeDisjunctionScorer.java &g

Re: Read DocValue twice

2018-02-20 Thread Vadim Gindin
Probably it is not possible to attach files from email letter. Here they are: ConstTermScorer.java PrizeDisjunctionScorer.java PhraseQuery.java <

Re: Read DocValue twice

2018-02-20 Thread Vadim Gindin
core() and in explanation(). Isn't it? Regards, Vadim Gindin On Mon, Feb 19, 2018 at 10:55 PM, Adrien Grand wrote: > Yes, this is the problem. This doc ID is a special sentinel value that > means that the iterator is exhausted. I don't have enough context to know > what the ex

Re: Read DocValue twice

2018-02-19 Thread Vadim Gindin
.doc and > reader.maxDoc() are before before you call advanceExact? > > What do you mean by "I reuse the same DisiPriorityQueue of scorers in > score() and explain()". This shouldn't be possible. > > Le lun. 19 févr. 2018 à 15:23, Vadim Gindin a > écrit : > > > I

Re: Read DocValue twice

2018-02-19 Thread Vadim Gindin
; /** > > * Iterates to the next value in the current document. Do not call > > this more than {@link #docValueCount} times > > * for the document. > > */ > > > > public abstract long nextValue() throws IOException; > > > > > > Questions: > > 1) Why I can't read the values twice? > > 2) How can I manage this situation? > > 3) Can it work for NumericDocValues? > > > > Regards, > > Vadim Gindin > > >

Read DocValue twice

2018-02-19 Thread Vadim Gindin
Why I can't read the values twice? 2) How can I manage this situation? 3) Can it work for NumericDocValues? Regards, Vadim Gindin

Custom explain implementation - how to transfer the data

2018-01-19 Thread Vadim Gindin
Assume, I have some scorer. During the execution of score() method, I'm caching a document id and scoring details to a Map. Further, in the explain(docID) method, I'm taking scoring details from that map by docID. Is it a correct scheme? If no how to implement it correctly? Regards, Vadim Gindin

Re: Wrong ID in explain() method.

2017-12-31 Thread Vadim Gindin
Yes, thanks a lot for your help. Do you mean that id of category must not be transferred to explain? If yes why it is happen? Regards Vadim Gindin 29 дек. 2017 г. 14:22 пользователь "Mikhail Khludnev" написал: > Responded on the elastic forum. Have you seen it? > > > On

Re: Query in a doc context

2017-12-31 Thread Vadim Gindin
Thanks Mikhail! I'll look there. Happy new year ) Regards Vadim Gindin 31 дек. 2017 г. 2:21 пользователь "Mikhail Khludnev" написал: > Literally it's done in Solr (excuse moi) via > q=field1:(foo bar baz)^=3 field2:(foo bar baz)^=4 field3:(foo bar baz)^=5 > but

Re: Wrong ID in explain() method.

2017-12-28 Thread Vadim Gindin
rved word for Lucene? Regards, Vadim Gindin On Wed, Dec 27, 2017 at 12:43 PM, Vadim Gindin wrote: > Hi all. > > I've written a simple plugin, that implements custom scoring logic and > extending `Explanation`. I have some real index data that looks like this: > >

Wrong ID in explain() method.

2017-12-26 Thread Vadim Gindin
the real installation, but in the test case - it works fine. 1. ID=342 and others come to explain(id) method. Note, it is not a document id - it is ID of the nested object (category). Why does it happen? 2. I have a test case, based on ESIntegTestCase. It works fine with this document. But this document is not founded in the real index. Regards, Vadim Gindin

Re: Query in a doc context

2017-12-26 Thread Vadim Gindin
ings: like explanation extending and composing sum scores. Regards, Vadim Gindin On Fri, Dec 15, 2017 at 10:33 PM, Mike Dinescu (DNQ) wrote: > Got it. I misunderstood the question (actually I'm still not convinced I > fully understand what you're looking for). It might be g

Re: Query in a doc context

2017-12-14 Thread Vadim Gindin
Mike, I don't need full doc match. I need a multi-field match and later I need to know - what fields are matched for a document to be able to calculate other multi-fields-oriented metrics. Regards, Vadim Gindin On Thu, Dec 14, 2017 at 8:46 PM, Mike Dinescu (DNQ) wrote: > Apolog

Re: Query in a doc context

2017-12-14 Thread Vadim Gindin
Thanks Mikhail Could you describe your sentences in more detail? Vadim On Thu, Dec 14, 2017 at 7:08 PM, Mikhail Khludnev wrote: > Hello, Vadim. > > Please find inline. > > On Thu, Dec 14, 2017 at 11:43 AM, Vadim Gindin > wrote: > > > Hi all. > > > > As

Re: Tracking that all query terms are matched in one document

2017-12-14 Thread Vadim Gindin
eplaced with > BulkScorer or so. Anyway, you need to find a way to prevent term-at-time > scoring, when FakeScorer is injected. > You need to make it score doc-at-time. As I told you, it's far way. > > On Wed, Dec 13, 2017 at 11:55 AM, Vadim Gindin > wrote: > > > Hi M

Re: Terminology. LeafReader -> TermEnum -> PostingsEnum

2017-12-14 Thread Vadim Gindin
ssume that I need to keep some coefficients along with tokens to use them further in scoring. For example, if the matched token is a synonym - I could multiple the query score to 0.75. Regards, Vadim Gindin On Thu, Dec 14, 2017 at 2:15 PM, Vadim Gindin wrote: > Hi All > > I have a ques

Terminology. LeafReader -> TermEnum -> PostingsEnum

2017-12-14 Thread Vadim Gindin
lementations of that interface. Why is it used in LeafReader? What the principal difference between these 20 implementations and which of them can be really useful? Regards, Vadim Gindin

Query in a doc context

2017-12-14 Thread Vadim Gindin
Hi all. As I can understand. All Queries (or most of them?) are single-field oriented. They may implement different search/score logic, but they are intended for a single field. For example, simple TermQuery or PhraseQuery. If I need to implement the search through different fields I should use Bo

Re: Tracking that all query terms are matched in one document

2017-12-13 Thread Vadim Gindin
void this? Thanks, Vadim Gindin On Fri, Dec 8, 2017 at 2:01 PM, Vadim Gindin wrote: > Thank's for your help. I'll try that. > > On Tue, Dec 5, 2017 at 4:18 PM, Mikhail Khludnev wrote: > >> Vadim, >> You can create a collector which checks Scorer.getChildren() &g

Re: Tracking that all query terms are matched in one document

2017-12-08 Thread Vadim Gindin
t to avoid this if it's possible. However, Elastic does something > like this with named queries or so. > I've told about this few years ago > https://www.youtube.com/watch?v=sGVyUdNGBgw > > On Tue, Dec 5, 2017 at 12:36 PM, Vadim Gindin > wrote: > > > I'm not sure

Re: Tracking that all query terms are matched in one document

2017-12-05 Thread Vadim Gindin
value - is a list of terms by whom this document was matched. I need to save somewhere the document ID and the term matched that document. Could somebody advise me an appropriate place? Regards, Vadim Gindin On Tue, Dec 5, 2017 at 12:04 PM, Vadim Gindin wrote: > For e

Re: Tracking that all query terms are matched in one document

2017-12-04 Thread Vadim Gindin
stQuery(bq, queryBoost); Vadim On Tue, Dec 5, 2017 at 9:24 AM, Michael Sokolov wrote: > Well how did you make the original query? > > On Dec 4, 2017 12:05 PM, "Vadim Gindin" wrote: > > > Yes, thanks. My question is exactly about how to create "another extra &g

Re: Tracking that all query terms are matched in one document

2017-12-04 Thread Vadim Gindin
multiplied. > > On Dec 4, 2017 5:22 AM, "Vadim Gindin" wrote: > > > Thanks, Michael! > > > > Yes, I'm sure. Could you explain your proposal in more detail? > > > > Regards, > > Vadim Gindin > > > > On Mon, Dec 4, 2017 at 3:1

Re: Scorer.iterator() - how to implement correctly

2017-12-04 Thread Vadim Gindin
ally multiplied to query boost. Now it works. Thank's a lot! Regards, Vadim Gindin On Mon, Dec 4, 2017 at 3:17 PM, Adrien Grand wrote: > It is correct... but ConstantScoreQuery is the way to go with your > use-case. It should not return scores of 0 unless you are misusing the API > in s

Re: Tracking that all query terms are matched in one document

2017-12-04 Thread Vadim Gindin
Thanks, Michael! Yes, I'm sure. Could you explain your proposal in more detail? Regards, Vadim Gindin On Mon, Dec 4, 2017 at 3:18 PM, Michael Sokolov wrote: > You could combine a Boolean and query with the same terms, as an optional > clause. Are you sure about the requirement to m

Re: Tracking that all query terms are matched in one document

2017-12-04 Thread Vadim Gindin
Sorry I've accidentally sent an unfinished letter ). Could somebody advise me the way how to implement the following thing? Regards Vadim Gindin On Mon, Dec 4, 2017 at 3:12 PM, Vadim Gindin wrote: > Hi all. > > I need to track that all query terms are matched in one docu

Tracking that all query terms are matched in one document

2017-12-04 Thread Vadim Gindin
Hi all. I need to track that all query terms are matched in one document. When all terms are matched I need to multiply the score of such document to some constant coefficient.

Re: Scorer.iterator() - how to implement correctly

2017-12-04 Thread Vadim Gindin
43 PM, Vadim Gindin wrote: > Hi Adrien. > > ConstantScoreQuery - I'd tried that earlier. There is the problem. It > returns score = 0.0 for my configuration with Boolean.. I've debugged and > found, that it happens because of the following: > > @Override > public Wei

Re: Scorer.iterator() - how to implement correctly

2017-12-03 Thread Vadim Gindin
r.iterator()?* Many thanks for your help! Regards Vadim Gindin On Fri, Dec 1, 2017 at 1:11 PM, Adrien Grand wrote: > There are many implementations because each query typically needs a custom > DocIdSetIterator implementation. It looks like your use-case doesn't need a > custom query

Scorer.iterator() - how to implement correctly

2017-11-30 Thread Vadim Gindin
score - 3f, field "vendor" - score - 5f. I'm creating a subquery for each field and specify score for it using custom QUERY that is almost the same as TermQuery except Weight.Scorer Any help is appreciated. Regards, Vadim Gindin

Re: COST vs SCORE vs WEIGHT

2017-11-30 Thread Vadim Gindin
then public DocIdSetIterator iterator() { return iterator; } Is that a correct implementation? Are there other ways to implement it? Thanks a lot for your response Regards, Vadim Gindin On Thu, Nov 30, 2017 at 8:56 PM, Adrien Grand wrote: > Hi Vadim, > > A Weight is the speciali

COST vs SCORE vs WEIGHT

2017-11-30 Thread Vadim Gindin
? Regards, Vadim Gindin

Re: Custom scoring algorithm and Explanation extending.

2017-11-22 Thread Vadim Gindin
Thank's a lot! On Mon, Nov 20, 2017 at 11:22 PM, Adrien Grand wrote: > Hi Vadim, > > Le jeu. 16 nov. 2017 à 18:09, Vadim Gindin a écrit > : > > > 1. I would like to use my custom scoring algorithm. Is it make sense to > use > > Lucene with other scoring algor

Custom scoring algorithm and Explanation extending.

2017-11-16 Thread Vadim Gindin
ot;explain" that uses Lucene's Explanation class under the hood. But this class covers only scoring aspects. I would like to include matching logic details there. It seems a good place but this class is final.. Regards, Vadim Gindin

Extending Explanation class information

2017-11-16 Thread Vadim Gindin
ribe matching/querying documents by concrete query? Regards, Vadim Gindin