RE: Migration Lucene 4.7.0 -->6.0.1 - NumericUtils

2016-12-05 Thread Ludovic Bertin
Hi Mike,

Sorry for the delay.
For some use cases, Fields are implemented by numeric values, but distinct 
values are not so big (<50 terms).
We have a Search GUI, where user can add criteria on predefined fields. For 
those fields, where the number of terms is limited, then we proposed the 
available values in a dropdown.

What are your recommendations ? Using String instead of Numeric Values ?

Thanks.
Ludovic



-Original Message-
From: Michael McCandless [mailto:luc...@mikemccandless.com]
Sent: jeudi, 22. septembre 2016 22:34
To: Lucene Users; Ludovic Bertin
Subject: Re: Migration Lucene 4.7.0 -->6.0.1 - NumericUtils

LegacyNumericUtils is the right solution for your index for now, but
longer term you should migrate to dimensional points instead, which
are a more efficient way to index and range search numerics.

But: why do you need all distinct values of a field?  In general this
is a very dangerous method to offer.

Mike McCandless

http://blog.mikemccandless.com


On Tue, Sep 20, 2016 at 9:02 AM, Ludovic Bertin
 wrote:
> Hi there,
>
> I'm migrating an application from Lucene 4.7.0 to Lucene 6.0.1.
> I'm facing a problem with this piece of code :
>
> public List getDistinctValues(IndexReader reader, EventField field) 
> throws IOException {
>
> List values = new ArrayList();
> Fields fields = MultiFields.getFields(reader);
> if (fields == null) return values;
>
> Terms terms = fields.terms(field.name());
> if (terms == null) return values;
>
> TermsEnum iterator = terms.iterator();
>
> Class type = field.getJavaType();
> BytesRef value = iterator.next();
>
> while (value != null) {
> if (type == Long.class) {
> values.add(LegacyNumericUtils.prefixCodedToLong(value));
> } else if (type == Integer.class) {
> values.add(LegacyNumericUtils.prefixCodedToInt(value));
> } else if (type == Boolean.class) {
> values.add(LegacyNumericUtils.prefixCodedToInt(value) == 1 ? TRUE 
> : FALSE);
> } else if (type == Date.class) {
> values.add(new Date(LegacyNumericUtils.prefixCodedToLong(value)));
> } else if (type == String.class) {
> values.add(value.utf8ToString());
> } else {
> log.warn("getDistinctValues: ignoring field " + field + " of type 
> " + type);
> }
>
> value = iterator.next();
> }
>
> return values;
> }
>
> The aim of this method is to get all terms present in a field directly in the 
> correct java type. I'm seeing just now that NumericUtils is an internal 
> class, so I should not have used it.
> For now, I can use LegacyNumericUtils, but that's a very temporary solution.
>
> What is the best approach to achieve my goal ?
>
> Thanks in advance for any help.
> Ludovic
>
>  DISCLAIMER 
> This message is intended only for use by the person to
> whom it is addressed. It may contain information that is
> privileged and confidential. Its content does not constitute
> a formal commitment by Bank Lombard Odier & Co Ltd or any
> of its branches or affiliates. If you are not the intended recipient
> of this message, kindly notify the sender immediately and
> destroy this message. Thank You.
> *
[[ rethink everything. ]]

DISCLAIMER **
This message is intended only for use by the person to
whom it is addressed. It may contain information that is
privileged and confidential. Its content does not constitute
a formal commitment by Bank Lombard Odier & Co Ltd
or any of its branches or affiliates. If you are not the
intended recipient of this message, kindly notify the sender
immediately and destroy this message. Thank You.
***



Re: Apply Lucene Query on Bits

2016-12-05 Thread Mikhail Khludnev
Hello Hendrik,

I lurked sources and find nothing better than copy a few pieces
from org.apache.lucene.search.MultiTermQueryConstantScoreWrapper and
especially
org.apache.lucene.search.MultiTermQueryConstantScoreWrapper.createWeight(...).new
ConstantScoreWeight() {...}.scorer(DocIdSet).

On Sun, Dec 4, 2016 at 12:04 PM, Hendrik Dev  wrote:

> Hi,
>
> how to apply a org.apache.lucene.search.Query on a given
> org.apache.lucene.util.Bits object?
>
> Background: I have a subclass of
> org.apache.lucene.index.FilterLeafReader where i want to filter the
> livedocs by applying a query on the "Bits".
>
> According to javadoc i need also to override numDocs() if i override
> getLiveDocs(). So the question extends also to how to filter the
> number of documents based on a query (within a FilterLeafReader)
>
> Thx
> Hendrik
>
>
> --
> Hendrik Saly (salyh, hendrikdev22)
> @hendrikdev22
> PGP: 0x22D7F6EC
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>


-- 
Sincerely yours
Mikhail Khludnev


Re: query parser of SpanNearQuery

2016-12-05 Thread Mikhail Khludnev
Hello,

You can check ComplexPhrase and Surround query parsers.

On Mon, Dec 5, 2016 at 8:12 AM, Yonghui Zhao  wrote:

> It seems lucene query parser doesn't support SpanNearQuery.
> Is there any query parser supports SpanNearQuery?
>



-- 
Sincerely yours
Mikhail Khludnev


RE: query parser of SpanNearQuery

2016-12-05 Thread Allison, Timothy B.
Not part of Lucene, but take a look at LUCENE-5205 [1], which I actively 
maintain on github [2].

And, you can integrate via maven [3]

See the jira issue for an overview of the query syntax, and let me know if you 
have any questions.



[1] https://issues.apache.org/jira/browse/LUCENE-5205
[2] https://github.com/tballison/lucene-addons 
[3] 
http://search.maven.org/#artifactdetails%7Corg.tallison.lucene%7Clucene-5205%7C6.3-0.1%7Cjar
 



-Original Message-
From: Yonghui Zhao [mailto:zhaoyong...@gmail.com] 
Sent: Monday, December 5, 2016 12:13 AM
To: java-user@lucene.apache.org
Subject: query parser of SpanNearQuery

It seems lucene query parser doesn't support SpanNearQuery.
Is there any query parser supports SpanNearQuery?


Re: Apply Lucene Query on Bits

2016-12-05 Thread Adrien Grand
Do I get it right that you have a query that defines a set of visible
documents, and you want to make sure that your FilterReader only sees those
documents?

If this is the case, then you could use FixedBitSet.or to load the
Scorer.iterator() into a FixedBitSet, and then maintain two caches:
 - one from the core cache key to the bit set of visible documents,
 - one from the core and deletes cache key to the number of documents in
the index, this numDocs could be recomputed by iterating the bit set of
visible documents, and counting how many of them are not deleted.

Le dim. 4 déc. 2016 à 10:04, Hendrik Dev  a écrit :

> how to apply a org.apache.lucene.search.Query on a given
> org.apache.lucene.util.Bits object?
>
> Background: I have a subclass of
> org.apache.lucene.index.FilterLeafReader where i want to filter the
> livedocs by applying a query on the "Bits".
>
> According to javadoc i need also to override numDocs() if i override
> getLiveDocs(). So the question extends also to how to filter the
> number of documents based on a query (within a FilterLeafReader)
>


RE: Apply Lucene Query on Bits

2016-12-05 Thread Uwe Schindler
Hi,

you may also have a look at the implementation of the PKIndexSplitter, doing 
exactly what you want (applying a FilterReader / FilterCodecReader that hides 
documents matched by a query):
https://github.com/apache/lucene-solr/blob/master/lucene/misc/src/java/org/apache/lucene/index/PKIndexSplitter.java#L127-L170

Uwe

-
Uwe Schindler
Achterdiek 19, D-28357 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de

> -Original Message-
> From: Adrien Grand [mailto:jpou...@gmail.com]
> Sent: Monday, December 5, 2016 3:35 PM
> To: java-user@lucene.apache.org
> Subject: Re: Apply Lucene Query on Bits
> 
> Do I get it right that you have a query that defines a set of visible
> documents, and you want to make sure that your FilterReader only sees
> those
> documents?
> 
> If this is the case, then you could use FixedBitSet.or to load the
> Scorer.iterator() into a FixedBitSet, and then maintain two caches:
>  - one from the core cache key to the bit set of visible documents,
>  - one from the core and deletes cache key to the number of documents in
> the index, this numDocs could be recomputed by iterating the bit set of
> visible documents, and counting how many of them are not deleted.
> 
> Le dim. 4 déc. 2016 à 10:04, Hendrik Dev  a écrit
> :
> 
> > how to apply a org.apache.lucene.search.Query on a given
> > org.apache.lucene.util.Bits object?
> >
> > Background: I have a subclass of
> > org.apache.lucene.index.FilterLeafReader where i want to filter the
> > livedocs by applying a query on the "Bits".
> >
> > According to javadoc i need also to override numDocs() if i override
> > getLiveDocs(). So the question extends also to how to filter the
> > number of documents based on a query (within a FilterLeafReader)
> >


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: LeafCollector

2016-12-05 Thread Matt Hicks
Interesting. Are there any examples of how to actually use
DiversifiedTopDocsCollector?

On Fri, Dec 2, 2016 at 8:53 AM Adrien Grand  wrote:

> Maybe you could use DiversifiedTopDocsCollector?
> https://lucene.apache.org/core/6_2_0/misc/org/apache/lucene/search/DiversifiedTopDocsCollector.html
>
> Le jeu. 1 déc. 2016 à 23:08, Michael McCandless 
> a écrit :
>
> Lucene used to have a DuplicateFilter to do this, but we removed it
> recently ... see https://issues.apache.org/jira/browse/LUCENE-6633 for
> some discussion as to why.
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Thu, Dec 1, 2016 at 2:39 PM, Matt Hicks  wrote:
> > I'm trying to write a LeafCollector that filters out duplicates for a
> > specific field. However, looking at the JavaDoc for `collect` it says not
> > to call `IndexSearch.doc` or `IndexReader.document`.  How am I supposed
> to
> > determine the value of a field and then exclude it?
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>


Facet DrillDown Exclusion

2016-12-05 Thread Matt Hicks
I'm currently drilling down adding a facet path, but I'd like to be able to
do the same as a NOT query.  Is there any way to do an exclusion drill down
on a facet to exclude docs that match the facet while including all others?

Thanks