Re: Lucene Facets performance problems (version 4.7.2)

2016-02-25 Thread Erick Erickson
You haven't given us much to go on. What is the cardinality of the fields you're faceting on? What does your query look like? How are you measuring time? What is the output if you add &debug=true? In short, your question is far too vague to give any meaningful information, there could be any of a

Lucene Facets performance problems (version 4.7.2)

2016-02-25 Thread Simona Russo
Hi all, we use Lucene *Facet* library version* 4.7.2.* We have an *index* with *45 millions *of documents (size about 15 GB) and a *taxonomy* index with *57* millions of documents (size about 2 GB). The total *facet search* time achieve *15 seconds*! Is it possible to improve this time? Is the

Re: Is there a way to share IndexReader data sensibly across independent callers?

2016-02-25 Thread Trejkaz
So it turns out I still have problems. I wanted to return a proxy reader that the caller could close like normal. I wanted to do this for two reasons: 1. This: try (IndexReader reader = sharer.acquireReader(...)) { ... } Looks much nicer than this: IndexReader reader =

Re: Lucene 5.5.0 StopFilter Error

2016-02-25 Thread Jake Clawson
Thanks for the quick response. I checked everything as you had pointed out and the following did the trick to the code working for me: > In your code you indirectly called reset twice on the Tokenizer. First direct > and then implicit > through the filter. I removed the tokenizer.reset() i

Re: Lucene 5.5.0 StopFilter Error

2016-02-25 Thread Uwe Schindler
You must build the whole stream including all filters first and then consume it. So first create Tokenizer, then wrap by filter. Once all this is done, you can consume the filter on top using the workflow. You don't need the Tokenizer anymore (you can remove its reference). The filter delegates

Lucene 5.5.0 StopFilter Error

2016-02-25 Thread Jake Clawson
I am trying to use StopFilter in Lucene 5.5.0. I tried the following: package lucenedemo; import java.io.StringReader; import java.util.ArrayList; import java.util.Arrays; import java.util.Collections; import java.util.HashSet; import java.util.List; import java.util.Set; import java.util.Iterato

Re: Spaces in regular expressions

2016-02-25 Thread Kudrettin Güleryüz
Thank you, I had looked at that article a little, some time ago. I was thinking I may have to change some lower level Lucene classes to be able to work like that. Plus I don't have much clue if that would break things. I am primarily looking for a Lucene solution at this point. On Thu, Feb 25, 20

CachingWrapperQuery deprecated

2016-02-25 Thread Otmar Caduff
Hi Just switched from Lucene 5.1 to Lucene 5.5 and noticed that org.apache.lucene.search.CachingWrapperQuery has been deprecated. Up until now I used that class to solve the following problem: I have a query covering multiple fields. In the end, I have to give feedback on which fields matched

Re: Weird ClassCastException running lucene 5.2.1 on Java 1.8.

2016-02-25 Thread Krishnamurthy, Kannan
Hi Uwe, Thanks for the pointers. We didn't see this error after upgrading to 8u74. Thanks, Kannan. From: Uwe Schindler Sent: Wednesday, February 24, 2016 4:16 AM To: java-user@lucene.apache.org Subject: RE: Weird ClassCastException running lucene 5.2.

Re: Spaces in regular expressions

2016-02-25 Thread Greg Bowyer
Possibly not helpful but some time ago Russ Cox implemented a code search at Google. His design is documented here https://swtch.com/~rsc/regexp/regexp4.html On Wed, Feb 24, 2016, at 08:01 AM, Kudrettin Güleryüz wrote: > I appreciate the pointers Jack. More on that, where can I read more on > ena

Re: Grouping Lucene result

2016-02-25 Thread Koji Sekiguchi
Hi Taher, Solr has the function of result grouping. I think it has two steps. First, it tries to find how many groups are there in the result and choose top groups (say 10 groups) using a priority queue. Second, provide 10 priority queues for each groups and search again to collect second or a