AW: Questions for facets search

2014-08-12 Thread Ralf Heyde
For 1st: from Solr Level i guess, you could select (only) the document by uniqueid. Then you have the facets for that particular document. But this results in one additional query/doc. Gesendet von meinem BlackBerry 10-Smartphone.   Originalnachricht   Von: Sheng Gesendet: Dienstag, 12. August 2

Questions for facets search

2014-08-12 Thread Sheng
I actually have 2 questions: 1. Is it possible to get the facet label for a particular document? The reason we want this is we'd like to allow users to see tags for each hit in addition to the taxonomy for his/her search. 2. Is it possible to re-index the facet cache without reindexing the whole

RE: BitSet in Filters

2014-08-12 Thread Uwe Schindler
Hi, in general you cannot cache Filter, you can cache their DocIdSets (CachingWrapperFilter is for example doing this). Lucene Queries are executed per segment, that means when you index new documents or update new documents, lucene creates new index segments. Older ones *never* change, so a Do

Re: BitSet in Filters

2014-08-12 Thread Sandeep Khanzode
Hi Erick, I have mentioned everything that is relevant, I believe :). However, just to give more background: Assume documents of the order of more than 300 million, and multiple concurrent users running search. I may front Lucene with ElasticSearch, and ES basically calls Lucene TermFilters. My

Re: Can't get case insensitive keyword analyzer to work

2014-08-12 Thread Milind
Thanks Christoph, So it seems that tokenized has been conflated to analyzed. I just looked at the Javadocs and that's what it mentions. I had read it earlier, but it hadn't registered. I wonder why it's not called setAnalyzed. Thanks again. On Tue, Aug 12, 2014 at 3:07 AM, Christoph Kaser < c

Re: BitSet in Filters

2014-08-12 Thread Erick Erickson
bq: Unless, I can cache these filters in memory, the cost of constructing this filter at run time per query is not practical Why do you say that? Do you have evidence? Because lots and lots of Solr installations do exactly this and they run fine. So I suspect there's something you're not telling

Re: Problem of calling indexWriterConfig.clone()

2014-08-12 Thread Michael McCandless
IWC.clone is/was buggy ... just stop calling it and make a new IWC from scratch each time in your application. Mike McCandless http://blog.mikemccandless.com On Tue, Aug 12, 2014 at 8:37 AM, Sheng wrote: > I think what you suggest probably will work, and I appreciate that. What I > am a little

RE: escaping characters

2014-08-12 Thread Uwe Schindler
See Javadocs of QueryParser: NOTE: You must specify the required Version compatibility when creating QueryParser: - As of 3.1, QueryParserBase.setAutoGeneratePhraseQueries(boolean) is false by default. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u

Re: escaping characters

2014-08-12 Thread Jack Krupansky
The default changed to "false" in Lucene 3.1. Before that it was "true". -- Jack Krupansky -Original Message- From: Chris Salem Sent: Tuesday, August 12, 2014 8:34 AM To: java-user@lucene.apache.org Subject: RE: escaping characters Thanks! That worked. We recently upgraded from 2.9

Problem of calling indexWriterConfig.clone()

2014-08-12 Thread Sheng
I think what you suggest probably will work, and I appreciate that. What I am a little concerned about is if Indexwriterconfig is completely stateless or not, meaning if i clone from the very original Indexwriterconfig, will I lose some info from the breakpoint? Maybe I don't need worry about it, s

RE: escaping characters

2014-08-12 Thread Chris Salem
Thanks! That worked. We recently upgraded from 2.9 to 4.9, was true the default in 2.9? -Original Message- From: Jack Krupansky [mailto:j...@basetechnology.com] Sent: Monday, August 11, 2014 5:54 PM To: java-user@lucene.apache.org Subject: Re: escaping characters You need to manually e

Re: Can't get case insensitive keyword analyzer to work

2014-08-12 Thread Jack Krupansky
And unfiltered. So even if you use the keyword tokenizer that only generates a single token, you still want token filtering, such as lower case. -- Jack Krupansky -Original Message- From: Christoph Kaser Sent: Tuesday, August 12, 2014 3:07 AM To: java-user@lucene.apache.org Subject: R

Re: Problem of calling indexWriterConfig.clone()

2014-08-12 Thread Michael McCandless
We've removed IndexWriterConfig.clone as of 4.9: https://issues.apache.org/jira/browse/LUCENE-5708 Cloning of those complex / expert classes was buggy and too hairy to get right. You just have to make a new IWC every time you make an IW. Mike McCandless http://blog.mikemccandless.com On

Re: Can't get case insensitive keyword analyzer to work

2014-08-12 Thread Christoph Kaser
Hello Milind, if you don't set the field to be tokenized, no analyzer will be used and the field's contents will be stored "as-is", i.e. case sensitive. It's the analyzer's job to tokenize the input, so if you use an analyzer that does not separate the input into several tokens (like the Keywo