Re: What is the proper use of stop words in Lucene?

2014-04-23 Thread Ahmet Arslan
Hi, I think you final goal is not full related to stop word elimination.  I would use synonyms instead of setEnablePositionIncrements. Alternatively, Assuming that you have list of stop words, you may simulate previous behaviorsetEnablePositionIncrements(false) via org.apache.lucene.analysis.Ma

CheckIndex fails with missing .si file?

2014-04-23 Thread Ryan McKinley
I am trying to debug an issue we are seeing with various deployments of solr with lucene 4.6 We are seeing many errors like: ERROR: could not read any segments file in directory java.nio.file.NoSuchFileException: /Users/ryan/Downloads/indexV2/v0/index/_ of.si The file listing looks like this:

What is the proper use of stop words in Lucene?

2014-04-23 Thread Chris Tomlinson
Hello, I've written several times now on the list with this question / problem and no one has yet replied so I don't know if the question is too wrong-headed or if there is simply no one reading the list that can comment on the question. The question that I'm trying to get answered is what is t

Re: Getting multi-values to use in filter?

2014-04-23 Thread Rob Audenaerde
Thanks for all the questions, gives me an opportunity to clarify it :) I want the user to be able to give a (simple) formula (so I don't know it on beforehand) and use that formula in the search. The Javascript expressions are really powerful in this use case, but have the single-value limitation.

Re: Getting multi-values to use in filter?

2014-04-23 Thread Shai Erera
A NumericDocValues field can only hold one value. Have you thought about encoding the values in a BinaryDocValues field? Or are you talking about multiple fields (different names), each has its own single value, and at search time you sum the values from a different set of fields? If it's one fiel

Re: Getting multi-values to use in filter?

2014-04-23 Thread Rob Audenaerde
Hi Shai, all, I am trying to write that Filter :). But I'm a bit at loss as how to efficiently grab the multi-values. I can access the context.reader().document() that accesses the storedfields, but that seems slow. For single-value fields I use a compiled JavaScript Expression with simplebinding

Re: Getting multi-values to use in filter?

2014-04-23 Thread Shai Erera
You can do that by writing a Filter which returns matching documents based on a sum of the field's value. However I suspect that is going to be slow, unless you know that you will need several such filters and can cache them. Another approach would be to write a Collector which serves as a Filter,

Re: Getting multi-values to use in filter?

2014-04-23 Thread Rob Audenaerde
Hi Mike, Thanks for your reply. I think it is not-so-much an invalid use case for Lucene. Lucene already has (experimental) support for Dynamic Range Facets, expressions (javascript expressions, geospatial haversin etc. etc). There are all computed on the fly; and work really well. They just depe

Re: Getting multi-values to use in filter?

2014-04-23 Thread Michael Sokolov
This isn't really a good use case for an index like Lucene. The most essential property of an index is that it lets you look up documents very quickly based on *precomputed* values. -Mike On 04/23/2014 06:56 AM, Rob Audenaerde wrote: Hi all, I'm looking for a way to use multi-values in a f

Getting multi-values to use in filter?

2014-04-23 Thread Rob Audenaerde
Hi all, I'm looking for a way to use multi-values in a filter. I want to be able to search on sum(field)=100, where field has values in one documents: field=60 field=40 In this case 'field' is a LongField. I examined the code in the FieldCache, but that seems to focus on single-valued fields o