Difference between StoredField vs Other Fields with Field.Store.YES

2015-03-11 Thread Gimantha Bandara
Hi all, Is there a difference between using StoredField and using other types of fields with Field.Store.YES? Another question, Is it a good practise to use NumericDocValuesField instead of using usual Fields (IntField, LongField, StringField ...etc) with Field.Store.NO ? -- Gimantha Bandara Sof

Re: get the DocsEnum in lucene4.10.3

2015-03-11 Thread wangdong
Can anybody help me? I am confused about the api in lucene 4.10.3. I want to get the DocsEnum object and iterate the doc and its frequecy for a specific term. Now I get the IndexReader and IndexSearcher in my hand. What can I do ? thanks ahead! andrew ---

Re: Sampled Hit counts using Lucene Facets.

2015-03-11 Thread Shai Erera
OK yes then sampling isn't the right word. So what you would want to have is API like "count faces in N buckets between a range of [min..max] values". That would create the ranges for you and then you would be able to use the RangeFacetCounts as usual. Would you like to open a JIRA issue and post

Re: Sampled Hit counts using Lucene Facets.

2015-03-11 Thread Gimantha Bandara
Hi Shai, Yes.. Bucketing is the word :) .. IMO it would be better if bucketing is moved to a utility class. I ll create a JIRA and provide a patch. Thanks! On Wed, Mar 11, 2015 at 4:33 PM, Shai Erera wrote: > OK yes then sampling isn't the right word. So what you would want to have > is API li

Re: get the DocsEnum in lucene4.10.3

2015-03-11 Thread Ian Lea
Take a look at the first section of https://lucene.apache.org/core/4_10_3/MIGRATE.html. There's probably something there that will help you. -- Ian. On Wed, Mar 11, 2015 at 11:03 AM, wangdong wrote: > Can anybody help me? > > >> I am confused about the api in lucene 4.10.3. >> >> I want to ge

Re: Filtering question

2015-03-11 Thread Ian Lea
Can you use a BooleanFilter (or ChainedFilter in 4.x) alongside your BooleanQuery? Seems more logical and I suspect would solve the problem. Caching filters can be good too, depending on how often your data changes. See CachingWrapperFilter. -- Ian. On Tue, Mar 10, 2015 at 12:45 PM, Chris Bamf

Re: Difference between StoredField vs Other Fields with Field.Store.YES

2015-03-11 Thread Ian Lea
> Is there a difference between using StoredField and using other types of > fields with Field.Store.YES? It will depend on what the other type of field is. As the javadoc for Field states, the xxxField classes are sugar. If you are doing standard things on standard data it's generally easier to

Re: Filtering question

2015-03-11 Thread Shai Erera
I don't see that you use acceptDocs in your MyNDVFilter. I think it would return false for all userB docs, but you should confirm that. Anyway, because you use an NDV field, you can't automatically skip unrelated documents, but rather your code would look something like: for (int i = 0; i < reade

Re: Lucene index

2015-03-11 Thread Michael McCandless
Lucene itself is not a graph database, but maybe look at http://neo4j.com which I think can index node properties into a Lucene index. For synonyms maybe look at Lucene's unit tests for SynonymFilter?: https://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_5_0_0/lucene/analysis/common/src/te

Re: Filtering question

2015-03-11 Thread Chris Bamford
Hi Shai I thought that might be what acceptDocs was for, but in my case it is null and throws a NPE if I try your suggestion. What am I doing wrong? I'd like to really understand this stuff .. Thanks Chris > On 11 Mar 2015, at 13:05, Shai Erera wrote: > > I don't see that you use acceptDo

Re: Filtering question

2015-03-11 Thread Chris Bamford
Additional - I'm on lucene 4.10.2 If I use a BooleanFilter as per Ian's suggestion I still get a null acceptDocs being passed to my NDV filter. Sent from my iPhone > On 11 Mar 2015, at 17:19, Chris Bamford wrote: > > Hi Shai > > I thought that might be what acceptDocs was for, but in my cas

Re: AW: Lucene 4.x -> 5 : IllegalStateException while sorting

2015-03-11 Thread shamik
I'm facing similar issue. I've a field which is being used for result grouping. I did a rolling update from 4.7 to 5.0. I started getting the error on any group by query --> "SolrDispatchFilter null:java.lang.IllegalStateException: unexpected docvalues type NONE for field 'ADSKDedup' (expected=

RE: Filtering question

2015-03-11 Thread Uwe Schindler
Hi, BooleanQuery: -- Clause 1: TermQuery -- Clause 2: FilteredQuery - Branch 1: MatchAllDocsQuery() - Branch 2: MyNDVFilter Why does it look like this? Clause 2 should simply be: ConstantScoreQuery(MyNDVFilter) In that case the BooleanQuery will execute more effectively, in case of 2 MU

RE: Filtering question

2015-03-11 Thread Uwe Schindler
Hi, In fact the FilteredQuery(MatchAllDocsQuery,...) with the filter should have been rewritten to a ConstantScoreQuery already, but for some unknown reason, Mike McCandless removed it in https://issues.apache.org/jira/browse/LUCENE-5418 Because of this it's better to do it like I said before (u

Lucene MMapDirectory: Mapping failure

2015-03-11 Thread Rahul Kotecha
Hi All, We have multiple indexes in our linux system each of which has a decent size (occupying a few gigs). We are facing few issues while opening an IndexReader for some of those indexes. java.io.IOException: Map failed ,% STACK: sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:758)

Re: Filtering question

2015-03-11 Thread Chris Bamford
Hi Uwe Thanks for the suggestion, I tried to use a BooleanQuery with clause1 = termquery and clause2 = ConstantScoreQuery(MyNDVFilter), joined by SHOULD. I also applied the term filter at the top level (as before). Unfortunately it doesn't work in that the MyNDVFilter still receives null accep

RE: Lucene MMapDirectory: Mapping failure

2015-03-11 Thread Uwe Schindler
Hi, your ulimit settings look fine! One possibility why this may fail: Could it be that you forget to close indexes while reopening them? This could keep mmapped files open for very long time or possibly mmap them multiple times. As your number of open files limit is very large, it could take

RE: Filtering question

2015-03-11 Thread Uwe Schindler
Hi, > Thanks for the suggestion, I tried to use a BooleanQuery with clause1 = > termquery and clause2 = ConstantScoreQuery(MyNDVFilter), joined by > SHOULD. I also applied the term filter at the top level (as before). > Unfortunately it doesn't work in that the MyNDVFilter still receives null > ac