Re: ArrayIndexOutOfBoundsException: -65536 during full-import from old index

2017-09-05 Thread bebe1437
I figure out the problem, I custom an NGramFilter which takes the token's length as a default maxGramSize, and there are some documents fulled with non sense data like 'xakldjfklajsdfklajdslkf', when the token is too big to do NGramFilter , it crushed the IndexWriter. -- Sent from: http://lucen

Re: How to load all document fields, together with facet fields?

2017-09-05 Thread Michael McCandless
You'll just have to add additional StoredField instances for all those facet fields as well. The FacetField is consumed as an inverted field and not directly stored, though you could do some work and reconstruct it from the binary doc values that the facet store. Mike McCandless http://blog.mike

Re: What is the fastest way to loop over all documents in an index?

2017-09-05 Thread Michael McCandless
You can call MultiFields.getLiveDocs(IndexReader) to get the bitset identifying which documents are not deleted. Mike McCandless http://blog.mikemccandless.com On Tue, Sep 5, 2017 at 2:54 PM, Mikhail Khludnev wrote: > You can call searcher.search() with MatchAlldocsQuery passing own collector

Re: Open IndexWriter to prior commit

2017-09-05 Thread Michael McCandless
DirectoryReader.listCommits is what you are looking for! Mike McCandless http://blog.mikemccandless.com On Tue, Sep 5, 2017 at 4:22 PM, Bryan Bende wrote: > I was reading this blog post about Lucene transactions (thank you to > Mike for writing this): > http://blog.mikemccandless.com/2012/03/t

Open IndexWriter to prior commit

2017-09-05 Thread Bryan Bende
I was reading this blog post about Lucene transactions (thank you to Mike for writing this): http://blog.mikemccandless.com/2012/03/transactional-lucene.html I'm interested in the part that references distributed transactions and says: "if Lucene completed its 2nd phase commit but the database's

Re: What is the fastest way to loop over all documents in an index?

2017-09-05 Thread Mikhail Khludnev
You can call searcher.search() with MatchAlldocsQuery passing own collector impl which will be notified about every non-deleted doc via collect(docId). On Tue, Sep 5, 2017 at 3:09 AM, Jean Claude van Johnson < vanjohnsonjeancla...@gmail.com> wrote: > Hi there, > > I have an use case, were I need

Re: Re: What is the fastest way to loop over all documents in an index?

2017-09-05 Thread Ishan Chattopadhyaya
I believe that's the case. Leave the deleted docs out, though (which can be computed by intersecting with some other bitset.). On Tue, Sep 5, 2017 at 2:04 PM, Ahmet Arslan wrote: > > Hi Ishan, > > I saw following loop is suggested for this task in the stack overflow. > > for (int i=0; i > How ca

Re: Document updating and SortedSetDocValuesFacetField

2017-09-05 Thread Bryan Bende
I came across this same exception when I performed a query that faceted on a field that had no documents in the index with that field. One simple case was attempting to perform faceting on an empty index. Is it possible that no documents in your index have a value for "facet_category" at the ti

Re: Re: What is the fastest way to loop over all documents in an index?

2017-09-05 Thread Ahmet Arslan
Hi Ishan, I saw following loop is suggested for this task in the stack overflow. for (int i=0; i wrote: Maybe IndexReader#document(), looping over docids is the best here? http://lucene.apache.org/core/6_6_0/core/org/apache/lucene/index/IndexReader.html#document-int- On Tue, Sep 5, 2017 a