I opened https://issues.apache.org/jira/browse/LUCENE-5090
Kaze, if you could try out that patch and see if it throws a better exception in your case that would be great ... Mike McCandless http://blog.mikemccandless.com On Wed, Jul 3, 2013 at 4:16 PM, Michael McCandless <luc...@mikemccandless.com> wrote: > Hmm, not good. > > One trickiness with SSDVA is that you must create a new > SortedSetDocValuesReaderState every time you open a new IndexReader. > > If you don't do this correctly, e.g. you use the SSDVReaderState from > an old reader, then it can lead to exceptions like this. > > Is it possible that's happening in your case? > > We should add a check for this in the code so you get a better > exception ... I'll open an issue. > > Mike McCandless > > http://blog.mikemccandless.com > > > On Wed, Jul 3, 2013 at 2:52 PM, Kaze <kaze_da...@hotmail.com> wrote: >> Hello, >> >> I'm a novice Lucene user and just started using it to do some prototyping >> for my project. >> >> I noticed SortedSetDocValues was introduced in 4.3.0 that allows faceted >> search without a dedicated taxonomy index. I've successfully used it to >> perform faceting on a small index (~3000 documents, ~400 bytes per doc). >> But when I loaded a bigger index (~50000 documents), I started getting >> ArrayIndexOutOfBounds exception when SortedSetDocValuesAccumulator performs >> aggregation. >> >> Specifically, it errors out on line 139 where it tries to migrate segment >> ordinals to global ordinals. I've poked around and did some debugging; the >> following is my finding. >> >> The smaller index only had one segment when initially loaded, while the >> bigger one had multiple. My test suite consists of some searches on the >> index with occasional updates to the index. The error only happens when I >> do a faceted search immediately following an update to the index. >> >> Then I tried forcing a merge of the segments for the larger index as the >> final step of initial indexing. So when I initially loaded the index >> afterwards, there was only one segment. This time there were no errors, >> even though it was the same set of documents. Interestingly, even though >> segments are created as I do updates on the index as part of my test suite, >> no errors crop up afterwards. I can add that I've only seen issues with 3 >> or more segments, while 2 seems to work. I don't know why this would be >> the case but these are my observations. >> >> Let me know if there is some standard way to report bugs that I should >> follow. I've checked out the JIRA page for Lucene, but it looked more like >> a "find bugs, create issue, fix it, upload patch", where the issue creator >> fixes the bug. I have a long ways to go before I understand the low level >> implementation to apply a fix :( >> >> Thanks --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org