OK I have a small test case showing the issue! I opened https://issues.apache.org/jira/browse/LUCENE-7491
Thanks for reporting this, Hans. Mike McCandless http://blog.mikemccandless.com On Tue, Oct 11, 2016 at 12:08 PM, Hans Lund <ha.l...@gmail.com> wrote: > hmm you're right - when it revealed a bug in our indexing code I stopped > wondering ;-) but now I tried to create small tests to show the behavior - > until now without success. I'm pretty sure that I can reproduce it by > re-introducing our index bug, unfortunately it occurs after some hours > parsing and indexing wikipedia dumps - but from there I'll try simplifying a > test reproducing the setup. > > The setup we use is quite forward using MMapDirectory and a NRT setup - the > only tailored functionality is our own IndexDeletionPolicy using an added > timestamp in userdata for the index commit keeping a number of snapshots but > honoring a max retention period, not that I suspect it to be the cause - but > if fieldinfos from another snapshot is used in the merge that could cause > problems > > Hans Lund > > On Tue, Oct 11, 2016 at 12:07 PM, Michael McCandless > <luc...@mikemccandless.com> wrote: >> >> Hmm, that should be "OK" from Lucene's standpoint. >> >> I mean, it should not result in strange merge exceptions later on. >> >> I think there's a bug somewhere in Lucene's efforts to pretend it's >> fully schema-less ... I'll try to reproduce this. >> >> Mike McCandless >> >> http://blog.mikemccandless.com >> >> On Tue, Oct 11, 2016 at 4:38 AM, Hans Lund <ha.l...@gmail.com> wrote: >> > Turned out to be must much simpler - we had added a new 'dynamic' field >> > to >> > a stats doc a count on articles based on identified language code. >> > Having a >> > set of test documents in German, English, Swedish - no one had suspected >> > the obvious that the language detection categorized a single document as >> > being Indonesian, making the stats count id:1. >> > >> > I realized that the debug output I added - made output of everything >> > else >> > that the interesting field (iterating over already added fields - not >> > the >> > field causing the error later on ;-) >> > >> > >> > >> > >> > >> > On Mon, Oct 10, 2016 at 4:32 PM, Adrien Grand <jpou...@gmail.com> wrote: >> > >> >> It looks like the field infos of your index went out of sync with data >> >> stored in the files about points. >> >> >> >> Can you run CheckIndex on your index (potentially with the `-fast` >> >> option >> >> so that it only verifies checksums)? It could be that one of these two >> >> parts of the index got corrupted. >> >> >> >> Since you were able to modify the way add(IndexableField) is >> >> implemented, >> >> I'm wondering if you are running a fork of Lucene? If yes, maybe you >> >> did >> >> some changes that triggered this bug? >> >> >> >> Otherwise is your application: >> >> - using IndexWriter.addIndexes? >> >> - customizing merging in some way, eg. by wrapping the merge readers? >> >> >> >> Le mar. 4 oct. 2016 à 16:40, Hans Lund <ha.l...@gmail.com> a écrit : >> >> >> >> > After upgrading to 6.2 we are having problems during merges (after >> >> running >> >> > for a while). >> >> > >> >> > When the problem occurs its always complaining about the same field - >> >> > and >> >> > throws: >> >> > >> >> > java.lang.IllegalArgumentException: field="id" did not index point >> >> values >> >> > at >> >> > >> >> > org.apache.lucene.codecs.lucene60.Lucene60PointsReader.getBKDReader( >> >> Lucene60PointsReader.java:126) >> >> > at >> >> > >> >> > org.apache.lucene.codecs.lucene60.Lucene60PointsReader. >> >> size(Lucene60PointsReader.java:224) >> >> > at >> >> > >> >> > org.apache.lucene.codecs.lucene60.Lucene60PointsWriter. >> >> merge(Lucene60PointsWriter.java:169) >> >> > at >> >> > org.apache.lucene.index.SegmentMerger.mergePoints( >> >> SegmentMerger.java:173) >> >> > at org.apache.lucene.index.SegmentMerger.merge( >> >> SegmentMerger.java:122) >> >> > at >> >> > >> >> > org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4312) >> >> > at >> >> > org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3889) >> >> > >> >> > >> >> > To figure out where we messed up - I have added some ugly logging to >> >> > Document: >> >> > >> >> > public final void add(IndexableField field) { >> >> > if ("id".equals(field.name()) && >> >> > field.fieldType().pointDimensionCount() >> >> > != 0) { >> >> > System.err.println("Point value detected"); >> >> > for (IndexableField i : fields) { >> >> > System.err.println(i); >> >> > } >> >> > } >> >> > fields.add(field); >> >> > } >> >> > >> >> > In hope to intercept the document we messed up. >> >> > >> >> > But to my surprise toString on the suspected field just says >> >> > (contains a >> >> > URN): >> >> > >> >> > indexed,omitNorms,indexOptions=DOCS<id:urn:wiki:doc:YEL:57028#1-1> >> >> > >> >> > So any hints as to why field.fieldType().pointDimensionCount() != 0 >> >> > >> >> > and any suggestions what might cause this? >> >> > >> >> > Regards >> >> > Hans Lund >> >> > >> >> > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org