You might be hitting a rounding error. When this happens, how many deleted documents are there in the remaining segments? 1?
The calculation for whether to merge the segment is: double pctDeletes = 100. * ((double) deleted_docs_in_segment / (double) doc_count_in_segment_including_deleted_docs if (pctDeletes > forceMergeDeletesPctAllowed) {merge the segment}. At any rate, calling findForcedMerges instead will purge all deleted docs no matter what. NOTE: as of 7.5, the behavior has changed in that both of these methods will respect the maximum segment size by default. Prior to 7.5, either of these could produce a single segment for all the segments that were merged (all of them in forceMerge, all with > n% deleted docs in forceMergeDeletes). If you require a single segment to result, you can specify the maxSegmentCount as 1. See LUCENE-7976 for all the gory details of this change if you're curious Best, Erick On Fri, Sep 28, 2018 at 5:41 AM Rob Audenaerde <rob.audenae...@gmail.com> wrote: > > Hi all, > > We build a FST on the terms of our index by iterating the terms of the > readers for our fields, like this: > > for (final LeafReaderContext ctx : leaves) { > final LeafReader leafReader = ctx.reader(); > > for (final String indexField : indexFields) { > final Terms terms = > leafReader.terms(indexField); > // If the field does not exist in this > reader, then we get null, so check for that. > if (terms != null) { > final TermsEnum termsEnum = > terms.iterator(); > > However, it sometimes the building of the FST seems to find terms that are > from documents that are deleted. This is what we expect, checking the > javadocs. > > So, now we switched the IndexWriter to a config with a TieredMergePolicy > with: setForceMergeDeletesPctAllowed(0). > > When calling indexWriter.forceMergeDeletes(true) we expect that there will > be no more deletes. However, the deleted terms still sometimes appear. We > use the DirectoryReader.openIfChanged() to refresh the reader before > iterating the terms. > > Are we forgetting something? > > Thanks in advance. > Rob Audenaerde --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org