s1monw commented on code in PR #12829:
URL: https://github.com/apache/lucene/pull/12829#discussion_r1423637103
##########
lucene/core/src/java/org/apache/lucene/index/IndexingChain.java:
##########
@@ -219,15 +222,33 @@ private Sorter.DocMap maybeSortSegment(SegmentWriteState
state) throws IOExcepti
}
LeafReader docValuesReader = getDocValuesLeafReader();
-
+ Function<IndexSorter.DocComparator, IndexSorter.DocComparator>
comparatorWrapper = in -> in;
+
+ if (state.segmentInfo.getHasBlocks() && indexSort.getParentField() !=
null) {
+ final DocIdSetIterator readerValues =
+ docValuesReader.getNumericDocValues(indexSort.getParentField());
+ BitSet parents = BitSet.of(readerValues, state.segmentInfo.maxDoc());
+ comparatorWrapper =
+ in ->
+ (docID1, docID2) ->
+ in.compare(parents.nextSetBit(docID1),
parents.nextSetBit(docID2));
+ }
+ assert state.segmentInfo.getHasBlocks() == false
+ || indexSort.getParentField() != null
+ || indexCreatedVersionMajor < Version.LUCENE_10_0_0.major
+ : "parent field is not set but the index has blocks.
indexCreatedVersionMajor: "
+ + indexCreatedVersionMajor;
List<IndexSorter.DocComparator> comparators = new ArrayList<>();
for (int i = 0; i < indexSort.getSort().length; i++) {
SortField sortField = indexSort.getSort()[i];
IndexSorter sorter = sortField.getIndexSorter();
if (sorter == null) {
throw new UnsupportedOperationException("Cannot sort index using sort
field " + sortField);
}
- comparators.add(sorter.getDocComparator(docValuesReader,
state.segmentInfo.maxDoc()));
+
+ IndexSorter.DocComparator docComparator =
Review Comment:
I also thought a bit about other uses of this field that we should evaluate.
One of the main things that make we worried is that fact that our delete API
doesn't give the guarantees that it should IMO. Today you can just delete the
parent without the children which will then in-turn merge adjacent blocks
magically or erroneous together and searches will return broken results. With
this field we can fix applying deletes to also delete all children if a parent
is deleted which is the right thing to do in this case. There might be more
usecases for this down to road mainly for index consistency.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]