Hi all,

Is it possible to page over results when using
an EarlyTerminatingSortingCollector?

I'm using the following code with Lucene 5.5.0 to read results in pages of
10 documents each:

/** The Lucene field name */
private static final String FIELD_NAME = "id";

/** The Lucene field type */
private static final FieldType FIELD_TYPE = new FieldType();
static {
    FIELD_TYPE.setTokenized(true);
    FIELD_TYPE.setOmitNorms(true);
    FIELD_TYPE.setIndexOptions(IndexOptions.DOCS);
    FIELD_TYPE.setNumericType(FieldType.NumericType.INT);
    FIELD_TYPE.setDocValuesType(DocValuesType.NUMERIC);
    FIELD_TYPE.setStored(true);
    FIELD_TYPE.freeze();
}

public static void main(String[] args) throws Exception {

    // Sort to be used both with merge policy and queries
    Sort sort = new Sort(new SortedNumericSortField(FIELD_NAME,
SortField.Type.INT));

    // Create directory
    RAMDirectory directory = new RAMDirectory();

    // Setup merge policy
    TieredMergePolicy tieredMergePolicy = new TieredMergePolicy();
    SortingMergePolicy sortingMergePolicy = new
SortingMergePolicy(tieredMergePolicy, sort);

    // Setup index writer
    IndexWriterConfig indexWriterConfig = new IndexWriterConfig(new
SimpleAnalyzer());
    indexWriterConfig.setOpenMode(IndexWriterConfig.OpenMode.CREATE_OR_APPEND);
    indexWriterConfig.setMergePolicy(sortingMergePolicy);
    IndexWriter indexWriter = new IndexWriter(directory, indexWriterConfig);

    // Index values
    for (int i = 1; i <= 1000; i++) {
        Document document = new Document();
        document.add(new IntField(FIELD_NAME, i, FIELD_TYPE));
        indexWriter.addDocument(document);
    }

    // Force index merge to ensure early termination
    indexWriter.forceMerge(1, true);
    indexWriter.commit();

    // Create index searcher
    IndexReader reader = DirectoryReader.open(directory);
    IndexSearcher searcher = new IndexSearcher(reader);

    // Paginated read
    int pageSize = 10;
    FieldDoc pageStart = null;
    while (true) {

        System.out.println(String.format("\nCollecting page starting
at: %s", pageStart));

        Query query = new MatchAllDocsQuery();

        TopFieldCollector tfc = TopFieldCollector.create(sort,
pageSize, pageStart, true, false, false);
        EarlyTerminatingSortingCollector collector = new
EarlyTerminatingSortingCollector(tfc, sort, pageSize, sort);
        searcher.search(query, collector);
        ScoreDoc[] scoreDocs = tfc.topDocs().scoreDocs;
        for (ScoreDoc scoreDoc : scoreDocs) {
            pageStart = (FieldDoc) scoreDoc;
            Document document = searcher.doc(scoreDoc.doc);
            System.out.println(String.format("FOUND %s -> %s",
document, scoreDoc));
        }

        System.out.println(String.format("Terminated early: %s",
collector.terminatedEarly()));

        if (scoreDocs.length < pageSize) {
            break;
        }
    }

    // Close
    reader.close();
    indexWriter.close();
    directory.close();
}


But the query for the second page doesn't return any results. However, I
get the expected results when I don't wrap the TopFieldCollector with the
EarlyTerminatingSortingCollector.

Is there something I am missing? Is EarlyTerminatingSortingCollector not
compatible with paging?

Thanks in advance,

-- 
Andrés de la Peña

Vía de las dos Castillas, 33, Ática 4, 3ª Planta
28224 Pozuelo de Alarcón, Madrid
Tel: +34 91 828 6473 // www.stratio.com // *@stratiobd
<https://twitter.com/StratioBD>*

Reply via email to