cpoerschke commented on a change in pull request #123: URL: https://github.com/apache/solr/pull/123#discussion_r632990383
########## File path: solr/contrib/ltr/src/java/org/apache/solr/ltr/feature/FieldValueFeature.java ########## @@ -106,18 +106,29 @@ public FieldValueFeatureWeight(IndexSearcher searcher, /** * Return a FeatureScorer that uses docValues or storedFields if no docValues are present + * * @param context the segment this FeatureScorer is working with * @return FeatureScorer for the current segment and field * @throws IOException as defined by abstract class Feature */ @Override public FeatureScorer scorer(LeafReaderContext context) throws IOException { if (schemaField != null && !schemaField.stored() && schemaField.hasDocValues()) { - return new DocValuesFieldValueFeatureScorer(this, context, - DocIdSetIterator.all(DocIdSetIterator.NO_MORE_DOCS), schemaField.getType()); + + FieldInfo fieldInfo = context.reader().getFieldInfos().fieldInfo(field); + DocValuesType docValuesType = fieldInfo != null ? fieldInfo.getDocValuesType() : DocValuesType.NONE; + + if (DocValuesType.NUMERIC.equals(docValuesType) || DocValuesType.SORTED.equals(docValuesType)) { + return new DocValuesFieldValueFeatureScorer(this, context, + DocIdSetIterator.all(DocIdSetIterator.NO_MORE_DOCS), schemaField.getType(), docValuesType); + // If type is NONE, this segment has no docs with this field. That's not a problem, because we won't call score() anyway Review comment: Earlier today I had a thought on this code area here and wanted to note it down (for further pondering next week) but had to repair my bicycle's puncture first. Anyway ... The _"If type is NONE, this segment has no docs with this field. That's not a problem ..."_ comment here is very helpful. I wonder though if the _"because we won't call score() anyway"_ part remains accurate after the refactorings. In the branch's initial commit I think it was accurate i.e. via the `if (docValues != null` at the start of the `score()` implementation the inside of the `if` block wasn't called. But now there's something more interesting going on here: * If there's a segment whose documents all don't have the field then instead of a `DocValuesFieldValueFeatureScorer` the `FieldValueFeatureScorer` is used via the fall-through. * Conceptually that jumps out as untidy or unintuitive i.e. if there is no field for all the documents in the segment then (say) a `NoFieldValueFeatureScorer` is more obvious. * In terms of performance use of the `FieldValueFeatureScorer` should be more costly i.e. reading of the field would be attempted and nothing would be found. * In terms of test coverage this would not be caught since the default value is returned, albeit inefficiently, by the `FieldValueFeatureScorer` implementation. * In terms of use cases this code path may or may not be an edge case e.g. if a `FieldValueFeature` is for a field that is sparsely populated then a good proportion of segments might encounter the code path and reasons for sparseness may include (say) historical documents not having the field. _If_ that reasoning makes sense then a next step here could be to create and use a `NoFieldValueFeatureScorer` class. What do you think? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org