[
https://issues.apache.org/jira/browse/LUCENE-7354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15353400#comment-15353400
]
Steve Rowe commented on LUCENE-7354:
------------------------------------
{quote}
In MoreLikeThis.java, circa line 763, when calling addTermFrequencies on a
Field object, we are incorrectly calling toString on the Field object, which
puts the Field attributes (indexed, stored, et. al) into the String that is
returned.
{quote}
I don't see this - when I run {{CloudMLTQParserTest}} without your patch, and I
look at {{MoreLikeThis.retrieveTerms()}} where {{String.valueOf(fieldValue)}}
is called (by pulling the value of that expression out into a variable and
breaking there in the debugger), I only see the actual field values - no
indexed stored et al.
Indexed, stored, et al. are Field*Type* attributes, not Field attributes,
right?
In {{CloudMLTQParser.parse()}} where the filtered doc is composed, in your
patch you have a nocommit (the only one I see in your patch) -
Field.stringValue() returns {{value.toString()}}, but only if it's a String or
a Number, and otherwise null, so it's definitely possible to not have a string
value for binary fields or geo fields - I guess the question is whether people
want to use non-text/non-scalar fields for MLT?:
{code:java}
for (String field : fieldNames) {
Collection<Object> fieldValues = doc.getFieldValues(field);
if (fieldValues != null) {
Collection<String> strings = new ArrayList<>(fieldValues.size());
for (Object value : fieldValues) {
if (value instanceof Field){
String sv = ((Field) value).stringValue();
if (sv != null) {
strings.add(sv);
}//TODO: nocommit: what to do when we don't have StringValue? I
don't think it is possible in this case, but need to check on this
} else {
strings.add(value.toString());
}
}
filteredDocument.put(field, strings);
}
}
{code}
> MoreLikeThis incorrectly does toString on Field object
> ------------------------------------------------------
>
> Key: LUCENE-7354
> URL: https://issues.apache.org/jira/browse/LUCENE-7354
> Project: Lucene - Core
> Issue Type: Bug
> Affects Versions: 6.0.1, 5.5.1, master (7.0)
> Reporter: Grant Ingersoll
> Assignee: Grant Ingersoll
> Priority: Minor
> Attachments: LUCENE-7354-mlt-fix
>
>
> In MoreLikeThis.java, circa line 763, when calling addTermFrequencies on a
> Field object, we are incorrectly calling toString on the Field object, which
> puts the Field attributes (indexed, stored, et. al) into the String that is
> returned.
> I'll put up a patch/fix shortly.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]