So, I thought you're response meant that I could eliminate my code: String[] fields = new String[1]; fields[0] = "EVERYTHING"; // use the single "big" field in the index mlt.setFieldNames(fields);
But, if I comment out that code, my unit test fails. If I include it, it passes. I'm using MLT as follows: _query = new BooleanClause(mlt.like(new InputStreamReader(is), "EVERYTHING"), BooleanClause.Occur.MUST); "is" is the input stream. Did I miss something in your response? Scott -----Original Message----- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Wednesday, September 21, 2011 6:59 PM To: java-user@lucene.apache.org Subject: Re: MoreLikeThis Interface changes On Wed, Sep 21, 2011 at 5:17 PM, Scott Smith <ssm...@mainstreamdata.com> wrote: > I'm updating my lucene code from 3.0 to 3.4. There's a change in the MLT > interface I'm confused about. I used the MLT.like(InputStream) method. It > now appears I should change to the MLT.like(InputStreamReader, fieldname) > method. Easy enough to create an InputStreamReader from an InputStream. Yes, requiring a reader is to ensure that MLT is using the encoding you want > > So, my question is regarding the addition of the fieldname parameter. > There's also a call called MLT.setFieldNames(String[]). This would seem to > be redundant except the setFieldNames() allows you to specify multiple fields > and like() doesn't. Am I allowed to specify null as the fieldname in like() > (documentation doesn't say you can). It seems like you shouldn't need to do > both. But there's a difference in functionality between the two (since one > allows multiple fields and the other doesn't). A Reader has no fields :) The fieldName is only for passing to the Analyzer (@param fieldName field passed to the analyzer to use when analyzing the content) This is because some Analyzers (e.g. PerFieldAnalyzerWrapper) analyze content differently according to different fields. Previously, MoreLikeThis would use what was in the setFieldNames parameter, iteratively like this: for (field : fieldNames) { analyzer.analyze(field, reader); } However, MoreLikeThis also had a bug where it would never close() the reader As you can see this logic was completely bogus, as you can only consume the field once. Effectively the reader would be analyzed by fieldNames[0], then MLT would analyze an exhausted reader with fieldNames[1]...fieldNames[n]. When we fixed MLT to close its resources correctly (around 3.2), it exposed this second bug, If you tried to pass a reader with multiple values in fieldNames you would get an IOException because it tried to re-consume a closed reader. Now, instead when supplying a reader, you should pass in this fieldName explicitly so that it analyzes the content the way you want. For backwards compatibility with the deprecated method, it uses fieldNames[0] only. -- lucidimagination.com --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org