On Tue, 28 Apr 2009, Max Lynch wrote:

I am trying to get a list of all terms that matched a document.  So, if I
search for "John Smith", I want to know if I found "John Smith" specifically
in the document.  I can use the lucene results but I need to do more
processing based on exactly what was found.  I am using a highlighter and
formatter for this, but if I use the QueryScorer it breaks up the phrase
into "John" and "Smith", but only if the whole name was found.  I have
uncovered that maybe the SpanScorer would preserve the whole phrase, but
when I try to use it I get NotImplementedError.  Has it not been interfaced
yet?  Is it a difficult thing to do?

If you are trying to use the highlighter package's SpanScorer class, there may be a problem with it clashing (by name) with the org.apache.lucene.search.spans.SpanScorer class:

  >>> import lucene
  >>> lucene.initVM(lucene.CLASSPATH)
  >>> lucene.SpanScorer.class_
  <Class: class org.apache.lucene.search.spans.SpanScorer>

But without a specific example of what you're trying to do, it's mostly just guesswork here.

If I guessed this right, enhancing JCC so that specific classes involved in a name clash can be renamed in Python (because java packages are flattened out in Python, yet not in the underlying generated C++) shouldn't be too hard.

Could you please include a piece of code that reproduces the problem ?
Thanks !

Andi..

Reply via email to