I am having trouble using ShingleFilter/NGramTokenizer. I can't figure out
what I am supposed to do in response to the error 'Could not find
implementing class for
org.apache.lucene.analysis.tokenattributes.OffsetAttribute'. Can anyone
lend a hand?

The (brief) code is here:
https://github.com/rjurney/datafu/blob/lucene/src/java/datafu/pig/text/lucene/NGramTokenize.java

The error is:

ren: null at []]: java.lang.IllegalArgumentException: Could not find
implementing class for
org.apache.lucene.analysis.tokenattributes.OffsetAttribute

at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:338)

at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:378)

at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:298)

at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:282)

at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:277)

at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)

at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)

at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)

at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)

at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)

Caused by: java.lang.IllegalArgumentException: Could not find implementing
class for org.apache.lucene.analysis.tokenattributes.OffsetAttribute

at
org.apache.lucene.util.AttributeSource$AttributeFactory$DefaultAttributeFactory.getClassForInterface(AttributeSource.java:94)

at
org.apache.lucene.util.AttributeSource$AttributeFactory$DefaultAttributeFactory.createAttributeInstance(AttributeSource.java:67)

at
org.apache.lucene.util.AttributeSource.addAttribute(AttributeSource.java:276)

at
org.apache.lucene.analysis.standard.StandardTokenizer.<init>(StandardTokenizer.java:171)

at datafu.pig.text.lucene.NGramTokenize.exec(NGramTokenize.java:48)

at datafu.pig.text.lucene.NGramTokenize.exec(NGramTokenize.java:33)

at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:330)

at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNextDataBag(POUserFunc.java:374)

at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:309)

... 9 more

-- 
Russell Jurney twitter.com/rjurney
russell.jur...@gmail.com<https://mail.google.com/mail/?view=cm&fs=1&tf=1&to=russell.jur...@gmail.com>
 datasyndrome.com

Reply via email to