Hi, I'm trying to extract payloads from an index for specific tokens the following way (inserting sample document number and term):
Terms terms = reader.getTermVector(16504, "term"); TokenStream tokenstream = TokenSources.getTokenStream(terms); while (tokenstream.incrementToken()) { OffsetAttribute offset = tokenstream.getAttribute(OffsetAttribute.class); int start = offset.startOffset(); int end = offset.endOffset(); String token = tokenstream.getAttribute(CharTermAttribute.class).toString(); PayloadAttribute payloadAttr = tokenstream.addAttribute(PayloadAttribute.class); BytesRef payloadBytes = payloadAttr.getPayload(); ... } This works fine for the OffsetAttribute and the CharTermAttribute, but payloadAttr.getPayload() always returns null for all documents and all tokens, unfortunately. However, I know that the payloads are stored in the index as I can retrieve them through a SpanQuery with Spans.getPayload(). I actually expect every token to carry a payload, as I'm my custom tokenizer implementation has the following lines: public class KoraTokenizer extends Tokenizer { ... private PayloadAttribute payloadAttr = addAttribute(PayloadAttribute.class); ... public boolean incrementToken() { ... payloadAttr.setPayload(new BytesRef(payloadString)); ... } ... } I've asserted that the payloadString variable is never an empty String and as I said above, I can retrieve the Payloads with Spans.getPayload(). So what do I do wrong in my tokenstream.addAttribute(PayloadAttribute.class) call? BTW, I used tokenstream.getAttribute() before as for the other attributes but this obviously threw an IllegalArgumentException so I implemented the recommendation given in the documentation and replaced it by addAttribute(). Thanks! Carsten -- Institut für Deutsche Sprache | http://www.ids-mannheim.de Projekt KorAP | http://korap.ids-mannheim.de Tel. +49-(0)621-43740789 | schno...@ids-mannheim.de Korpusanalyseplattform der nächsten Generation Next Generation Corpus Analysis Platform --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org