Are you _sure_ you're looking at tokens and not stored data? That can sometimes be confusing.
admin/<core>/schema browser might help here. Best, Erick On Thu, Sep 11, 2014 at 1:33 PM, suleman mubarik (JIRA) <[email protected]> wrote: > > [ > https://issues.apache.org/jira/browse/LUCENE-5943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14130661#comment-14130661 > ] > > suleman mubarik edited comment on LUCENE-5943 at 9/11/14 8:33 PM: > ------------------------------------------------------------------ > > Here is other example > if input is this "I love <pizza hut>" > then i get tokens "i", "love" ,"pizza", "hut" and offsets (0,1), (2,6), > (7,11), (12,14) > if HTMLStripCharFilter remove text between angle brackets then i should get > "i", "love" and not "i", "love" ,"pizza", "hut" > > here is other example "I love <html>" > tokens i get "i", "love" ,"html" > I am on Lucene 4.8 > > > was (Author: sulemanmubarik): > Here is other example > if input is this "I love <pizza hut>" > then i get tokens "i", "love" ,"pizza", "hut" and offsets (0,1), (2,6), > (7,11), (12,14) > if HTMLStripCharFilter remove text between angle brackets then i should get > "i", "love" and not "i", "love" ,"pizza", "hut" > I am on Lucene 4.8 > >> HTML strip filter removes text between < and > >> ---------------------------------------------- >> >> Key: LUCENE-5943 >> URL: https://issues.apache.org/jira/browse/LUCENE-5943 >> Project: Lucene - Core >> Issue Type: Bug >> Components: core/index >> Environment: Production >> Reporter: suleman mubarik >> >> If I have this as input “I love <pizza hut> so much” >> When I apply html striper it removes “pizza hut” and I get tokens "i", >> "love" ,"so", "much" >> And these are offsets I get back ((0,1), (2,6), (20,22), (23,27)) >> Html strip filter should return "i", "love" ,"pizza", "hut", "so", "much" > > > > -- > This message was sent by Atlassian JIRA > (v6.3.4#6332) > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
