[jira] [Commented] (LUCENE-8527) Upgrade JFlex to 1.7.0

Steve Rowe (JIRA) Fri, 07 Dec 2018 16:24:33 -0800


    [ 
https://issues.apache.org/jira/browse/LUCENE-8527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16713437#comment-16713437
 ]


Steve Rowe commented on LUCENE-8527:
------------------------------------

[~rcmuir ] mentioned on LUCENE-8125 that StandardTokenizer should give such 
sequences the {{<EMOJI>}} token type - see the logic in the {{icu}} module's 
{{BreakIteratorWrapper}}.

JFlex 1.7.0 supports Unicode 9.0, which, if I'm interpreting the discussion at 
http://www.unicode.org/L2/L2016/16315r-handling-seg-emoji.pdf properly, does 
not (fully) include Emoji sequence support (though customized rules that would 
do that properly in Unicode 9.0 are listed in that doc).

Should we include the (post-9.0) customized rules for Unicode 9.0?


> Upgrade JFlex to 1.7.0
> ----------------------
>
>                 Key: LUCENE-8527
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8527
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: general/build, modules/analysis
>            Reporter: Steve Rowe
>            Priority: Minor
>
> JFlex 1.7.0, supporting Unicode 9.0, was released recently: 
> [http://jflex.de/changelog.html#jflex-1.7.0].  We should upgrade.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-8527) Upgrade JFlex to 1.7.0

Reply via email to