[
https://issues.apache.org/jira/browse/LUCENE-5482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13916137#comment-13916137
]
Ahmet Arslan commented on LUCENE-5482:
--------------------------------------
This is similar to ClassicFilter that removes 's from the end of words. But
ClassicFilter is useful for English language only and has nothing to do with
Turkish. Because it only removes 's and 'S. In Turkish different character
sequences may come after an apostrophe. e.g. 'nin, 'a, 'nin, 'ü etc.
In Turkish, apostrophe is used to separate suffixes from proper names
(continent, sea, river, lake, mountain, upland, proper names related to
religion and mythology). For example Van Gölü’ne (meaning: to Lake Van).
> improve default TurkishAnalyzer
> -------------------------------
>
> Key: LUCENE-5482
> URL: https://issues.apache.org/jira/browse/LUCENE-5482
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/analysis
> Affects Versions: 4.7
> Reporter: Ahmet Arslan
> Priority: Minor
> Labels: Turkish, analysis
> Fix For: 4.8
>
>
> Add a TokenFilter that strips characters after an apostrophe (including the
> apostrophe itself).
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]