[jira] [Comment Edited] (LUCENE-8453) Add example settings to Korean analyzer components' javadocs

Tomoko Uchida (JIRA) Fri, 10 Aug 2018 18:43:40 -0700


    [ 
https://issues.apache.org/jira/browse/LUCENE-8453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16576941#comment-16576941
 ]


Tomoko Uchida edited comment on LUCENE-8453 at 8/11/18 1:42 AM:
----------------------------------------------------------------

So here are my proposal for javadoc's example settings (my pull request)  :)

For KoreanTokenizerFactory:
{code:xml}
<fieldType name="text_ko" class="solr.TextField">
   <analyzer>
     <tokenizer class="solr.KoreanTokenizerFactory"
       decompoundMode="discard"
       userDictionary="user.txt"
       userDictionaryEncoding="UTF-8"
       outputUnknownUnigrams="false"
     />
  </analyzer>
 </fieldType>
{code}
For KoreanPartOfSpeechStopFilterFactory:
{code:xml}
<fieldType name="text_ko" class="solr.TextField">
    <analyzer>
      <tokenizer class="solr.KoreanTokenizerFactory"/>
      <filter class="solr.KoreanPartOfSpeechStopFilterFactory"
              tags="E,J"/>
    </analyzer>
 </fieldType>
{code}
For KoreanReadingFormFilterFactory:
{code:xml}
<fieldType name="text_ko" class="solr.TextField">
   <analyzer>
     <tokenizer class="solr.KoreanTokenizerFactory"/>
     <filter class="solr.KoreanReadingFormFilterFactory"/>
   </analyzer>
 </fieldType>
{code}

Update:
Added brief descriptions for each parameter (please see the pull request,) 
though unfortunately, Kuromoji's documentation lacks those.


was (Author: tomoko uchida):
So here are my proposal for javadoc's example settings (my pull request)  :)

For KoreanTokenizerFactory:
{code:xml}
<fieldType name="text_ko" class="solr.TextField">
   <analyzer>
     <tokenizer class="solr.KoreanTokenizerFactory"
       decompoundMode="discard"
       userDictionary="user.txt"
       userDictionaryEncoding="UTF-8"
       outputUnknownUnigrams="false"
     />
  </analyzer>
 </fieldType>
{code}
For KoreanPartOfSpeechStopFilterFactory:
{code:xml}
<fieldType name="text_ko" class="solr.TextField">
    <analyzer>
      <tokenizer class="solr.KoreanTokenizerFactory"/>
      <filter class="solr.KoreanPartOfSpeechStopFilterFactory"
              tags="E,J"/>
    </analyzer>
 </fieldType>
{code}
For KoreanReadingFormFilterFactory:
{code:xml}
<fieldType name="text_ko" class="solr.TextField">
   <analyzer>
     <tokenizer class="solr.KoreanTokenizerFactory"/>
     <filter class="solr.KoreanReadingFormFilterFactory"/>
   </analyzer>
 </fieldType>
{code}

> Add example settings to Korean analyzer components' javadocs
> ------------------------------------------------------------
>
>                 Key: LUCENE-8453
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8453
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: general/javadocs
>            Reporter: Tomoko Uchida
>            Priority: Minor
>
> Korean analyzer (nori) javadoc needs example schema settings.
> I'll create a patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (LUCENE-8453) Add example settings to Korean analyzer components' javadocs

Reply via email to