[
https://issues.apache.org/jira/browse/SOLR-13593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16878369#comment-16878369
]
Tomoko Uchida commented on SOLR-13593:
--------------------------------------
I've opened a draft pull request:
[https://github.com/apache/lucene-solr/pull/761]. (Not yet tested.) I'm new to
Solr schema handling, please feel free to add comments if I missed something.
This accepts SPI names when loading bundled managed-schema and calling REST API.
managed-schema example:
{code:xml}
<fieldType name="text_fa_spi" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<!-- for ZWNJ -->
<charFilter spi="persian"/>
<tokenizer spi="standard"/>
<filter spi="lowercase"/>
<filter spi="arabicNormalization"/>
<filter spi="persianNormalization"/>
<filter spi="stop" ignoreCase="true" words="lang/stopwords_fa.txt" />
</analyzer>
</fieldType>
{code}
REST API example:
{code:java}
curl -X POST -H 'Content-type:application/json' --data-binary '{
"add-field-type" : {
"name":"myNewTxtField",
"class":"solr.TextField",
"positionIncrementGap":"100",
"analyzer" : {
"charFilters":[{
"spi":"htmlStrip"
}],
"tokenizer":{
"spi":"whitespace" },
"filters":[{
"spi":"lowercase"
}]}}
}' http://localhost:8983/solr/techproducts/schema
{code}
> Allow to specify analyzer components by their SPI names in schema definition
> ----------------------------------------------------------------------------
>
> Key: SOLR-13593
> URL: https://issues.apache.org/jira/browse/SOLR-13593
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Components: Schema and Analysis
> Reporter: Tomoko Uchida
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Now each analysis factory has explicitely documented SPI name which is stored
> in the static "NAME" field (LUCENE-8778).
> Solr uses factories' simple class name in schema definition (like
> class="solr.WhitespaceTokenizerFactory"), but we should be able to also use
> more concise SPI names (like name="whitespace").
> e.g.:
> {code:xml}
> <fieldtype name="myfieldtype" class="solr.TextField">
> <analyzer>
> <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"
> />
> <filter class="solr.PorterStemFilterFactory" />
> </analyzer>
> </fieldtype>
> {code}
> would be
> {code:xml}
> <fieldtype name="myfieldtype" class="solr.TextField">
> <analyzer>
> <tokenizer name="whitespace"/>
> <filter name="keywordMarker" protected="protwords.txt" />
> <filter name="porterStem" />
> </analyzer>
> </fieldtype>
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]