[
https://issues.apache.org/jira/browse/SOLR-13593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16878792#comment-16878792
]
Tomoko Uchida commented on SOLR-13593:
--------------------------------------
I updated the pull request.
{quote}I am not so happy about the "spi" name, I'd perfer "name". Whats's
exactly the problem with using "name"? The Solr plugin stuff should not be
affected by this.
{quote}
+1 Now the PR uses "name" to specify SPI names (just as my first proposal):
{code:xml}
<!-- managed-schema file -->
<fieldType name="text_fa_spi" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<!-- for ZWNJ -->
<charFilter name="persian"/>
<tokenizer name="standard"/>
<filter name="lowercase"/>
<filter name="arabicNormalization"/>
<filter name="persianNormalization"/>
<filter name="stop" ignoreCase="true" words="lang/stopwords_fa.txt" />
</analyzer>
</fieldType>
{code}
{code:java}
# REST API
curl -X POST -H 'Content-type:application/json' --data-binary '{
"add-field-type" : {
"name":"myNewTxtField",
"class":"solr.TextField",
"positionIncrementGap":"100",
"analyzer" : {
"charFilters":[{
"name":"htmlStrip"
}],
"tokenizer":{
"name":"whitespace" },
"filters":[{
"name":"lowercase"
}]}}
}' http://localhost:8983/solr/techproducts/schema
{code}
--
{quote}Another suggestion, not sure if it's already implemented: When
persisting a managed schema after modification, it should use the provider
names only and no longer persist class names.
{quote}
I have not noticed that.
It seems that Solr persists the factory's original properties as-is with its
class name ("class"). So I changed the property handling logic in
{{o.a.s.schema.FieldType}} to discard "class" property when the SPI name is
passed, and instead preserve "name" in the original properties to keep
consistency of managed-schema.
> Allow to specify analyzer components by their SPI names in schema definition
> ----------------------------------------------------------------------------
>
> Key: SOLR-13593
> URL: https://issues.apache.org/jira/browse/SOLR-13593
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Components: Schema and Analysis
> Reporter: Tomoko Uchida
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Now each analysis factory has explicitely documented SPI name which is stored
> in the static "NAME" field (LUCENE-8778).
> Solr uses factories' simple class name in schema definition (like
> class="solr.WhitespaceTokenizerFactory"), but we should be able to also use
> more concise SPI names (like name="whitespace").
> e.g.:
> {code:xml}
> <fieldtype name="myfieldtype" class="solr.TextField">
> <analyzer>
> <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt"
> />
> <filter class="solr.PorterStemFilterFactory" />
> </analyzer>
> </fieldtype>
> {code}
> would be
> {code:xml}
> <fieldtype name="myfieldtype" class="solr.TextField">
> <analyzer>
> <tokenizer name="whitespace"/>
> <filter name="keywordMarker" protected="protwords.txt" />
> <filter name="porterStem" />
> </analyzer>
> </fieldtype>
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]