Create LanguageIdentifierUpdateProcessor
----------------------------------------
Key: SOLR-1979
URL: https://issues.apache.org/jira/browse/SOLR-1979
Project: Solr
Issue Type: New Feature
Components: update
Reporter: Jan Høydahl
Priority: Minor
We need the ability to detect language of some random text in order to act upon
it, such as indexing the content into language aware fields. Another usecase is
to be able to filter/facet on language on random unstructured content.
To do this, we should wrap the [Nutch
LanguageIdentifier|http://nutch.apache.org/apidocs-1.1/org/apache/nutch/analysis/lang/LanguageIdentifier.html"]
in an UpdateProcessor. The processor should be configured like this:
{{monospaced}}
<processor
class="org.apache.solr.update.processor.LanguageIdentifierUpdateProcessorFactory">
<str name="inputFields">title,teaser,body</str>
<str name="isoOutputField">language</str>
<str name="fullOutputField">language_display</str>
</processor>
{{monospaced}}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]