: 1. Solr, like Lucene, should be able to work with an older analyzers
: module for backwards compatibility purposes.
While i don't disagree with you, Solr "philosiphy" has generally
discouraged the use of "Analyzer" classes in favor of more more discreet
Tokenizer & TokenFilter pieces -- direct support for Analyzers is mainly a
result of wanting to allow an easy trannasition for people that already
have custom analyzers they wrote for direct use in Lucene. The more fine
grain analysis chain appraoch that Solr encourages makes it easier for
people to debug what is going on, and allows for more customization of the
individiaul stages of the "Analayzer" thta gets built on the fly.
That said: if we can make it easier to use Analyzers, i'm all for it -- I
just don't want to set things up in a way that people choose to use
XyzAnalyzer from an analyzer module, when they could get the exact same
behavior by chaining together XTokenizer, YTokenFilter, and ZTokenFilter
(from the same module) and in the later case have more transparent
debugging and fine grained configuration controls.
: So with this idea, analyzers are just a Solr plugin, and the default
: Solr install includes the ones it does today, so most users would not
: see the difference. But if a user wants Polish, Smart Chinese, or
: improved Unicode support, they would be able to drop in one of the
: additional analyzer modules easily.
:
: The factories for Solr serve as a buffer to hide the implementation
: details, and I think they should be part of these analyzer modules, so
Just to be clear: what you are suggesting is that module-analyzer-XXX.jar
artifact of modules/analyzers/XXX should not only contain the Tokenizers &
TokenFilters that relate to XXX, but also the Factories solr expects to
initialize them -- so a user only needs to add that
module-analyzer-XXX.jar to their Solr lib dir to get all the
functionality, instead of needing module-analyzer-XXX.jar plus some
solr-analyzer-XXX-glue.jar
...am i understanding that correctly?
I'm all in favor of this -- anticipating that some of the stuff in
IndexSchema might eventually get "promoted" up in to a lucene
contirb/module is the key reason why we made sure a few years back to
prevent letting FieldTypes/TokenizerFactories/TokenFilter factories be
"aware" of the SolrCore or the IndexSchema classes -- instead all they are
allowed to know about is hte concept of a "ResourceLoader" for accessing
external file resources (ie: via a classpath and or effective directory).
So refactorying the factory APIs + the ResourceLoader into a new module
should be relatively straight forward (knock on wood)
: 2. example schema definitions (even snippets) for Solr users as a
: documentation artifact, so they know how to use this stuff.
+1
-Hoss
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]