[
https://issues.apache.org/jira/browse/SOLR-11917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16341679#comment-16341679
]
Hoss Man commented on SOLR-11917:
---------------------------------
_The following notes were compiled over many months and iteratively
tweaked/revised – It's likeley that in some cases my comments may be
overlooking/ignorant-of comments/ideas/patches related to some of these
concepts that were posted added after I wrote them that i just haven't noticed
since._
_Also: Jira says my notes are too long for one comment, so i have to break it
up into sections_
----
h1. High Level Goals / *U*secases
Talking to various customers about their pain points, and reading up on various
jiras lead me to a handful of Text/String related *U*secases that all seemed
like they have solutions that could either overlapp, or be in close proximity,
when it came to implementation:
* *U0*: "I want sane defaults when sorting on multivalued fields, not an error"
** Low hanging fruit already implemented for PrimitiveFieldType subclasses in
SOLR-11854
* *U1*: (SOLR-8362) Add docValues support to TextField (or some new subclasses
of TextField) – Because...
** *U1.1*: "I want to be able to (efficiently) sort on the original input of a
TextField (using docValues)"
** *U1.2*: "I want to be able to (efficiently) facet on (docValues built from)
the indexed terms of a TextField
** *U1.3*: "I want to be able to (efficiently) sort/facet on docValues built
from analyzed terms using a completely diff analyzer then what i use for
searching"
*** Example: StandardAnalyzer for searching, but lowercased docValues for
sorting.
* *U2*: Choose Query Analysis Aspects At Query Time – Because...
** *U2.1*: "I want to be able to do multi-language indexing/querying easily so
it only looks like one 'field' name." (SOLR-6492)
** *U2.2*: "I want to be able to have lots of arbitrary analyzers I pick
between arbitrarily at query time and maybe shoot myself in the foot but it's
ok i'm an expert and i have special needs." (SOLR-5053)
*** NOTE: the description of SOLR-5053 does also list multi-lang as a
motivation, but some of the examples – like "ignore synonyms" – are definitely
broader scope then this.
> A Potential Roadmap for robust multi-analyzer TextFields w/various options
> for configuring docValues
> ----------------------------------------------------------------------------------------------------
>
> Key: SOLR-11917
> URL: https://issues.apache.org/jira/browse/SOLR-11917
> Project: Solr
> Issue Type: Wish
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Hoss Man
> Assignee: Hoss Man
> Priority: Major
>
> A while back, I was tasked at my day job to brainstorm & design some "smarter
> field types" in Solr. In particular to think about:
> # How to simplify some of the "special things" people have to know about
> Solr behavior when creating their schemas
> # How to reduce the number of situations where users have to copy/clone one
> "logical field" into multiple "schema felds in order to meet diff use cases
> The main result of this thought excercise is a handful of usecases/goals that
> people seem to have - many of which are already tracked in existing jiras -
> along with a high level design/roadmap of potential solutions for these goals
> that can be implemented incrementally to leverage some common changes (and
> what those changes might look like).
> My intention is to use this jira as a place to share these ideas for broader
> community discussion, and as a central linkage point for the related jiras.
> (details to follow in a very looooooong comment)
> ----
> NOTE: I am not (at this point) personally committing to following through on
> implementing every aspect of these ideas :)
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]