[
https://issues.apache.org/jira/browse/LUCENE-2458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12892007#action_12892007
]
Robert Muir commented on LUCENE-2458:
-------------------------------------
{quote}
Robert, it was your commit that changed the default behavior of Solr, and I
disagree with that change.
Technically, I could VETO - but I don't believe I have ever done a code-change
veto, and I don't want to start now
{quote}
Yonik, i would rather you just VETO than heavy-commit the wrong changes.
For example, if you said "robert, its annoying that for users with LUCENE_31
version in their solrconfig,
I don't feel they don't have enough flexibility yet without going setting
version to LUCENE_30. I feel that
the parameter setting in SOLR-2015 should be incorporated into this issue"
I mean, thats completely constructive!
{quote}
Instead, I'll try and be constructive by going to work on SOLR-2015 so we can
at least configure it per-field.
{quote}
Man, I am willing to help with that also (though, i am not particularly a solr
queryparser expert, I think we
should expose these options to users that want them, instead of requiring them
to depend on version-specific
defaults). Just let me know how I can help, I want constructive progress.
> queryparser makes all CJK queries phrase queries regardless of analyzer
> -----------------------------------------------------------------------
>
> Key: LUCENE-2458
> URL: https://issues.apache.org/jira/browse/LUCENE-2458
> Project: Lucene - Java
> Issue Type: Bug
> Components: QueryParser
> Reporter: Robert Muir
> Assignee: Robert Muir
> Priority: Blocker
> Fix For: 3.1, 4.0
>
> Attachments: LUCENE-2458.patch, LUCENE-2458.patch, LUCENE-2458.patch,
> LUCENE-2458.patch
>
>
> The queryparser automatically makes *ALL* CJK, Thai, Lao, Myanmar, Tibetan,
> ... queries into phrase queries, even though you didn't ask for one, and
> there isn't a way to turn this off.
> This completely breaks lucene for these languages, as it treats all queries
> like 'grep'.
> Example: if you query for f:abcd with standardanalyzer, where a,b,c,d are
> chinese characters, you get a phrasequery of "a b c d". if you use cjk
> analyzer, its no better, its a phrasequery of "ab bc cd", and if you use
> smartchinese analyzer, you get a phrasequery like "ab cd". But the user
> didn't ask for one, and they cannot turn it off.
> The reason is that the code to form phrase queries is not internationally
> appropriate and assumes whitespace tokenization. If more than one token comes
> out of whitespace delimited text, its automatically a phrase query no matter
> what.
> The proposed patch fixes the core queryparser (with all backwards compat
> kept) to only form phrase queries when the double quote operator is used.
> Implementing subclasses can always extend the QP and auto-generate whatever
> kind of queries they want that might completely break search for languages
> they don't care about, but core general-purpose QPs should be language
> independent.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]