[ 
https://issues.apache.org/jira/browse/SOLR-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12892117#action_12892117
 ] 

Robert Muir commented on SOLR-2015:
-----------------------------------

bq. Aside: some of these people would like multiple languages in the same 
field, which is part of the reason why I always felt that a the information 
about how two tokens are related should be produced by the tokenizer/filter 
creating such tokens.

I don't think we should design our apis around such hacks, especially unproven 
ones. I don't think the auto phrase generation actually helps english at all, 
and no one has shown results anywhere that it helps. The reason I don't think 
it helps is because any improvement in precision is accompanied by decrease in 
recall: e.g. in this example from the user list, not using the phrase query 
would find the document, but if you use the phrase query, it doesn't. 
http://www.lucidimagination.com/search/document/bacf34995067e3cb/worddelimiterfilter_and_phrase_queries

Furthermore, I dont think we should try to make complicated support for 
multiple languages. Instead we should support simple, proven approaches such as 
simple language-independent tokenization or n-gram analysis that actually 
works, not trying to support fine-grained detection and fancy stuff that overly 
complicates APIs and only provides worse results: 
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.111.6844


> add a config hook for autoGeneratePhraseQueries
> -----------------------------------------------
>
>                 Key: SOLR-2015
>                 URL: https://issues.apache.org/jira/browse/SOLR-2015
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 3.1, 4.0
>            Reporter: Koji Sekiguchi
>            Assignee: Yonik Seeley
>            Priority: Blocker
>             Fix For: 3.1, 4.0
>
>         Attachments: SOLR-2015.patch, SOLR-2015.patch, SOLR-2015.patch
>
>
> After committed LUCENE-2458, a hook for autoGeneratePhraseQueries will be 
> convenient for some situation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to