Hi James,

On 10/23/2008 at 8:30 AM, James liu wrote:
> public class AnalyzerTest {
>    @Test
>    public void test() throws ParseException {
>        QueryParser parser = new MultiFieldQueryParser(new String[]{"title", 
> "body"}, new StandardAnalyzer());
>        Query query1 = parser.parse("中文"); 
>              Query query2 = parser.parse("中 文");
>             System.out.println(query1);
>             System.out.println(query2);
>    }
> }
> 
> output :
> 
> title:"中 文" body:"中 文"
> (title:中 body:中) (title:文 body:文)
> 
> why they not  same?

MultiFieldQueryParser extends QueryParser.

QueryParser first breaks up its input on whitespace, and then sends each chunk 
to the analyzer.

query1 contains no whitespace, so the entire string is sent to 
StandardAnalyzer, which produces one token per CJK character, for a total of 
two tokens.  When QueryParser receives more than one token back from the 
analyzer, it produces a PhraseQuery (unless tokens share the same position, but 
this is not applicable to your situation).  MultiFieldQueryParser specializes 
this action by producing one PhraseQuery per Field.

query2 contains a space, so the following is performed for each 
whitespace-separated chunk (in query2, this means one character per chunk):

The chunk is sent to the analyzer, which produces a single token for each CJK 
character - one token is produced because each chunk contains a single 
character.  When QueryParser receives a single token back from the analyzer, it 
produces a TermQuery.  MultiFieldQueryParser specializes this action by 
producing one TermQuery per Field, and then groups these per-Field TermQuery's 
together in a BooleanQuery, denoted in the output as a parenthesized group.  

Steve

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to