Re: [HACKERS] tsearch2: enable non ascii stop words with C locale

2007-02-13 Thread Teodor Sigaev
Precise definition for "latin" in C locale please. Are you saying that single byte encoding with range 0-7f? is "latin"? If so, it seems they are exacty same as ASCII. p_islatin returns true for ASCII alpha characters. -- Teodor Sigaev E-mail: [EMAIL PROTECTED

Re: [HACKERS] tsearch2: enable non ascii stop words with C locale

2007-02-13 Thread Tatsuo Ishii
> > I know. My guess is the parser does not read the stop word file at > > least with default configuration. > > Parser should not read stopword file: its deal for dictionaries. I'll come up with more detailed info, explaining why stopword file is not read. > > So if a character is not ASCII, it

Re: [HACKERS] tsearch2: enable non ascii stop words with C locale

2007-02-13 Thread Teodor Sigaev
I know. My guess is the parser does not read the stop word file at least with default configuration. Parser should not read stopword file: its deal for dictionaries. So if a character is not ASCII, it returns 0 even if p_isalpha returns 1. Is this what you expect? No, p_islatin should return

Re: [HACKERS] tsearch2: enable non ascii stop words with C locale

2007-02-12 Thread Tatsuo Ishii
> > Currently tsearch2 does not accept non ascii stop words if locale is > > C. Included patches should fix the problem. Patches against PostgreSQL > > 8.2.3. > > I'm not sure about correctness of patch's description. > > First, p_islatin() function is used only in words/lexemes parser, not > st

Re: [HACKERS] tsearch2: enable non ascii stop words with C locale

2007-02-12 Thread Teodor Sigaev
Currently tsearch2 does not accept non ascii stop words if locale is C. Included patches should fix the problem. Patches against PostgreSQL 8.2.3. I'm not sure about correctness of patch's description. First, p_islatin() function is used only in words/lexemes parser, not stop-word code. Second