On 11.02.2016 03:33, Tom Lane wrote:
Artur Zakirov writes:
[ tsearch_aff_parse_v1.patch ]
I've pushed this with some corrections --- notably, I did not like the
lack of buffer-overrun prevention, and it did the wrong thing if a line
had more than one trailing space character.
We still need
On 11.02.2016 01:19, Tom Lane wrote:
I wrote:
Artur Zakirov writes:
I think this is not a bug. It is a normal behavior. In Mac OS sscanf()
with the %s format reads the string one character at a time. The size of
letter 'Ñ…' is 2. And sscanf() separate it into two wrong characters.
That argu
On 02/10/16 23:55, Tom Lane wrote:
> Yeah, I got that --- what seems squishier is that none of the other C1
> control characters are considered whitespace?
That seems to be exactly the case:
http://www.unicode.org/Public/5.2.0/ucd/PropList.txt
09..0D, 20, 85, and A0 are the only whitespace char
Chapman Flack writes:
> On 02/10/16 17:19, Tom Lane wrote:
>> I also verified that in UTF8-based locales, isspace() thinks that 0x85 and
>> 0xA0, and no other high-bit-set values, are spaces. Not sure exactly why
> Unicode NEXT LINE (NEL) and NO-BREAK SPACE, respectively.
Yeah, I got that --- w
On 02/10/16 17:19, Tom Lane wrote:
> I also verified that in UTF8-based locales, isspace() thinks that 0x85 and
> 0xA0, and no other high-bit-set values, are spaces. Not sure exactly why
Unicode NEXT LINE (NEL) and NO-BREAK SPACE, respectively.
http://unicode.org/standard/reports/tr13/tr13-5.ht
Artur Zakirov writes:
> [ tsearch_aff_parse_v1.patch ]
I've pushed this with some corrections --- notably, I did not like the
lack of buffer-overrun prevention, and it did the wrong thing if a line
had more than one trailing space character.
We still need to look at other uses of *scanf(), but
Larry Rosenman writes:
> If you want, file a bug at https://bugs.freebsd.org/bugzilla
Probably not much point; the commit log shows pretty clearly that they
have been thinking about the code's behavior with multibyte characters,
so I assume they've intentionally decided to keep it like this.
On 2016-02-10 17:00, Tom Lane wrote:
Larry Rosenman writes:
On 2016-02-10 16:19, Tom Lane wrote:
I looked into the OS X sources, and found that indeed you are right:
*scanf processes the input a byte at a time, and applies isspace() to
each byte separately, even when the locale is such that th
Larry Rosenman writes:
> On 2016-02-10 16:19, Tom Lane wrote:
>> I looked into the OS X sources, and found that indeed you are right:
>> *scanf processes the input a byte at a time, and applies isspace() to
>> each byte separately, even when the locale is such that that's a
>> clearly insane thing
On 2016-02-10 16:19, Tom Lane wrote:
I wrote:
Artur Zakirov writes:
I think this is not a bug. It is a normal behavior. In Mac OS
sscanf()
with the %s format reads the string one character at a time. The size
of
letter 'х' is 2. And sscanf() separate it into two wrong characters.
That arg
I wrote:
> Artur Zakirov writes:
>> I think this is not a bug. It is a normal behavior. In Mac OS sscanf()
>> with the %s format reads the string one character at a time. The size of
>> letter 'Ñ
' is 2. And sscanf() separate it into two wrong characters.
> That argument might be convincing if
On 10.02.2016 18:51, Teodor Sigaev wrote:
Hmm. Here
src/backend/access/transam/xlog.c read_tablespace_map()
using %s in scanf looks suspisious. I don't fully understand but it
looks like it tries to read oid as string. So, it should be safe in
usial case
Next, _LoadBlobs() reads filename (fname)
Artur Zakirov writes:
> I agree that previous patch is wrong. Instead of using new
> parse_ooaffentry() function maybe better to use sscanf() with %ls
> format. The %ls format is used to read a wide character string.
No, that way is going to give you worse portability problems than what
we have
It seems that *scanf() with %s format occures only here:
- check.c - get_bin_version()
- server.c - get_major_server_version()
- filemap.c - isRelDataFile()
- pg_backup_directory.c - _LoadBlobs()
- xlog.c - do_pg_stop_backup()
- mac.c - macaddr_in()
I think here sscanf() do not works with the UTF-
On 09.02.2016 20:13, Tom Lane wrote:
I do not like this patch much. It is basically "let's stop using sscanf()
because it seems to have a bug on one platform". There are at least two
things wrong with that approach:
1. By my count there are about 80 uses of *scanf() in our code. Are we
going
Artur Zakirov writes:
>> I think the NIImportOOAffixes() in spell.c should be corrected to avoid
>> this bug.
> I have attached a patch. It adds new functions parse_ooaffentry() and
> get_nextentry() and fixes a couple comments.
I do not like this patch much. It is basically "let's stop using
On 28.01.2016 17:42, Artur Zakirov wrote:
On 27.01.2016 15:28, Artur Zakirov wrote:
On 27.01.2016 14:14, Stas Kelvich wrote:
Hi.
I tried that and confirm strange behaviour. It seems that problem with
small cyrillic letter ‘х’. (simplest obscene language filter? =)
That can be reproduced with
On 27.01.2016 15:28, Artur Zakirov wrote:
On 27.01.2016 14:14, Stas Kelvich wrote:
Hi.
I tried that and confirm strange behaviour. It seems that problem with
small cyrillic letter ‘х’. (simplest obscene language filter? =)
That can be reproduced with simpler test
Stas
The test program was
On 27.01.2016 14:14, Stas Kelvich wrote:
Hi.
I tried that and confirm strange behaviour. It seems that problem with small
cyrillic letter ‘х’. (simplest obscene language filter? =)
That can be reproduced with simpler test
Stas
The test program was corrected. Now it uses wchar_t type. And
Hi.
I tried that and confirm strange behaviour. It seems that problem with small
cyrillic letter ‘х’. (simplest obscene language filter? =)
That can be reproduced with simpler test
Stas
test.c
Description: Binary data
> On 27 Jan 2016, at 13:59, Artur Zakirov wrote:
>
> On 27.01.2016 13
On 27.01.2016 13:46, Shulgin, Oleksandr wrote:
Not sure why the file uses "SET KOI8-R" directive then?
This directive is used only by Hunspell program. PostgreSQL ignores this
directive and assumes that input affix and dictionary files in the UTF-8
encoding.
What error message do you g
On Wed, Jan 27, 2016 at 10:59 AM, Artur Zakirov
wrote:
> Hello.
>
> When a user try to create a text search dictionary for the russian
> language on Mac OS then called the following error message:
>
> CREATE EXTENSION hunspell_ru_ru;
> + ERROR: invalid byte sequence for encoding "UTF8": 0xd1
>
Hello.
When a user try to create a text search dictionary for the russian
language on Mac OS then called the following error message:
CREATE EXTENSION hunspell_ru_ru;
+ ERROR: invalid byte sequence for encoding "UTF8": 0xd1
+ CONTEXT: line 341 of configuration file
"/Users/stas/code/postg
23 matches
Mail list logo