On Thu, Apr 6, 2017 at 1:33 AM, Heikki Linnakangas <hlinn...@iki.fi> wrote: > Attached is a new version. Notable changes since yesterday: > > * Implemented the rest of the SASLPrep, mapping some characters to spaces, > leaving out others, and checking for prohibited characters and bidirectional > strings. > > * Moved things around. There's now a separate directory, src/common/unicode, > which contains the perl scripts and the test code. Those are not needed to > build from source, as the pre-generated tables are put in > src/include/common. Similar to the scripts in src/backend/utils/mb/Unicode, > really. > > * Renamed many things from utf_* to unicode_*, since they don't deal with > utf-8 input anymore. > > This is starting to shape up, but still some cleanup work to do. I will > continue tomorrow..
Thanks for the new patch, that's looking nice. Now I was not able to compile it as saslprep.h is missing from what you have sent... There is for example this portion in the new tables: +static const Codepoint prohibited_output_chars[] = +{ + 0xD800, 0xF8FF, /* C.3, C.5 */ ----- Start Table C.5 ----- D800-DFFF; [SURROGATE CODES] ----- End Table C.5 ----- This indicates a range of values. Wouldn't it be better to split this table in two, one for the range of codepoints and another one with the single entries? + 0x1D173, 0x1D17A, /* C.2.2 */ This is for musical symbols. It seems to me that checking for a range is what is intended. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers