Re: [HACKERS] Patch for bug #12845 (GB18030 encoding)

2015-05-19 Thread Arjen Nienhuis
>> That's fine when not every code point is used, but it's different for >> GB18030 where almost all code points are used. Using a plain array >> saves space and saves a binary search. > > Well, it doesn't save any space: if we get rid of the additional linear > ranges in the lookup table, what rem

Re: [HACKERS] Patch for bug #12845 (GB18030 encoding)

2015-05-15 Thread Robert Haas
On Fri, May 15, 2015 at 3:18 PM, Tom Lane wrote: > However, I'm not that excited about changing it. We have not heard field > complaints about these converters being too slow. What's more, there > doesn't seem to be any practical way to apply the same idea to the other > conversion direction, wh

Re: [HACKERS] Patch for bug #12845 (GB18030 encoding)

2015-05-15 Thread Tom Lane
Arjen Nienhuis writes: > On Fri, May 15, 2015 at 4:10 PM, Tom Lane wrote: >> According to that, about half of the characters below U+ can be >> processed via linear conversions, so I think we ought to save table >> space by doing that. However, the remaining stuff that has to be >> processed

Re: [HACKERS] Patch for bug #12845 (GB18030 encoding)

2015-05-15 Thread Arjen Nienhuis
On Fri, May 15, 2015 at 4:10 PM, Tom Lane wrote: > Arjen Nienhuis writes: >> GB18030 is a special case, because it's a full mapping of all unicode >> characters, and most of it is algorithmically defined. > > True. > >> This makes UtfToLocal a bad choice to implement it. > > I disagree with that

Re: [HACKERS] Patch for bug #12845 (GB18030 encoding)

2015-05-15 Thread Tom Lane
Arjen Nienhuis writes: > GB18030 is a special case, because it's a full mapping of all unicode > characters, and most of it is algorithmically defined. True. > This makes UtfToLocal a bad choice to implement it. I disagree with that conclusion. There are still 3+ characters that need to be

Re: [HACKERS] Patch for bug #12845 (GB18030 encoding)

2015-05-15 Thread Arjen Nienhuis
On Thu, May 14, 2015 at 11:04 PM, Tom Lane wrote: > I wrote: >> Robert Haas writes: >>> On Wed, May 6, 2015 at 11:13 AM, Alvaro Herrera >>> wrote: Maybe not, but at the very least we should consider getting it fixed in 9.5 rather than waiting a full development cycle. Same as in

Re: [HACKERS] Patch for bug #12845 (GB18030 encoding)

2015-05-14 Thread Tom Lane
I wrote: > Robert Haas writes: >> On Wed, May 6, 2015 at 11:13 AM, Alvaro Herrera >> wrote: >>> Maybe not, but at the very least we should consider getting it fixed in >>> 9.5 rather than waiting a full development cycle. Same as in >>> https://www.postgresql.org/message-id/20150428131549.ga25..

Re: [HACKERS] Patch for bug #12845 (GB18030 encoding)

2015-05-06 Thread Tom Lane
Robert Haas writes: > On Wed, May 6, 2015 at 11:13 AM, Alvaro Herrera > wrote: >> Maybe not, but at the very least we should consider getting it fixed in >> 9.5 rather than waiting a full development cycle. Same as in >> https://www.postgresql.org/message-id/20150428131549.ga25...@momjian.us >>

Re: [HACKERS] Patch for bug #12845 (GB18030 encoding)

2015-05-06 Thread Robert Haas
On Wed, May 6, 2015 at 11:13 AM, Alvaro Herrera wrote: >> It's a behavior change, so I don't think we would consider a back-patch. > > Maybe not, but at the very least we should consider getting it fixed in > 9.5 rather than waiting a full development cycle. Same as in > https://www.postgresql.or

Re: [HACKERS] Patch for bug #12845 (GB18030 encoding)

2015-05-06 Thread Alvaro Herrera
Robert Haas wrote: > On Wed, May 6, 2015 at 10:55 AM, Alvaro Herrera > wrote: > > Robert Haas wrote: > >> On Tue, May 5, 2015 at 9:04 AM, Arjen Nienhuis > >> wrote: > >> > Can someone look at this patch. It should fix bug #12845. > >> > > >> > The current tests for conversions are very minimal.

Re: [HACKERS] Patch for bug #12845 (GB18030 encoding)

2015-05-06 Thread Robert Haas
On Wed, May 6, 2015 at 10:55 AM, Alvaro Herrera wrote: > Robert Haas wrote: >> On Tue, May 5, 2015 at 9:04 AM, Arjen Nienhuis >> wrote: >> > Can someone look at this patch. It should fix bug #12845. >> > >> > The current tests for conversions are very minimal. I expanded them a >> > bit for this

Re: [HACKERS] Patch for bug #12845 (GB18030 encoding)

2015-05-06 Thread Alvaro Herrera
Robert Haas wrote: > On Tue, May 5, 2015 at 9:04 AM, Arjen Nienhuis wrote: > > Can someone look at this patch. It should fix bug #12845. > > > > The current tests for conversions are very minimal. I expanded them a > > bit for this bug. > > > > I think the binary search in the .map files should be

Re: [HACKERS] Patch for bug #12845 (GB18030 encoding)

2015-05-06 Thread Robert Haas
On Tue, May 5, 2015 at 9:04 AM, Arjen Nienhuis wrote: > Can someone look at this patch. It should fix bug #12845. > > The current tests for conversions are very minimal. I expanded them a > bit for this bug. > > I think the binary search in the .map files should be removed but I > leave that for a

[HACKERS] Patch for bug #12845 (GB18030 encoding)

2015-05-05 Thread Arjen Nienhuis
Hi, Can someone look at this patch. It should fix bug #12845. The current tests for conversions are very minimal. I expanded them a bit for this bug. I think the binary search in the .map files should be removed but I leave that for another patch. 0001-Have-GB18030-handle-more-than-2-byte-Unic