Hi, > > On Mon, 20 Apr 2026 at 19:09, Tom Lane <[email protected]> wrote: >
> Seems to me the correct thing here is to make it work like the other >> cases, ie perform pg_server_to_any(). I have exactly no sympathy for >> the argument about the RFC saying it must be UTF-8, not least because >> that's not in fact what is implemented (what if the server encoding >> isn't UTF-8?). >> > > Agreed. I initially thought rejecting the option was the safer route > given the RFC, but as you pointed out, we aren't enforcing > UTF-8 strictly on the server side anyway. > > >> Rejecting this option altogether doesn't improve anything, not >> functionally, not specs-compliance-wise, nor according to the >> principle of least surprise. >> > > Makes sense. Implementing the conversion properly > keeps JSON format consistent with how the text and CSV formats behave. > >> >> No, you don't get to punt this till later. Once we ship v19 there's >> going to be a strong expectation of backwards compatibility. >> >> The idea of sending UTF-8 to a client that's set client_encoding to >> something else would be risible, if it weren't a security hazard. >> > > I agree sending unconverted bytes to a mismatched > client encoding is clearly a security hazard that needs addressing. Did > not consider the backward compatibility part, my bad. > > Was trying out adding pg_server_to_any() to the json_buf after > composite_to_json() returns, > correctly covering both explicit ENCODING option specifications and > implicit client_encoding mismatches. > > Let me send a patch with code and associated test cases. > > Attached patch with round trip test case. Please review and let me know if it's in the right direction. Regards, Ayush
v3-0001-Apply-encoding-conversion-in-COPY-TO-FORMAT-JSON.patch
Description: Binary data
