On Thursday, December 7, 2023, Joe Conway <m...@joeconway.com> wrote:
> On 12/7/23 08:35, Daniel Verite wrote: > >> Joe Conway wrote: >> >> The attached should fix the CopyOut response to say one column. I.e. it >>> ought to look something like: >>> >> >> Spending more time with the doc I came to the opinion that in this bit >> of the protocol, in CopyOutResponse (B) >> ... >> Int16 >> The number of columns in the data to be copied (denoted N below). >> ... >> >> this number must be the number of columns in the source. >> That is for COPY table(a,b,c) the number is 3, independently >> on whether the result is formatted in text, cvs, json or binary. >> >> I think that changing it for json can reasonably be interpreted >> as a protocol break and we should not do it. >> >> The fact that this value does not help parsing the CopyData >> messages that come next is not a new issue. A reader that >> doesn't know the field separator and whether it's text or csv >> cannot parse these messages into fields anyway. >> But just knowing how much columns there are in the original >> data might be useful by itself and we don't want to break that. >> > > Ok, that sounds reasonable to me -- I will revert that change. > > The other question for me is, in the CopyData message, this >> bit: >> " Messages sent from the backend will always correspond to single data >> rows" >> >> ISTM that considering that the "[" starting the json array is a >> "data row" is a stretch. >> That might be interpreted as a protocol break, depending >> on how strict the interpretation is. >> > > If we really think that is a problem I can see about changing it to this > format for json array: > > 8<------------------ > copy > ( > with ss(f1, f2) as > ( > select 1, g.i from generate_series(1, 3) g(i) > ) > select ss from ss > ) to stdout (format json, force_array); > [{"ss":{"f1":1,"f2":1}} > ,{"ss":{"f1":1,"f2":2}} > ,{"ss":{"f1":1,"f2":3}}] > 8<------------------ > > Is this acceptable to everyone? > > Or maybe this is preferred? > 8<------------------ > [{"ss":{"f1":1,"f2":1}}, > {"ss":{"f1":1,"f2":2}}, > {"ss":{"f1":1,"f2":3}}] > 8<------------------ > > Or as long as we are painting the shed, maybe this? > 8<------------------ > [{"ss":{"f1":1,"f2":1}}, > {"ss":{"f1":1,"f2":2}}, > {"ss":{"f1":1,"f2":3}}] > 8<------------------ > Those are all the same breakage though - if truly interpreted as data rows the protocol is basically written such that the array format is not supportable and only the lines format can be used. Hence my “format 0 doesn’t work” comment for array output and we should explicitly add format 2 where we explicitly decouple lines of output from rows of data. That said, it would seem in practice format 0 already decouples them and so the current choice of the brackets on their own lines is acceptable. I’d prefer to keep them on their own line. I also don’t know why you introduced another level of object nesting here. That seems quite undesirable. David J.