On Fri, 13 Mar 2020 at 08:48, movead...@highgo.ca <movead...@highgo.ca> wrote:
> Thanks for the reply. > > >Why have you used size? Shouldn't we use int64? > Yes, thanks for the point, I have changed the patch. > > Thanks for the patch. > >If get_bit()/set_bit() accept the second argument as int32, it can not > >be used to set bits whose number does not fit 32 bits. I think we need > >to change the type of the second argument as well. > Because int32 can cover the length of bytea that PostgreSQL support, > I think the second argument indicates the bit position, which would be max bytea length * 8. If max bytea length covers whole int32, the second argument needs to be wider i.e. int64. Some more comments on the patch struct pg_encoding { - unsigned (*encode_len) (const char *data, unsigned dlen); + int64 (*encode_len) (const char *data, unsigned dlen); unsigned (*decode_len) (const char *data, unsigned dlen); unsigned (*encode) (const char *data, unsigned dlen, char *res); unsigned (*decode) (const char *data, unsigned dlen, char *res); Why not use return type of int64 for rest of the functions here as well? res = enc->encode(VARDATA_ANY(data), datalen, VARDATA(result)); /* Make this FATAL 'cause we've trodden on memory ... */ - if (res > resultlen) + if ((int64)res > resultlen) if we change return type of all those functions to int64, we won't need this cast. Right now we are using int64 because bytea can be 1GB, but what if we increase that limit tomorrow, will int64 be sufficient? That may be unlikely in the near future, but nevertheless a possibility. Should we then define a new datatype which resolves to int64 right now but really depends upon the bytea length. I am not suggesting that we have to do it right now, but may be something to think about. hex_enc_len(const char *src, unsigned srclen) { - return srclen << 1; + return (int64)(srclen << 1); why not to convert srclen also to int64. That will also change the pg_encoding member definitions. But if encoded length requires int64 to fit the possibly values, same would be true for the source lengths. Why can't the source length also be int64? If still we want the cast, I think it should be ((int64)srclen << 1) rather than casting the result. /* 3 bytes will be converted to 4, linefeed after 76 chars */ - return (srclen + 2) * 4 / 3 + srclen / (76 * 3 / 4); + return (int64)((srclen + 2) * 4 / 3 + srclen / (76 * 3 / 4)); similar comments as above. SELECT set_bit(B'0101011000100100', 16, 1); -- fail ERROR: bit index 16 out of valid range (0..15) +SELECT get_bit( + set_bit((repeat('Postgres', 512 * 1024 * 1024 / 8))::bytea, 0, 0) + ,0); + get_bit +--------- + 0 +(1 row) It might help to add a test where we could pass the second argument something greater than 1G. But it may be difficult to write such a test case. -- Best Wishes, Ashutosh