On Wed, 18 Mar 2020 at 08:18, movead...@highgo.ca <movead...@highgo.ca> wrote:
> > Hello thanks for the detailed review, > > >I think the second argument indicates the bit position, which would be > max bytea length * 8. If max bytea length covers whole int32, the second > argument >needs to be wider i.e. int64. > Yes, it makes sence and followed. > > I think we need a similar change in byteaGetBit() and byteaSetBit() as well. > > > Some more comments on the patch > > struct pg_encoding > > { > >- unsigned (*encode_len) (const char *data, unsigned dlen); > >+ int64 (*encode_len) (const char *data, unsigned dlen); > > unsigned (*decode_len) (const char *data, unsigned dlen); > > unsigned (*encode) (const char *data, unsigned dlen, char *res); > > unsigned (*decode) (const char *data, unsigned dlen, char *res); > > Why not use return type of int64 for rest of the functions here as well? > > res = enc->encode(VARDATA_ANY(data), datalen, VARDATA(result)); > > /* Make this FATAL 'cause we've trodden on memory ... */ > >- if (res > resultlen) > >+ if ((int64)res > resultlen) > > > >if we change return type of all those functions to int64, we won't need > this cast. > I change the 'encode' function, it needs an int64 return type, but keep > other > > functions in 'pg_encoding', because I think it of no necessary reason. > > Ok, let's leave it for a committer to decide. > >Right now we are using int64 because bytea can be 1GB, but what if we > increase > >that limit tomorrow, will int64 be sufficient? That may be unlikely in > the near > >future, but nevertheless a possibility. Should we then define a new > datatype > >which resolves to int64 right now but really depends upon the bytea > length. I > >am not suggesting that we have to do it right now, but may be something to > >think about. > I decide to use int64 because if we really want to increase the limit, it > should be > the same change with 'text', 'varchar' which have the same limit. So it may > need > a more general struct. Hence I give up the idea. > > Hmm, Let's see what a committer says. Some more review comments. + int64 res,resultlen; We need those on separate lines, possibly. + byteNo = (int32)(n / BITS_PER_BYTE); Does it hurt to have byteNo as int64 so as to avoid a cast. Otherwise, please add a comment explaining the reason for the cast. The comment applies at other places where this change appears. - int len; + int64 len; Why do we need this change? int i; > > > >It might help to add a test where we could pass the second argument > something > >greater than 1G. But it may be difficult to write such a test case. > Add two test cases. > > + +select get_bit( + set_bit((repeat('Postgres', 512 * 1024 * 1024 / 8))::bytea, 1024 * 1024 * 1024 + 1, 0) + ,1024 * 1024 * 1024 + 1); This bit position is still within int4. postgres=# select pg_column_size(1024 * 1024 * 1024 + 1); pg_column_size ---------------- 4 (1 row) You want something like postgres=# select pg_column_size(512::bigint * 1024 * 1024 * 8); pg_column_size ---------------- 8 (1 row) -- Best Wishes, Ashutosh