Re: A bug when use get_bit() function for a long bytea string

Ashutosh Bapat Thu, 26 Mar 2020 06:33:17 -0700

On Wed, 18 Mar 2020 at 08:18, movead...@highgo.ca <movead...@highgo.ca>
wrote:


>
> Hello thanks for the detailed review,
>
> >I think the second argument indicates the bit position, which would be
> max bytea length * 8. If max bytea length covers whole int32, the second
> argument >needs to be wider i.e. int64.
> Yes, it makes sence and followed.
>
>
I think we need a similar change in byteaGetBit() and byteaSetBit() as well.


>
> > Some more comments on the patch
> > struct pg_encoding
> > {
> >- unsigned (*encode_len) (const char *data, unsigned dlen);
> >+ int64 (*encode_len) (const char *data, unsigned dlen);
> >  unsigned (*decode_len) (const char *data, unsigned dlen);
> >  unsigned (*encode) (const char *data, unsigned dlen, char *res);
> >  unsigned (*decode) (const char *data, unsigned dlen, char *res);
> > Why not use return type of int64 for rest of the functions here as well?
> >  res = enc->encode(VARDATA_ANY(data), datalen, VARDATA(result));
> >  /* Make this FATAL 'cause we've trodden on memory ... */
> >- if (res > resultlen)
> >+ if ((int64)res > resultlen)
> >
> >if we change return type of all those functions to int64, we won't need
> this cast.
> I change the 'encode' function, it needs an int64 return type, but keep
> other
>
> functions in 'pg_encoding', because I think it of no necessary reason.
>
>
Ok, let's leave it for a committer to decide.


> >Right now we are using int64 because bytea can be 1GB, but what if we
> increase
> >that limit tomorrow, will int64 be sufficient? That may be unlikely in
> the near
> >future, but nevertheless a possibility. Should we then define a new
> datatype
> >which resolves to int64 right now but really depends upon the bytea
> length. I
> >am not suggesting that we have to do it right now, but may be something to
> >think about.
> I decide to use int64 because if we really want to increase the limit,  it
> should be
> the same change with 'text', 'varchar' which have the same limit. So it may
> need
> a more general struct. Hence I give up the idea.
>
>
Hmm, Let's see what a committer says.

Some more review comments.
+   int64       res,resultlen;

We need those on separate lines, possibly.

+   byteNo = (int32)(n / BITS_PER_BYTE);
Does it hurt to have byteNo as int64 so as to avoid a cast. Otherwise,
please
add a comment explaining the reason for the cast. The comment applies at
other
places where this change appears.

-       int         len;
+       int64       len;
Why do we need this change?
        int         i;


>
>
> >It might help to add a test where we could pass the second argument
> something
> >greater than 1G. But it may be difficult to write such a test case.
> Add two test cases.
>
>
+
+select get_bit(
+        set_bit((repeat('Postgres', 512 * 1024 * 1024 / 8))::bytea, 1024 *
1024 * 1024 + 1, 0)
+       ,1024 * 1024 * 1024 + 1);

This bit position is still within int4.
postgres=# select pg_column_size(1024 * 1024 * 1024 + 1);
 pg_column_size
----------------
              4
(1 row)

You want something like
postgres=# select pg_column_size(512::bigint * 1024 * 1024 * 8);
 pg_column_size
----------------
              8
(1 row)

-- 
Best Wishes,
Ashutosh

Re: A bug when use get_bit() function for a long bytea string

Reply via email to