On Thu, Jun 4, 2026 at 3:22 PM Branko Čibej <[email protected]> wrote:

> On 4. 6. 2026 13:48, Ivan Zhakov wrote:
>
> Hi,
>
> I was reviewing r1931334 nomination and noticed that svn__adler32() uses
> apr_off_t for buffer size. The apr_off_t stands for an offset in a file. It
> can be larger or even smaller than maximum addressable size (apr_size_t).
>
>
>
> The diff code uses off_t because it deals with file sizes. The svn_adler32
> wrapper probably inherited that. According to POSIX.1 [1] there is no way
> that off_t can be smaller than size_t but it can certainly be larger. It's
> perfectly normal to have a system with 32-bit size_t and 64-bit off_t, so
> in that respect, your proposed change is a regression.
>
> The other major change here is that off_t is signed but size_t is unsigned.
>
> Taken together, the patch looks OK but -1 to the concept. I do not see any
> improvement. The assumption that's the motivation for the patch, that off_t
> can be smaller than size_t, is incorrect.
>
> -- Brane
>
>
> I have prepared patch that fixes this. Patch changes the diff code which
> very complicated, so review will be much appreciated.
>
> Log message:
> [[[
> Use apr_size_t instead of apr_off_t for buffer size in svn__adler32().
>
> The apr_off_t stands for an offset in a file. It can be larger or even
> smaller
> than maximum addressable size (apr_size_t).
>
> Update all callers.
>
> * subversion/include/private/svn_adler32.h
> * subversion/libsvn_subr/adler32.c
>   (svn__adler32): Use apr_size_t instead of apr_off_t for SIZE argument.
>
> * subversion/libsvn_diff/diff.h
> * subversion/libsvn_diff/util.c
>   (svn_diff__normalize_buffer): Use apr_size_t instead of apr_off_t for
>    LENGTGP argument.
>
>
> LENGTHP?
>

>
> * subversion/libsvn_diff/diff_file.c
>   (datasource_get_next_token): Use apr_size_t instead of apr_off_t for
>    local variable LENGTH.
>   (token_compare): Rewrite a code a bit to make it clear that size is
>    apr_size_t.
>
> * subversion/libsvn_diff/diff_memory.c
>   (datasource_get_next_token, token_compare): Use apr_size_t instead of
>    apr_off_t for local variables to match source type.
> ]]]
>
>
>
> Looking at the patch itself:
>
> I don't see the need to complicate things by introducing a 'read_length'
> variable in subversion/libsvn_diff/diff_file.c. It's not really an
> optimisation and to my eyes it makes the code less clear, not more.
>
> -- Brane
>
> [1] see attached image.
>


I would argue and support this change. We use size_t whenever we deal with
memory (including buffers, offsets in for loops, and anything of this
kind). off_t is only applicable for files in the context of a file system.
I'm pretty sure that most systems that work on 32 bit architectures (if
there are still any) allow for file sizes more than the amount of
addressable space. So these types could be different and size_t is IMO more
correct.

Okay this is a situation where a smaller type could be converted to a
larger one which will technically never be an issue but it doesn't mean
it's correct...

-- 
Timofei Zhakov

Reply via email to