On Tue, Jun 10, 2025 at 8:00 PM Branko Čibej <br...@apache.org> wrote:

> On 11. 6. 25 01:19, Nathan Hartman wrote:
>
> On Tue, Jun 10, 2025 at 12:26 PM Branko Čibej <br...@apache.org> wrote:
>
>> On 10. 6. 25 17:32, Timofei Zhakov wrote:
>>
>> On Tue, Jun 10, 2025 at 5:24 PM Branko Čibej <br...@apache.org> wrote:
>>
>>> On 10. 6. 25 16:01, Timofei Zhakov wrote:
>>>
>>> On Mon, Jun 9, 2025 at 7:26 PM Branko Čibej <br...@apache.org> wrote:
>>>
>>>> On 9. 6. 25 18:31, rin...@apache.org wrote:
>>>>
>>>> Author: rinrab
>>>> Date: Mon Jun  9 16:31:15 2025
>>>> New Revision: 1926293
>>>>
>>>> URL: http://svn.apache.org/viewvc?rev=1926293&view=rev
>>>> Log:
>>>> Do not include apr_xlate.h into svn_utf.h, but manually wrap the
>>>> SVN_APR_*_CHARSET constants into APR_*_CHARSET.
>>>>
>>>>
>>>>
>>>> -1. You can't presume that this will never change in APR.
>>>>
>>>> Fix the cmake build to handle include paths correctly. This is not that
>>>> fix. Please revert.
>>>>
>>>> -- Brane
>>>>
>>>
>>> I want to make sure that you entirely understand this change.
>>>
>>> SVN_APR_*_CHARSET family has nothing similar with APR_*_CHARSET since they 
>>> are being converted into APR constants through the get_apr_xlate_charset() 
>>> function.
>>>
>>> This also means we can add our own shortcuts to SVN_APR_*_CHARSET, which 
>>> I'm planning to do for UTF-8.
>>> As long as consumers of our library are not passing APR_*_CHARSET to our 
>>> utf routines, we are all fine.
>>>
>>>
>>> Yes. "As long as". This stuff has been in our public header since before
>>> Subversion was an ASF project. APR is and always has been part of
>>> Subversion's public interface.
>>>
>>> It doesn't make us dependent on apr's constants. Please withdraw your veto.
>>>
>>>
>>> You're changing a public header in a backward-incompatible way because
>>> the CMake build has broken include paths.
>>>
>>> You don't see anything wrong with that? Looks to me like you got your
>>> priorities wrong.
>>>
>>> -- Brane
>>>
>>>
>> Okay, I see. You are talking about removing this include, not about the
>> whole change, right? I agree that this is wrong for backward compat to
>> remove this from the private header. However, the remaining part
>> (with SVN_APR_*_CHARSET) is correct. It's still an improvement IMO, which
>> we can use later to add new charset shortcuts (like I said for utf8).
>>
>>
>> No, it is not correct, because this is *not* a private header. It's
>> public interface.
>>
>> The proper fix would be to put something like this in CMakeLists.txt:
>>
>> include_directories($<TARGET_PROPERTY:apr::apr,INCLUDE_DIRECTORIES>
>>                     $<TARGET_PROPERTY:apr::aprutil,INCLUDE_DIRECTORIES>)
>>
>>
>> Right now, your changes for utf-8 argument parsing has spilled into our
>> public API in an incompatible way, just to avoid fixing cmake.
>>
>>
>> -- Brane
>>
>
>
> Hi,
>
> Timofei, your tremendous efforts are very much appreciated.
>
> Brane, your vigilance and care is very much appreciated too.
>
> Okay, so there's a speed bump here. Please, let's all be patient and
> try to gain some clarity, and let's try to stay positive. We've seen
> quite a bit of what *shouldn't* be done; let's try to focus on what
> *should* be done...
>
> Yes, obviously I agree that we should not change public APIs, nor
> mask any problem by patching the wrong thing, such as fixing a build
> problem. Btw I think build system issues are not the motivation for
> this change...
>
> It looks to me like this patch is meant to be one step in a sequence,
> but we haven't seen the rest of the sequence, because we've paused
> here. So we're not understanding the full picture in context.
>
> So, if we could understand what the rest of the sequence will be? What
> problem are we trying to solve? What is the intended solution?
>
> From my reading of the diff:
>
> It looks like the intention is to "subclass" and extend APR's charset
> stuff by doing two things:
>
> (1) defining SVN_APR_LOCALE_CHARSET and SVN_APR_DEFAULT_CHARSET
>     "on top of" APR_LOCALE_CHARSET and APR_DEFAULT_CHARSET, to set the
>     stage for adding further definitions (which we haven't seen yet),
>     and
>
> (2) adding the function get_apr_xlate_charset(), which currently is a
>     no-op, but appears as though it intends to be extended later to
>     examine CHARSET and either pass it through unchanged or translate
>     it to a different code.
>
> Am I correct above the above? If not, where am I mistaken?
>
> In APR, APR_LOCALE_CHARSET and APR_DEFAULT_CHARSET are #define'd to
> 0 and 1, respectively, typecast to (const char *). The reason for this
> typecasting is so that you can either pass one of the above names, or
> pass a string, like "ISO-8559-1".
>
> Several things are not clear to me:
>
> (1) How does APR intend for applications to extend beyond these two
>     codes?
>
>
> It doesn't. These are magic pointers for the only two encodings that the
> application can't know directly.
>
>     Perhaps it's one of these possibilities:
>
>     (a) APR intends for applications to use either APR_LOCALE_CHARSET
>         and APR_DEFAULT_CHARSET or use strings. Nothing else. Or,
>
>
> This.
>
>     (b) APR intends for applications to add their own defines, and
>         perhaps use the above defines and/or strings also. Or,
>
>
> Sure applications can define their own string constants, just not override
> those two, which are special.
>
>     (c) APR hasn't considered this question. Or,
>
>
> Yes, we did. :)
>


Thanks for your response. This is extremely helpful!

    (d) APR hasn't considered this question because it should never
>         be necessary to do so. If so, what *are* we supposed to do?
>
>     Brane's insights here will be particularly appreciated.
>
>
> I frankly don't understand why we'd need any more than what APR gives us.
> Subversion certainly doesn't want to deal with anyhting explicit other than
> UTF-8 (and, maybe in the future, the other UTF's that tend to be used in
> text files, such as UTF-16-LE on Windows).
>

>
> Those constants are intended to make it easy for the application to create
> a translator *from* whatever encoding the locale give and the source
> compiler give us. Everything else, as far as we're concerned, is either
> utf-8 (internally) or something specific about a file's contents that we
> can get from svn:mime-type. Not that we currently do, as far as I'm aware.
>


Ack.

(2) Perhaps more importantly, does APR lack something that SVN needs
>     in this area? If so, what?
>
> Now, a suggestion. A picture is worth a thousand words, and in
> software development, code is pictures. So perhaps things would be
> more clear to everyone by doing something like:
>
> (1) Revert the change on trunk.
>
> (2) Fix the CMake build if indeed it's broken right now; I saw
>     Brane's response but I haven't checked. Fixing it will remove
>     anything CMake-related from the equation, so we can focus on the
>     charset issue without doubts.
>
>
> The CMake build is broken with regard to include directories, that's a
> given, and it's easy to fix. Perhaps not in such a cmake-purist way that
> some people around here appear to advocate, so let me just drop this
> reminder here: a build tool is a *tool*, meant to make life easier for our
> users and *us*. The point is not to try to get some sort of literary award
> for purism.
>


Yes, after my earlier reply I saw the GHA notification about the build
breaking on the revision that added back the #include in question. So
that's obviously something to be fixed. I wonder why this particular
#include in this specific place? After all, we're including other APR
headers elsewhere. But I haven't had time to analyze this yet. Probably a
bisect would find the culprit without much manual analysis. I can't figure
it out by wading through the notification emails. If it continues to be
broken tomorrow I'll try to help.

(3) Create a branch and make this change + the further changes that
>     are supposed to build upon it on the branch...
>
> Now, with a branch and with all the changes together, we will all be
> a lot smarter because we will see everything in context. Right now,
> we're seeing one change in a vacuum and making judgments without the
> full information.
>
> Even if the branch turns out to approach the problem from the wrong
> direction, at least we'll be able to see the full picture and get some
> better feedback about how to do it right.
>
> Again, I agree about not changing the public APIs. That's what we
> should *not* do. I'm trying to get at: what *should* we do?
>
>
> As I said, I see no need for any extra public constants just to get our
> command-line tools to parse their arguments correctly. All we need is for
> that is APR_LOCALE_CHARSET, or rather SVN_APR_LOCALE_CHARSET, which our
> conversion functions already use.
>
> If, on the other hand, this is not the case -- then I'd really like to see
> some discussion and design proposal here on the list. Not just throwing
> code at a branch or at trunk with no real explanation and apparently no
> plan, because right now I have absolutely no clue what's on the branch,
> what's on trunk, what will be merged or if will be merged at all.
>


I admit I'm a bit confused also, though partly that's because a lot has
been happening here with Timofei's efforts. We've gotten accustomed to
things being kind of dormant for a while. I'm frankly glad to see that
change.

And yes, I also don't know why this change is on the trunk rather than the
branch, why the need for our own defines, etc. But I'm going to assume
there is a plan and I'll wait to hear from Timofei before reaching any
conclusions. I'm +1 to discussing the aim and intention of this change and
how it fits in with whatever it tries to achieve. Once we understand that,
we can all be more helpful, and, again, insights from your extensive
knowledge of the APR/SVN interaction, and any other guidance you can offer,
will be greatly appreciated.


Originally the idea was to fix a known bug. I can't help but feel that this
> bug fixing has gone slightly out of control, and the goal now is something
> else -- but I don't know what.
>
> -- Brane
>


Cheers,
Nathan

Reply via email to