On 11. 6. 25 01:19, Nathan Hartman wrote:
On Tue, Jun 10, 2025 at 12:26 PM Branko Čibej <br...@apache.org> wrote:
On 10. 6. 25 17:32, Timofei Zhakov wrote:
On Tue, Jun 10, 2025 at 5:24 PM Branko Čibej <br...@apache.org>
wrote:
On 10. 6. 25 16:01, Timofei Zhakov wrote:
On Mon, Jun 9, 2025 at 7:26 PM Branko Čibej
<br...@apache.org> wrote:
On 9. 6. 25 18:31, rin...@apache.org wrote:
Author: rinrab
Date: Mon Jun 9 16:31:15 2025
New Revision: 1926293
URL:http://svn.apache.org/viewvc?rev=1926293&view=rev
<http://svn.apache.org/viewvc?rev=1926293&view=rev>
Log:
Do not include apr_xlate.h into svn_utf.h, but manually wrap the
SVN_APR_*_CHARSET constants into APR_*_CHARSET.
-1. You can't presume that this will never change in APR.
Fix the cmake build to handle include paths correctly.
This is not that fix. Please revert.
-- Brane
I want to make sure that you entirely understand this change.
SVN_APR_*_CHARSETfamily has nothing similar with APR_*_CHARSETsince
they are being converted into APR constants through the
get_apr_xlate_charset()function.
This also means we can add our own shortcuts to SVN_APR_*_CHARSET,
which I'm planning to do for UTF-8.
As long as consumers of our library are not passing APR_*_CHARSETto our
utf routines, we are all fine.
Yes. "As long as". This stuff has been in our public header
since before Subversion was an ASF project. APR is and always
has been part of Subversion's public interface.
It doesn't make us dependent on apr's constants. Please
withdraw your veto.
You're changing a public header in a backward-incompatible
way because the CMake build has broken include paths.
You don't see anything wrong with that? Looks to me like you
got your priorities wrong.
-- Brane
Okay, I see. You are talking about removing this include, not
about the whole change, right? I agree that this is wrong for
backward compat to remove this from the private header. However,
the remaining part (with SVN_APR_*_CHARSET) is correct. It's
still an improvement IMO, which we can use later to add new
charset shortcuts (like I said for utf8).
No, it is not correct, because this is *not* a private header.
It's public interface.
The proper fix would be to put something like this in CMakeLists.txt:
include_directories($<TARGET_PROPERTY:apr::apr,INCLUDE_DIRECTORIES>
$<TARGET_PROPERTY:apr::aprutil,INCLUDE_DIRECTORIES>)
Right now, your changes for utf-8 argument parsing has spilled
into our public API in an incompatible way, just to avoid fixing
cmake.
-- Brane
Hi,
Timofei, your tremendous efforts are very much appreciated.
Brane, your vigilance and care is very much appreciated too.
Okay, so there's a speed bump here. Please, let's all be patient and
try to gain some clarity, and let's try to stay positive. We've seen
quite a bit of what *shouldn't* be done; let's try to focus on what
*should* be done...
Yes, obviously I agree that we should not change public APIs, nor
mask any problem by patching the wrong thing, such as fixing a build
problem. Btw I think build system issues are not the motivation for
this change...
It looks to me like this patch is meant to be one step in a sequence,
but we haven't seen the rest of the sequence, because we've paused
here. So we're not understanding the full picture in context.
So, if we could understand what the rest of the sequence will be? What
problem are we trying to solve? What is the intended solution?
From my reading of the diff:
It looks like the intention is to "subclass" and extend APR's charset
stuff by doing two things:
(1) defining SVN_APR_LOCALE_CHARSET and SVN_APR_DEFAULT_CHARSET
"on top of" APR_LOCALE_CHARSET and APR_DEFAULT_CHARSET, to set the
stage for adding further definitions (which we haven't seen yet),
and
(2) adding the function get_apr_xlate_charset(), which currently is a
no-op, but appears as though it intends to be extended later to
examine CHARSET and either pass it through unchanged or translate
it to a different code.
Am I correct above the above? If not, where am I mistaken?
In APR, APR_LOCALE_CHARSET and APR_DEFAULT_CHARSET are #define'd to
0 and 1, respectively, typecast to (const char *). The reason for this
typecasting is so that you can either pass one of the above names, or
pass a string, like "ISO-8559-1".
Several things are not clear to me:
(1) How does APR intend for applications to extend beyond these two
codes?
It doesn't. These are magic pointers for the only two encodings that the
application can't know directly.
Perhaps it's one of these possibilities:
(a) APR intends for applications to use either APR_LOCALE_CHARSET
and APR_DEFAULT_CHARSET or use strings. Nothing else. Or,
This.
(b) APR intends for applications to add their own defines, and
perhaps use the above defines and/or strings also. Or,
Sure applications can define their own string constants, just not
override those two, which are special.
(c) APR hasn't considered this question. Or,
Yes, we did. :)
(d) APR hasn't considered this question because it should never
be necessary to do so. If so, what *are* we supposed to do?
Brane's insights here will be particularly appreciated.
I frankly don't understand why we'd need any more than what APR gives
us. Subversion certainly doesn't want to deal with anyhting explicit
other than UTF-8 (and, maybe in the future, the other UTF's that tend to
be used in text files, such as UTF-16-LE on Windows).
Those constants are intended to make it easy for the application to
create a translator *from* whatever encoding the locale give and the
source compiler give us. Everything else, as far as we're concerned, is
either utf-8 (internally) or something specific about a file's contents
that we can get from svn:mime-type. Not that we currently do, as far as
I'm aware.
(2) Perhaps more importantly, does APR lack something that SVN needs
in this area? If so, what?
Now, a suggestion. A picture is worth a thousand words, and in
software development, code is pictures. So perhaps things would be
more clear to everyone by doing something like:
(1) Revert the change on trunk.
(2) Fix the CMake build if indeed it's broken right now; I saw
Brane's response but I haven't checked. Fixing it will remove
anything CMake-related from the equation, so we can focus on the
charset issue without doubts.
The CMake build is broken with regard to include directories, that's a
given, and it's easy to fix. Perhaps not in such a cmake-purist way that
some people around here appear to advocate, so let me just drop this
reminder here: a build tool is a *tool*, meant to make life easier for
our users and *us*. The point is not to try to get some sort of literary
award for purism.
(3) Create a branch and make this change + the further changes that
are supposed to build upon it on the branch...
Now, with a branch and with all the changes together, we will all be
a lot smarter because we will see everything in context. Right now,
we're seeing one change in a vacuum and making judgments without the
full information.
Even if the branch turns out to approach the problem from the wrong
direction, at least we'll be able to see the full picture and get some
better feedback about how to do it right.
Again, I agree about not changing the public APIs. That's what we
should *not* do. I'm trying to get at: what *should* we do?
As I said, I see no need for any extra public constants just to get our
command-line tools to parse their arguments correctly. All we need is
for that is APR_LOCALE_CHARSET, or rather SVN_APR_LOCALE_CHARSET, which
our conversion functions already use.
If, on the other hand, this is not the case -- then I'd really like to
see some discussion and design proposal here on the list. Not just
throwing code at a branch or at trunk with no real explanation and
apparently no plan, because right now I have absolutely no clue what's
on the branch, what's on trunk, what will be merged or if will be merged
at all.
Originally the idea was to fix a known bug. I can't help but feel that
this bug fixing has gone slightly out of control, and the goal now is
something else -- but I don't know what.
-- Brane