Den sön 8 juni 2025 kl 20:55 skrev Nathan Hartman <hartman.nat...@gmail.com
>:

> On Tue, May 20, 2025 at 2:07 PM Branko Čibej <br...@apache.org> wrote:
>
>> On 18. 5. 25 21:48, Branko Čibej wrote:
>>
>> XML has the unenviable distinction of being *both* almost unreadable for
>> humans *and* very finicky to parse for machines.
>>
>>
>> There's one other nasty problem with XML: it can't represent every
>> character. There's a test for that, xml_unsafe_author2() in prop_tests.py
>> and discussion at
>>
>>   https://issues.apache.org/jira/browse/SVN-4415
>>
>> but the really painful par is that our comand-line client is quite happy
>> to produce invalid XML. Yeah, the *expected output* in that test case is
>> invalid XML, heh. I've been thinking about how to solve this; we can't use
>> &#*xx*; character entities, we can't use <![CDATA[...]]> sections – both
>> are transparent to invalid XML chars. Of course I'm talking about our XML
>> output here; we could base64- or quoted-printable-encode values that are
>> not valid XML, and we wouldn't be breaking any existing use cases.
>>
>> Well, that's for command-line output. An XML patch format has similar
>> issues. Any patch format does, but XML is especially nasty in that respect.
>>
>> I created SVN-4919 to track this in the client and to annotate the test.
>>
>> -- Brane
>>
>
>
> I know we've been discussing an XML-based format for xpatch, including the
> pros & cons of being XML-based...
>
> And then I came across this:
>
> [1] https://diffx.org/
>
> This is a page that proposes enhancing the unidiff format in a backwards-
> and forwards-compatible way while remaining human readable; it proposes
> calling format Extensible Diff or DiffX.
>
> I have done only a cursory skimming through the site and though I have not
> done a thorough analysis, I think this is interesting enough to at least
> look through and consider.
>
> I'll give it a more careful reading a bit later and will organize my
> thoughts about it; for now, I just wanted to point out that this exists.
>
> Thoughts/feedback?
>

Apart from the arguments Brane has already brought up (good points, btw!) I
would argue for implementing a format already found somewhere else. I don't
think this has support other than in the existing (commercial) Review Board
product. As the webpage rightly point out, each VCS has its own conventions
and it might be difficult representing some action in a portable way. So we
might end up extending the format in a non-compatible way.

That said - we should also compare it against using XML for xpatch. Is it
easier to integrate diffx into the existing diff engine than outputting a
completely new format? Or can we somehow represent how Subversion actually
apply changes in an easier way if we create the format from scratch? Sorry
I can't be of much help here, it is too technical for me but I'm interested
in learning from the discussion.

Cheers,
Daniel

Reply via email to