On Sun, Jun 8, 2025 at 8:57 PM Nathan Hartman <hartman.nat...@gmail.com>
wrote:

> On Tue, May 20, 2025 at 2:07 PM Branko Čibej <br...@apache.org> wrote:
>
>> On 18. 5. 25 21:48, Branko Čibej wrote:
>>
>> XML has the unenviable distinction of being *both* almost unreadable for
>> humans *and* very finicky to parse for machines.
>>
>>
>> There's one other nasty problem with XML: it can't represent every
>> character. There's a test for that, xml_unsafe_author2() in prop_tests.py
>> and discussion at
>>
>>   https://issues.apache.org/jira/browse/SVN-4415
>>
>> but the really painful par is that our comand-line client is quite happy
>> to produce invalid XML. Yeah, the *expected output* in that test case is
>> invalid XML, heh. I've been thinking about how to solve this; we can't use
>> &#*xx*; character entities, we can't use <![CDATA[...]]> sections – both
>> are transparent to invalid XML chars. Of course I'm talking about our XML
>> output here; we could base64- or quoted-printable-encode values that are
>> not valid XML, and we wouldn't be breaking any existing use cases.
>>
>> Well, that's for command-line output. An XML patch format has similar
>> issues. Any patch format does, but XML is especially nasty in that respect.
>>
>> I created SVN-4919 to track this in the client and to annotate the test.
>>
>> -- Brane
>>
>
>
> And then I came across this:
>
> [1] https://diffx.org/
>
> This is a page that proposes enhancing the unidiff format in a backwards-
> and forwards-compatible way while remaining human readable; it proposes
> calling format Extensible Diff or DiffX.
>
> I have done only a cursory skimming through the site and though I have not
> done a thorough analysis, I think this is interesting enough to at least
> look through and consider.
>
> I'll give it a more careful reading a bit later and will organize my
> thoughts about it; for now, I just wanted to point out that this exists.
>
> Thoughts/feedback?
>
> Nathan
>
>
Hi,

This seems pretty interesting after the first look. Like wow, someone
already used to extend plain unidiff format.

However, I see several issues in using this exact format.

1. The initial idea of xpatch was to save the whole *base* content and
properties so we can use our conflict resolver. Technically I think we can
put the base content into a custom section, which should be ignored by any
other unidiff parser, but we will make it so subversion will handle it, and
apply the diff body onto itself. This way we will get both versions (*left*
and *right*) of the file. But still it makes the format too complicated.
Also there is a chance we couldn't implement it parsing patches from a
stream.

2. I don't think this format is made to handle the actual cases. I mean
they designed an abstract format without providing any use-cases. They even
mention that on their website:

*What supports DiffX today?*
DiffX is still in a specification and prototype phase. We are adding
support in Review Board and RBTools.

So, I think we should be still looking for designing our own format.

PS: when will 1.15 be out? ;))

-- 
Timofei Zhakov

Reply via email to