On Monday, August 1, 2022 at 2:14:41 PM UTC Graham Higgins wrote:

> However, the nt serializer code has been through many subsequent changes, 
> I doubt any of the 2009-vintage code remains.


After some forensic work in the commit history, I can confirm that it is 
intentional that RDFLib's N-Triples serialization doesn't conform to the 
W3C N-Triples serialization standard ¹.

It *used to* conform, up until Dec 2021 when the encoding was changed from 
ASCII to UTF-8 ², obviating the need for \-escaping.

The way it used to work was (with ascii encoding) the presence of a 
non-ascii character in the input would trigger an XML processing exception 
that deferred to an error-correcting handler which replaced the non-ascii 
character with the \-escaped correlate. ³
 
¹ “N-Triples strings are sequences of US-ASCII character 
<https://www.w3.org/TR/2004/REC-rdf-testcases-20040210/#character> 
productions encoding [UNICODE 
<https://www.w3.org/TR/2004/REC-rdf-testcases-20040210/#Unicode>] character 
strings <http://www.w3.org/TR/charmod/#def-character-string>. The 
characters outside the US-ASCII range and some other specific characters 
are made available by \-escape sequences”
² 
https://github.com/RDFLib/rdflib/commit/6c47908c026112adecc2fc31dda5c34bf9894a00
³ 
https://github.com/RDFLib/rdflib/blob/d971a5d5026016598ccdaeae242ef11849a2d09f/rdflib/plugins/serializers/nt.py#L80

Cheers,
Graham

-- 
http://github.com/RDFLib
--- 
You received this message because you are subscribed to the Google Groups 
"rdflib-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to rdflib-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/rdflib-dev/e1f0693f-ab0e-4d80-a9a7-3d9db16ed579n%40googlegroups.com.

Reply via email to