On Monday, August 1, 2022 at 2:14:41 PM UTC Graham Higgins wrote: > However, the nt serializer code has been through many subsequent changes, > I doubt any of the 2009-vintage code remains.
After some forensic work in the commit history, I can confirm that it is intentional that RDFLib's N-Triples serialization doesn't conform to the W3C N-Triples serialization standard ¹. It *used to* conform, up until Dec 2021 when the encoding was changed from ASCII to UTF-8 ², obviating the need for \-escaping. The way it used to work was (with ascii encoding) the presence of a non-ascii character in the input would trigger an XML processing exception that deferred to an error-correcting handler which replaced the non-ascii character with the \-escaped correlate. ³ ¹ “N-Triples strings are sequences of US-ASCII character <https://www.w3.org/TR/2004/REC-rdf-testcases-20040210/#character> productions encoding [UNICODE <https://www.w3.org/TR/2004/REC-rdf-testcases-20040210/#Unicode>] character strings <http://www.w3.org/TR/charmod/#def-character-string>. The characters outside the US-ASCII range and some other specific characters are made available by \-escape sequences” ² https://github.com/RDFLib/rdflib/commit/6c47908c026112adecc2fc31dda5c34bf9894a00 ³ https://github.com/RDFLib/rdflib/blob/d971a5d5026016598ccdaeae242ef11849a2d09f/rdflib/plugins/serializers/nt.py#L80 Cheers, Graham -- http://github.com/RDFLib --- You received this message because you are subscribed to the Google Groups "rdflib-dev" group. To unsubscribe from this group and stop receiving emails from it, send an email to rdflib-dev+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/rdflib-dev/e1f0693f-ab0e-4d80-a9a7-3d9db16ed579n%40googlegroups.com.