Of course, this is a Python 2 -> 3 change thing. Yes, Python can just handle 
the fancy chars "better" now, i.e. not needing to encode everything. The W3C 
spec was simply written for less capable encoding systems.

So, unless there's a strong practical reason to revert, I suggest we keep this 
behaviour.

Etienne, would you be interested in creating a small PR for either then flag or 
the documentation?

------- Original Message -------
On Tuesday, August 2nd, 2022 at 05:13, Graham Higgins <gjhigg...@gmail.com> 
wrote:

> On Monday, August 1, 2022 at 5:43:54 PM UTC Etienne Posthumus wrote:
>
>> Thanks for the excellent spelunking Graham.
>
> Happy to help, thanks for the kind words.
>
>> Is it common practice nowadays for most serializers to just do UTF-8 and not 
>> do \-escape sequences anymore? I guess if this has been the behaviour in 
>> rdflib for years now and no-one complains too much, we can just assume it is 
>> OK and keep on doing it.
>
> I don't know about "common practice" but I treat Jena's behaviour as a useful 
> ad hoc yardstick, if it passes muster with Andy Seaborn then it's probably 
> the right way to go.
>
>> Maybe it is a good idea for us to add a line in the docs that the rdflib 
>> serializer intentionally deviates from the spec.
>
> Yes, either document the difference or, given that known-working code still 
> exists, perhaps just enabling strictness by setting an *args flag might be a 
> viable solution ... something along the lines of:
>
> diff --git a/rdflib/plugins/serializers/nt.py 
> b/rdflib/plugins/serializers/nt.py
> index 913dbedf..b73f223f 100644
> --- a/rdflib/plugins/serializers/nt.py
> +++ b/rdflib/plugins/serializers/nt.py
> @@ -38,7 +38,11 @@ class NTSerializer(Serializer):
> )
>
> for triple in self.store:
> - stream.write(_nt_row(triple).encode())
> + stream.write(
> + _nt_row(triple).encode("ascii", "_rdflib_nt_escape")
> + if "w3c" in args
> + else _nt_row(triple).encode()
> + )
>
> class NT11Serializer(NTSerializer):
>
> Which, on casual testing, behaves as desired, producing “<urn:aap> <urn:noot> 
> "mi\u00EBs" .” with the flag set and “<urn:aap> <urn:noot> "miës" .” when not 
> set.
>
> What does the team think?
>
> Cheers,
> Graham
>
> --
> http://github.com/RDFLib
> ---
> You received this message because you are subscribed to the Google Groups 
> "rdflib-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to rdflib-dev+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> [https://groups.google.com/d/msgid/rdflib-dev/1b1503b0-dc7a-40a5-963e-0875a6f4b843n%40googlegroups.com](https://groups.google.com/d/msgid/rdflib-dev/1b1503b0-dc7a-40a5-963e-0875a6f4b843n%40googlegroups.com?utm_medium=email&utm_source=footer).

-- 
http://github.com/RDFLib
--- 
You received this message because you are subscribed to the Google Groups 
"rdflib-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to rdflib-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/rdflib-dev/29cX8Nrj5mGwJ9R7axQsXXgzz3QWlJzlwNrKDBuUhXkRHMsGS_kQuuniXOzdU17ThPPniovQZdlwC6qCexTAV7cKqriwCyyUBxFZ2zP76uY%3D%40kurrawong.net.

Reply via email to