[rdflib-dev] Re: long turtle format and Git

Graham Higgins Sat, 12 Mar 2022 07:39:37 -0800

On Saturday, March 12, 2022 at 11:28:32 AM UTC Nicholas Car wrote:

> Any feedback on this format or on RDF text files and version control in 
> general would be great.
>


Although It’s not advertised as such, the architecture of the BerkeleyDB 
RDFLib Store implementation preserves the order in which triples are added 
(as a a side-effect of key indexing) and so all of the RDFLib Stores based 
on key-value back-ends (BerkelyDB, LevelDB, SQLiteLSM) provide reliably 
repeatable re-serialization suitable for unambitious efforts. I've not yet 
checked but I imagine that the corresponding (again, index-using) 
AbstractSQL-based RDFLib Stores (atm, only rdflib-sqlalchemy) also exhibit 
the same repeatable serialization property.

Otherwise, if you're just interested in manageable diffs of large but 
straightforward graphs, serializing as ntriples/nquads and sorting the 
serialization is also a viable strategy, I've found 
<https://github.com/DOACC/individuals/commit/1515db20e936a085d36a40210cb63237b6f2e837>.
 
However this approach is unsuitable for graphs containing blank nodes as 
BNode serialization *will* differ and afaik, the only mooted solution to 
this is Digital Bazaar's URDNA2015 RDF Dataset Canonicalization 
<https://json-ld.github.io/rdf-dataset-canonicalization/spec/> proposal, 
also now a topic in the RDFLib Github discussions section 
<https://github.com/RDFLib/rdflib/discussions/1545>


-- 
http://github.com/RDFLib
--- 
You received this message because you are subscribed to the Google Groups 
"rdflib-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to rdflib-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/rdflib-dev/f582793a-e22d-404c-9e6d-f3d901a7e76cn%40googlegroups.com.

[rdflib-dev] Re: long turtle format and Git

Reply via email to