Hi all, I'm not sure whether this is the right place for this questions, but AFAIK the RDF python community does not have a general community mailing list like RDF.js?
I was wondering whether there were any libraries / efforts using RDFLib to create ETL pipelines for constructing RDF from various sources? I could definitely use something like that, but couldn't really find anything yet. The RML-based tools don't really work that well for my use cases (json records) and they miss some transparency for debugging / iterop with other libraries when producing triples. I was already starting to thing about a possible API and how it could leverage Dask or Spark to really scale up. I'm not a Python/data engineering expert, so this might come across as naive. ``` Mapping() # lazy execution pipeline object .load(file1.json) # Creates graph from direct json mapping .construct(query1) # Creates new graph containing mapping from file1.json graph .construct(query2) # Creates new graph containing mapping from file1.json graph .load(file2.json) # Creates graph from direct json mapping .construct(query3) # Creates new map graph containing mapping from file2.json graph .collect() # aggregates all constructed graphs into one .check(shacl) # validate the constructed graph against mapping .run() # actually runs the pipeline ``` Best, Miel -- http://github.com/RDFLib --- You received this message because you are subscribed to the Google Groups "rdflib-dev" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/rdflib-dev/32767b1c-ad4c-4c4b-8447-1919154f2427n%40googlegroups.com.
