Hi all,

I'm not sure whether this is the right place for this questions, but AFAIK 
the RDF python community does not have a general community mailing list 
like RDF.js?

I was wondering whether there were any libraries / efforts using RDFLib to 
create ETL pipelines for constructing RDF from various sources? I could 
definitely use something like that, but couldn't really find anything yet. 
The RML-based tools don't really work that well for my use cases (json 
records) and they miss some transparency for debugging / iterop with other 
libraries when producing triples.

I was already starting to thing about a possible API and how it could 
leverage Dask or Spark to really scale up. I'm not a Python/data 
engineering expert, so this might come across as naive.

```
Mapping() # lazy execution pipeline object
.load(file1.json) # Creates graph from direct json mapping
.construct(query1) # Creates new graph containing mapping from file1.json 
graph
.construct(query2) # Creates new graph containing mapping from file1.json 
graph
.load(file2.json) # Creates graph from direct json mapping
.construct(query3) # Creates new map graph containing mapping from 
file2.json graph
.collect() # aggregates all constructed graphs into one
.check(shacl) # validate the constructed graph against mapping
.run() # actually runs the pipeline 

```

Best,

Miel

-- 
http://github.com/RDFLib
--- 
You received this message because you are subscribed to the Google Groups 
"rdflib-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/rdflib-dev/32767b1c-ad4c-4c4b-8447-1919154f2427n%40googlegroups.com.

Reply via email to