Turtle files have structure to them that a line-by-line sampler such as grep 
will break. It's not just about the prefixes but other parts in the Turtle 
files too since many lines only make sense in groups of lines.

If you want to sample an RDF file line-by-line, you need to serialise the file 
into N-Triples and then filter that, using some mechanism.

With a samples N-Triples file, you can then convert back to Turtle. to preserve 
the original prefixes, you can re-add them to the graph when serialising using 
g.bind("prefix", Namespace("namespace")) for each one.

Regards, Nick

------- Original Message -------
On Monday, October 23rd, 2023 at 9:07 AM, Abhay Kujur <agpkuju...@gmail.com> 
wrote:

> Hello,
>
> I am working on a large ttl file of 20 GB, I try to read in using rdflib but 
> the I am getting a error
>
> killed
>
> I am trying to create a smaller file from this file using grep command.
>
> The sample data is 
> [yagoTransitiveType.ttl](https://resources.mpi-inf.mpg.de/yago-naga/yago3.1/yagoTransitiveType.txt)
>
> grep "wordnet_" yagoTransitiveType.ttl >wordnet_yagoTransitiveType.ttl
>
> The problem is that the file don't read the initial prefix like yago: and 
> other, due to which rdflib is not able to parse the ttl file.
>
> import rdflib
> g = rdflib.Graph()
> g.parse('yagoTransitiveType.ttl', format='ttl')
>
> How can I fix the issue either by adding 10 lines after running grep command 
> or any other way?
>
> --
> http://github.com/RDFLib
> ---
> You received this message because you are subscribed to the Google Groups 
> "rdflib-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to rdflib-dev+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/rdflib-dev/43db0536-07ae-4e22-b073-9442ed08a5b0n%40googlegroups.com.

-- 
http://github.com/RDFLib
--- 
You received this message because you are subscribed to the Google Groups 
"rdflib-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to rdflib-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/rdflib-dev/eBIF1CKdPszNqi3sFStwIireRSqIbyzlYuek4W2mD4p2jvjMYvcn4ugURZf0dZTLBXLoZ-Gxnrsl27Ykgo9dk2tLKjjbDN5SW1ANWJAC3EQ%3D%40kurrawong.net.

Reply via email to