If you are having trouble with RDFLib, you could use the Jena command line tool called RIOT - https://jena.apache.org/documentation/io/. It can format converts large files like 20GB.
Use RDFLib or RIOT on a machine with lots of RAM. Nick ------- Original Message ------- On Monday, October 23rd, 2023 at 7:08 PM, Abhay Kujur <agpkuju...@gmail.com> wrote: > Hello. > > Thank you for making suggestions; the problem is handling large files, can > you please suggest any efficient way to transform TTL to N-triple file? > On Monday, October 23, 2023 at 2:14:14 AM UTC+2 ni...@kurrawong.net wrote: > >> Turtle files have structure to them that a line-by-line sampler such as grep >> will break. It's not just about the prefixes but other parts in the Turtle >> files too since many lines only make sense in groups of lines. >> >> If you want to sample an RDF file line-by-line, you need to serialise the >> file into N-Triples and then filter that, using some mechanism. >> >> With a samples N-Triples file, you can then convert back to Turtle. to >> preserve the original prefixes, you can re-add them to the graph when >> serialising using g.bind("prefix", Namespace("namespace")) for each one. >> >> Regards, Nick >> >> ------- Original Message ------- >> >> On Monday, October 23rd, 2023 at 9:07 AM, Abhay Kujur <agpku...@gmail.com> >> wrote: >> >>> Hello, >>> >>> I am working on a large ttl file of 20 GB, I try to read in using rdflib >>> but the I am getting a error >>> >>> killed >>> >>> I am trying to create a smaller file from this file using grep command. >>> >>> The sample data is >>> [yagoTransitiveType.ttl](https://resources.mpi-inf.mpg.de/yago-naga/yago3.1/yagoTransitiveType.txt) >>> >>> grep "wordnet_" yagoTransitiveType.ttl >wordnet_yagoTransitiveType.ttl >>> >>> The problem is that the file don't read the initial prefix like yago: and >>> other, due to which rdflib is not able to parse the ttl file. >>> >>> import rdflib >>> g = rdflib.Graph() >>> g.parse('yagoTransitiveType.ttl', format='ttl') >>> >>> How can I fix the issue either by adding 10 lines after running grep >>> command or any other way? >> >>> -- >>> http://github.com/RDFLib >>> --- >>> You received this message because you are subscribed to the Google Groups >>> "rdflib-dev" group. >>> To unsubscribe from this group and stop receiving emails from it, send an >>> email to rdflib-dev+...@googlegroups.com. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/rdflib-dev/43db0536-07ae-4e22-b073-9442ed08a5b0n%40googlegroups.com. > > -- > http://github.com/RDFLib > --- > You received this message because you are subscribed to the Google Groups > "rdflib-dev" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to rdflib-dev+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/rdflib-dev/a5bf206c-cb43-495b-bbb2-4a093cfaabebn%40googlegroups.com. -- http://github.com/RDFLib --- You received this message because you are subscribed to the Google Groups "rdflib-dev" group. To unsubscribe from this group and stop receiving emails from it, send an email to rdflib-dev+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/rdflib-dev/UFm0xHjxAB4P78Tu7xOLAM8cl7Y9zWRZaHAHie8jPWz-TGRNEDErYekKHVe9e7RvjUq8r3OUnc0VWX5tQvlL5MfKlZUb0OjQ5tALWGg4gL8%3D%40kurrawong.net.