Hi all, For documentation purposes: the main reason why loading *over the platform* is slow is that Marmotta needs to make sure all data is consistent in the case of concurrent access (i.e. two clients try to add triples for the same resource or even the same triple) - which is the assumed default case for all Web data. In case you know in advance that noone else will access the triple store (e.g. by shutting down Marmotta), you can use the KiWiLoader, as pointed out by Raffaele and Sergio. The KiWiLoader applies many typical performance improvements for database bulk loading (like dropping indexes before import, assuming noone else will create the same triple or node during import, keeping large in-memory batches before writing out). With the KiWiLoader we have managed to import both DBPedia and Freebase in reasonable time.
Usage is actually java -jar marmotta-loader-kiwi.jar for KiWi (the documentation on the webpage is a bit too simplified). To a somewhat limited extent, the loader also exists for other backends like Titan, HBase and BerkeleyDB. Greetings, Sebastian 2014-07-03 16:13 GMT+02:00 Sergio Fernández < sergio.fernan...@salzburgresearch.at>: > Hi Adam, > > > On 03/07/14 15:39, Adam Flinton wrote: > >> Loading into marmotta (with the default h2 db backend) seems to take many >> many hours. >> > > H2 is just for demo purposes. For such amount of data you have to switch > to a proper database, PostgreSQL is the commended one. > > > Would it be any quicker using the client library? >> > > Plus using a direct loader: http://marmotta.apache.org/kiwi/loader > > > Are there any tips & tricks e.g. turning off things like versioning which >> are not required for the initial load? >> > > Of course versioning has an impact of importing, but no so relevant. Leave > it enabled if you need it. > > Cheers, > > -- > Sergio Fernández > Senior Researcher > Knowledge and Media Technologies > Salzburg Research Forschungsgesellschaft mbH > Jakob-Haringer-Straße 5/3 | 5020 Salzburg, Austria > T: +43 662 2288 318 | M: +43 660 2747 925 > sergio.fernan...@salzburgresearch.at > http://www.salzburgresearch.at >