Hi all,

For documentation purposes: the main reason why loading *over the platform*
is slow is that Marmotta needs to make sure all data is consistent in the
case of concurrent access (i.e. two clients try to add triples for the same
resource or even the same triple) - which is the assumed default case for
all Web data. In case you know in advance that noone else will access the
triple store (e.g. by shutting down Marmotta), you can use the KiWiLoader,
as pointed out by Raffaele and Sergio. The KiWiLoader applies many typical
performance improvements for database bulk loading (like dropping indexes
before import, assuming noone else will create the same triple or node
during import, keeping large in-memory batches before writing out). With
the KiWiLoader we have managed to import both DBPedia and Freebase in
reasonable time.

Usage is actually java -jar marmotta-loader-kiwi.jar for KiWi (the
documentation on the webpage is a bit too simplified). To a somewhat
limited extent, the loader also exists for other backends like Titan, HBase
and BerkeleyDB.

Greetings,

Sebastian


2014-07-03 16:13 GMT+02:00 Sergio Fernández <
sergio.fernan...@salzburgresearch.at>:

> Hi Adam,
>
>
> On 03/07/14 15:39, Adam Flinton wrote:
>
>> Loading into marmotta (with the default h2 db backend) seems to take many
>> many hours.
>>
>
> H2 is just for demo purposes. For such amount of data you have to switch
> to a proper database, PostgreSQL is the commended one.
>
>
>  Would it be any quicker using the client library?
>>
>
> Plus using a direct loader: http://marmotta.apache.org/kiwi/loader
>
>
>  Are there any tips & tricks e.g. turning off things like versioning which
>> are not required for the initial load?
>>
>
> Of course versioning has an impact of importing, but no so relevant. Leave
> it enabled if you need it.
>
> Cheers,
>
> --
> Sergio Fernández
> Senior Researcher
> Knowledge and Media Technologies
> Salzburg Research Forschungsgesellschaft mbH
> Jakob-Haringer-Straße 5/3 | 5020 Salzburg, Austria
> T: +43 662 2288 318 | M: +43 660 2747 925
> sergio.fernan...@salzburgresearch.at
> http://www.salzburgresearch.at
>

Reply via email to