Hello there,

I am new to Riak, but we are thinking of migrating some of our data from
mysql into it and running with it for some of our website.

Temporarily we would need to keep the data in sync whilst we make other
changes. So for some time we would be using riak in parallel and
synchronising the data. So there are two processes we need to create:

1) full data import
2) synchrinising changes to the data

We use solr which has a very usable DataImport handler to get many millions
of mysql rows indexed, we also use this for delta-imports based on lists of
unique IDs. Is there any similar technique for Riak? We have 16 million
documents and counting, so we would rather not open a socket and push over
HTTP. Currently the dataimporter selects them, and indexes in about 2 hours
which, as we don't do this often, we can live with. Incremental
synchronisation would be much smaller sets of documents (<1000 per 10 min)
so I am less worried there.

I have seen the PBC API which looks promising but I'd still need to fetch
the rows and push. Does the node you connect to handle the consistent
hashing in this case? Are there any benchmarks for this?

Is there anything else out there for migrating this amount of data?

Jonathan
_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to