I was writing a new mapreduce query to look at users over time, and ran
it over a single user in production. After that, other mapreduce jobs
over users started returning results from my new map phase, some of the
time. After five minutes of this, I had to restart every node in the
cluster to get it to stop.
Every node has {map_cache_size, 0} in riak_kv.
The map phase that screwed things up was:
function(v) {
o = JSON.parse(v.values[0].data);
// Age of account in days
age = Math.round(
(Date.now() - Date.iso8601(o.created_at)) /
(1000 * 60 * 60 * 24)
);
return [['t_user_scores', v.key, age]];
}
It looks like one node started running that phase instead of the
requested phase for subsequent jobs. It *should* have run this one, but
didn't.
function(v) {
o = JSON.parse(v.values[0].data);
return [{
key: v.key,
name: o.name,
thumbnail: o.thumbnail
}];
}
Now I'm scared to run MR jobs. Could it be an issue with returning
keydata? Anybody else seen this before?
--Kyle
_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com