Re: Inconsistent map/reduce results

Dan Reverri Thu, 31 Mar 2011 09:20:31 -0700

Hi Keith,

The cache entry parameter name changed in 0.14 to "map_cache_size". Setting
this parameter to 0 will disable the cache.


Regarding the empty MapReduce results, I'll try to reproduce the issue
locally and narrow down the cause.

Thanks,
Dan

Daniel Reverri
Developer Advocate
Basho Technologies, Inc.
d...@basho.com


On Tue, Mar 29, 2011 at 6:16 PM, Keith Dreibelbis <kdrei...@gmail.com>wrote:

> Followup to this (somewhat old) thread...
>
> I had resolved my problem by putting the vnode_cache_entries=0 thing in
> app.config, doing what Grant said.  But sometime later it began failing
> again.  I was getting misses of 25%-50% on records that should have been
> found by map reduce but weren't.  At that point I tried Rohman's suggestion
> of using a random seed, and that worked around the problem successfully.
>  But this isn't a very satisfying fix.
>
> So the vnode_cache_entries=0 thing doesn't really fix it after all?  Is
> there something else to put in the config that would make this work
> properly, without the random seed hack?  BTW since the original thread I
> have upgraded from 0.13 to 0.14, and the bug is still there.
>
>
> Keith
>
>
> On Thu, Mar 10, 2011 at 6:56 PM, Antonio Rohman Fernandez <
> roh...@mahalostudio.com> wrote:
>
>> if you want to avoid caching ( without configuration ), you can put some
>> random variable in your map or reduce or both... that does the trick for me
>> as the query will be always different:
>>
>> $seed = randomStringHere;
>>
>> {"map":{"language":"javascript","source":"function(v,k,a) {
>> seed='.$seed.'; x=Riak.mapValuesJson(v)[0]; return [v.values[0].data]; }"}
>>
>> Rohman
>>
>> On Thu, 10 Mar 2011 17:47:49 -0800, Keith Dreibelbis <kdrei...@gmail.com>
>> wrote:
>>
>> Thanks for the prompt response, Grant.  I made the configuration change
>> you suggested, and it fixed my problem.
>>  Some followup questions:
>>  - is it possible to configure this dynamically on a per-bucket basis, or
>> just per-server like it is now?
>> - is this fixed in a newer version?
>>
>> On Thu, Mar 10, 2011 at 2:56 PM, Grant Schofield <gr...@basho.com> wrote:
>>
>>> There are currently some bugs in the mapreduce caching system. The best
>>> thing to do would be to disable the feature, on 0.13 you can do this by
>>> editing or adding the vnode_cache_entries to the riak_kv section of your
>>> app.config. The entry would look like:
>>> {vnode_cache_entries, 0},
>>>
>>>  Grant Schofield
>>> Developer Advocate
>>> Basho Technologies
>>>
>>>   On Mar 10, 2011, at 4:16 PM, Keith Dreibelbis wrote:
>>>
>>>  Hi riak-users,
>>> I'm trying to do a map/reduce query from java on a 0.13 server, and get
>>> inconsistent results.  What I'm doing should be pretty simple.  I'm hoping
>>> someone will notice an obvious error in here, or have some insight:
>>>  This is an automated test.  I'm doing a simple query where I'm trying
>>> to get the keys for records with a certain field value.  In SQL it would
>>> look like "SELECT id FROM table WHERE age = '32'".  In java I'm invoking it
>>> like this:
>>>    MapReduceResponse r = riak.mapReduceOverBucket(getBucket())
>>>         .map(JavascriptFunction.anon(func), true)
>>>              .submit();
>>>  where riak is a RiakClient, getBucket() returns the name of the bucket,
>>> and func is a string that looks like:
>>>  function(value, keyData, arg) {
>>>        var data = Riak.mapValuesJson(value)[0];
>>>        if(data.age == "32")
>>>          return [value.key];
>>>       else
>>>          return [];
>>>    }
>>>  No reduce phase.  All entries in the example bucket are json and have
>>> an age field.  This initially works correctly, it gets back the matching
>>> records as expected.  It also works in curl.  It's an automated test, so
>>> each time I run this, it is using a different bucket.  After about a dozen
>>> queries, this starts to fail.  It returns an empty result, when it should
>>> have found records.  It fails in curl at the same time.
>>>  I initially suspected this might have something to do with doing map
>>> reduce too soon after writing, and the write not being available on all
>>> nodes.  However, I changed the bucket schema entries for w,r,rw,dw from
>>> "quorum" to "all", and this still happens (is there another bucket setting I
>>> missed?). In addition, I only have 3 nodes (I'm using the dev123 example),
>>> and am running curl long enough afterwards.
>>>  Here's the strange part that makes me suspicious.  If I make
>>> insignificant changes to the query, for example change the double quotes to
>>> single quotes, add whitespace or extra parentheses, etc, then it suddenly
>>> works again.  It will work on an existing bucket, and on subsequent tests,
>>> but again only about a dozen times before it starts failing again. Same
>>> behavior in curl.  This makes me suspect that the server is doing some
>>> incorrect caching around this js function, based on the function string.
>>>  Any explanation about what's going on?
>>>  Keith
>>>  _______________________________________________
>>> riak-users mailing list
>>> riak-users@lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>
>>>     --
>>
>> [image: line][image: logo] <http://mahalostudio.com> *Antonio Rohman 
>> Fernandez*
>> CEO, Founder & Lead Engineer
>> roh...@mahalostudio.com *Projects*
>> MaruBatsu.es <http://marubatsu.es>
>> PupCloud.com <http://pupcloud.com>
>> Wedding Album <http://wedding.mahalostudio.com>[image: line]
>>
>>
>
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Inconsistent map/reduce results

Reply via email to