I'm still struggling with some unexpected results in running tests for the Java 
support I'm writing for Grails and Spring Data.

As an example, I ran the full Gorm TCK test suite against my local Riak server 
(0.13.0) and had 3 failures out of 103. Not bad, though my goal is 0 failures 
in 103 tests. :) The weirdness started happening when I ran the test that 
failed during the full test run manually. It passed with no errors. So I got a 
different result when running it manually than what I got when running it in a 
batch.

Another thing that's actually a little concerning is the following two calls to 
Riak's Map/Reduce. I log all the M/R I execute, so on the test that failed, I 
executed that Javascript. Sure enough, no results, which means a failed test. 
In the second run, I added the call to ejsLog() so I could see what's going on, 
and I got a different result:

+-( ~ ):> curl -v -H "Content-Type: application/json" 
http://localhost:8098/mapred -d @-
{"inputs":"grails.gorm.tests.TestEntity","query":[{"map":{"language":"javascript","source":"function(v){ejsLog('/tmp/mapred.log',
 'map input: '+JSON.stringify(v)); var row = Riak.mapValuesJson(v); row[0].id = 
v.key; return row; }"}}]}

> POST /mapred HTTP/1.1
> User-Agent: curl/7.19.7 (universal-apple-darwin10.0) libcurl/7.19.7 
> OpenSSL/0.9.8l zlib/1.2.3
> Host: localhost:8098
> Accept: */*
> Content-Type: application/json
> Content-Length: 234
> 
< HTTP/1.1 200 OK
< Server: MochiWeb/1.1 WebMachine/1.7.2 (participate in the frantic)
< Date: Tue, 30 Nov 2010 20:57:36 GMT
< Content-Type: application/json
< Content-Length: 2
< 

[]

+-( ~ ):> curl -v -H "Content-Type: application/json" 
http://localhost:8098/mapred -d @-
{"inputs":"grails.gorm.tests.TestEntity","query":[{"map":{"language":"javascript","source":"function(v){ejsLog('/tmp/mapred.log',
 'map input: '+JSON.stringify(v)); var row = Riak.mapValuesJson(v); row[0].id = 
v.key; ejsLog('/tmp/mapred.log', 'map output: '+JSON.stringify(row)); return 
row; }"}}]}

> POST /mapred HTTP/1.1
> User-Agent: curl/7.19.7 (universal-apple-darwin10.0) libcurl/7.19.7 
> OpenSSL/0.9.8l zlib/1.2.3
> Host: localhost:8098
> Accept: */*
> Content-Type: application/json
> Content-Length: 297
> 
< HTTP/1.1 200 OK
< Server: MochiWeb/1.1 WebMachine/1.7.2 (participate in the frantic)
< Date: Tue, 30 Nov 2010 20:58:08 GMT
< Content-Type: application/json
< Content-Length: 75
< 

[{"child":"-5040867138647877277","name":"Bob","id":"-7706240328526746461"}]

It's like just by changing the Javascript enough to get a different hash, I got 
a different result. 

What I suspect is happening with the test suite is that M/R scripts are run 
against data sets that aren't yet complete because the tests load several 
entries into the database, then immediately try to query them back out. It 
looks like if that M/R script is run against this incomplete data, I'll keep 
getting the same incorrect result until the M/R script's hash changes. This is 
a complete WAG. All I know is that the code does what it's supposed to when 
Riak returns the results it's supposed to. :) Is there caching I'm running into 
here?

If you could help me figure out why some tests are prone to this problem when 
others aren't, I'd be very appreciative. It seems to be related to tests that 
save multiple objects in a loop. We had talked about getting close to doing an 
M1 release of this sometime soon. I'm still not 100% comfortable with moving 
forward on that until I can get consist and clean test runs. Having tests fail 
randomly doesn't instill a tremendous amount of confidence in me. :/

If the problem is eventual consistency is biting me, how do you work around 
that in a test suite with dozens of tests that hit the server as fast as it can?

Thanks for all the help so far! :)

Jon Brisbin
Portal Webmaster
NPC International, Inc.




_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to