On 3 Jun 2011, at 00:55, Jacques wrote: > I'm horrible at character encoding but I don't think so once we use those > strings in the Map Reduce json object. Unless we remove a bunch of > characters from any character sets, I believe we would choke the json parsing > on the riak side since ultimately the job would be read as a utf8 string.
Good point. I didn't think far enough downstream. > <<snip>> > >> I've noticed that there is a secondary erlang format that can be passed for >> map reduce jobs, must I use that? If so, does anyone have an example of a >> generating one of these from within Java? > > The java PB client doesn't currently support the application/x-erlang-binary > content-type for map/reduce jobs. I think that only the erlang pb client does. > > I understand that the current client doesn't support this. I was more > thinking that of using Jinterface to generate the erlang version of the map > reduce job. I haven't worked with it and really don't have any knowledge > around erlang types but figured it might be possible. I guess the question > was whether anybody thought this was feasible. I think that is feasible. Use the OtpOutputStream and OtpInputStream to encode/decode. I'm playing with Jinterface right now to write a basho_bench driver for the Java client so I have that head on. I'll give it some time this weekend. > > <<snip>> > > I was asking on the Riak side. Our real need is multi-get. We're looking for > regular pulls of 20 random bucket/key values. We need to minimize the > latency each of the pulls. I know that we could split this into 20 separate > sockets. (ick... This gets ugly when we're talking about many simultaneous > pulls from multiple servers. I'd rather not create pools of 100 sockets per > requesting server if I could avoid it.) I was wondering whether if we sent > four requests in a row to riak on the same socket, whether it would work on > them all at once or serially. I'm guessing serially. Serially. > > I'm sure I can make it work utilizing the path you referenced above and > base85 or base64. It is a simple return all values map job. Based on what > you're saying it seems like we have three options: > > - Use a string based binary encoding for our bucket/key names (e.g. base64) > - Use a truck load of sockets. (How will Riak perform if we are generating > let's say 200-500 connections per riak node?) > - Try to figure out encoding an erlang version of the map reduce job using > JInterface (assuming the MapReduce api supports binary buckets and keys if > using an erlang content-type). > > Am I missing any options? I'm inclined to see if option 3 is available > unless someone says that is a fool's dream. I like the sound of option 3 also. I'll have a look at it this weekend and get back to you. > > Thanks, > Jacques > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com