First of all, when dealing with large csv files I always use split to shrink them down to manageable size. That way you can do all sorts of parallel loading tricks via bash. Google search for "unix split".
@siculars http://siculars.posterous.com Sent from my rotary phone. On Jul 12, 2012 2:23 AM, <sangeetha.pattabiram...@cognizant.com> wrote: > > > > > > > Thanks Sicular, > > > > Supervisor riak_core_vnode_sup had child undefined started with > riak_core_vnode:start_link() at undefined exit with reason > {timeout,{gen_server,call,[<0.1164.0>,stop]}} in context shutdown_error… > > > > > > trying to insert csv data of 434 MB onto 2 node riak cluster….iam getting > the following error ..could you please share the reason for the same > > …Thanks In advance . attached app.config for your reference . > > > > > > > > This is my another quest need your valuable suggestion please > > Dear team, > > > > > > > > > > > > > > > FYI:we have a 4 quad core intel processor on each server on 2 node > cluster with more than 1 TB of storage > > > > > > > > > I Ihave constructed the 2 node physical machine riak cluster with n_val > 2 and my app.config ,vm.args are attached for your reference.. > > > > > > > > > > > > > > > Please tell me where the bulk inserted data onto riak db gets stored on > Local file system…its taking huge time to load small size itself…how to > tune it to perform to large scale since we deal wit hbigdata of in few > hungred GB’s????????????????? > > > > > > > > > > > > > > > Cmd used:time ./load_data1m Customercalls1m.csv > > > > > > > > > > > > > > > ./load_data100m CustomerCalls100m(got this error so changed default > config of app.config…from 8 MB to 3072 MB > > > > > > > > > escript: exception error: no match of right hand side value > {error,enoent} > > > > > > > > > > > > > > > > > > > > > size > > > > > > > > > > > > Load time > > > > > > > > > > > > No of mappersonapp.config > > > > > > > > > > > > Js-max-vm-mem on app.config > > > > > > > > > > > > Js-thread-stack > > > > > > > > *> y this much time for small data * > > > > > > 100k(10,lakhrows)—5 MB takes > > > > > > > > > > > > 20m39.625 seconds > > > > > > > > > > > > 48 > > > > > > > > > > > > 3 GB 3072MB(changedfromdefault 8MB)since i/p data is large) > > > > > > > > > > > > 3 GB 3072MB(changedfromdefault 8MB)since i/p data is large) > > > > > > > > > > > > > > > 1millionrows---54 MB takes > > > > > > > > > > > > 198m42.375seconds > > > > > > > > > > > > 48 > > > > > > > > > > > > 3 GB 3072MB(changedfromdefault 8MB)since i/p data is large) > > > > > > > > > > > > 3 GB 3072MB(changedfromdefault 8MB)since i/p data is large) > > > > > > > > > > > > > > > > > > > > > . > > > > > > > > > > > > > > > > > > > > > ./load_data script used: > > > > > > > > > > > > > > > #!/usr/local/bin/escript > > > > > > > > > main([Filename]) -> > > > > > > > > > {ok, Data} = file:read_file(Filename), > > > > > > > > > Lines = tl(re:split(Data, "\r?\n", [{return, binary},trim])), > > > > > > > > > lists:foreach(fun(L) -> LS = re:split(L, ","), format_and_insert(LS) > end, Lines). > > > > > > > > > > > > > > > format_and_insert(Line) -> > > > > > > > > > JSON = > io_lib:format("{\"id\":\"~s\",\"phonenumber\":~s,\"callednumber\":~s,\"starttime\":~s,\"endtime\":~s,\"status\":~s}", > Line), > > > > > > > > > Command = io_lib:format("curl -X PUT > http://10.232.5.169:8098/riak/CustomerCalls100k/~s -d '~s' -H > 'content-type: application/json'", [hd(Line),JSON]), > > > > > > > > > io:format("Inserting: ~s~n", [hd(Line)]), > > > > > > > > > os:cmd(Command). > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks in advance!!!!!!!!!!waiting fr the reply…plz anyone help..struck > u pwit hbulk loading…..and make me clear how riak splits the data and gets > loaded on cluster > > > > > > > > > Thanks & regards > > > > > > > > > sangeetha > > > > > > > > > > > > > > > This e-mail and any files transmitted with it are for the sole use of > the intended recipient(s) and may contain confidential and privileged > information. If you are not the intended recipient(s), please reply to the > sender and destroy all copies of the original message. Any unauthorized > review, use, disclosure, dissemination, forwarding, printing or copying of > this email, and/or any action taken in reliance on the contents of this > e-mail is strictly prohibited and may be unlawful. > > > > > > > > > > > > Regards > > Sangeetha > > > > *From:* Alexander Sicular [mailto:sicul...@gmail.com] > *Sent:* Thursday, July 12, 2012 11:30 AM > *To:* Pattabi Raman, Sangeetha (Cognizant) > *Cc:* riak-users@lists.basho.com > *Subject:* Re: reg. storageon Localfile system > > > > check your installations app.config file in the etc folder. > > @siculars > http://siculars.posterous.com > > Sent from my rotary phone. > > On Jul 11, 2012 11:18 PM, <sangeetha.pattabiram...@cognizant.com> wrote: > > > > > > > > Hi team, > > > > Iam very new to riak ,can anyone say on loading onto riak > db 2 node cluster …where does it puts the data on the Local file > system.like we see on autosharding of mongodb or hadoop-hdfs over shards in > the cluster mode ….as in which of riak…………………… > > FYI: > > I use curl in loading the same > > > > -module(time). > > -export([starttime/0]). > > -export([main/1]). > > -export([format_and_insert/1]). > > -export([endtime/0]). > > starttime() -> > > starttime = {{Year,Month,Day},{Hour,Min,Sec}} = erlang:localtime(). > > main([Filename]) -> > > {ok, Data} = file:read_file(Filename), > > Lines = tl(re:split(Data, "\r?\n", [{return, binary},trim])), > > lists:foreach(fun(L) -> LS = re:split(L, ","), format_and_insert(LS) > end, Lines). > > > > format_and_insert(Line) -> > > JSON = > io_lib:format("{\"id\":\"~s\",\"phonenumber\":~s,\"callednumber\":~s,\"starttime\":~s,\"endtime\":~s,\"status\":~s}", > Line), > > Command = io_lib:format("curl -X PUT > http://10.232.5.169:8098/riak/goog1/~s -d '~s' -H 'content-type: > application/json'", [hd(Line),JSON]), > > io:format("Inserting: ~s~n", [hd(Line)]), > > os:cmd(Command). > > > > > > Thanks & regards > > sangeetha > > > > This e-mail and any files transmitted with it are for the sole use of the > intended recipient(s) and may contain confidential and privileged > information. If you are not the intended recipient(s), please reply to the > sender and destroy all copies of the original message. Any unauthorized > review, use, disclosure, dissemination, forwarding, printing or copying of > this email, and/or any action taken in reliance on the contents of this > e-mail is strictly prohibited and may be unlawful. > > > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > This e-mail and any files transmitted with it are for the sole use of > the intended recipient(s) and may contain confidential and privileged > information. If you are not the intended recipient(s), please reply to the > sender and destroy all copies of the original message. Any unauthorized > review, use, disclosure, dissemination, forwarding, printing or copying of > this email, and/or any action taken in reliance on the contents of this > e-mail is strictly prohibited and may be unlawful. >
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com