Good ' day Christian!
I have two doubts.Please requesting you to do clear me on the same
.Thanks in advance .
1. riak -took 7733m32.525s (nearly 5.3 days) for loading 35 million (1.8 sdata
set)which uses single curl -one node for storage .....
Is there a provision in the below script to make use of two curl's in using
second node also in the 2-node riak cluster ,wherein its now only one node.
2.we deal with bigdata and I need to load max 500million(35 GB) data since my
task involves comparision of various loads on different data bases such
as(hadoop,mongodb,Cassandra db and now riak) finished ensuring 500million on
hadoop ,mongo,Cassandra..struck of on riak since it takes loads of time in
loading ...could u please help me on the same in making use of second node also
in the below script of yours.
Regards
sangeetha
-----Original Message-----
From: Christian Dahlqvist [mailto:christ...@whitenode.com]
Sent: Tuesday, October 09, 2012 3:57 PM
To: Pattabi Raman, Sangeetha (Cognizant)
Cc: sh...@mcewan.id.au; riak-users@lists.basho.com
Subject: Re: riak memstore clarification on enomem error
On 09/10/2012 10:39, sangeetha.pattabiram...@cognizant.com wrote:
Thanks Shane ,
Load script used is as follows (basically a curl)
#!/usr/local/bin/escript
main([Filename]) ->
{ok, Data} = file:read_file(Filename),
Lines = tl(re:split(Data, "\r?\n", [{return, binary},trim])),
lists:foreach(fun(L) -> LS = re:split(L, ","), format_and_insert(LS) end,
Lines).
format_and_insert(Line) ->
JSON =
io_lib:format("{\"id\":\"~s\",\"phonenumber\":~s,\"callednumber\":~s,\"starttime\":~s,\"endtime\":~s,\"status\":~s}",
Line),
Command = io_lib:format("curl -X PUT
http://127.0.0.1:8098/riak/CustCalls35m/~s -d '~s' -H 'content-type:
application/json'", [hd(Line),JSON]),
io:format("Inserting: ~s~n", [hd(Line)]),
os:cmd(Command).
you are right shane .after Loading it I confirm the same by querying the
(1.8GB)35 million dataset with first ,middle and last row
value(1,15000000,35000000) with id column.hence confirmed its stored onto
CustCalls35m bucket of riak db.
Regards
Sangeetha
-----Original Message-----
From: riak-users [mailto:riak-users-boun...@lists.basho.com] On Behalf
Of Shane McEwan
Sent: Tuesday, October 09, 2012 3:00 PM
To: riak-users@lists.basho.com
Subject: Re: riak memstore clarification on enomem error
G'day Sangeetha.
On 09/10/12 07:40, sangeetha.pattabiram...@cognizant.com wrote:
Dear Team ,
I have a 64 GB RAM ,during the Load of 35 million
dataset (1.8 GB) it consumes nearly 40-45 GB of RAM durial the
startup of the erlang script ,but
While trying to load 40 million dataset (2.1 GB) I am getting the
following error
*escript: exception error: no match of right hand side value
{error,enomem}**,*
The error message is coming from escript and not Riak. It's just a guess but
could it be that the script you're using to load your data into Riak is trying
to load all the data into memory before sending it to Riak?
Can you break your dataset into smaller chunks and load them separately?
Or send the data to Riak as you read it from the dataset without storing it all
in memory?
*2.**Is there a provision to make use of the swap memory in riak
config?*
Using swap in this situation is almost always a bad idea. Your script will end
up running so slowly you will be waiting for days, maybe months, for your data
to load.
Shane.
_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
This e-mail and any files transmitted with it are for the sole use of the
intended recipient(s) and may contain confidential and privileged information.
If you are not the intended recipient(s), please reply to the sender and
destroy all copies of the original message. Any unauthorized review, use,
disclosure, dissemination, forwarding, printing or copying of this email,
and/or any action taken in reliance on the contents of this e-mail is strictly
prohibited and may be unlawful.
_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
Hi,
That script does indeed load all lines into memory before processing them one
by one. Try something like this instead:
#!/usr/local/bin/escript
main([Filename]) ->
{ok, IoDev} = file:open(Filename, [read, raw, binary, {read_ahead,
65536}]),
process_file(IoDev).
process_file(IoDev) ->
case file:read_line(IoDev) of
{ok, Data} ->
Line = strip_and_split(Data),
JSON =
io_lib:format("{\"id\":\"~s\",\"phonenumber\":~s,\"callednumber\":~s,\"starttime\":~s,\"endtime\":~s,\"status\":~s}",
Line),
Command = io_lib:format("curl -X PUT
http://127.0.0.1:8098/riak/CustCalls35m/~s -d '~s' -H 'content-type:
application/json'", [hd(Line),JSON]),
io:format("Inserting: ~s~n", [hd(Line)]),
os:cmd(Command),
process_file(IoDev);
eof ->
ok;
{error, Reason} ->
io:format("Error processing file: ~p~n", [Reason]),
error
end.
strip_and_split(Line) ->
[L | _] = re:split(Line, "\n"),
re:split(L, ",").
Best Regards,
Christian
This e-mail and any files transmitted with it are for the sole use of the
intended recipient(s) and may contain confidential and privileged information.
If you are not the intended recipient(s), please reply to the sender and
destroy all copies of the original message. Any unauthorized review, use,
disclosure, dissemination, forwarding, printing or copying of this email,
and/or any action taken in reliance on the contents of this e-mail is strictly
prohibited and may be unlawful.