Re: May allow_mult cause DoS?

2013-12-18 Thread Russell Brown
Hi,

Can you describe your use case a little? Maybe it would be easier for us to 
help.

On 18 Dec 2013, at 04:32, Viable Nisei  wrote:

> On Wed, Dec 18, 2013 at 8:32 AM, Erik Søe Sørensen  wrote:
> It really is not a good idea to use siblings to represent 1-to-many 
> relations. That's not what it's intended for, nor what it's optimized for...
> Ok, understood.
>  
> Can you tell us exactly why you need Bitcask rather than LevelDB? 2i would 
> probably do it.
> 1) According to 
> http://docs.basho.com/riak/latest/ops/running/backups/#LevelDB-Backups , it's 
> real pain to implement backups with leveldb.
> 2) According to 
> http://docs.basho.com/riak/latest/ops/advanced/backends/leveldb/ , reads may 
> be slower comparing to bitcask, it's critical for us
> 
> Otherwise, storing a list of items under each key could be a solution, 
> depending of course on the number of items per key. (But do perform conflict 
> resolution.)
> Why any conflict resolving is required? As far as I understood, with 
> allow_mult=true riak should just collect all the values written to key 
> without anything additional work? What design decision leads to exponential 
> slowdown and crashes when multiple values allowed for any single key?.. So, 
> what's the REAL purpose of allow_mult=true if it's bad idea to use it for 
> unlimited values per single key?

The real purpose of allow_mult=true is so that writes are never dropped. In the 
case where your application concurrently writes to the same key on two 
different nodes, or on two partitioned nodes, Riak keeps both values. Other 
data stores will lose one of the writes based on timestamp, serialise your 
writes (slow) or simply refuse to accept one or more of them.

It is the job of the client to aggregate those multiple writes into a single 
value when it detects the conflict on read. Conflict resolution is required 
because your data is opaque to Riak. Riak doesn’t know that you’re storing 
lists of values, or JPEGs or JSON. It can’t possibly know how to resolve two 
conflicting values unless it knows the semantics of the values. Riak _does_ 
collect all the values written to a key, but it does so as a temporary measure, 
it expects your application to resolve them to a single value. How many are you 
writing per Key?

Riak’s sweetspot is highly write available applications. If you have the time 
read the Amazon Dynamo paper[1], as it explains the _problems_ Riak solves as 
well as the way in which it solves them. If you don’t have these problems, 
maybe Riak is not the right datastore for you. Solving these problems comes 
with some developer complexity costs. You’ve run into one of them. We have many 
customers who think the trade-off is worth it: that the high availability and 
low-latency makes up for having eventual consistency.

> 
> Ok, documentation contains the following paragraph:
>  
> > Sibling explosion occurs when an object rapidly collects siblings without 
> > being reconciled. This can lead to a myriad of issues. Having an enormous 
> > object in your node can cause reads of that object to crash the entire 
> > node. Other issues are increased cluster latency as the object is 
> > replicated and out of memory errors.
>  
> But there is no point if it related to allow_mult=false or both cases.

Sorry, but I don’t understand what you mean by this statement. The point of 
allow_mult=true is so that writes are not arbitrarily dropped. It allows Riak 
nodes to continue to be available to take writes even if they can’t communicate 
with each other. Have a look at Kyle Kingsbury’s Jepsen[2] post on Riak.

> 
> So, the only solution is leveldb+2i?

Maybe. Or maybe just use the client as it is intended to resolve sibling values 
and send that value and a vector clock back to Riak. Or maybe roll your own 
indexes like in this blog post[3]. With Riak 2.0 there are a few data types 
added to Riak that are not opaque. Maybe Riak’s Sets would suit your purpose 
(depending on the size of your Set.)

There is a wealth of data modelling experience at Basho and on this list. The 
more information you give us about your problem, (rather than describing what 
you perceive to be Riak’s shortcomings), the more likely you are to be able to 
benefit from that experience.

You’re fighting the database at the moment, rather than working with it. The 
properties of Riak buy you some wonderful things (high availability, partition 
tolerance, low latency) but you have to want / need those properties, and then 
you have to accept that there is a data modelling / developer complexity price 
to pay. We don’t think that price is too high. We have many customers who 
agree. We’re always working to lower that price (see Strong Consistency, 
Yokozuna, Data Types etc in Riak 2.0[4].)

You seem to have had a very negative first experience of Riak (and Basho.) I 
think that is because you misunderstand what it is for and how it should be 
used. I'm very keen to fix that. If it turns out that Riak

Improved C++ client, python3 bindings

2013-12-18 Thread Pedro Larroy
Hi

I'm improving the C++ client.

Adding python3 bindings with boost_python

and creating an easy C++ class to use the client with less effort.

https://github.com/larroy/riak_python3

Pedro
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: accessing CRDTs in riak 2.0

2013-12-18 Thread Russell Brown
Hi James,

We’re working on docs. There are some edocs at the top of riak_kv_wm_crdt that 
describe the HTTP API, that I’ve put on DropBox here 
https://www.dropbox.com/s/bcdn2q2owgv4jxl/riak_kv_wm_crdt.html, though we are 
still pre-freeze on this code, so APIs change.

As for the PB messages, it might be best at this time to look at the proto file 
here https://github.com/basho/riak_pb/blob/develop/src/riak_dt.proto

Sorry that that is all there is right now. When we have examples, I’ll get them 
on the list. In the meantime, there is the riak-erlang-client and the 
riak-erlang-http-client, as others have said.

Cheers

Russell

On 17 Dec 2013, at 20:08, James Moore  wrote:

> mainly a spec for the straight http or pb APIs, as far as I understand the 
> only client with explicit support right now is erlang.
> 
> thanks!
> 
> --James
> 
> 
> On Tue, Dec 17, 2013 at 3:06 PM, Brian Roach  wrote:
> Hi James,
> 
> Do you mean via the Erlang client, or one of the other client libs, or ... ?
> 
> Thanks,
> - Roach
> 
> On Tue, Dec 17, 2013 at 12:42 PM, James Moore  wrote:
> > Hey all,
> >
> > I'm working on testing out some of the CRDT features but haven't been able
> > to sort through the incantations to store/query any CRDT other than a
> > counter.  any tips?
> >
> > thanks,
> >
> > --James
> >
> > ___
> > riak-users mailing list
> > riak-users@lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: /riak-cs/stats broken

2013-12-18 Thread Hector Castro
Hi Dan,

Apologizes for the delayed response.

I reproduced the HTTP 403 when `admin_auth_enabled` is set to `false`,
but successfully retrieved stats with it set to `true`:

$ riak-cs version
1.4.2
$ grep "admin_auth_enabled" /etc/riak-cs/app.config
 {admin_auth_enabled, true},
$ ./s3curl.pl --id admin --contentType application/json -- -s
--proxy1.0 localhost:8080 http://s3.amazonaws.com/riak-cs/stats
{"legend":["meter_count","meter_rate","latency_mean","latency_median","latency_95","latency_99"],"block_get":[0,0.0,0.0,0.0,0.0,0.0],"block_get_retry":[0,0.0,0.0,0.0,0.0,0.0],"block_put":[0,0.0,0.0,0.0,0.0,0.0],"block_delete":[0,0.0,0.0,0.0,0.0,0.0],"service_get_buckets":[0,0.0,0.0,0.0,0.0,0.0],"bucket_list_keys":[0,0.0,0.0,0.0,0.0,0.0],"bucket_create":[0,0.0,0.0,0.0,0.0,0.0],"bucket_delete":[0,0.0,0.0,0.0,0.0,0.0],"bucket_get_acl":[0,0.0,0.0,0.0,0.0,0.0],"bucket_put_acl":[0,0.0,0.0,0.0,0.0,0.0],"object_get":[0,0.0,0.0,0.0,0.0,0.0],"object_put":[0,0.0,0.0,0.0,0.0,0.0],"object_head":[0,0.0,0.0,0.0,0.0,0.0],"object_delete":[0,0.0,0.0,0.0,0.0,0.0],"object_get_acl":[0,0.0,0.0,0.0,0.0,0.0],"object_put_acl":[0,0.0,0.0,0.0,0.0,0.0],"legend":["workers","overflow","size"],"request_pool":[127,0,1],"bucket_list_pool":[5,0,0]}%

The following issue can be used to track the scenario where
`admin_auth_enabled` returns a 403 when set to `false`:

https://github.com/basho/riak_cs/issues/719

--
Hector


On Thu, Dec 12, 2013 at 12:30 PM, Sajner, Daniel G  wrote:
> Hi.
>
>
>
> Running riak-cs-1.4.2 here and the /riak-cs/stats endpoint appears to be
> broken.  I get 403 error regardless of the configuration settings.
>
>
>
> Here’s a snippet from app.config:
>
>
>
>   %% Port and IP address to listen on for system
>
>   %% administration tasks. Uncomment the following lines
>
>   %% to use a separate IP and port for administrative
>
>   %% API calls.
>
>   {admin_ip, "127.0.0.1"},
>
>   {admin_port, 8000 } ,
>
>   {admin_auth_enabled, false },  ß This should let me access the
> /riak-cs/stats endpoint without an authorization
>
>
>
>
>
> I’ve changed admin_auth_enabled to true, but get the same results passing an
> authorization header.
>
>
>
> Anyone else run into this issue?
>
>
>
> Thanks,
>
> Dan
>
>
>
> Confidentiality Notice: This electronic message transmission, including any
> attachment(s), may contain confidential, proprietary, or privileged
> information from Chemical Abstracts Service (“CAS”), a division of the
> American Chemical Society (“ACS”). If you have received this transmission in
> error, be advised that any disclosure, copying, distribution, or use of the
> contents of this information is strictly prohibited. Please destroy all
> copies of the message and contact the sender immediately by either replying
> to this message or calling 614-447-3600.
>
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Client interface

2013-12-18 Thread Nicholas Wieland
Hi, I'm a riak newbie and I'm planning to use it for my next project.
It would be very useful to have a way to see inside buckets graphically
(like phpmyadmin), a project similar to riak-control but for data.
Does anybody know of something like this? Something basic is ok, I don't
need anything fancy.

TIA,
  ngw
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Client interface

2013-12-18 Thread Shane McEwan
On 18/12/13 16:29, Nicholas Wieland wrote:

Hi, I'm a riak newbie and I'm planning to use it for my next project.
It would be very useful to have a way to see inside buckets graphically
(like phpmyadmin), a project similar to riak-control but for data.
Does anybody know of something like this? Something basic is ok, I don't
need anything fancy.


If you're using LevelDB as your backend you could try Levelweb: 
https://github.com/hij1nx/levelweb


Shane.


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Client interface

2013-12-18 Thread Sean Cribbs
Sorry to burst your bubble Shane, but the way we encode Riak keys in
LevelDB is going to be mostly useless in levelweb.

Nicholas: It's not ideal (list-buckets and list-keys caveats apply), but
for development purposes, you can use rekon: https://github.com/basho/rekon


On Wed, Dec 18, 2013 at 10:34 AM, Shane McEwan  wrote:

> On 18/12/13 16:29, Nicholas Wieland wrote:
>
>> Hi, I'm a riak newbie and I'm planning to use it for my next project.
>> It would be very useful to have a way to see inside buckets graphically
>> (like phpmyadmin), a project similar to riak-control but for data.
>> Does anybody know of something like this? Something basic is ok, I don't
>> need anything fancy.
>>
>
> If you're using LevelDB as your backend you could try Levelweb:
> https://github.com/hij1nx/levelweb
>
> Shane.
>
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>



-- 
Sean Cribbs 
Software Engineer
Basho Technologies, Inc.
http://basho.com/
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: 404 Error: Object Not Found

2013-12-18 Thread Ari King
Hi Hector,

Hopefully this one resolves it:
>
> I just setup a single node on Ubuntu 12.04 using Vagrant and attempted
> to walk through your steps. In the process, I noticed that you are
> issuing the following `curl` command and receiving a 404:
>
> $ curl -v -XPUT http://10.0.2.15:8098/sessions/GA123D971 -H
> "content-type: application/json" -d '{"username": "Ari", "token":
> "20131122-GA123D971-1210"}'
>
> The path for storing a key named "GA123D971" in a bucket named
> "sessions" should be:
>
> $ curl -v -XPUT http://10.0.2.15:8098/buckets/sessions/keys/GA123D971
> -H "content-type: application/json" -d '{"username": "Ari", "token":
> "20131122-GA123D971-1210"}'
>
> The response for this request should contain a HTTP 204 No Content


Unfortunately, I get the following error:

* upload completely sent off: 49out of 49 bytes
< HTTP/1.1 500 Internal Server Error
< Vary: Accept-Encoding
< Server: MochiWeb/1.1 WebMachine/1.10.0 (never breaks eye contact)
< Date: Wed, 18 Dec 2013 18:13:31 GMT
< Content-Type: text/plain
< Content-Length: 22

Error:
all_nodes_down
* Connection #0 to host 192.168.2.25 left intact
* Closing connection #0

Any ideas?
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Fwd: May allow_mult cause DoS?

2013-12-18 Thread Viable Nisei
-- Forwarded message --
From: Viable Nisei 
Date: Thu, Dec 19, 2013 at 2:11 AM
Subject: Re: May allow_mult cause DoS?
To: Russell Brown 


Hi.

Thank you for your descriptive and so informative answer very much.

On Wed, Dec 18, 2013 at 3:29 PM, Russell Brown  wrote:

> Hi,
>
> Can you describe your use case a little? Maybe it would be easier for us
> to help.
>
Yeah, let me describe some abstract case equivalent to our. Let we have
CUSTOMER object, STORE object and TRANSACTION object, each TRANSACTION has
one tribool attribute STATE={ACTIVE, COMPLETED, ROLLED_BACK}.

We should be able to list all the TRANSACTIONs of given CUSTOMER, for
example (so we should establish 1-many relation, this list should not be
long, 10^2-10^3 records, but we should be able to obtain this list fast
enough). Also we should be able to list all the TRANSACTIONs of given STATE
made in given STORE (lists may be very long, up to 10^8 records), but these
list may be computed with some latency. Predictable latency is surely
preferred but is not show-stopper. So, that's all.

Another pain is races and/or operations atomicity, but it's not so
important at current time.


> On 18 Dec 2013, at 04:32, Viable Nisei  wrote:
>
> > On Wed, Dec 18, 2013 at 8:32 AM, Erik Søe Sørensen 
> wrote:
> > It really is not a good idea to use siblings to represent 1-to-many
> relations. That's not what it's intended for, nor what it's optimized for...
> > Ok, understood.
> >
> > Can you tell us exactly why you need Bitcask rather than LevelDB? 2i
> would probably do it.
> > 1) According to
> http://docs.basho.com/riak/latest/ops/running/backups/#LevelDB-Backups ,
> it's real pain to implement backups with leveldb.
> > 2) According to
> http://docs.basho.com/riak/latest/ops/advanced/backends/leveldb/ , reads
> may be slower comparing to bitcask, it's critical for us
> >
> > Otherwise, storing a list of items under each key could be a solution,
> depending of course on the number of items per key. (But do perform
> conflict resolution.)
> > Why any conflict resolving is required? As far as I understood, with
> allow_mult=true riak should just collect all the values written to key
> without anything additional work? What design decision leads to exponential
> slowdown and crashes when multiple values allowed for any single key?.. So,
> what's the REAL purpose of allow_mult=true if it's bad idea to use it for
> unlimited values per single key?
>
> The real purpose of allow_mult=true is so that writes are never dropped.
> In the case where your application concurrently writes to the same key on
> two different nodes, or on two partitioned nodes, Riak keeps both values.
> Other data stores will lose one of the writes based on timestamp, serialise
> your writes (slow) or simply refuse to accept one or more of them.
>
Ok, but documentation doesn't make points really clear.


>
> It is the job of the client to aggregate those multiple writes into a
> single value when it detects the conflict on read. Conflict resolution is
> required because your data is opaque to Riak. Riak doesn’t know that you’re
> storing lists of values, or JPEGs or JSON. It can’t possibly know how to
> resolve two conflicting values unless it knows the semantics of the values.
> Riak _does_ collect all the values written to a key, but it does so as a
> temporary measure, it expects your application to resolve them to a single
> value. How many are you writing per Key?
>
As I said before, we need really many values in our 1-many sets - up to
10^8
Also why not to implement separate bucket mode allowing just to collect all
the values writing? Anyway, current allow_mult implementation looks like
very dangerous. Also documentation should be more clear - in "sibling
explosion" paragraph some statement should be added pointing that this
relates to allow_mult=true too.


> Riak’s sweetspot is highly write available applications. If you have the
> time read the Amazon Dynamo paper[1], as it explains the _problems_ Riak
> solves as well as the way in which it solves them. If you don’t have these
> problems, maybe Riak is not the right datastore for you. Solving these
> problems comes with some developer complexity costs. You’ve run into one of
> them. We have many customers who think the trade-off is worth it: that the
> high availability and low-latency makes up for having eventual consistency.
>
> Yeah, ok, but what riak<2.0 really allows? FTS looks unscalable (am I
right? is any way to speed-up it available?), list of all bucket keys is
not for production, 2i is not implemented for bitcask (anyway, we'll try
them on leveldb), links "implemented as hacks in java driver". So, riak<2.0
with bitcask is only good distributed 1-1 hashmap with mapred support.

>
> > Ok, documentation contains the following paragraph:
> >
> > > Sibling explosion occurs when an object rapidly collects siblings
> without being reconciled. This can lead to a myriad of issues. Having an
> enormous object in yo

strange error with risk 1.3.1 on OS X

2013-12-18 Thread José G. Quenum
Hi all,
I am using riak 1.3.1-x86_64 on OS X Mavericks. After installing the riak 
server I decided to first test if it was working fine. So I issued a few cURL 
commands. First to add with curl -i -v -XPUT 
http://127.0.0.1:8098/riak/users_development/kemy -H "Content-Type: 
application/json" -d '{"bar":"baz"}'
and it returned a 204 code result. Then I decided to fetch the same data with 
curl -i -v -XGET http://127.0.0.1:8098/riak/users_development/kemy
Surprisingly it returns a 500 Internal server error:
< HTTP/1.1 500 Internal Server Error
HTTP/1.1 500 Internal Server Error
< Vary: Accept-Encoding
Vary: Accept-Encoding
* Server MochiWeb/1.1 WebMachine/1.9.2 (someone had painted it blue) is not 
blacklisted
< Server: MochiWeb/1.1 WebMachine/1.9.2 (someone had painted it blue)
Server: MochiWeb/1.1 WebMachine/1.9.2 (someone had painted it blue)
< ETag: "2HyL34gyPBEVz0eVHxraXE"
ETag: "2HyL34gyPBEVz0eVHxraXE"
< Date: Wed, 18 Dec 2013 11:06:21 GMT
Date: Wed, 18 Dec 2013 11:06:21 GMT
< Content-Type: text/html
Content-Type: text/html
< Content-Length: 1009
Content-Length: 1009

< 
500 Internal Server ErrorInternal 
Server ErrorThe server encountered an error while processing this 
request:[{erlang,localtime_to_universaltime,[{{2013,12,18},{12,6,17}},true],[]},
 {calendar,local_time_to_universal_time_dst,1,
   [{file,"calendar.erl"},{line,282}]},
 {httpd_util,rfc1123_date,1,[{file,"httpd_util.erl"},{line,344}]},
 {webmachine_decision_core,decision,1,
   [{file,"src/webmachine_decision_core.erl"},
{line,543}]},
 {webmachine_decision_core,handle_request,2,
   [{file,"src/webmachine_decision_core.erl"},
{line,33}]},
 {webmachine_mochiweb,loop,1,[{file,"src/webmachine_mochiweb.erl"},{line,97}]},
 {mochiweb_http,parse_headers,5,[{file,"src/mochiweb_http.erl"},{line,180}]},
* Connection #0 to host 127.0.0.1 left intact
 
{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]mochiweb+webmachine
 web server
So it can store data but cannot fetch them. The whole thing got confusing when 
I realized that I could delete. And then when I try to fetch it returns a 404 
error code, which is expected.
I have really no clue why this is happening. Does anyone have any idea?
thanks in advance,
José___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: strange error with risk 1.3.1 on OS X

2013-12-18 Thread Luke Bakken
I think you're hitting this timezone bug:

http://erlang.org/pipermail/erlang-questions/2013-January/071698.html

Could you update your Riak version, which should update the Erlang being used?
--
Luke Bakken
CSE
lbak...@basho.com


On Wed, Dec 18, 2013 at 3:33 AM, "José G. Quenum"  wrote:
> Hi all,
> I am using riak 1.3.1-x86_64 on OS X Mavericks. After installing the riak
> server I decided to first test if it was working fine. So I issued a few
> cURL commands. First to add with curl -i -v -XPUT
> http://127.0.0.1:8098/riak/users_development/kemy -H "Content-Type:
> application/json" -d '{"bar":"baz"}'
> and it returned a 204 code result. Then I decided to fetch the same data
> with curl -i -v -XGET http://127.0.0.1:8098/riak/users_development/kemy
> Surprisingly it returns a 500 Internal server error:
> < HTTP/1.1 500 Internal Server Error
> HTTP/1.1 500 Internal Server Error
> < Vary: Accept-Encoding
> Vary: Accept-Encoding
> * Server MochiWeb/1.1 WebMachine/1.9.2 (someone had painted it blue) is not
> blacklisted
> < Server: MochiWeb/1.1 WebMachine/1.9.2 (someone had painted it blue)
> Server: MochiWeb/1.1 WebMachine/1.9.2 (someone had painted it blue)
> < ETag: "2HyL34gyPBEVz0eVHxraXE"
> ETag: "2HyL34gyPBEVz0eVHxraXE"
> < Date: Wed, 18 Dec 2013 11:06:21 GMT
> Date: Wed, 18 Dec 2013 11:06:21 GMT
> < Content-Type: text/html
> Content-Type: text/html
> < Content-Length: 1009
> Content-Length: 1009
>
> <
> 500 Internal Server
> ErrorInternal Server ErrorThe server
> encountered an error while processing this
> request:[{erlang,localtime_to_universaltime,[{{2013,12,18},{12,6,17}},true],[]},
>  {calendar,local_time_to_universal_time_dst,1,
>[{file,"calendar.erl"},{line,282}]},
>  {httpd_util,rfc1123_date,1,[{file,"httpd_util.erl"},{line,344}]},
>  {webmachine_decision_core,decision,1,
>[{file,"src/webmachine_decision_core.erl"},
> {line,543}]},
>  {webmachine_decision_core,handle_request,2,
>[{file,"src/webmachine_decision_core.erl"},
> {line,33}]},
>
> {webmachine_mochiweb,loop,1,[{file,"src/webmachine_mochiweb.erl"},{line,97}]},
>
> {mochiweb_http,parse_headers,5,[{file,"src/mochiweb_http.erl"},{line,180}]},
> * Connection #0 to host 127.0.0.1 left intact
>
> {proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]mochiweb+webmachine
> web server
> So it can store data but cannot fetch them. The whole thing got confusing
> when I realized that I could delete. And then when I try to fetch it returns
> a 404 error code, which is expected.
> I have really no clue why this is happening. Does anyone have any idea?
> thanks in advance,
> José
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


SV: May allow_mult cause DoS?

2013-12-18 Thread Rune Skou Larsen
Save the transaction list inside the customer object keyed by customerid. Index 
this object with 2i on storeids for each contained tx.

If some customer objects grow too big, you can move old txs into archive 
objects keyed by customerid_seqno. For your low latency customer reads, you 
probably only need the newest txs anyway.

That's just one idea. Trifork will be happy to help you find a suitable model 
for your use cases.

We usually do this by stress-testing a simulation with realistic data 
sizes/shapes and access patterns. It's fastest if we come onsite for a couple 
of days and work with you to set it up, but we can also help you offsite.

Write me if you're interested, then we can do a call.

Rune Skou Larsen
Trifork, Denmark


- Reply message -
Fra: "Viable Nisei" 
Til: "riak-users@lists.basho.com" 
Emne: May allow_mult cause DoS?
Dato: ons., dec. 18, 2013 20:13





-- Forwarded message --
From: Viable Nisei mailto:vsni...@gmail.com>>
Date: Thu, Dec 19, 2013 at 2:11 AM
Subject: Re: May allow_mult cause DoS?
To: Russell Brown mailto:russell.br...@me.com>>


Hi.

Thank you for your descriptive and so informative answer very much.

On Wed, Dec 18, 2013 at 3:29 PM, Russell Brown 
mailto:russell.br...@me.com>> wrote:
Hi,

Can you describe your use case a little? Maybe it would be easier for us to 
help.
Yeah, let me describe some abstract case equivalent to our. Let we have 
CUSTOMER object, STORE object and TRANSACTION object, each TRANSACTION has one 
tribool attribute STATE={ACTIVE, COMPLETED, ROLLED_BACK}.

We should be able to list all the TRANSACTIONs of given CUSTOMER, for example 
(so we should establish 1-many relation, this list should not be long, 
10^2-10^3 records, but we should be able to obtain this list fast enough). Also 
we should be able to list all the TRANSACTIONs of given STATE made in given 
STORE (lists may be very long, up to 10^8 records), but these list may be 
computed with some latency. Predictable latency is surely preferred but is not 
show-stopper. So, that's all.

Another pain is races and/or operations atomicity, but it's not so important at 
current time.


On 18 Dec 2013, at 04:32, Viable Nisei 
mailto:vsni...@gmail.com>> wrote:

> On Wed, Dec 18, 2013 at 8:32 AM, Erik Søe Sørensen 
> mailto:e...@trifork.com>> wrote:
> It really is not a good idea to use siblings to represent 1-to-many 
> relations. That's not what it's intended for, nor what it's optimized for...
> Ok, understood.
>
> Can you tell us exactly why you need Bitcask rather than LevelDB? 2i would 
> probably do it.
> 1) According to 
> http://docs.basho.com/riak/latest/ops/running/backups/#LevelDB-Backups , it's 
> real pain to implement backups with leveldb.
> 2) According to 
> http://docs.basho.com/riak/latest/ops/advanced/backends/leveldb/ , reads may 
> be slower comparing to bitcask, it's critical for us
>
> Otherwise, storing a list of items under each key could be a solution, 
> depending of course on the number of items per key. (But do perform conflict 
> resolution.)
> Why any conflict resolving is required? As far as I understood, with 
> allow_mult=true riak should just collect all the values written to key 
> without anything additional work? What design decision leads to exponential 
> slowdown and crashes when multiple values allowed for any single key?.. So, 
> what's the REAL purpose of allow_mult=true if it's bad idea to use it for 
> unlimited values per single key?

The real purpose of allow_mult=true is so that writes are never dropped. In the 
case where your application concurrently writes to the same key on two 
different nodes, or on two partitioned nodes, Riak keeps both values. Other 
data stores will lose one of the writes based on timestamp, serialise your 
writes (slow) or simply refuse to accept one or more of them.
Ok, but documentation doesn't make points really clear.


It is the job of the client to aggregate those multiple writes into a single 
value when it detects the conflict on read. Conflict resolution is required 
because your data is opaque to Riak. Riak doesn’t know that you’re storing 
lists of values, or JPEGs or JSON. It can’t possibly know how to resolve two 
conflicting values unless it knows the semantics of the values. Riak _does_ 
collect all the values written to a key, but it does so as a temporary measure, 
it expects your application to resolve them to a single value. How many are you 
writing per Key?
As I said before, we need really many values in our 1-many sets - up to 10^8
Also why not to implement separate bucket mode allowing just to collect all the 
values writing? Anyway, current allow_mult implementation looks like very 
dangerous. Also documentation should be more clear - in "sibling explosion" 
paragraph some statement should be added pointing that this relates to 
allow_mult=true too.


Riak’s sweetspot is highly write available applications. If you have the time 
read the Amazon Dynamo paper[1], as it explain

Re: accessing CRDTs in riak 2.0

2013-12-18 Thread James Moore
That seems to have done the trick along with the datatype property on
bucket-type creation

Cheers!

--James


On Wed, Dec 18, 2013 at 9:47 AM, Russell Brown  wrote:

> Hi James,
>
> We’re working on docs. There are some edocs at the top of riak_kv_wm_crdt
> that describe the HTTP API, that I’ve put on DropBox here
> https://www.dropbox.com/s/bcdn2q2owgv4jxl/riak_kv_wm_crdt.html, though we
> are still pre-freeze on this code, so APIs change.
>
> As for the PB messages, it might be best at this time to look at the proto
> file here https://github.com/basho/riak_pb/blob/develop/src/riak_dt.proto
>
> Sorry that that is all there is right now. When we have examples, I’ll get
> them on the list. In the meantime, there is the riak-erlang-client and the
> riak-erlang-http-client, as others have said.
>
> Cheers
>
> Russell
>
> On 17 Dec 2013, at 20:08, James Moore  wrote:
>
> > mainly a spec for the straight http or pb APIs, as far as I understand
> the only client with explicit support right now is erlang.
> >
> > thanks!
> >
> > --James
> >
> >
> > On Tue, Dec 17, 2013 at 3:06 PM, Brian Roach  wrote:
> > Hi James,
> >
> > Do you mean via the Erlang client, or one of the other client libs, or
> ... ?
> >
> > Thanks,
> > - Roach
> >
> > On Tue, Dec 17, 2013 at 12:42 PM, James Moore 
> wrote:
> > > Hey all,
> > >
> > > I'm working on testing out some of the CRDT features but haven't been
> able
> > > to sort through the incantations to store/query any CRDT other than a
> > > counter.  any tips?
> > >
> > > thanks,
> > >
> > > --James
> > >
> > > ___
> > > riak-users mailing list
> > > riak-users@lists.basho.com
> > > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> > >
> >
> > ___
> > riak-users mailing list
> > riak-users@lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Ruby riak client: timeout and retry?

2013-12-18 Thread Adam Greene
hey folks,

We have an issue where one of the nodes is taking > 5 secs to respond to a
ping request.  This caused (on our side) haproxy health check to time out,
and... well... a cascade of issues.

So we are fixing that up on our end, but it raised questions about Riak
timeouts on the client side and retry logic.  In ruby riak-client 1.4.2 as
well as master, there doesn't seem to be any timeout logic for protocol
buffers.  Am I mistaken?  A lot of the logic seems to be there
(Riak::Client#recover_from as well as your extensions to TCPSocket), but
not wired together.

Ideally we could set this globally and then override on a per-call basis
(ie, have ping return in < 250ms or error).

Does a feature like this make sense from your perspective and is it on the
road map?  This also may be something that we can help with (in the form of
a PR on github).

Thanks for your time,
adam

-- 

--
CONFIDENTIALITY NOTICE:
This electronic transmission, and any documents attached hereto, may 
contain confidential and/or privileged information. The information is 
intended only for use by the recipient named above. If you are not the 
named addressee, you are not authorized to read, print, retain, copy or 
disseminate this message or any part of it. If you have received this 
electronic message in error, please notify the sender and delete the 
electronic message. Any disclosure, copying, distribution, or use of the 
contents of information received in error is strictly prohibited.
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Maps with multiple buckets

2013-12-18 Thread Bryce Verdier
So in playing with things a little bit, I don't think that this list of 
bucket-key pairs is going to work for me.


I'm using riak counters to keep tabs of various customers ID's as they 
travel through our system. So when Bob first shows up, he's seen by one 
set of servers. Adding 1 to the counter for Bob within some bucket. When 
he interacts with us, we'll see Bob again in another service. And thus 
add 1 to another counter for Bob within another bucket.


so:
buckets/initial/counters/Bob => 1
and:
buckets/interact/counters/Bob => 1

Currently I'm using 2 MR queries to get the list of counts for all 
customers from both buckets and combine these data sets within the 
client. I'm trying to see if its possible to do this within 1 query. 
Maybe return something like:

{"Bob": [1,1]}
in json.

I know that riak_kv_counter:value() requires a RiakObject to get the 
data. I the case of a MR I know the key and that its in another bucket. 
Is it possible to get the RiakObject based on those two items?





On 12/17/2013 05:09 PM, Jeremiah Peschka wrote:


The allowable inputs to an MR map phase include a list of bucket key 
pairs. If you know your keys in advance the problem is solved.


Can you describe a bit more about how you're using MR? Is this an ad 
hoc query? A predictable report? Time based?


---
sent from a tiny portion of the hive mind...
in this case, a phone

On Dec 17, 2013 4:51 PM, "Bryce Verdier" > wrote:


Hi All,

I have a question concerning map-reduce. I have two buckets with
counters enabled that have similar keys to track two different
metrics. At the moment in order to combine these two datasets
together I have to make 2 different map-reduce queries and combine
the data within the client. I'm wondering if/how it might be
possible to combine both of these queries into one. I'm thinking
that Links are a possibility, but I'm not sure if it would or how
viable a solution it would be.

Any and all advice is welcomed.

Thanks in advance,
Bryce

___
riak-users mailing list
riak-users@lists.basho.com 
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Maps with multiple buckets

2013-12-18 Thread John Daily
Alex Moore and I provided some general time series advice and links on 
StackOverflow recently: 
http://stackoverflow.com/questions/19384686/what-is-the-most-efficient-way-to-store-time-series-in-riak-with-heavy-reads

Broadly speaking, issuing dynamic queries via MapReduce is going to be less 
desirable than building responses to the questions you’re going to ask later, 
as the data arrives. As you’ve already seen, writing MapReduce queries is 
rather painful, and Riak’s max performance/availability/scalability is achieved 
when serving key/value requests.

-John


On Dec 18, 2013, at 5:02 PM, Bryce Verdier  wrote:

> So in playing with things a little bit, I don't think that this list of 
> bucket-key pairs is going to work for me.
> 
> I'm using riak counters to keep tabs of various customers ID's as they travel 
> through our system. So when Bob first shows up, he's seen by one set of 
> servers. Adding 1 to the counter for Bob within some bucket. When he 
> interacts with us, we'll see Bob again in another service. And thus add 1 to 
> another counter for Bob within another bucket.
> 
> so:
> buckets/initial/counters/Bob => 1
> and:
> buckets/interact/counters/Bob => 1
> 
> Currently I'm using 2 MR queries to get the list of counts for all customers 
> from both buckets and combine these data sets within the client. I'm trying 
> to see if its possible to do this within 1 query. Maybe return something like:
> {"Bob": [1,1]}
> in json.
> 
> I know that riak_kv_counter:value() requires a RiakObject to get the data. I 
> the case of a MR I know the key and that its in another bucket. Is it 
> possible to get the RiakObject based on those two items?
> 
> 
> 
> 
> On 12/17/2013 05:09 PM, Jeremiah Peschka wrote:
>> The allowable inputs to an MR map phase include a list of bucket key pairs. 
>> If you know your keys in advance the problem is solved.
>> 
>> Can you describe a bit more about how you're using MR? Is this an ad hoc 
>> query? A predictable report? Time based?
>> 
>> ---
>> sent from a tiny portion of the hive mind...
>> in this case, a phone
>> 
>> On Dec 17, 2013 4:51 PM, "Bryce Verdier"  wrote:
>> Hi All,
>> 
>> I have a question concerning map-reduce. I have two buckets with counters 
>> enabled that have similar keys to track two different metrics. At the moment 
>> in order to combine these two datasets together I have to make 2 different 
>> map-reduce queries and combine the data within the client. I'm wondering 
>> if/how it might be possible to combine both of these queries into one. I'm 
>> thinking that Links are a possibility, but I'm not sure if it would or how 
>> viable a solution it would be.
>> 
>> Any and all advice is welcomed.
>> 
>> Thanks in advance,
>> Bryce
>> 
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: May allow_mult cause DoS?

2013-12-18 Thread Viable Nisei
Hi

On Thu, Dec 19, 2013 at 3:07 AM, Rune Skou Larsen  wrote:

> Save the transaction list inside the customer object keyed by customerid.
> Index this object with 2i on storeids for each contained tx.
>
> Not so good idea. Transactions may be running in parallel, but there is no
atomic operations in Riak, or lock managers or UPDATE operation knowing
about blob structure. Risk of race condition is not so high but it exists.


> If some customer objects grow too big, you can move old txs into archive
> objects keyed by customerid_seqno. For your low latency customer reads, you
> probably only need the newest txs anyway.
>
> Yeah, we've considered approaches similar to this, but rejected this due
to race conditions. Also we've considered some kind of DLM (like
ZooKeeper), but if we need DLM, we'll just use hadoop/cassandra/hbase...

That's just one idea. Trifork will be happy to help you find a suitable
> model for your use cases.
>
> Ok, but such idea doesn't look as something mind-blowing...we have
considered this idea and many other approaches. Also what may be anwser for
STORE-TRANSACTION binding? Just mapred?..


> We usually do this by stress-testing a simulation with realistic data
> sizes/shapes and access patterns.

Same for us. We using tsung (scripts are generated, tsung is slightly
automated with some pieces of erlang code) and some custom multithreaded
scenarios like I've mentioned in op-message.


> It's fastest if we come onsite for a couple of days and work with you to
> set it up, but we can also help you offsite.
>

Write me if you're interested, then we can do a call.
>
I'm interested, but for now it looks like that there is no prefect solution
(the only untested approach left is custom indexing on riak side), so I
don't really sure if we should pay to just confirm that there is no real
solution...


On Thu, Dec 19, 2013 at 3:07 AM, Rune Skou Larsen  wrote:

> Save the transaction list inside the customer object keyed by customerid.
> Index this object with 2i on storeids for each contained tx.
>
> If some customer objects grow too big, you can move old txs into archive
> objects keyed by customerid_seqno. For your low latency customer reads, you
> probably only need the newest txs anyway.
>
> That's just one idea. Trifork will be happy to help you find a suitable
> model for your use cases.
>
> We usually do this by stress-testing a simulation with realistic data
> sizes/shapes and access patterns. It's fastest if we come onsite for a
> couple of days and work with you to set it up, but we can also help you
> offsite.
>
> Write me if you're interested, then we can do a call.
>
> Rune Skou Larsen
> Trifork, Denmark
>
>
> - Reply message -
> Fra: "Viable Nisei" 
> Til: "riak-users@lists.basho.com" 
> Emne: May allow_mult cause DoS?
> Dato: ons., dec. 18, 2013 20:13
>
>
>
>
>
> -- Forwarded message --
> From: Viable Nisei mailto:vsni...@gmail.com>>
> Date: Thu, Dec 19, 2013 at 2:11 AM
> Subject: Re: May allow_mult cause DoS?
> To: Russell Brown mailto:russell.br...@me.com>>
>
>
> Hi.
>
> Thank you for your descriptive and so informative answer very much.
>
> On Wed, Dec 18, 2013 at 3:29 PM, Russell Brown  > wrote:
> Hi,
>
> Can you describe your use case a little? Maybe it would be easier for us
> to help.
> Yeah, let me describe some abstract case equivalent to our. Let we have
> CUSTOMER object, STORE object and TRANSACTION object, each TRANSACTION has
> one tribool attribute STATE={ACTIVE, COMPLETED, ROLLED_BACK}.
>
> We should be able to list all the TRANSACTIONs of given CUSTOMER, for
> example (so we should establish 1-many relation, this list should not be
> long, 10^2-10^3 records, but we should be able to obtain this list fast
> enough). Also we should be able to list all the TRANSACTIONs of given STATE
> made in given STORE (lists may be very long, up to 10^8 records), but these
> list may be computed with some latency. Predictable latency is surely
> preferred but is not show-stopper. So, that's all.
>
> Another pain is races and/or operations atomicity, but it's not so
> important at current time.
>
>
> On 18 Dec 2013, at 04:32, Viable Nisei  vsni...@gmail.com>> wrote:
>
> > On Wed, Dec 18, 2013 at 8:32 AM, Erik Søe Sørensen  > wrote:
> > It really is not a good idea to use siblings to represent 1-to-many
> relations. That's not what it's intended for, nor what it's optimized for...
> > Ok, understood.
> >
> > Can you tell us exactly why you need Bitcask rather than LevelDB? 2i
> would probably do it.
> > 1) According to
> http://docs.basho.com/riak/latest/ops/running/backups/#LevelDB-Backups ,
> it's real pain to implement backups with leveldb.
> > 2) According to
> http://docs.basho.com/riak/latest/ops/advanced/backends/leveldb/ , reads
> may be slower comparing to bitcask, it's critical for us
> >
> > Otherwise, storing a list of items under each key could be a solution,
> depending of course on the