Re: Scale up or out?

2012-06-26 Thread Aphyr
On 06/26/2012 07:24 AM, Eric Anderson wrote: Hey all, Question about EC2 (or scale in general): i'm building a decent cluster, to handle 15-20k inserts/s and 5-10k gets per second. (for a rough idea of what I'm doing). I've been playing with a 15-node cluster of m2.xlarge systems, but I am wonde

Bitcask - large keydirs

2011-03-10 Thread Aphyr
TLDR: hey, what about using extendible hashing for bitcask keydirs? Constant-time lookups with two disk seeks end-to-end, much larger keyspaces than currently supportable, but without the total rehashing cost. Also avoids the O(log N) insertion/search/deletion costs of b-trees. At length: I'v

A script to check bitcask keydir sizes

2011-03-16 Thread Aphyr
I'm trying to track some basic metrics so we can plan for cluster capacity, monitor transfers, etc. Figured this might be of interest to other riak admins. Apologies if my erlang is nonidiomatic, I'm still learning. :) #!/usr/bin/env escript %%! -name riakstatuscheck -setcookie riak main([])

Re: Riak.mapValuesJson crashes on \r\n

2011-03-23 Thread Aphyr
Newline parsing is broken in JSON2.js shipped with Riak. Drop a more recent version of JSON2.js in the directory referred to by js_source_dir in app.config's riak_kv section, e.g. {js_source_dir, "/etc/riak/js/"} and reload the node. --Kyle On 03/23/2011 01:31 PM, Michael Ossareh wrote: Gre

Re: JSON with newlines

2011-04-15 Thread Aphyr
Yes, it's because the JSON2.js included with Riak has a bug around newlines. Dropping an updated JSON2.js in your js_source_dir will fix it. --Kyle On 04/15/2011 12:20 AM, Matt Ranney wrote: I'm using Riak Search 0.14.0-1, and it seems like JSON docs with otherwise legal \r characters in them

Re: This sure looks like a bug...?

2011-04-18 Thread Aphyr
I actually had a question about that page. Why is it that when there is a conflict we can only get the conflicting versions of the data? If I'm going to try to resolve the conflict intelligently, I really want the common ancestor as well so that I can try to do a 3-way merge. Good call. If an a

Re: 'not found' after join

2011-05-02 Thread Aphyr
I'd like to chime in here by noting that it would be incredibly nice if the client could distinguish between a record that is missing because the vnode is unavailable, and a record that truly does not exist. My consistency-repair system was running during partition handoff, determined that seve

Re: Authentication

2011-05-03 Thread Aphyr
Any system which presents plaintext is vulnerable; it is simply a matter of complexity. Once you've compromised a layer which processes plaintext, all layers below it are essentially moot, as the Playstation network recently discovered. The only scheme which will defend against data compromise

Re: How many links does it take til you get to the center of the ... ?

2011-05-04 Thread Aphyr
HA! I just ran into this limit! 5,000 links like /tablet_users/whoever riaktag=following will cause your user objects to take upwards of 4 seconds to return, on our beefy cluster, with a variance of ~3 seconds. (Over HTTP, ruby riak_client.) I had to move them into the JSON body, which makes t

Re: Links vs Key Filters for Performance

2011-05-05 Thread Aphyr
The key filter still has to walk the entire keyspace, which will make fetches an O(n) operation as opposed to O(1). --Kyle On 05/05/2011 03:35 PM, Andrew Berman wrote: I was curious if anyone has any thoughts on what is more performant, links or key filters in terms of secondary links. For ex

Re: Links vs Key Filters for Performance

2011-05-05 Thread Aphyr
I suppose if you had a really small number of keys in Riak it might be faster, but you're almost certainly better off maintaining a second object and making the lookup constant time. Here's an example: --Kyle On 05/05/

Re: Millions of buckets?

2011-05-11 Thread Aphyr
Since buckets are essentially key prefixes, I think buckets will probably not make this faster. Maybe one of the riak-search experts knows why your search is taking so long. --Kyle On 05/11/2011 12:00 PM, alexeypro wrote: Generally the problem there that I may end up with N buckets, where N i

Re: Production Backup Strategies

2011-05-13 Thread Aphyr
In the exciting event that your application or riak goes rogue and deletes everything, bitcask will allow you to recover amazing, life-saving amounts of data from its log-structured format. ASK ME HOW I KNOW. :-P Uh, more typically, I've heard that FS-level snapshots of /var/lib/riak or simpl

Mapreduce crosstalk

2011-05-17 Thread Aphyr
I was writing a new mapreduce query to look at users over time, and ran it over a single user in production. After that, other mapreduce jobs over users started returning results from my new map phase, some of the time. After five minutes of this, I had to restart every node in the cluster to g

Bitcask bindings for Ruby

2011-05-18 Thread Aphyr
g vclocks. --Kyle ___ riak-users mailing list

Re: Riak Client Resources, Deleting a Key Doesn't Remove it from bucket.keys

2011-05-26 Thread Aphyr
Agreed. In fact, jrecursive pointed out to me last week that vnode operations are synchronous. That means that when you call list-keys, not only is it going to take a long time (right now upwards of 5 minutes) to complete, but while each vnode is returning its list of keys *it blocks any other

Re: Riak Client Resources, Deleting a Key Doesn't Remove it from bucket.keys

2011-05-26 Thread Aphyr
In software products that have containment metaphors, how often do we see a function return a cached value rather than the up-to-date value, especially for products that manage shared data? Pretty frequently, actually. Every Ruby ORM I've used caches associations by default. Even when listing is

Re: Best practice for using erlang modules in riak?

2011-06-02 Thread Aphyr
On 06/02/2011 01:49 PM, Sylvain Niles wrote: Is there any open source code out there using erlang functions via ripple or rest that I can look at to see a fully functional flow? I made a lot of stupid mistakes getting this to work. Leaving add_paths commented out, not using arrays in arguments

Re: Question: Object Not Saved After Save/Delete/Save

2011-06-03 Thread Aphyr
Riak can't use the vclock for conflict resolution on a fresh object, i.e. one without a vclock. Deletes are writes. You should use get or reload before writing to help Riak sequence your writes correctly. On top of this, Riak has some weirdness around very quick sequences of deletes/writes due

Bitcask-ruby update

2011-06-11 Thread Aphyr
Bitcask-ruby now implements the keydir and knows how to use hintfiles. It's now capable of loading 62,000 keys (from a 535mb bitcask) in 1.5 seconds. We're using this at Showyou to list keys and run various analytics without blocking Riak.

Re: Newbie Ripple

2011-06-21 Thread Aphyr
2. You could write x = Klass.find(key) if x.nil? x = end get_or_new doesn't save, so perhaps Klass.find(key) || Risky (another Ruby Riak model layer) offers Klass.get_or_new(key) 3. control the bucket on which the document is stored/retri

Re: mr_queue gone wild

2011-06-30 Thread Aphyr
The mr_queue is a bitcask, so you should expect it to grow monotonically until compaction. The file size is not an indication of the number of pending jobs. You can read the contents using any bitcask utility. For example, using $ bitcask --no-riak /var

Lots of bitcask files for a vnode, unable to merge

2011-06-30 Thread Aphyr
One of the vnodes on one of my hosts has a *lot* of bitcask data/hint files, and makes a new one every 3 minutes. In the logs, I get =ERROR REPORT 30-Jun-2011::20:24:14 === Failed to merge ["/var/lib/riak/bitcask/794976964837219653749465284983368790965189869568", [], ...HUGE LIST OF DATA

Re: Namespace in Ripple?

2011-07-01 Thread Aphyr
class TcWeb::Root include Ripple::Document bucket_name = 'roots' # or tcweb_roots, whatever ... end On 07/01/2011 08:25 AM, Thomas Fee wrote: I'm currently using Ripple with the application name prepended to the typename in an effort to artificially create a namespace for app, to not colli

Re: Riak crashing due to "eheap_alloc: Cannot allocate xxxx bytes of memory"

2011-07-05 Thread Aphyr
/2011 09:28 PM, Jeff Pollard wrote: Thanks to some help from Aphyr + Sean Cribbs on IRC, we narrowed the issue down to us having several multiple-hundred-megabyte sized documents and one 1.1 gig document. Deletion of those documents has now kept the cluster running quite happily for 3+ hours now,

Re: Bitcask merge

2011-08-25 Thread Aphyr
Have you checked that your bitcask maximum file size is small enough? Bitcask will only merge *inactive* files, so if your active file limit is 500MB and your active file is 320, you won't merge. --Kyle On 08/25/2011 06:43 AM, raghwani sohil wrote: I have deleted all the keys from all buckets

SF talk: Scaling at Showyou

2011-09-06 Thread Aphyr
Hello all, Wanted to say that John Mullerleile and I will be giving a talk on high-volume, high-availability technologies at We'll discuss building an application with Riak, Solr, distributed queuing, and metrics, and present some new open-source tools we've built to tackl

Re: Bitcask folder 0

2011-09-14 Thread Aphyr
Yep, that's partition 0. Partitions are spaced evenly around the hash range, which is [0, 2^160) and cover hashes starting at their name. If you have two partitions, they'll be {0, N/2}. three partitions: {0, N/3, 2N/3}. --Kyle On 09/14/2011 10:56 AM, Jeremy Raymond wrote: In /var/lib/riak/b

Re: Riak security

2011-09-30 Thread Aphyr
We've been over this several times on riak-users, which suggests to me a blog post might help. I'll try to draft something. On 09/30/2011 11:00 AM, Kyle Quest wrote: This is a pretty common situation with the NoSQL databases. They have no security and the standard answer is that it's your job t

Re: Riak security

2011-09-30 Thread Aphyr
On 09/30/2011 01:28 PM, Kyle Quest wrote: Having separate nodes for reads and writes provides an opportunity for better isolation and control even when the requests are forwarded to different vnodes... I humbly suggest this is a bad idea. Varying behavior between nodes a.) is a headache to c

Re: Riak security

2011-09-30 Thread Aphyr
On 09/30/2011 02:50 PM, Kyle Quest wrote: I'm not here to define a perfect infrastructure for securing NoSQL databases and Riak and go into implementation details... It's not my intention because I simply don't have time to dedicate to this big project and it's impossible to come up with a perfec

Systems Security: a Primer

2011-10-02 Thread Aphyr
As promised, a brief overview of designing secure applications, with a quick rundown of how you might expose Riak to the world. --Kyle Kingsbury ___ riak-users mailing list riak-users@lists.bas

Re: Have Riak servers in separate cluster behind a load balancer, or on same machines as web server?

2011-10-04 Thread Aphyr
Option C: Deploy your web servers with a list of hosts to connect to. Have the clients fail over when a riak node goes down. Lower latency without sacrificing availability. If you're using protobufs, this may not be as big of an issue. --Kyle On 10/04/2011 02:04 PM, O'Brien-Strain, Eamonn wro

Re: Have Riak servers in separate cluster behind a load balancer, or on same machines as web server?

2011-10-04 Thread Aphyr
Internode times in our datacenter at SL are indistinguishible from loopback; TCP/IP processing dominates. HTTP, on the other hand, involves either in-depth connection management/multiplexing, or TCP/IP setup/teardown latency at either end of a request. In read-write heavy apps, protobufs outper

Re: Have Riak servers in separate cluster behind a load balancer, or on same machines as web server?

2011-10-04 Thread Aphyr
re. Not how the wire is managed. (and with that said, the Python client managed the wire in the most horrible ways imaginable for the HTTP Client; I've since fixed that on my branch) On Oct 4, 2011 11:37 PM, "Aphyr">> wrote: > Internode times in our data

Re: Need help with Mapreduce

2011-10-05 Thread Aphyr
Like it says, the request that you submitted isn't JSON. MR functions belong in the source attribute of the JSON document, not floating outside it. --Kyle On 10/05/2011 05:19 PM, urvi wrote: I am trying to use this fuctntion to get the highest number from given date. My map is working fine bu

Re: Riak 1.0, Clojure and the Java Client

2011-10-10 Thread Aphyr
On 10/07/2011 04:23 PM, Tim Robinson wrote: I just read the Satebox page you linked as an example and have a hard time thinking I would want to use this. While automation is always nice, the overhead is an unnecessary burden. Since Clojure provides coordinated/transactional data structures, it's

Re: network-based access control

2011-10-17 Thread Aphyr
Yes; front Riak with a proxy which performs the appropriate access control. Note that you'll have to ban (or have a javascript/erlang interpreter to identify/contain incorrect access) mapreduce through this proxy as well. --Kyle On 10/17/2011 10:39 AM, Simon Chen wrote: Hi folks, Is it poss

Do not expose Riak to the Internet

2011-10-19 Thread Aphyr
With Eric Redmond's permission*, I am releasing a proof-of-concept exploit which uses Riak's mapreduce API to execute arbitrary Erlang code and obtain shell access. Please stop doing this. --Kyle * Eric (http://crudco

Re: Do not expose Riak to the Internet

2011-10-19 Thread Aphyr
On 10/19/2011 04:36 PM, Nate Lawson wrote: You can call 'os:cmd' to shell out from a M-R job. You can't do that directly in MySQL. No, but you can do other interesting things. Writing binaries to the filesystem will get you quite a ways. :) --Kyle __

Re: Do not expose Riak to the Internet

2011-10-20 Thread Aphyr
failsafes.** The answer is not to ban mapreduce (or distributed code execution of any kind). The answer is to avoid running code from people in dark alleys on a system you care about.*** :) --Kyle *!/aphyr/status/124275497042591746/photo/1/large ** e.g. http://www.cvedetai

Re: Key Filter Timeout

2011-10-23 Thread Aphyr
prepare to stop using key filters, bucket listing, and key listing early. Our current strategy is to store the keys in Redis, and synchronize them with post-commit hooks and a process that reads over bitcask. With ionice 3, it's fairly low-impact. may be

Re: Severe problems when adding a new node

2011-10-28 Thread Aphyr
I was waiting for Basho to write an official notice about this, but it's been three days and I really don't want anyone else to go through this shitshow. 1.0.1 contains a race condition which can cause vnodes to crash during partition drop. This crash will kill the entire riak process. On our

Re: key name conventions

2011-10-28 Thread Aphyr
Yep, two buckets: one for users, one for users_by_id. Or, you could use secondary indexes, and not worry about keeping the ids in sync. For ID generation, UUIDs will work, SHA1s will work, or you could use an ID generation s

Re: atomically updating multiple keys

2011-10-30 Thread Aphyr
One easy way to solve an atomic a->b relationship in an eventually consistent way is to require the existence of both A and B in order to consider the write valid, to write A first, and use a message queue to retry writes until both A and B exist. There are other approaches for agreement betwee

Re: safely resolving conflicts on read

2011-11-02 Thread Aphyr
On 11/02/2011 10:40 AM, Justin Karneges wrote: Thanks everyone for these replies (and also Aphyr, off-list). It has helped me confirm my suspicions and sounds like I'm on the right track. For one of my keys, I am doing sort of a manual "last write wins" by having the reader s

Re: best practices for testing eventual consistency?

2011-11-15 Thread Aphyr
The fastest thing is probably to generate conflicts right below the conflict resolution system. If you are worried you can't predict the conflicts at all, go ahead and perform multiple reads and writes at overlapping times. No need for excessive load; controlling the timing alone should be suff

Re: Social network data / Graph properties of Riak

2011-11-18 Thread Aphyr
Depending on whether you think it will be more efficient to store the graph or its dual, consider each node a vertex and write the adjacency list as a part of its data. You can store whatever weights, etc. you need on the edges there. Don't use links; they're just a thin layer on top of mapred

Re: Social network data / Graph properties of Riak

2011-11-18 Thread Aphyr
On 11/18/2011 11:50 AM, Jeroen van Dijk wrote: And I also didn't include the riak user list for this reply: On Fri, Nov 18, 2011 at 7:04 PM, Aphyr>> wrote: Depending on whether you think it will be more efficient to store the graph or its dual, co

Re: slow 2 node cluster

2011-11-20 Thread Aphyr
On 11/20/2011 05:19 AM, Catalin Constantin wrote: The connection between servers is 10MBytes / sec not 10Mbit / sec. Are you sure? To my knowledge almost no ethernet gear runs at 10 MB/s. It's almost always 10, 100, 1000, or 1 Mb/s. It may be your n_val. If it's the default (3), one of y

Re: slow 2 node cluster

2011-11-20 Thread Aphyr
On 11/20/2011 12:14 PM, Catalin Constantin wrote: I am 100% sure the transfer rate is 10MBytes / second. This is not the problem. In ten years of network administration I have never encountered an ethernet device with a wire rate of 10 MBps. I have, however, encountered frequent confusion ove

Re: slow 2 node cluster

2011-11-20 Thread Aphyr
On 11/20/2011 01:34 PM, Catalin Constantin wrote: To make it simple. No more networking. Just one node (with n = 1) and local tests. The producing of data is a simple CSV file read (ruled out too cause this is fast). Read from the same disk? If you're interleaving every write with a read from

Re: Is Riak appropriate for website metrics?

2011-11-28 Thread Aphyr
For limited mapreduce (where you know the keys in advance) riak would be a fine choice. 500 million keys, n val 3 is readily achievable on commodity hardware; say four nodes with 128GB SSDs. If large-scale mapreduce (more than a few hundred thousand keys) is important, or listing keys is criti

Re: Is Riak appropriate for website metrics?

2011-11-28 Thread Aphyr
uld love to hear more about Mecha if you're willing to share. Feel free to contact me off-list. thanks again, -mike On 11/28/11 2:24 PM, Aphyr wrote: For limited mapreduce (where you know the keys in advance) riak would be a fine choice. 500 million keys, n val 3 is readily achievable on comm

Re: Riak Recap for November 23 - 27

2011-11-28 Thread Aphyr
6) Q --- Are people running riak natively on osx (for development) or running on a vm that matches production? (from kenperkins via #riak) A --- Anyone? (We had a similar thread on the list several months back about this but I figured it couldn't hurt to open it up to more discussion.) We

Re: Riak Client Pooling in python

2011-12-08 Thread Aphyr
I don't know about Python, but we've been attacking this problem in the Ruby client. You might find these useful: Re-entrant threadsafe resource pooling: Node configuration/error tracking:

Re: standby cluster experiment

2011-12-19 Thread Aphyr
On 12/09/2011 11:53 AM, John Loehrer wrote: I am currently evaluating riak. I'd like to be able to do periodic snapshots of /var/lib/riak using LVM without stopping the node. According to a response on this ML you should be able to copy the data directory for eleveldb backend.

Re: Absolute consistency

2012-01-05 Thread Aphyr
On 01/05/2012 11:44 AM, Tim Robinson wrote: Ouch. I'm shocked that is not considered a major bug. At minimum that kind of stuff should be front and center in their wiki/docs. Here I am thinking n 2 on a 3 node cluster means I'm covered when in fact I am not. It's the whole reason I gave Riak

Re: Absolute consistency

2012-01-05 Thread Aphyr
On 01/05/2012 12:12 PM, Tim Robinson wrote: Thank you for this info. I'm still somewhat confused. Why would anyone ever want 2 copies on one physical PC? Correct me if I am wrong, but part of the sales pitch for Riak is that the cost of hardware is lessened by distributing your data across a clu

Re: Absolute consistency

2012-01-05 Thread Aphyr
On 01/05/2012 12:53 PM, Tim Robinson wrote: So with the original thread where with N=3 on 3 nodes. The developer believed each node was getting a copy. When in fact 2 copies went to a single node. So yes, there's redundancy and the "shock" value can go away :) My apologies. That said, I have no

Re: Adding a new machine to a three node cluster cause partition handoff problems

2012-01-10 Thread Aphyr
There's a code snippet in riak 1.0.1 or 1.0.2 release notes which addresses this. Sorry can't find it for you, network here is useless. :( Ivaylo Panitchkov wrote: > >Hello All, > >We have a cluster of three machines (Debian 6.0, 4GB RAM, >riak_1.0.2-1_amd64.deb, n_val: 3) that serves an appli

Re: Pending transfers when joining 1.0.3 node to 1.0.0 cluster

2012-01-18 Thread Aphyr If partition transfer is blocked awaiting [] (as opposed to [kv_vnode] or whatever), There's a snippet in there that might be helpful. --Kyle On Jan 18, 2012, at 1:43 PM, Fredrik Lindström wrote: > After some digging I found a sug

Re: Pending transfers when joining 1.0.3 node to 1.0.0 cluster

2012-01-18 Thread Aphyr
Did you try riak_core_ring_manager:force_update() and force_handoffs() on the old partition owner as well as the new one? Can't recall off the top of my head which one needs to execute that handoff. --Kyle On Jan 18, 2012, at 2:08 PM, Fredrik Lindström wrote: > Thanks for the respon

Re: Pending transfers when joining 1.0.3 node to 1.0.0 cluster

2012-01-18 Thread Aphyr
he owner of 0 partitions. riak-admin ring_status lists various pending ownership handoffs, all of them are between our 3 original nodes. The new node is not mentioned anywhere. I'm really curious about the current state of our cluster. It does look rather exciting :) /F --

Re: Riak for eCommerce

2012-01-21 Thread Aphyr
Side question: dynamo exposes both partial and fully consistent reads. Does anyone know what the conflict semantics are? Last write wins? Actual mvcc? Ahmed Al-Saadi wrote: >I suppose this speaks to DynamoDB's consistent read feature that Vishal >pointed out (though I believe statebox is more

Re: Is Riak a good solution for this problem?

2012-02-12 Thread Aphyr
On 02/12/2012 03:27 AM, Marco Monteiro wrote: I'm considering Riak for the statistics of a site that is approaching a billion page views per month. The plan is to log a little information about each the page view and then to query that data. Honestly, I wouldn't use stock Riak for this; the MR

Re: A couple of questions about Riak

2012-02-16 Thread Aphyr
On 02/16/2012 01:07 AM, Jerome Renard wrote: Hello, I am really interested into Riak but I would like to know if my goals can be achieved with for my project. The use case is the following : - I need to support 10 000 writes/second minimum. Object size will be from 1kb to 5kb Definitely. 1

Re: 1.1 upgrade woes

2012-02-22 Thread Aphyr
I also discovered MR issues during a rolling upgrade to 1.1.0 last night. We had so many MR errors that the 1.1 node crashed altogether, and I had to roll it back to 1.0.3. Basho support is working on that problem. 2012-02-22 00:56:16.429 [error] <0.1615.0> gen_server riak_pipe_vnode_master t

Re: Riak for Messaging Project Question

2012-02-22 Thread Aphyr
On 02/22/2012 02:10 PM, wrote: 1. Is Riak a good fit for this solution going up to and beyond 20 million users (i.e. terabytes upon terabytes added per year)? The better question might be: what do you actually plan to do with that much data? 2. I plan to use 2i, whic

Re: Questions on configuring public and private ips for riak on ubuntu

2012-03-04 Thread Aphyr
ngs does anyone want to help with the question. Thanks, Tim -Original Message- From: "Aphyr" Sent: Sunday, March 4, 2012 10:41pm To: "Tim Robinson" Subject: Re: Questions on configuring public and private ips for riak on ubuntu I can get SSH access over Riak'

Re: Need quick fix for > lineno":466, "message":"SyntaxError: syntax error", "source":"()

2012-03-06 Thread Aphyr
On 03/06/2012 08:11 AM, Ivaylo Panitchkov wrote: Hello guys, We are in production and noticed ALL of the M/R requests failing right after a bulk delete with the following response returned back: lineno":466,"message":"SyntaxError: syntax error","source":"() The problem is now persistent even

Re: Riak problems on Ubuntu (novice user)

2012-03-06 Thread Aphyr
First, for security reasons, don't run Riak on a public IP. Access it through an application proxy, a VPN, or an SSH tunnel if you need. Second, when you change the name of a node, you need to run riak-admin reip r...@my.old.ip ... to update the ring file with the new node name.

Re: Listing buckets causes nodes to misbehave

2012-03-15 Thread Aphyr
Yup, list-keys and list-buckets does this for us too, since Riak 0.14. Bitcask, 6 nodes, physical hardware, 1024 partitions, 100-300 million keys with n_val 3. --Kyle On 03/15/2012 11:17 AM, Armon Dadgar wrote: We are currently running Riak 1.1 in production, using LevelDB with snappy compres

Re: Riak Adoption - What can we do better?

2012-04-21 Thread Aphyr
On 04/21/2012 09:07 AM, Les Mikesell wrote: On Fri, Apr 20, 2012 at 5:00 PM, Kyle Kingsbury wrote: OK, so how about Statebox? We use timestamps to ameliorate the GC problem so long as a given time window. Our hosts are running NTP so it's all cool, ya? Wrong. One of your hosts is not running