I was looking at the baseball MR example on the blog.
http://basho.com/blog/technical/2011/01/20/Baseball-Batting-Averages-Riak-Map-Reduce/
One thing I was wondering was how the file split mechanism is aware of record
lengths. It doesn't look like the author is using any particular split functio
We're working on a test where we want to add batches of keys to Riak. We're
using the LevelDB backend with 1.0. One suggestion we heard was to take down a
node, batch insert directly into the backing store, and then bring the node up
again.
Could someone give us some more details on this? How w
use one of the clients to load the data
>
> 3. archive the data dir on all nodes
>
> Next time you want to stand up a cluster with the same data simply unarchive
> the data dirs before starting the cluster.
>
> -Ryan
>
>
> On Fri, Oct 7, 2011 at 9:29 PM, Nate Laws
A related question -- does the secondary index implementation make some attempt
to cluster "nearby" integer keys for range queries? In other words, if I have
an integer secondary index on a set of keys, is this taken into account in the
partition function?
Since you have to query the full cover
nd setting w to a low value.
>
> Jeremiah Peschka
> Founder, Brent Ozar PLF
>
> On Oct 11, 2011 8:50 AM, "Nate Lawson" wrote:
> It's not just for testing. We found that loading/updating keys was somewhat
> slow, even with a low replication count. So we're ex
The fact that it allows any client to execute arbitrary code as the database
user, on the database server? You can call 'os:cmd' to shell out from a M-R
job. You can't do that directly in MySQL.
I think this requirement should be extended to: "Don't allow clients to connect
who aren't equivalen
On Oct 20, 2011, at 8:35 PM, Antonio Rohman Fernandez wrote:
> Hi All,
>
> Imagine that I store "some_item" with key "some_key" into a "products" bucket
> with a category "whatever" and a price "300".
> I save it using cURL like this:
>
> $data = 'some_item';
> $key = 'some_key';
>
I know Bitcask has the expiry_secs option for expiring keys, but what about
LevelDB? We're thinking of using Luwak as a file cache frontend to S3, and it
would be nice for older entries to be deleted in LRU order as we store newer
files. This could be implemented as a storage quota also (high/lo
On Nov 7, 2011, at 1:23 PM, andrew cooke wrote:
> Apologies if this is a dumb idea, or I am asking in the wrong place. I'm
> muddling around trying to understand various bits of technology while piecing
> together a possible project. So feel free to tell me I'm wrong :o)
>
> I am considering ho
On Nov 7, 2011, at 5:45 PM, Greg Pascale wrote:
> Hi,
>
> I'm thinking about using 2i for a certain piece of my system, but I'm worried
> that the document-based partitioning may make it suboptimal.
>
> The issue is that the secondary fields I want to query over (email and
> username) are uniq
olution like this. Suddenly a user create
> operation requires n writes to be considered a success. If one fails, I need
> to delete the others, etc… It quickly becomes a pain.
>
> I don't know what you mean by "some relationship between the keys"
>
> --
> Greg
#x27;ll never have half of a user or some dangling index or anything. The
> validity checks at read-time ensure this. But, some periodically run task
> that cleans up your DB with MapReduce operations would be smart.
>
> Maybe there's a better way to do this, but I thought I
We have been looking into ways to cluster keys to benefit from the LevelDB
backend's prefix compression. If we were doing a batch of lookups and the keys
from a given document could be grouped on a partition, they could be read with
less disk IO. However, consistent hashing in Riak currently spr
On Nov 9, 2011, at 3:33 PM, Elias Levy wrote:
> On Wed, Nov 9, 2011 at 3:29 PM, Phil Stanhope wrote:
> Tread carefully here ... by forcing localilty ... you will sacrifice high
> availability by algorithmically creating a bias and a single point of failure
> in the cluster.
>
> You don't have
On Nov 9, 2011, at 3:49 PM, Nate Lawson wrote:
> On Nov 9, 2011, at 3:33 PM, Elias Levy wrote:
>
>> On Wed, Nov 9, 2011 at 3:29 PM, Phil Stanhope wrote:
>> Tread carefully here ... by forcing localilty ... you will sacrifice high
>> availability by algorithmically crea
On Nov 10, 2011, at 8:25 AM, Nitish Sharma wrote:
Hi,
> I am trying to install Riak's python client library using Pip. But it throws
> an IOError while installing: IOError: [Errno 2] No such file or directory:
> 'protobuf/setup.py'. Apparently, a lot of guys are facing the same problem.
> The pr
On Nov 10, 2011, at 4:04 PM, Greg Stein wrote:
> On Thu, Nov 10, 2011 at 11:51, Nate Lawson wrote:
>> ...
>> BTW, are there any plans for the Riak python client to use the protobuf C
>> library directly via ctypes? The pure python implementation of protobuf
>> se
On Nov 16, 2011, at 5:50 AM, David Smith wrote:
> On Tue, Nov 15, 2011 at 6:01 PM, Jeremy Raymond wrote:
>> I've seen issues when leveldb runs out of file handles. The leveldb
>> log then fills with error messages.
>
> Hmm -- this could be. However, I would expect that the Erlang VM would
> even
On Nov 16, 2011, at 9:57 AM, Rusty Klophaus wrote:
> Now that you've had a few weeks to investigate and experiment with
> Secondary Indexes, I'm hoping to hear about your experiences to help
> us focus future development efforts most effectively:
> • Have you tried Secondary Indexes?
>
On Nov 16, 2011, at 11:41 AM, Rusty Klophaus wrote:
>> 2. We need a guaranteed order of inputs from a 2I query. If we select on a
>> range, each key we get on a given node in the M-R job should be ordered
>> according to the 2I values. Of course we understand that keys won't be
>> ordered acros
20 matches
Mail list logo