On Tue, Jan 5, 2016 at 5:52 PM, Jonathan Haddad wrote:
> You could keep a "num_buckets" value associated with the client's account,
> which can be adjusted accordingly as usage increases.
>
Yes, but the adjustment problem is tricky when there are multiple
concurrent writers. What happens when yo
You could keep a "num_buckets" value associated with the client's account,
which can be adjusted accordingly as usage increases.
On Tue, Jan 5, 2016 at 2:17 PM Jim Ancona wrote:
> On Tue, Jan 5, 2016 at 4:56 PM, Clint Martin <
> clintlmar...@coolfiretechnologies.com> wrote:
>
>> What sort of dat
On Tue, Jan 5, 2016 at 4:56 PM, Clint Martin <
clintlmar...@coolfiretechnologies.com> wrote:
> What sort of data is your clustering key composed of? That might help some
> in determining a way to achieve what you're looking for.
>
Just a UUID that acts as an object identifier.
>
> Clint
> On Jan
What sort of data is your clustering key composed of? That might help some
in determining a way to achieve what you're looking for.
Clint
On Jan 5, 2016 2:28 PM, "Jim Ancona" wrote:
> Hi Nate,
>
> Yes, I've been thinking about treating customers as either small or big,
> where "small" ones have
Hi Nate,
Yes, I've been thinking about treating customers as either small or big,
where "small" ones have a single partition and big ones have 50 (or
whatever number I need to keep sizes reasonable). There's still the problem
of how to handle a small customer who becomes too big, but that will hap
Hi Jack,
Thanks for your response. My answers inline...
On Tue, Jan 5, 2016 at 11:52 AM, Jack Krupansky
wrote:
> Jim, I don't quite get why you think you would need to query 50 partitions
> to return merely hundreds or thousands of rows. Please elaborate. I mean,
> sure, for that extreme 100th
>
>
> In this case, 99% of my data could fit in a single 50 MB partition. But if
> I use the standard approach, I have to split my partitions into 50 pieces
> to accommodate the largest data. That means that to query the 700 rows for
> my median case, I have to read 50 partitions instead of one.
>
Jim, I don't quite get why you think you would need to query 50 partitions
to return merely hundreds or thousands of rows. Please elaborate. I mean,
sure, for that extreme 100th percentile, yes, you would query a lot of
partitions, but for the 90th percentile it would be just one. Even the 99th
per
Thanks for responding!
My natural partition key is a customer id. Our customers have widely
varying amounts of data. Since the vast majority of them have data that's
small enough to fit in a single partition, I'd like to avoid imposing
unnecessary overhead on the 99% just to avoid issues with the
You should endeavor to use a repeatable method of segmenting your data.
Swapping partitions every time you "fill one" seems like an anti pattern to
me. but I suppose it really depends on what your primary key is. Can you
share some more information on this?
In the past I have utilized the consiste
10 matches
Mail list logo