Hey Ikai,
  That's what I was actually remembering, that a tablet would be
around 100 or 200mb.  I couldn't remember where I'd read that though
-- thanks for the link.


Robert




On Thu, Feb 2, 2012 at 22:06, Ikai Lan (Google) <[email protected]> wrote:
> "A Bigtable cluster stores a number of tables. Each ta-
> ble consists of a set of tablets, and each tablet contains
> all data associated with a row range. Initially, each table
> consists of just one tablet. As a table grows, it is auto-
> matically split into multiple tablets, each approximately
> 100-200 MB in size by default."
>
> Also: 100-200 MB is the default configuration. It might not be what we have
> configured. This is *really really* detailed, though.
>
> --
> Ikai Lan
> Developer Programs Engineer, Google App Engine
> plus.ikailan.com
>
>
>
> On Thu, Feb 2, 2012 at 7:04 PM, Ikai Lan (Google) <[email protected]> wrote:
>>
>> Okay, looks like these whitepapers are at research.google.com now:
>>
>> http://research.google.com/archive/bigtable.html
>>
>> --
>> Ikai Lan
>> Developer Programs Engineer, Google App Engine
>> plus.ikailan.com
>>
>>
>>
>> On Thu, Feb 2, 2012 at 7:03 PM, Ikai Lan (Google) <[email protected]>
>> wrote:
>>>
>>> Robert, I'll see what I can do. No promises on an ETA. It isn't in one of
>>> the white papers?
>>>
>>> http://labs.google.com/papers/bigtable.html
>>>
>>> Oh what the heck ... the link is broken. Let me see what's up.
>>>
>>> --
>>> Ikai Lan
>>> Developer Programs Engineer, Google App Engine
>>> plus.ikailan.com
>>>
>>>
>>>
>>> On Thu, Feb 2, 2012 at 1:56 PM, Robert Kluin <[email protected]>
>>> wrote:
>>>>
>>>> Yeah Ikai is completely correct.  I should have noted more clearly
>>>> that this is not something I even waste time worrying about until I
>>>> think I'm actually hitting it, which is not often.  In the few cases
>>>> where I do think I've bumped into it, it is a writing thousands of
>>>> entities per second type of thing -- which is not very common.
>>>>
>>>> It is interesting that sharding is determined by access patterns.  Is
>>>> that something you can elaborate on at all?  ;)
>>>>
>>>>
>>>> Robert
>>>>
>>>>
>>>>
>>>> On Thu, Feb 2, 2012 at 16:14, Ikai Lan (Google) <[email protected]>
>>>> wrote:
>>>> > Thanks for the answers, Robert.
>>>> >
>>>> > Shard size isn't determined by amount of data, but by access patterns.
>>>> > An
>>>> > example of an anti-pattern that will cause a shard size imbalance
>>>> > would be
>>>> > an entity write every time a user takes an action - but you never do
>>>> > anything with this data. Since the data just kind of accumulates, the
>>>> > shard
>>>> > never splits (unless it hits some hardware bound, which I've never
>>>> > really
>>>> > seen happen yet with GAE data).
>>>> >
>>>> > As a final note, it takes a LOT of writes before this sort of thing
>>>> > happens,
>>>> > and I sometimes regret writing that blog post because anytime you
>>>> > write a
>>>> > blog post about scalability patterns, it invites people to prematurely
>>>> > implement them (Brett Slatkin's video generated an endless number of
>>>> > questions from people doing sub 1 QPS). We've done launches on the
>>>> > YouTube/Google homepage
>>>> >
>>>> > (http://blog.golang.org/2011/12/from-zero-to-go-launching-on-google.html)
>>>> > that haven't required us to make these changes because they did fine
>>>> > under
>>>> > load testing. I'd invest more energy in figuring out the right way to
>>>> > load
>>>> > test, then trying to figure out the bottlenecks when you hit limits
>>>> > with
>>>> > real data.
>>>> >
>>>> > --
>>>> > Ikai Lan
>>>> > Developer Programs Engineer, Google App Engine
>>>> > plus.ikailan.com
>>>> >
>>>> >
>>>> >
>>>> > On Wed, Feb 1, 2012 at 9:19 PM, Robert Kluin <[email protected]>
>>>> > wrote:
>>>> >>
>>>> >> So I'd say don't worry about it unless you actually hit this problem.
>>>> >> If you do know you'll hit it, see if you have a way to "shard" the
>>>> >> timestamp, by account, user, or region, etc..., to relieve some of
>>>> >> the
>>>> >> pressure.  If you must have a global timestamp, I'd say keep it as
>>>> >> simple as possible, until you hit the issue.  At that point you can
>>>> >> figure out a fix.
>>>> >>
>>>> >> When I have timestamps on high write-rate entities that are
>>>> >> non-critical, for example "expiration" times that are used only for
>>>> >> cleanup, I'll sometimes add a random jitter of several hours to
>>>> >> spread
>>>> >> the writes out a bit.  I'd be surprised if changing it by a few
>>>> >> seconds helped much -- but it could.  Keep in mind, there will
>>>> >> already
>>>> >> be some degree of randomness since the instance clocks have some
>>>> >> slight variation.  If you're hitting this issue, I'd give it a shot
>>>> >> though.  If it works it could at least buy you some time to get a
>>>> >> better fix.
>>>> >>
>>>> >> I don't think there is a fixed number of rows per shard.  I think it
>>>> >> is split up by data size, and I don't think the exact number is
>>>> >> publicly documented.  Maybe you can roughly figure it out via
>>>> >> experimentation.
>>>> >>
>>>> >>
>>>> >> Robert
>>>> >>
>>>> >>
>>>> >> On Wed, Feb 1, 2012 at 02:28, WGuerlich <[email protected]> wrote:
>>>> >> > I know, I'm going to hit the write limit with a timestamp I need to
>>>> >> > update
>>>> >> > on every write and which needs to be indexed.
>>>> >> >
>>>> >> > As an alternative to sharding: What do you think about adding time
>>>> >> > jitter to
>>>> >> > the timestamp, that is, changing time randomly by a couple seconds?
>>>> >> > In
>>>> >> > my
>>>> >> > application the timestamp being off by a couple senconds wouldn't
>>>> >> > pose a
>>>> >> > problem.
>>>> >> >
>>>> >> > Now what I need to know is: How many index entries can I expect to
>>>> >> > go
>>>> >> > into
>>>> >> > one tablet? This is needed to estimate the amount of jitter
>>>> >> > necessary to
>>>> >> > avoid hitting the same tablet on every write.
>>>> >> >
>>>> >> > Any insights on this?
>>>> >> >
>>>> >> > Wolfram
>>>> >> >
>>>> >> > --
>>>> >> > You received this message because you are subscribed to the Google
>>>> >> > Groups
>>>> >> > "Google App Engine" group.
>>>> >> > To view this discussion on the web visit
>>>> >> > https://groups.google.com/d/msg/google-appengine/-/r0SVTq6i4iEJ.
>>>> >> >
>>>> >> > To post to this group, send email to
>>>> >> > [email protected].
>>>> >> > To unsubscribe from this group, send email to
>>>> >> > [email protected].
>>>> >> > For more options, visit this group at
>>>> >> > http://groups.google.com/group/google-appengine?hl=en.
>>>> >>
>>>> >> --
>>>> >> You received this message because you are subscribed to the Google
>>>> >> Groups
>>>> >> "Google App Engine" group.
>>>> >> To post to this group, send email to
>>>> >> [email protected].
>>>> >> To unsubscribe from this group, send email to
>>>> >> [email protected].
>>>> >> For more options, visit this group at
>>>> >> http://groups.google.com/group/google-appengine?hl=en.
>>>> >>
>>>> >
>>>> > --
>>>> > You received this message because you are subscribed to the Google
>>>> > Groups
>>>> > "Google App Engine" group.
>>>> > To post to this group, send email to
>>>> > [email protected].
>>>> > To unsubscribe from this group, send email to
>>>> > [email protected].
>>>> > For more options, visit this group at
>>>> > http://groups.google.com/group/google-appengine?hl=en.
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "Google App Engine" group.
>>>> To post to this group, send email to [email protected].
>>>> To unsubscribe from this group, send email to
>>>> [email protected].
>>>> For more options, visit this group at
>>>> http://groups.google.com/group/google-appengine?hl=en.
>>>>
>>>
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group at
> http://groups.google.com/group/google-appengine?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Reply via email to