Thanks for the answers, Robert.

Shard size isn't determined by amount of data, but by access patterns. An
example of an anti-pattern that will cause a shard size imbalance would be
an entity write every time a user takes an action - but you never do
anything with this data. Since the data just kind of accumulates, the shard
never splits (unless it hits some hardware bound, which I've never really
seen happen yet with GAE data).

As a final note, it takes a LOT of writes before this sort of thing
happens, and I sometimes regret writing that blog post because anytime you
write a blog post about scalability patterns, it invites people to
prematurely implement them (Brett Slatkin's video generated an endless
number of questions from people doing sub 1 QPS). We've done launches on
the YouTube/Google homepage (
http://blog.golang.org/2011/12/from-zero-to-go-launching-on-google.html)
that haven't required us to make these changes because they did fine under
load testing. I'd invest more energy in figuring out the right way to load
test, then trying to figure out the bottlenecks when you hit limits with
real data.

--
Ikai Lan
Developer Programs Engineer, Google App Engine
plus.ikailan.com



On Wed, Feb 1, 2012 at 9:19 PM, Robert Kluin <[email protected]> wrote:

> So I'd say don't worry about it unless you actually hit this problem.
> If you do know you'll hit it, see if you have a way to "shard" the
> timestamp, by account, user, or region, etc..., to relieve some of the
> pressure.  If you must have a global timestamp, I'd say keep it as
> simple as possible, until you hit the issue.  At that point you can
> figure out a fix.
>
> When I have timestamps on high write-rate entities that are
> non-critical, for example "expiration" times that are used only for
> cleanup, I'll sometimes add a random jitter of several hours to spread
> the writes out a bit.  I'd be surprised if changing it by a few
> seconds helped much -- but it could.  Keep in mind, there will already
> be some degree of randomness since the instance clocks have some
> slight variation.  If you're hitting this issue, I'd give it a shot
> though.  If it works it could at least buy you some time to get a
> better fix.
>
> I don't think there is a fixed number of rows per shard.  I think it
> is split up by data size, and I don't think the exact number is
> publicly documented.  Maybe you can roughly figure it out via
> experimentation.
>
>
> Robert
>
>
> On Wed, Feb 1, 2012 at 02:28, WGuerlich <[email protected]> wrote:
> > I know, I'm going to hit the write limit with a timestamp I need to
> update
> > on every write and which needs to be indexed.
> >
> > As an alternative to sharding: What do you think about adding time
> jitter to
> > the timestamp, that is, changing time randomly by a couple seconds? In my
> > application the timestamp being off by a couple senconds wouldn't pose a
> > problem.
> >
> > Now what I need to know is: How many index entries can I expect to go
> into
> > one tablet? This is needed to estimate the amount of jitter necessary to
> > avoid hitting the same tablet on every write.
> >
> > Any insights on this?
> >
> > Wolfram
> >
> > --
> > You received this message because you are subscribed to the Google Groups
> > "Google App Engine" group.
> > To view this discussion on the web visit
> > https://groups.google.com/d/msg/google-appengine/-/r0SVTq6i4iEJ.
> >
> > To post to this group, send email to [email protected].
> > To unsubscribe from this group, send email to
> > [email protected].
> > For more options, visit this group at
> > http://groups.google.com/group/google-appengine?hl=en.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group at
> http://groups.google.com/group/google-appengine?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Reply via email to