Okay, looks like these whitepapers are at research.google.com now: http://research.google.com/archive/bigtable.html
-- Ikai Lan Developer Programs Engineer, Google App Engine plus.ikailan.com On Thu, Feb 2, 2012 at 7:03 PM, Ikai Lan (Google) <[email protected]> wrote: > Robert, I'll see what I can do. No promises on an ETA. It isn't in one of > the white papers? > > http://labs.google.com/papers/bigtable.html > > Oh what the heck ... the link is broken. Let me see what's up. > > -- > Ikai Lan > Developer Programs Engineer, Google App Engine > plus.ikailan.com > > > > On Thu, Feb 2, 2012 at 1:56 PM, Robert Kluin <[email protected]>wrote: > >> Yeah Ikai is completely correct. I should have noted more clearly >> that this is not something I even waste time worrying about until I >> think I'm actually hitting it, which is not often. In the few cases >> where I do think I've bumped into it, it is a writing thousands of >> entities per second type of thing -- which is not very common. >> >> It is interesting that sharding is determined by access patterns. Is >> that something you can elaborate on at all? ;) >> >> >> Robert >> >> >> >> On Thu, Feb 2, 2012 at 16:14, Ikai Lan (Google) <[email protected]> >> wrote: >> > Thanks for the answers, Robert. >> > >> > Shard size isn't determined by amount of data, but by access patterns. >> An >> > example of an anti-pattern that will cause a shard size imbalance would >> be >> > an entity write every time a user takes an action - but you never do >> > anything with this data. Since the data just kind of accumulates, the >> shard >> > never splits (unless it hits some hardware bound, which I've never >> really >> > seen happen yet with GAE data). >> > >> > As a final note, it takes a LOT of writes before this sort of thing >> happens, >> > and I sometimes regret writing that blog post because anytime you write >> a >> > blog post about scalability patterns, it invites people to prematurely >> > implement them (Brett Slatkin's video generated an endless number of >> > questions from people doing sub 1 QPS). We've done launches on the >> > YouTube/Google homepage >> > ( >> http://blog.golang.org/2011/12/from-zero-to-go-launching-on-google.html) >> > that haven't required us to make these changes because they did fine >> under >> > load testing. I'd invest more energy in figuring out the right way to >> load >> > test, then trying to figure out the bottlenecks when you hit limits with >> > real data. >> > >> > -- >> > Ikai Lan >> > Developer Programs Engineer, Google App Engine >> > plus.ikailan.com >> > >> > >> > >> > On Wed, Feb 1, 2012 at 9:19 PM, Robert Kluin <[email protected]> >> wrote: >> >> >> >> So I'd say don't worry about it unless you actually hit this problem. >> >> If you do know you'll hit it, see if you have a way to "shard" the >> >> timestamp, by account, user, or region, etc..., to relieve some of the >> >> pressure. If you must have a global timestamp, I'd say keep it as >> >> simple as possible, until you hit the issue. At that point you can >> >> figure out a fix. >> >> >> >> When I have timestamps on high write-rate entities that are >> >> non-critical, for example "expiration" times that are used only for >> >> cleanup, I'll sometimes add a random jitter of several hours to spread >> >> the writes out a bit. I'd be surprised if changing it by a few >> >> seconds helped much -- but it could. Keep in mind, there will already >> >> be some degree of randomness since the instance clocks have some >> >> slight variation. If you're hitting this issue, I'd give it a shot >> >> though. If it works it could at least buy you some time to get a >> >> better fix. >> >> >> >> I don't think there is a fixed number of rows per shard. I think it >> >> is split up by data size, and I don't think the exact number is >> >> publicly documented. Maybe you can roughly figure it out via >> >> experimentation. >> >> >> >> >> >> Robert >> >> >> >> >> >> On Wed, Feb 1, 2012 at 02:28, WGuerlich <[email protected]> wrote: >> >> > I know, I'm going to hit the write limit with a timestamp I need to >> >> > update >> >> > on every write and which needs to be indexed. >> >> > >> >> > As an alternative to sharding: What do you think about adding time >> >> > jitter to >> >> > the timestamp, that is, changing time randomly by a couple seconds? >> In >> >> > my >> >> > application the timestamp being off by a couple senconds wouldn't >> pose a >> >> > problem. >> >> > >> >> > Now what I need to know is: How many index entries can I expect to go >> >> > into >> >> > one tablet? This is needed to estimate the amount of jitter >> necessary to >> >> > avoid hitting the same tablet on every write. >> >> > >> >> > Any insights on this? >> >> > >> >> > Wolfram >> >> > >> >> > -- >> >> > You received this message because you are subscribed to the Google >> >> > Groups >> >> > "Google App Engine" group. >> >> > To view this discussion on the web visit >> >> > https://groups.google.com/d/msg/google-appengine/-/r0SVTq6i4iEJ. >> >> > >> >> > To post to this group, send email to >> [email protected]. >> >> > To unsubscribe from this group, send email to >> >> > [email protected]. >> >> > For more options, visit this group at >> >> > http://groups.google.com/group/google-appengine?hl=en. >> >> >> >> -- >> >> You received this message because you are subscribed to the Google >> Groups >> >> "Google App Engine" group. >> >> To post to this group, send email to [email protected] >> . >> >> To unsubscribe from this group, send email to >> >> [email protected]. >> >> For more options, visit this group at >> >> http://groups.google.com/group/google-appengine?hl=en. >> >> >> > >> > -- >> > You received this message because you are subscribed to the Google >> Groups >> > "Google App Engine" group. >> > To post to this group, send email to [email protected]. >> > To unsubscribe from this group, send email to >> > [email protected]. >> > For more options, visit this group at >> > http://groups.google.com/group/google-appengine?hl=en. >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Google App Engine" group. >> To post to this group, send email to [email protected]. >> To unsubscribe from this group, send email to >> [email protected]. >> For more options, visit this group at >> http://groups.google.com/group/google-appengine?hl=en. >> >> > -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
