Oh man, I didn't realize that Eishay's work got turned into a whole Google code project. Awesome to see what happens when you're curious, pursue something, then write and tweet about it.
On Thu, Apr 1, 2010 at 12:47 PM, Jeff Schnitzer <[email protected]> wrote: > It's an interesting thread. To summarize, you're discussing the best > way to serialize a 15,000-element dictionary in an entity. > > This doesn't seem to be all that closely related to the datastore. > The datastore api costs are the same no matter how you store it (58ms) > but the serialization costs vary widely depending on what technique > you use - pickling, protobuf, expando (you can put 15k properties in > an expando??). > > I guess having a 'select fields' (or 'suppress fields') instruction > could let you avoid the serialization costs for specific fields, but > it would require a dramatically more complicated API. On the other > hand, you can pretty easily address this as you describe; saving your > data as Text or Blob and serializing or deserializing it on-demand. > > At least in the numbers you posted, there weren't any bandwidth issues > - if I'm reading it correctly, all of the cost seems to have been > produced by the serialization process. It doesn't matter if you fetch > the Blob data or not, it only matters when you try to convert it into > a dictionary. Do the serialization lazily and your problem is solved. > > BTW, this is related information that always deserves a link when the > subject comes up: > > http://code.google.com/p/thrift-protobuf-compare/wiki/Benchmarking > > Jeff > > On Thu, Apr 1, 2010 at 12:03 PM, Eli Jones <[email protected]> wrote: > > Here are the numbers from a test I did just comparing the process of > > stuffing the protobuf of an entity into a BlobProperty: > > > http://groups.google.com/group/google-appengine-python/browse_thread/thread/4d6dde610addd8ef/a3037abd34ed03f6?#a3037abd34ed03f6 > > It's 40% to 70% faster and cheaper to do a get, change, put on the > protobuf > > version.. (see link for model definitions and test code and results). > > Using compression with the protobuf made it a little faster and cheaper.. > > but the benefit was not as drastic as the protobuf change. > > > > On Thu, Apr 1, 2010 at 2:02 PM, Jeff Schnitzer <[email protected]> > wrote: > >> > >> Do you have any quantitative numbers? I'd really like to know how > >> much this saves. > >> > >> Jeff > >> > >> On Thu, Apr 1, 2010 at 9:25 AM, Eli Jones <[email protected]> wrote: > >> > Compression will probably show the biggest benefit when you have a > >> > significant difference between the un-compressed and compressed sizes > >> > and > >> > the amount of data you're bringing over the wire is large enough to > >> > cause a > >> > noticeable slowdown.. > >> > So..if all you are doing is pulling over one entity with properties > that > >> > add > >> > up to 200 bytes total.. compression is just going to slow you down... > >> > If you are pulling over an entity that has properties of 200 kilobytes > >> > total, the time it takes to get the entity decompress it, change it, > >> > compress it and put it back to the datastore will be faster than just > >> > getting, changing and putting a non-compressed version. > >> > Granted my assumptions in this case are based on tests I did against > two > >> > Models: a decompressed version with two large un-indexed properties > and > >> > then > >> > a compressed Model that had one property = a compressed protobuf of > the > >> > decompressed Model. (The protobuf before compression was 390KB in > >> > size.. > >> > after compression it was 40KB). Even without using compression on the > >> > protobuf of the big model.. just putting one large property > (containing > >> > a > >> > protobuf of the two property model) was much faster than directly > >> > putting > >> > the two prop Model (even with indexed = false for the props). > >> > After adding compression to the mix, the roundtrip of getting, > >> > decompressing, changing, compressing and putting took less time and > less > >> > cpu > >> > overall. Than getting, changing, putting the un-compressed protobuf.. > >> > never mind comparing it to putting a larger entity with multiple > >> > defined > >> > (but un-indexed) properties. > >> > In the end, compression may be less important than just stuffing the > an > >> > entire entity into a BlobProperty as a protobuf. (I seem to remember > >> > that > >> > the benefit from compressing the large protobuf was nowhere near as > >> > drastic > >> > as the benefit from turning the entire entity into a protobuf to be > >> > stuffed > >> > into one prop.) > >> > On Thu, Apr 1, 2010 at 2:01 AM, Jeff Schnitzer <[email protected]> > >> > wrote: > >> >> > >> >> On Wed, Mar 31, 2010 at 9:57 PM, Robert Kluin < > [email protected]> > >> >> wrote: > >> >> > > >> >> > Although I have not personally tested with _really_ large > entities, > >> >> > I see very little difference in performance based on the size of > the > >> >> > entity. We have some models with 20 or 30 string and float fields > >> >> > that seem to perform similar to models with 5 or 6 string fields. > >> >> > There have been a number of threads discussing this in the past. I > >> >> > think a post had some benchmarks in December or January. > >> >> > >> >> This has been my experience as well. Additional indexes cost a lot, > >> >> but additional unindexed properties seem to be almost "free" in the > >> >> datastore. > >> >> > >> >> I would ask of anyone asking for select at a property level: Have > you > >> >> run any performance tests of your application with big vs small > >> >> entities? Are you sure it matters? > >> >> > >> >> Jeff > >> >> > >> >> -- > >> >> You received this message because you are subscribed to the Google > >> >> Groups > >> >> "Google App Engine" group. > >> >> To post to this group, send email to > [email protected]. > >> >> To unsubscribe from this group, send email to > >> >> [email protected]<google-appengine%[email protected]> > . > >> >> For more options, visit this group at > >> >> http://groups.google.com/group/google-appengine?hl=en. > >> >> > >> > > >> > -- > >> > You received this message because you are subscribed to the Google > >> > Groups > >> > "Google App Engine" group. > >> > To post to this group, send email to > [email protected]. > >> > To unsubscribe from this group, send email to > >> > [email protected]<google-appengine%[email protected]> > . > >> > For more options, visit this group at > >> > http://groups.google.com/group/google-appengine?hl=en. > >> > > >> > >> -- > >> You received this message because you are subscribed to the Google > Groups > >> "Google App Engine" group. > >> To post to this group, send email to [email protected]. > >> To unsubscribe from this group, send email to > >> [email protected]<google-appengine%[email protected]> > . > >> For more options, visit this group at > >> http://groups.google.com/group/google-appengine?hl=en. > >> > > > > -- > > You received this message because you are subscribed to the Google Groups > > "Google App Engine" group. > > To post to this group, send email to [email protected]. > > To unsubscribe from this group, send email to > > [email protected]<google-appengine%[email protected]> > . > > For more options, visit this group at > > http://groups.google.com/group/google-appengine?hl=en. > > > > -- > You received this message because you are subscribed to the Google Groups > "Google App Engine" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]<google-appengine%[email protected]> > . > For more options, visit this group at > http://groups.google.com/group/google-appengine?hl=en. > > -- Ikai Lan Developer Programs Engineer, Google App Engine http://googleappengine.blogspot.com | http://twitter.com/app_engine -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
