It's an interesting thread. To summarize, you're discussing the best way to serialize a 15,000-element dictionary in an entity.
This doesn't seem to be all that closely related to the datastore. The datastore api costs are the same no matter how you store it (58ms) but the serialization costs vary widely depending on what technique you use - pickling, protobuf, expando (you can put 15k properties in an expando??). I guess having a 'select fields' (or 'suppress fields') instruction could let you avoid the serialization costs for specific fields, but it would require a dramatically more complicated API. On the other hand, you can pretty easily address this as you describe; saving your data as Text or Blob and serializing or deserializing it on-demand. At least in the numbers you posted, there weren't any bandwidth issues - if I'm reading it correctly, all of the cost seems to have been produced by the serialization process. It doesn't matter if you fetch the Blob data or not, it only matters when you try to convert it into a dictionary. Do the serialization lazily and your problem is solved. BTW, this is related information that always deserves a link when the subject comes up: http://code.google.com/p/thrift-protobuf-compare/wiki/Benchmarking Jeff On Thu, Apr 1, 2010 at 12:03 PM, Eli Jones <[email protected]> wrote: > Here are the numbers from a test I did just comparing the process of > stuffing the protobuf of an entity into a BlobProperty: > http://groups.google.com/group/google-appengine-python/browse_thread/thread/4d6dde610addd8ef/a3037abd34ed03f6?#a3037abd34ed03f6 > It's 40% to 70% faster and cheaper to do a get, change, put on the protobuf > version.. (see link for model definitions and test code and results). > Using compression with the protobuf made it a little faster and cheaper.. > but the benefit was not as drastic as the protobuf change. > > On Thu, Apr 1, 2010 at 2:02 PM, Jeff Schnitzer <[email protected]> wrote: >> >> Do you have any quantitative numbers? I'd really like to know how >> much this saves. >> >> Jeff >> >> On Thu, Apr 1, 2010 at 9:25 AM, Eli Jones <[email protected]> wrote: >> > Compression will probably show the biggest benefit when you have a >> > significant difference between the un-compressed and compressed sizes >> > and >> > the amount of data you're bringing over the wire is large enough to >> > cause a >> > noticeable slowdown.. >> > So..if all you are doing is pulling over one entity with properties that >> > add >> > up to 200 bytes total.. compression is just going to slow you down... >> > If you are pulling over an entity that has properties of 200 kilobytes >> > total, the time it takes to get the entity decompress it, change it, >> > compress it and put it back to the datastore will be faster than just >> > getting, changing and putting a non-compressed version. >> > Granted my assumptions in this case are based on tests I did against two >> > Models: a decompressed version with two large un-indexed properties and >> > then >> > a compressed Model that had one property = a compressed protobuf of the >> > decompressed Model. (The protobuf before compression was 390KB in >> > size.. >> > after compression it was 40KB). Even without using compression on the >> > protobuf of the big model.. just putting one large property (containing >> > a >> > protobuf of the two property model) was much faster than directly >> > putting >> > the two prop Model (even with indexed = false for the props). >> > After adding compression to the mix, the roundtrip of getting, >> > decompressing, changing, compressing and putting took less time and less >> > cpu >> > overall. Than getting, changing, putting the un-compressed protobuf.. >> > never mind comparing it to putting a larger entity with multiple >> > defined >> > (but un-indexed) properties. >> > In the end, compression may be less important than just stuffing the an >> > entire entity into a BlobProperty as a protobuf. (I seem to remember >> > that >> > the benefit from compressing the large protobuf was nowhere near as >> > drastic >> > as the benefit from turning the entire entity into a protobuf to be >> > stuffed >> > into one prop.) >> > On Thu, Apr 1, 2010 at 2:01 AM, Jeff Schnitzer <[email protected]> >> > wrote: >> >> >> >> On Wed, Mar 31, 2010 at 9:57 PM, Robert Kluin <[email protected]> >> >> wrote: >> >> > >> >> > Although I have not personally tested with _really_ large entities, >> >> > I see very little difference in performance based on the size of the >> >> > entity. We have some models with 20 or 30 string and float fields >> >> > that seem to perform similar to models with 5 or 6 string fields. >> >> > There have been a number of threads discussing this in the past. I >> >> > think a post had some benchmarks in December or January. >> >> >> >> This has been my experience as well. Additional indexes cost a lot, >> >> but additional unindexed properties seem to be almost "free" in the >> >> datastore. >> >> >> >> I would ask of anyone asking for select at a property level: Have you >> >> run any performance tests of your application with big vs small >> >> entities? Are you sure it matters? >> >> >> >> Jeff >> >> >> >> -- >> >> You received this message because you are subscribed to the Google >> >> Groups >> >> "Google App Engine" group. >> >> To post to this group, send email to [email protected]. >> >> To unsubscribe from this group, send email to >> >> [email protected]. >> >> For more options, visit this group at >> >> http://groups.google.com/group/google-appengine?hl=en. >> >> >> > >> > -- >> > You received this message because you are subscribed to the Google >> > Groups >> > "Google App Engine" group. >> > To post to this group, send email to [email protected]. >> > To unsubscribe from this group, send email to >> > [email protected]. >> > For more options, visit this group at >> > http://groups.google.com/group/google-appengine?hl=en. >> > >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Google App Engine" group. >> To post to this group, send email to [email protected]. >> To unsubscribe from this group, send email to >> [email protected]. >> For more options, visit this group at >> http://groups.google.com/group/google-appengine?hl=en. >> > > -- > You received this message because you are subscribed to the Google Groups > "Google App Engine" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/google-appengine?hl=en. > -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
