Re: [web2py] Re: Speed of operations with Storage object vs plain dictionary

Michele Comitini Thu, 04 Oct 2012 15:52:27 -0700

The test is interesting, but the comparison is a bit like apples vs oranges.


Remember that a Storage object *is* a dict so in Niphlod test the
following would have to be done:

def set_dict():
    st = dict()
 ...

you will see that using the d[key] notation is faster.
So next step is testing storage vs dict both using the d[key]
notation. There should be a little overhead but not as much as
attributes vs key access.

mic

2012/10/4 monotasker <scotti...@gmail.com>
>
> Thanks very much Niphlod. I was mostly wondering whether this was common 
> knowledge or not. But I've also never taken the time (pun intended) to learn 
> how to time functions, so your sample code is really helpful. It has gone in 
> my snippets files.
>
> Your test confirms what I thought - that for the kind of apps I'm writing 
> right now the speed difference in using Storage objects and the dal is too 
> small to affect user experience. And as you say the design and maintenance 
> gains are much more significant. If I'm going to speed up my app I'm much 
> better served by looking at the usual webdev suspects -- number of file and 
> image downloads, resolution of image files, server optimization, etc. Good to 
> know.
>
> Ian
>
>
> On Thursday, October 4, 2012 3:52:24 PM UTC-4, Niphlod wrote:
>>
>> Not directed only to monotasker: aren't all you tired of reading benchmarks 
>> when the sourcecode is there ?
>> Just set-up your own test and see how it performs. Let's test it. Below the 
>> code to do a simple test.
>>
>> PS: my results are
>> set storage takes  1.86146702766
>> set dict takes 0.960257053375
>> get storage takes 6.44219303131
>> get dict takes 1.02610206604
>>
>> This is for 1 million repetitions. I can allow this time difference if my 
>> code is more readable.
>> To make a simple analogy, let's say I lost 7 seconds from the use of storage 
>> instead of plain dict, for 1 million repetitions. Let's say I can skip 100 
>> repetitions using dict instead of Storage for serving one page (a function 
>> in my controller). I'm going to gain 7 seconds every 10 thousands of pages 
>> served.
>> Let's assume that in the same function I have 3 simple queries. Let's be 
>> super-optimistic, I have a super-fast db (emphatize on super-fast). Every 
>> query is returned in 20ms (please note that on average db response times are 
>> higher).
>> I can easily gain 20ms if I cut a simple query from it (let's say I do a 
>> join instead of 2 separate fetches). For every page I'm cutting 20ms. It 
>> means that for 1000 pages I gained 20 seconds. For 10000 the gain is 200 
>> seconds. Lost 7 seconds in the process choosing storage over dict ? Let be 
>> it!
>>
>>
>> from gluon.storage import Storage
>>
>> def set_storage():
>>     st = Storage()
>>     st.test1 = 'test1'
>>     st.test2 = 'test2'
>>     st.test3 = 'test3'
>>     st.test4 = 'test4'
>>     st.test5 = 'test5'
>>     st.test6 = 'test6'
>>     st.test7 = 'test7'
>>     st.test8 = 'test9'
>>     st.test9 = 'test9'
>>     st.test10 = 'test10'
>>     return st
>>
>> def set_dict():
>>     st = dict()
>>     st['test1'] = 'test1'
>>     st['test2'] = 'test2'
>>     st['test3'] = 'test3'
>>     st['test4'] = 'test4'
>>     st['test5'] = 'test5'
>>     st['test6'] = 'test6'
>>     st['test7'] = 'test7'
>>     st['test8'] = 'test8'
>>     st['test9'] = 'test9'
>>     st['test10'] = 'test10'
>>     return st
>>
>> def get_storage(st):
>>     return st.test1 + st.test2 + st.test3 + st.test4 + st.test5 +\
>>      st.test6 + st.test7 + st.test8 + st.test9 + st.test10 == 
>> 'test1test2test3test4test5test6test7test9test9test10'
>>
>> def get_dict(st):
>>     return st['test1'] + st['test2'] + st['test3'] + st['test4'] + 
>> st['test5'] +\
>>      st['test6'] + st['test7'] + st['test8'] + st['test9'] + st['test10'] == 
>> 'test1test2test3test4test5test6test7test8test9test10'
>>
>> if __name__ == '__main__':
>>     from timeit import Timer
>>     t0 = Timer(setup='from __main__ import set_storage',
>>               stmt='set_storage()')
>>     t1 = Timer(setup='from __main__ import set_dict',
>>               stmt='set_dict()')
>>
>>     t2 = Timer(setup="from __main__ import set_storage, get_storage; st = 
>> set_storage()", stmt="get_storage(st)")
>>     t3 = Timer(setup="from __main__ import set_dict, get_dict; st = 
>> set_dict()", stmt="get_dict(st)")
>>
>>     print 'set storage takes ', t0.timeit(number=1000000)
>>     print 'set dict takes', t1.timeit(number=1000000)
>>     print 'get storage takes', t2.timeit(number=1000000)
>>     print 'get dict takes', t3.timeit(number=1000000)
>>
>>
>>
>>
>> On Thursday, October 4, 2012 7:53:13 PM UTC+2, monotasker wrote:
>>>
>>> Has anyone looked at the speed differences between operations performed 
>>> with a Storage object and the equivalent object with a dictionary? I wonder 
>>> how these would compare?
>>>
>>> bob = MyStorageObject.name
>>>
>>> bob = MyDictionary['name']
>>>
>>> I suspect that the difference with one lookup would be trivial, but I'm 
>>> wondering whether it is enough that it could make a noticeable difference 
>>> if we're working with a long list of nested Storage objects or nested 
>>> dicts. E.g.:
>>>
>>> allrows = db(db.mytable.id > 0).select()
>>> allrows.find(lambda row: [n for n in row.tags[0].names if n in 
>>> list_of_names])
>>>
>>> allrows_list = allrows.as_list()
>>> allrows_list = [d for d in allrows_list if [n for n in 
>>> d['tags'][0]['names'] in list_of_names]]
>>>
>>> Does anyone have an idea whether there will be much speed difference?
>>>
>>> Ian
>
> --
>
>
>

--

Re: [web2py] Re: Speed of operations with Storage object vs plain dictionary

Reply via email to