I meant to skip count.

On Saturday, 20 October 2012 15:28:56 UTC-5, Massimo Di Pierro wrote:
>
> How about adding a gae only parameter to the gae adapter_args that tells 
> it to skip fetch?
>
> On Saturday, 20 October 2012 11:25:51 UTC-5, howesc wrote:
>>
>> It appears that the most efficient way to delete on app engine is to:
>>  - build a query object, like we are doing now
>>  - call run with keys_only=True (
>> https://developers.google.com/appengine/docs/python/datastore/queryclass#Query_run)
>>  
>> which returns an iterator.
>>  - pass that iterator to the datastore delete method (
>> https://developers.google.com/appengine/docs/python/datastore/functions#delete
>> )
>>
>> this avoids the cost of loading the rows into memory, decreases the 
>> likelihood of timeout, and has the cost of 1 datastore small operation per 
>> row.  but it does prevent us from getting a count of rows deleted.
>>
>> the way we do it now:
>>  - run count() on the query.  this has a cost (time and money) of 
>> iterating over all the rows that match the query on GAE (1 datastore small 
>> operation per row)
>>  - run fetch(limit=1000) and call delete() successively until no more 
>> rows.  this has the cost of running a full query (at least 1 datastore read 
>> operation per row) and loading the result set into memory and then deleting 
>> the results.
>>
>> in my case i'm timing out on the count() call so i don't even start the 
>> delete.  from an efficiency standpoint i'd rather have more rows deleted 
>> for less cost then get a count....but this may not be acceptable for all. 
>>  at a minimum i think we should switch to use keys_only=True for the fetch, 
>> and skip the leading count() call and just sum the number of times we call 
>> fetch.  we may also consider catching the datastore timeout error and 
>> trying to handle a partial delete more gracefully (or continue to let the 
>> user catch the error).
>>
>> what is the "right" approach for web2py?  if the approach with count is 
>> correct, could i propose a gae bulk_delete method that does not return 
>> count but uses my first method?
>>
>> thanks for the input!
>>
>> cfh
>>
>> On Saturday, October 20, 2012 7:58:56 AM UTC-7, Massimo Di Pierro wrote:
>>>
>>> Delete should return the number of deleted records. What is your 
>>> proposal?
>>>
>>> On Wednesday, 17 October 2012 17:30:22 UTC-5, howesc wrote:
>>>>
>>>> Hi all,
>>>>
>>>> I'm trying to clean up old expired sessions.....but i waited a long 
>>>> time to get to this and now my GAE delete is just timing out.  Reading the 
>>>> GAE docs, there appears to be some improvements that we can make to the 
>>>> query delete method on GAE that will make it faster and cheaper.  what we 
>>>> lose then is the count of the number of rows deleted.
>>>>
>>>> my question is, does having a db(db.table.something==True).delete() 
>>>> that does not return a count break the web2py API contract, or break 
>>>> anyone's applications?
>>>>
>>>> thanks,
>>>>
>>>> christian
>>>>
>>>

-- 



Reply via email to