How about adding a gae only parameter to the gae adapter_args that tells it 
to skip fetch?

On Saturday, 20 October 2012 11:25:51 UTC-5, howesc wrote:
>
> It appears that the most efficient way to delete on app engine is to:
>  - build a query object, like we are doing now
>  - call run with keys_only=True (
> https://developers.google.com/appengine/docs/python/datastore/queryclass#Query_run)
>  
> which returns an iterator.
>  - pass that iterator to the datastore delete method (
> https://developers.google.com/appengine/docs/python/datastore/functions#delete
> )
>
> this avoids the cost of loading the rows into memory, decreases the 
> likelihood of timeout, and has the cost of 1 datastore small operation per 
> row.  but it does prevent us from getting a count of rows deleted.
>
> the way we do it now:
>  - run count() on the query.  this has a cost (time and money) of 
> iterating over all the rows that match the query on GAE (1 datastore small 
> operation per row)
>  - run fetch(limit=1000) and call delete() successively until no more 
> rows.  this has the cost of running a full query (at least 1 datastore read 
> operation per row) and loading the result set into memory and then deleting 
> the results.
>
> in my case i'm timing out on the count() call so i don't even start the 
> delete.  from an efficiency standpoint i'd rather have more rows deleted 
> for less cost then get a count....but this may not be acceptable for all. 
>  at a minimum i think we should switch to use keys_only=True for the fetch, 
> and skip the leading count() call and just sum the number of times we call 
> fetch.  we may also consider catching the datastore timeout error and 
> trying to handle a partial delete more gracefully (or continue to let the 
> user catch the error).
>
> what is the "right" approach for web2py?  if the approach with count is 
> correct, could i propose a gae bulk_delete method that does not return 
> count but uses my first method?
>
> thanks for the input!
>
> cfh
>
> On Saturday, October 20, 2012 7:58:56 AM UTC-7, Massimo Di Pierro wrote:
>>
>> Delete should return the number of deleted records. What is your proposal?
>>
>> On Wednesday, 17 October 2012 17:30:22 UTC-5, howesc wrote:
>>>
>>> Hi all,
>>>
>>> I'm trying to clean up old expired sessions.....but i waited a long time 
>>> to get to this and now my GAE delete is just timing out.  Reading the GAE 
>>> docs, there appears to be some improvements that we can make to the query 
>>> delete method on GAE that will make it faster and cheaper.  what we lose 
>>> then is the count of the number of rows deleted.
>>>
>>> my question is, does having a db(db.table.something==True).delete() that 
>>> does not return a count break the web2py API contract, or break anyone's 
>>> applications?
>>>
>>> thanks,
>>>
>>> christian
>>>
>>

-- 



Reply via email to