>> I'd run across a stackexchange answer 
>> <https://stackoverflow.com/a/75037925/2057516> that suggested setting a 
>> statement timeout (specific to postgres) in a django view.  If I did that 
>> for the validation view, it wouldn't necessarily 100% correspond to a 
>> blocking issue, but since the overall goal is the prevent a gateway timeout, 
>> I'm fine with that.
> 
> When you do a SET statement_timeout is session/connection specific.
> 
> For what that means in Django see:
> 
> https://docs.djangoproject.com/en/5.1/ref/databases/
> 
> Personally I think this will introduce more complexity then it's worth.
> 
> Further down the page at above link it shows how you can use a psycopg(2) 
> pool, but that is another layer added to an already complex setup.

TBF, adding a celery strategy to the codebase to queue jobs is already 
comparatively very complex to start out.  It's a pretty heavy handed solution 
to what seems to be a relatively rare occurrence (though rather recurrent when 
it does happen).  By that metric, I would think that queuing jobs at all is too 
complex to be worth it.  So I'm not sure that catching a timeout exception to 
queue a job instead of queuing the job at the outset is that much more complex. 
 I can just create a view that inherits from the current view (to which the 
statement timeout is applied) which is *not* subject to the statement timeout 
when I want to queue a longer running validation due to the block.  I think I 
should be able to redirect the request to that derived class view.  Not sure 
exactly how ATM, but it shouldn't be more than 1 line of code/statement, I 
would think.

Regarding statement timeouts in general, I found a pretty great article 
<https://hakibenita.com/9-django-tips-for-working-with-databases#statement-timeout>
 that suggests setting site-wide statement timeouts as a general rule, which I 
think makes a lot of sense.  In fact, some of the pages on our site (that I 
didn't write) are not going to scale well as the DB grows.  I already feel like 
the samples page is way too slow.  They should probably all gracefully handle 
timeouts so that users don't encounter raw 500 errors.  So whether or not I 
decide to "queue on statement timeout", I think using statement timeouts is 
worthwhile.

I take your point though.  How many more lines of code do you think would make 
the strategy worthwhile?  I'd be willing to bet I could keep it under 20 
compared to straight up queuing every validation.  In fact, I'm leaning toward 
implementing your suggestion of queuing everything as a "phase 1" of this 
endeavor, and then implement my proposed "queue on timeout" idea as phase 2 
(perhaps as a configurable option - at least until I'm satisfied it works well).

>> I don't know if that stack exchange suggestion does what it claims to do or 
>> not.  I haven't tried it.  But they key claim of interest here is that the 
>> timeout is specific to the view.  I.e. any other views or background 
>> processes wouldn't be subject to the timeout.  I'm suspicious as to whether 
>> this is actually true.  My novice understanding of Django's cursors is that 
>> it maintains a pool of connections, and I'm not 
> 
> Actually cursors are attributes of connections.

Yeah, thus my suspicion that it may not apply to only the view in the example.  
I suspected their "proof" that it didn't affect other views was due to randomly 
getting a connection to which that statement timeout had not yet been applied.  
*However*, I just read this article 
<https://medium.com/nerd-for-tech/djangos-database-connection-management-ad39a298d2e3>
 which describes Django's connection management and I realized that Django's 
default behavior (which is what we use) is to create a new connection for every 
request, so I think that, with the caveat that it wouldn't work with persistent 
connections, the stack exchange claim is actually correct.

I would just have to figure out how best to apply the timeout to everything 
except those queued validations.  And I think maybe a context manager might be 
the simplest way to do it.

Anyway, thanks so much for your help.  This discussion has been very useful, 
and I think I will proceed at first, exactly how you suggested, by queuing 
every validation job (using celery).  Then I will explore whether or not I can 
apply the "on timeout" strategy in a small patch.

Incidentally, during our Wednesday meeting this week, we actually opened our 
public instance to the world for the first time, in preparation for the 
upcoming publication.  This discussion is about the data submission interface, 
but that interface is actually disabled on the public-facing instance.  The 
other part of the codebase that I was primarily responsible for was the 
advanced search.  Everything else was primarily by other team members.  If you 
would like to check it out, let me know what you think: 
http://tracebase.princeton.edu <http://tracebase.princeton.edu/>

Cheers,
Rob


Robert William Leach
Research Software Engineer
133 Carl C. Icahn Lab
Lewis-Sigler Institute for Integrative Genomics
Princeton University
Princeton, NJ 08544

Reply via email to