On 2/7/25 10:20, Robert Leach wrote:



Anyway, thanks so much for your help.  This discussion has been very useful, and I think I will proceed at first, exactly how you suggested, by queuing every validation job (using celery).  Then I will explore whether or not I can apply the "on timeout" strategy in a small patch.

Incidentally, during our Wednesday meeting this week, we actually opened our public instance to the world for the first time, in preparation for the upcoming publication.  This discussion is about the data submission interface, but that interface is actually disabled on the public-facing instance.  The other part of the codebase that I was primarily responsible for was the advanced search.  Everything else was primarily by other team members.  If you would like to check it out, let me know what you think: http://tracebase.princeton.edu <http://tracebase.princeton.edu>

I would have to hit the books again to understand all of what is going on here. One quibble with the Download tab, there is no indication of the size of the datasets. I generally like to know what I am getting into before I start a download. Also, is there explicit throttling going on? I am seeing 10.2kb/sec, whereas from here https://www.nyc.gov/site/tlc/about/tlc-trip-record-data.page I downloaded a 47.65M file at 41.9MB/s


Cheers,
Rob


Robert William Leach
Research Software Engineer
133 Carl C. Icahn Lab
Lewis-Sigler Institute for Integrative Genomics
Princeton University
Princeton, NJ 08544


--
Adrian Klaver
adrian.kla...@aklaver.com



Reply via email to