That's a good idea, but I'll have to think about this a bit.   It seems relatively straightforward, but I'd be doing this in bash so I'd like to come up with an implementation that is not overly complicated.    Do you have a job that has the issue offhand?

Scott


On 9/4/20 10:27 AM, Barry Smith wrote:
   Scott,

    How difficult would it be for the test harness to run a failed test again 
if the return code has specific values? Instead of erroring out.

    I am thinking in particular about GPUs but it is general. If the GPU 
doesn't have he resources available it will error out thus crashing the entire 
job in the pipeline requiring retrying the job from the GUI. Wasting everyone's 
time.

    Seems in theory like it should be pretty straightforward but, of course, 
unforeseen issues can make it difficult. Just check the program's error code 
and it if is certain values run the program again, or wait a few seconds and run

   Barry


Issues are still broken hence here.

--
Tech-X Corporation               [email protected]
5621 Arapahoe Ave, Suite A       Phone: (720) 974-1841
Boulder, CO 80303                Fax:   (303) 448-7756

Reply via email to