On Tue, 8 Aug 2023 17:54:03 GMT, Marius Hanl <mh...@openjdk.org> wrote:

>> Hi,
>> 
>> I did open the bug report. Some notes to this PR:
>> 
>> My colleagues and I are able to reproduce this bug regularly, even though it 
>> takes sometimes up to 3 or 4 weeks until the D3DERR_DEVICEHUNG error shows 
>> up. We are currently evaluating two versions of fixes, but until now we do 
>> not have any results. I will post them as soon as I got them.
>> 
>> Version 1 (this version): Based on the observation, that the 
>> TestCooperativeLevel/CheckDeviceState method returns D3D_OK again after 
>> about 20 - 60 seconds, the reinitialize is called after the first time the 
>> state is returning D3D_OK. The 'isHung' flag stores the information until 
>> then.
>> 
>> Version 2: calls reinitialize directly after D3DERR_DEVICEHUNG has been 
>> returned. Basically
>> if (hr == D3DERR_DEVICEREMOVED || hr == D3DERR_DEVICEHUNG  ) { .. }
>> 
>> I did not modify the validatePresent method, as for our workaround (see 
>> ticket) it was not necessary. At least the native call swapchain->present 
>> dows not return that error code 
>> (https://learn.microsoft.com/en-us/windows/win32/api/d3d9/nf-d3d9-idirect3dswapchain9-present).
>>  I did not look decisively into all the native calls behind 
>> D3DRTTexture#readPixels.
>> 
>> As I said I will post the results (prism.verbose output) for the 2 versions 
>> later as a base for discussions.
>
> As I also worked/checked this classes in 
> https://github.com/openjdk/jfx/pull/1200, I now have a much better 
> understanding of the code (and the communication with Direct3d9) and agree, 
> this looks like the right thing to do in this situation.

Thank you @Maran23 for taking a look into this! Sadly it did not work out as 
expected.

Our observations with current proposal ("Version 1"): When the error occurred, 
the state did not go into D3D_OK again, but it stayed at D3DERR_DEVICEHUNG. The 
D3DERR_DEVICEHUNG error message was printed over and over (for > 1 day).

For "Version 2" (directly calling reiniztialize after the error): One day we 
saw the Prism initialization text 3 times in a row and the app was running 
fine. As it happened during a holiday week, we are unsure if there were 3 
distinct errors or if there was only one error that took 3 attempts/iterations. 
(We turned application logging off to not pollute the console log.)

Especially that the current proposal did not work out is very unfortunate, 
becuase either we did not analyze correctly what happend or the error behavior 
is not that predictable. Either way, we are now re-running our workaround 
version withe the 5-minute loop (see bug ticket) and include timestamps and 
"Version 2" also with additional timestamp information.
We are going to rerun both version mutliple times, to have somewhat reliable 
information. I'll update findings here.

-------------

PR Comment: https://git.openjdk.org/jfx/pull/1199#issuecomment-1704297025

Reply via email to