Hi Francisco, In all due respect, I am surprised with the volume of justification you put into some changes.
On 28 November 2015 at 11:52, Francisco Jerez <curroje...@riseup.net> wrote: > Francisco Jerez <curroje...@riseup.net> writes: > >> Tom Stellard <thomas.stell...@amd.com> writes: >> >>> When probing for devices, clover will call pipe_loader_probe() twice. >>> The first time to retrieve the number of devices, and then second time >>> to retrieve the device structures. >>> >>> We currently assume that the return value of both calls will be the >>> same, but this will not be the case if a device happens to disappear >>> between the two calls. >>> >> >> Assuming that the two sequential calls will give the same result is bad, >> but the likelihood of a device disappearing between the two calls is so >> small it's unlikely to happen more than once in a million years. If Err... have to love this estimate :-P With some recent chances that probability increases ever so slightly, as we push the "can we find the module, does it have the entry point" from .create_screen and .configure further up to probe screen. >> this problem happens deterministically to you there's almost certainly a >> more serious bug in pipe-loader that will cause it to return an >> incorrect device count during the first call. >> Care to give an example. I'm not sure where/how this is possible. >> Indeed, pipe_loader_sw_probe() seems to return 1 unconditionally if ndev >> is zero, so it will give a different device count depending on ndev if >> pipe_loader_sw_probe_null() happens to fail. Seems like a bug we don't >> want to simply paper over. >> Can you elaborate what exactly the problem here ? If we ask "how many we have" sure we can provide that [fixed] number. On the other hand, if we want more information about it (them), there is no guarantee that we'll succeed. >> Anyway I'm not saying that clover's assumption shouldn't be fixed too -- >> See comment below. >> >>> This patch removes this assumption and checks the return value of the >>> second pipe_loader_probe() call to ensure it does not try to initialize >>> devices that no longer exits. >>> >>> CC: <mesa-sta...@lists.freedesktop.org> >>> --- >>> src/gallium/state_trackers/clover/core/platform.cpp | 5 +++-- >>> 1 file changed, 3 insertions(+), 2 deletions(-) >>> >>> diff --git a/src/gallium/state_trackers/clover/core/platform.cpp >>> b/src/gallium/state_trackers/clover/core/platform.cpp >>> index 328b71c..689d692 100644 >>> --- a/src/gallium/state_trackers/clover/core/platform.cpp >>> +++ b/src/gallium/state_trackers/clover/core/platform.cpp >>> @@ -28,9 +28,10 @@ platform::platform() : adaptor_range(evals(), devs) { >>> int n = pipe_loader_probe(NULL, 0); >>> std::vector<pipe_loader_device *> ldevs(n); >>> >>> - pipe_loader_probe(&ldevs.front(), n); >>> + n = pipe_loader_probe(&ldevs.front(), n); >>> >>> - for (pipe_loader_device *ldev : ldevs) { >>> + for (int i = 0; i < n; ++i) { >>> + pipe_loader_device *ldev = ldevs[i]; >> >> There are many reasons to prefer range loops to induction variables >> (including iterators), and there's no need to change it here: Because >> ldevs is zero-initialized you can just wrap the push_back(create()) call >> around an 'if (ldev)' to make this a one-liner. >> > Let me be more explicit about what could go wrong with this change in > its current form: The return value of pipe_loader_probe() is the number > of devices available on the system, so basically the same race condition > is still possible if 'n' increases from the first call to the second, > and in that case you'll end up reading uninitialized memory past the end > of the array -- Worse than a null dereference. Range loops make this > kind of situation impossible. > If I understood you correctly, you're saying that on the second call can return count greater than the initial one. Errm... this looks like a bug in the pipe-loader. I don't think it's normal to claim that X devices are available only to return information for fewer devices. OK, calling it a bug might not be politically correct, although it is a misleading design decision, at the very least. -Emil _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev