Hi Yuri El mié, 30-11-2016 a las 15:02 +0200, Yuri Benditovich escribió: > Hello Javier, > > After implementing the pushing thread in current qxl-wddm-dod I > measure the CPU consumption in PresentDisplayOnly call and in the > thread on pushing drawables to the device. The results show that the > time to push drawables is negligible in relation to time of copying > dirty rects to the device memory (in average the proportion is ~ > 1/500) and in typical case the pushing thread serves only single > 'present' operation and then waits for next operation. > I tried in on Win10 with 2-3 G memory and 1-2 CPUs with regular user > activity (opening windows, redrawing, scrolling etc) > > Do I miss something? No, sounds ok to me. I wanted to experiment with multithreading, but I never got this far with my tests. So it's fairly possible that you get no performance improvement over using just one thread. > > Thanks, > Yuri >
> On Fri, Nov 25, 2016 at 11:11 AM, Javier Celaya > > <[email protected]>> wrote: > > Hello Yuri > > > > El vie, 25-11-2016 a las 01:08 +0200, Yuri Benditovich escribió: > > > I'm porting to [qxl-wddm-dod] set of flexvdi changes > > > related to execution of '> > > present display only' events > > > in separate thread. There are 2 questions below I'd like to ask and k> > > > > > now your opinion. > > > I see there 2 aspects: > > > - reliability > > > - performance > > > > > > Reliability: > > > I see in flexvdi mailing list existing report of > > > BSOD upon system shutdown. Possible cause is lack of > > > synchronization between system flows, hardware availability and worker > > > thread state > > > (last patch in flexvdi 'Terminate working thread on > > > exit' > > > introduces termination procedure but nobody calls it, > > > > > > as I can see) > > > The lack of synchronization may cause also races in > > > power> > > management flows and (possible) on changing > > > operating mode. > > > Question 1: > > > Do you have some additional recommendation which > > > flows shall be specially checked for races with > > > rendering thread? > > > > Unfortunately, the truth is, we have not thoroughly tested our code to > > > > remove these races yet. The clients this driver was intended for are > > > > still stuck using Windows XP/7, and our development is stalled. So, I > > > > cannot think of any situation you should check that you do not know > > > > about yet. > > > > > > > > Performance: > > > It looks like > > > the change should not affect total CPU consumption for > > > the rendering, it splits more or less the same operations over > > > 2 different threads. > > > It is still possible that the > > > change can > > > improve > > > common user experience > > > due to > > > faster indication of operation > > > completion to > > > the OS. > > > > We were not trying to reduce total CPU consumption. After all, the > > > > driver just copies rects from main memory to VRAM and passes them to > > > > the spice server; there is little to reduce there. Rather, we tried to > > > > increase the throughput of graphic operations, by not locking the > > > > DirectX subsystem while we wait for the spice server to accept new > > > > drawables. That is, we do not mind using more CPU if that results in > > > > painting faster. > > On the other hand, I was thinking that maybe we could get the DirectX > > subsystem to provide the rects already in VRAM if we described it as a > > linear memory segment on driver initialization. In that way, the copying > > operation could also be removed. However, I am not sure if this actually > > works or even how to do it, it is just an idea. > > > > > Question 2: > > > Do you have some ideas how to make quantitive > > > evaluation of this possible > > > improvement of user experience? > > > I think about: > > > - finding scenarios when we receive > > > rendering calls > > > (PresentDisplayOnly) when the worker > > > thread is still processing > > > previous operation. If they exist this can mean that some bottleneck > > > solved in GDI. > > > - writing or getting tool that loads the graphics > > > adapter by heavy operations (like continuos moving of window / scrolling > > > etc) with CPU consumption measurement > > > > We used a simple tool to measure the performance: it creates a window > > > > and continuously issues WM_PAINT events where the full background is > > > > filled with color, then measures the number of processed events per > > > > second (not CPU). It is quite naive, but it provides a good starting > > > > reference, since the tool, with the XDDM QXL driver in Windows 7, > > > > outputs almost twice as much paint events as executing it in Windows 8 > > > > with the WDDM QXL driver. There are other measurements you can try to > > > > obtain, like how much time does it take until a paint event gets to the > > > > spice server queue, ready to be sent to the client (although I'm not > > > > sure how to measure it). This delay affects the user perception of > > > > performance. > > > > > > > > > > > Please share your thoughts. > > > Thanks, > > > Yuri > > > > > > > > >
_______________________________________________ Spice-devel mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/spice-devel
