Hi Yuri
El mié, 30-11-2016 a las 15:02 +0200, Yuri Benditovich escribió:
> Hello Javier,
> 
> After implementing the pushing thread in current qxl-wddm-dod I
> measure the CPU consumption in PresentDisplayOnly call and in the
> thread on pushing drawables to the device. The results show that the
> time to push drawables is negligible in relation to time of copying
> dirty rects to the device memory (in average the proportion is ~
> 1/500) and in typical case the pushing thread serves only single
> 'present' operation and then waits for next operation.
> I tried in on Win10 with 2-3 G memory and 1-2 CPUs with regular user
> activity (opening windows, redrawing, scrolling etc)
> 
> Do I miss something?
No, sounds ok to me. I wanted to experiment with multithreading, but I
never got this far with my tests. So it's fairly possible that you get
no performance improvement over using just one thread.
> > Thanks,
> Yuri
> 

> On Fri, Nov 25, 2016 at 11:11 AM, Javier Celaya > 
> <[email protected]>>  wrote:
> > Hello Yuri
> > > > El vie, 25-11-2016 a las 01:08 +0200, Yuri Benditovich escribió:
> > > I'm porting to [qxl-wddm-dod] set of flexvdi changes
> > > related to execution of '> > > present display only' events
> > > in separate thread. There are 2 questions below I'd like to ask and k> > 
> > > > now your opinion.

> > > I see there 2 aspects:
> > > - reliability
> > > - performance
> > > 
> > > Reliability:
> > > I see in flexvdi mailing list existing report of
> > > BSOD upon system shutdown. Possible cause is lack of
> > > synchronization between system flows, hardware availability and worker 
> > > thread state > > > (last patch in flexvdi 'Terminate working thread on 
> > > exit' > > > introduces termination procedure but nobody calls it, > > > 
> > > as I can see)
> > > The lack of synchronization may cause also races in
> > > power> > >  management flows and (possible) on changing
> > > operating mode.

> > > Question 1:
> > > Do you have some additional recommendation which
> > > flows shall be specially checked for races with
> > > rendering thread?
> > > > Unfortunately, the truth is, we have not thoroughly tested our code to 
> > > > remove these races yet. The clients this driver was intended for are 
> > > > still stuck using Windows XP/7, and our development is stalled. So, I 
> > > > cannot think of any situation you should check that you do not know 
> > > > about yet.
> > > > > > > > Performance:
> > > It looks like > > > the change should not affect total CPU consumption for
> > > the rendering, it splits more or less the same operations over
> > > 2 different threads. > > > It is still possible that the > > > change can 
> > > improve
> > > common user experience > > > due to > > > faster indication of operation 
> > > completion to > > > the OS.
> > > > We were not trying to reduce total CPU consumption. After all, the 
> > > > driver just copies rects from main memory to VRAM and passes them to 
> > > > the spice server; there is little to reduce there. Rather, we tried to 
> > > > increase the throughput of graphic operations, by not locking the 
> > > > DirectX subsystem while we wait for the spice server to accept new 
> > > > drawables. That is, we do not mind using more CPU if that results in 
> > > > painting faster.
> > On the other hand, I was thinking that maybe we could get the DirectX 
> > subsystem to provide the rects already in VRAM if we described it as a 
> > linear memory segment on driver initialization. In that way, the copying 
> > operation could also be removed. However, I am not sure if this actually 
> > works or even how to do it, it is just an idea.
> > 
> > > Question 2:
> > > Do you have some ideas how to make quantitive
> > > evaluation of this possible > > > improvement of user experience? 

> > > I think about: 
> > > - finding scenarios when we receive > > > rendering calls 
> > > (PresentDisplayOnly) when the worker > > > thread is still processing 
> > > previous operation. If they exist this can mean that some bottleneck 
> > > solved in GDI.
> > > - writing or getting tool that loads the graphics
> > > adapter by heavy operations (like continuos moving of window / scrolling 
> > > etc) with CPU consumption measurement
> > > > We used a simple tool to measure the performance: it creates a window 
> > > > and continuously issues WM_PAINT events where the full background is 
> > > > filled with color, then measures the number of processed events per 
> > > > second (not CPU). It is quite naive, but it provides a good starting 
> > > > reference, since the tool, with the XDDM QXL driver in Windows 7, 
> > > > outputs almost twice as much paint events as executing it in Windows 8 
> > > > with the WDDM QXL driver. There are other measurements you can try to 
> > > > obtain, like how much time does it take until a paint event gets to the 
> > > > spice server queue, ready to be sent to the client (although I'm not 
> > > > sure how to measure it). This delay affects the user perception of 
> > > > performance.
> > > > > > > > > > > Please share your thoughts.

> > > Thanks,
> > > Yuri
> > > 
> > > 

> > 


> 

_______________________________________________
Spice-devel mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/spice-devel

Reply via email to