Re: [Intel-gfx] gem clflush optimization for media encoding

2011-06-23 Thread Zou, Nanhai
>>-Original Message- >>From: Jesse Barnes [mailto:jbar...@virtuousgeek.org] >>Sent: 2011年6月24日 1:20 >>To: Zou, Nanhai >>Cc: Keith Packard; intel-gfx@lists.freedesktop.org; Anholt, Eric >>Subject: Re: [Intel-gfx] gem clflush optimization for media enc

Re: [Intel-gfx] gem clflush optimization for media encoding

2011-06-23 Thread Jesse Barnes
On Wed, 22 Jun 2011 12:29:21 +0800 "Zou, Nanhai" wrote: > map_gtt in current gem is super slow. > I've tried map_gtt but it seems that the speed is unacceptable. > > >>> Since it is CPU read only surface, clflush in not needed at all. > >> > >>You'd still have to invalidate cache l

Re: [Intel-gfx] gem clflush optimization for media encoding

2011-06-22 Thread Chris Wilson
On Wed, 22 Jun 2011 09:20:35 -0700, Keith Packard wrote: > On Wed, 22 Jun 2011 08:29:24 +0200, Daniel Vetter wrote: > > > The important thing is that you may never use the cpu mappings with > > these functions (for objects of similar size). Because libdrm reuses > > bos without checking their do

Re: [Intel-gfx] gem clflush optimization for media encoding

2011-06-22 Thread Keith Packard
On Wed, 22 Jun 2011 08:29:24 +0200, Daniel Vetter wrote: > The important thing is that you may never use the cpu mappings with > these functions (for objects of similar size). Because libdrm reuses > bos without checking their domain, you'll get tons of unnecessary > clflush even on objects that

Re: [Intel-gfx] gem clflush optimization for media encoding

2011-06-22 Thread Keith Packard
On Wed, 22 Jun 2011 12:29:21 +0800, "Zou, Nanhai" wrote: > As I understand, > with movnti + sfence, data should be surly reach memory. Cache should be > coherent at this case. I wouldn't mind seeing additional experiments in this area, but when Eric and I tried this a couple of years ago, w

Re: [Intel-gfx] gem clflush optimization for media encoding

2011-06-21 Thread Daniel Vetter
2011/6/22 Zou, Nanhai : >        map_gtt in current gem is super slow. >        I've tried map_gtt but it seems that the speed is unacceptable. map_gtt should be pretty fast for large things on the upload side. For the gpu->cpu download, have you tried pread? btw, the counterpart (pwrite) also beat

Re: [Intel-gfx] gem clflush optimization for media encoding

2011-06-21 Thread Zou, Nanhai
s.freedesktop.org >>Cc: Anholt, Eric >>Subject: Re: [Intel-gfx] gem clflush optimization for media encoding >> >> >> >>>>-Original Message- >>>>From: Keith Packard [mailto:kei...@keithp.com] >>>>Sent: 2011年6月22日 12:14 >

Re: [Intel-gfx] gem clflush optimization for media encoding

2011-06-21 Thread Zou, Nanhai
>>-Original Message- >>From: Keith Packard [mailto:kei...@keithp.com] >>Sent: 2011年6月22日 12:14 >>To: Zou, Nanhai; intel-gfx@lists.freedesktop.org >>Cc: Anholt, Eric >>Subject: Re: [Intel-gfx] gem clflush optimization for media encoding >>

Re: [Intel-gfx] gem clflush optimization for media encoding

2011-06-21 Thread Keith Packard
On Wed, 22 Jun 2011 11:13:09 +0800, "Zou, Nanhai" wrote: > If I upload input buffer with movnti or movntdq (bypass cache) + > sfence(clear write combine buffer) in the end, clflush should > not be needed. Alas, neither of these will flush existing cached data, so you must still