On 08/14/2014 09:15 PM, Jerome Glisse wrote: > On Thu, Aug 14, 2014 at 08:47:16PM +0200, Daniel Vetter wrote: >> On Thu, Aug 14, 2014 at 8:18 PM, Jerome Glisse <j.glisse at gmail.com> wrote: >>> Sucks because you can not do weird synchronization like one i depicted in >>> another >>> mail in this thread and for as long as cmdbuf_ioctl do not give you >>> fence|syncpt >>> you can not such thing cleanly in non hackish way. >> Actually i915 can soon will do that that. > So you will return fence|syncpoint with each cmdbuf_ioctl ? > >>> Sucks because you have a fence object per buffer object and thus overhead >>> grow >>> with the number of objects. Not even mentioning fence lifetime issue. >>> >>> Sucks because sub-buffer allocation is just one of many tricks that can not >>> be >>> achieved properly and cleanly with implicit sync. >>> >>> ... >> Well I heard all those reasons and I'm well of aware of them. The >> problem is that with current hardware the kernel needs to know for >> each buffer how long it needs to be kept around since hw just can't do >> page faulting. Yeah you can pin them but for an uma design that >> doesn't go down well with folks. > I am not thinking with fancy hw in mind, on contrary i thought about all > this with the crappiest hw i could think of, in mind. > > Yes you can get rid of fence and not have to pin memory with current hw. > What matter for unpinning is to know that all hw block are done using the > memory. This is easily achievable with your beloved seqno. Have one seqno > per driver (one driver can have different block 3d, video decoding, crtc, > ...) each time a buffer is use as part of a command on one block inc the > common seqno and tag the buffer with that number. Have each hw block write > the lastest seqno that is done to a per block location. Now to determine > is buffer is done compare the buffer seqno with the max of all the signaled > seqno of all blocks. > > Cost 1 uint32 per buffer and simple if without locking to check status of > a buffer.
Hmm? The trivial and first use of fence objects in the linux DRM was triggered by the fact that a 32-bit seqno wraps pretty quickly and a 32-bit solution just can't be made robust. Now a 64-bit seqno will probably be robust for forseeable future, but when it comes to implement that on 32-bit hardware and compare it to a simple fence object approach, /Thomas