The synchronization for non-coherent persistent mappings can also be done using:
glMemoryBarrier(GL_CLIENT_MAPPED_BUFFER_BARRIER_BIT); In which case you don't know the range either. However I fully support the addition of coherent persistent mappings to GL. It's perfect for uploading data without the GL API overhead. Marek On Thu, Feb 6, 2014 at 12:49 AM, Jose Fonseca <jfons...@vmware.com> wrote: > I hadn't looked at GL_ARB_buffer_storage. I need to read more closely, but at > a glance i looks like GL_MAP_PERSISTENT_BIT alone is okay (app needs to call > FlushMappedBufferRange must be called to guarantee coherence) but if > GL_MAP_COHERENCE_BIT is set we are indeed in face of the same issue... :-( > > Even worse, being part of GL 4.4 and there being no way for the > implementation to fail GL_MAP_COHERENCE_BIT mappings, it means there is no > way to avoid supporting it... > > Jose > > Note to self: my time would be better spent on reviewing extensions before > they are ratified, than ranting after the fact... > > > ----- Original Message ----- >> However, GL_ARB_buffer_storage (OpenGL 4.4) with GL_MAP_PERSISTENT_BIT >> isn't much different. The only difference I see between >> ARB_buffer_storage and AMD_pinned_memory is that AMD_pinned_memory >> allows mapping CPU memory to the GPU address space permanently, while >> ARB_buffer_storage allows mapping GPU memory to the CPU address >> permanently. At the end of the day, both the GPU and the CPU can read >> and modify the same buffer and all they need to use for >> synchronization is fences. >> >> Marek >> >> On Wed, Feb 5, 2014 at 8:10 PM, Jose Fonseca <jfons...@vmware.com> wrote: >> > >> > >> > ----- Original Message ----- >> >> >> >> >> >> ----- Original Message ----- >> >> > On 05.02.2014 18:08, Jose Fonseca wrote: >> >> > > I honestly hope that GL_AMD_pinned_memory doesn't become popular. It >> >> > > would >> >> > > have been alright if it wasn't for this bit in >> >> > > https://urldefense.proofpoint.com/v1/url?u=http://www.opengl.org/registry/specs/AMD/pinned_memory.txt&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=NMr9uy2iTjWVixC0wOcYCWEIYhfo80qKwRgdodpoDzA%3D%0A&m=pA%2FnK9X3xx0wAlMUZ24PfQ1mW6wAMdTUujz%2Bx7LRwCA%3D%0A&s=ebbe1f51deb46c81578b3c125b16e31b5f4b28c1d47e283bc9ef588e2707024d >> >> > > which says: >> >> > > >> >> > > 2) Can the application still use the buffer using the CPU >> >> > > address? >> >> > > >> >> > > RESOLVED: YES. However, this access would be completely >> >> > > non synchronized to the OpenGL pipeline, unless explicit >> >> > > synchronization is being used (for example, through glFinish >> >> > > or >> >> > > by >> >> > > using >> >> > > sync objects). >> >> > > >> >> > > And I'm imagining apps which are streaming vertex data doing precisely >> >> > > just >> >> > > that... >> >> > > >> >> > >> >> > I don't understand your concern, this is exactly the same behavior >> >> > GL_MAP_UNSYCHRONIZED_BIT has, and apps are supposedly using that >> >> > properly. How does apitrace handle it? >> >> >> >> GL_AMD_pinned_memory it's nothing like GL_ARB_map_buffer_range's >> >> GL_MAP_UNSYCHRONIZED_BIT: >> >> >> >> - When an app touches memory returned by >> >> glMapBufferRange(GL_MAP_UNSYCHRONIZED_BIT) it will communicate back to the >> >> OpenGL driver which bytes it actually touched via the >> >> glFlushMappedBufferRange (unless the apps doesn't care about performance >> >> and >> >> doesn't call glFlushMappedBufferRange at all, which is silly as it will >> >> force the OpenGL driver to assumed the whole range changed) >> >> >> >> In this case, the OpenGL driver (hence apitrace) should get all the >> >> information it needs about which bytes were updated betwen >> >> glMap/glUnmap. >> >> >> >> - When an app touches memory bound via GL_AMD_pinned_memory outside >> >> glMap/glUnmap, there are be _no_ hints whatsever. The OpenGL driver might >> >> not care as the memory is shared between CPU and GPU, so all is good as >> >> far >> >> is it is concerned, but all the changes the app does are invisible at an >> >> API >> >> level, hence apitrace will not be able to catch them unless it does >> >> onerous >> >> heuristics. >> >> >> >> >> >> So while both extensions allow unsynchronized access, but lack of >> >> synchronization is not my concern. My concern is that GL_AMD_pinned_memory >> >> allows *hidden* access to GPU memory. >> > >> > Just for the record, the challenges GL_AMD_pinned_memory presents to >> > Apitrace are much similar to the old-fashioned OpenGL user array pointers: >> > an app is free to change the contents of memory pointed by user arrays >> > pointers at any point in time, except during a draw call. This means that >> > before every draw call, Apitrace needs to scavenge all the user memory >> > pointers and write their contents to the trace file, just in case the app >> > changed it.. >> > >> > In order to support GL_AMD_pinned_memory, for every draw call Apitrace >> > would also need to walk over bound GL_AMD_pinned_memory (and nowadays >> > there are loads of bound points!), and check if data changed, and >> > serialize in the trace file if it did... >> > >> > >> > I never care much about performance of Apitrace with user array pointers: >> > it is an old paradigm; only old apps use it, or programmers which don't >> > particular care about performance -- either way, a performance conscious >> > app developer would use VBOs hence never hit the problem at all. My >> > displeasure with GL_AMD_pinned_memory is that it essentially flips >> > everything on its head -- it encourages a paradigm which apitrace will >> > never be able to handle properly. >> > >> > >> > People often complain that OpenGL development tools are poor compared with >> > Direct3D's. An important fact they often miss is that Direct3D API is >> > several orders of mangnitude tool friendlier: it's clear that Direct3D >> > API's cares about things like allowing to query all state back, whereas >> > OpenGL is more fire and forget and never look back -- the main concern in >> > OpenGL is ensuring that state can go from App to Driver fast, but little >> > thought is often given to ensuring that one can read whole state back, or >> > ensuring that one can intercept all state as it goes between the app and >> > the driver... >> > >> > >> > In this particular case, if the answer for "Can the application still use >> > the buffer using the CPU address?" was a NO, the world would be a much >> > better place. >> > >> > >> > Jose >> > _______________________________________________ >> > mesa-dev mailing list >> > mesa-dev@lists.freedesktop.org >> > https://urldefense.proofpoint.com/v1/url?u=http://lists.freedesktop.org/mailman/listinfo/mesa-dev&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=NMr9uy2iTjWVixC0wOcYCWEIYhfo80qKwRgdodpoDzA%3D%0A&m=x3Py6SaAuizlHQhinD9Ig4nikUdXTWMc9RZ5CxQDi9M%3D%0A&s=4fe812f4242b6f3e2d4c7fde43bc25f5a3b4eb1c04ea4381b9f3a13e881a67cf >> _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev