Hi,
I've written a piece of software that fetches tiles out of a single
MBTiles raster done by multiple threads. Tiles go to a common cache
which can be used by a "view" thread that assembles tiles on the
screen in a seamless map.

Each fetcher thread uses GDALOpenEx() and then RasterIO(GF_Read,,,) to
get individual RGB bands out of the stored PNG tiles in the source
MBTiles db. On the lowest level
GDALGPKGMBTilesLikePseudoDataset::ReadTile() does a sqlite3 "SELECT
tile_data FROM tiles..." with a db context which is unique to every
fetcher thread.

Giving a recap, a multithreaded raster read on a single raster source
is doable, eased probably by the sqlite raster container that I use in
my app.

Best regards,
Deyan




On Mon, Jun 3, 2024 at 8:04 PM Even Rouault via gdal-dev
<gdal-dev@lists.osgeo.org> wrote:
>
> Andrew,
>
> what would be the purpose of thread-safe access: just making it thread-safe 
> without any particular requirement on how efficient this would be (1), or 
> hope for true concurrent access with ideally close to linear scalability with 
> the number of threads (2) ?
>
> If (1), then we could add a GDALMutexedDataset class, similarly to 
> https://github.com/OSGeo/gdal/blob/master/ogr/ogrsf_frmts/generic/ogrmutexeddatasource.h
>  which exists on the vector side (just used by the FileGDB driver due to the 
> fact that the underlying SDK is not even re-entrant), which uses the 
> decorator pattern around all public API entry points to call the underlying 
> dataset under a mutex.  One could imagine to have a GDAL_OF_THREADSAFE open 
> flag that GDALOpen() would use to return such instance. Shouldn't be too hard 
> to implement, but probably not that useful IMHO. I can anticipate most users 
> would have higher expectations than a mutex-based implementation.
>
> If (2), it seems to me that it would require a huge effort, and the 
> programming language we use (C++) offers hardly any safety belt to make sure 
> we don't make mistakes, the main one being forgetting to lock things that 
> should be locked, or dead locks situation. If we go into doing that, I'm not 
> even sure how we can reliably identify all parts of the code that must be 
> modified
>
> Neither GDAL raster core nor any driver are designed to be thread-safe. For 
> core, at least gcore/gdalarraybandblockcache.cpp and 
> gcore/gdalhashsetbandblockcache.cpp which interact with the block cache 
> should be made thread-safe, and "just" adding a lock would defeat the aim to 
> achieve linear scalability. The change in GDALDataset::RasterIO() I did in 
> https://github.com/OSGeo/gdal/commit/7f3a0e582eb189744bc7cb8e4a751135edaecaf5 
> isn't thread-safe either (would be easy to make thread-safe though)
>
> Once GDAL raster code is ready, the main challenge is making drivers 
> themselves thread-safe. Raster drivers may directly read from a VSILFILE* 
> handle, which isn't thread safe when using the standard Seek() + Read() pair. 
> A few VSIVirtualFileSystem have a PRead() implementation, which is 
> thread-safe, but not all). Or they rely on using some instance of a "reader" 
> returned by a third-party library (libtiff, libjpeg, libpng, sqlite3, etc.) 
> (which in most cases also uses a VSILFILE*), none of which are thread-safe 
> (except sqlite3 that can be made thread-safe by passing a flag at 
> sqlite3_open() time, that will basically applies strategy (1) by protecting 
> all calls with a mutex). Perhaps using thread-specific instances of VSILFILE* 
> and third-party "reader" objects could be a way of solving this. But 
> realistically doing a pass in all GDAL drivers would be a multi-month-man to 
> multi-year-man type of effort. A realistic plan should be designed to allow 
> combining (1) and (2): (2) for a few select drivers, and (1) as a fallback 
> for most drivers that wouldn't be updated.
>
> Even
>
> Le 03/06/2024 à 15:44, Andrew Bell via gdal-dev a écrit :
>
> Hi,
>
> I am aware that there isn't thread-safe raster access with the current GDAL 
> interface for various reasons. Given the state of processors, I was wondering 
> if it would be valuable to take a look at providing the ability to do Raster 
> I/O (at least reads) in a thread-safe way. This could be done through a new 
> set of API calls or perhaps by modifications to what currently exists -- I 
> don't know what makes sense at this point. I would be happy to spend some 
> time looking at this if there is interest, but I would also like to learn 
> from existing experience as to what kinds of things that I'm surely not 
> considering would have to be dealt with.
>
> Thanks,
>
> --
> Andrew Bell
> andrew.bell...@gmail.com
>
> _______________________________________________
> gdal-dev mailing list
> gdal-dev@lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/gdal-dev
>
> --
> http://www.spatialys.com
> My software is free, but my time generally not.
>
> _______________________________________________
> gdal-dev mailing list
> gdal-dev@lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/gdal-dev
_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev

Reply via email to