Hi, I've written a piece of software that fetches tiles out of a single MBTiles raster done by multiple threads. Tiles go to a common cache which can be used by a "view" thread that assembles tiles on the screen in a seamless map.
Each fetcher thread uses GDALOpenEx() and then RasterIO(GF_Read,,,) to get individual RGB bands out of the stored PNG tiles in the source MBTiles db. On the lowest level GDALGPKGMBTilesLikePseudoDataset::ReadTile() does a sqlite3 "SELECT tile_data FROM tiles..." with a db context which is unique to every fetcher thread. Giving a recap, a multithreaded raster read on a single raster source is doable, eased probably by the sqlite raster container that I use in my app. Best regards, Deyan On Mon, Jun 3, 2024 at 8:04 PM Even Rouault via gdal-dev <gdal-dev@lists.osgeo.org> wrote: > > Andrew, > > what would be the purpose of thread-safe access: just making it thread-safe > without any particular requirement on how efficient this would be (1), or > hope for true concurrent access with ideally close to linear scalability with > the number of threads (2) ? > > If (1), then we could add a GDALMutexedDataset class, similarly to > https://github.com/OSGeo/gdal/blob/master/ogr/ogrsf_frmts/generic/ogrmutexeddatasource.h > which exists on the vector side (just used by the FileGDB driver due to the > fact that the underlying SDK is not even re-entrant), which uses the > decorator pattern around all public API entry points to call the underlying > dataset under a mutex. One could imagine to have a GDAL_OF_THREADSAFE open > flag that GDALOpen() would use to return such instance. Shouldn't be too hard > to implement, but probably not that useful IMHO. I can anticipate most users > would have higher expectations than a mutex-based implementation. > > If (2), it seems to me that it would require a huge effort, and the > programming language we use (C++) offers hardly any safety belt to make sure > we don't make mistakes, the main one being forgetting to lock things that > should be locked, or dead locks situation. If we go into doing that, I'm not > even sure how we can reliably identify all parts of the code that must be > modified > > Neither GDAL raster core nor any driver are designed to be thread-safe. For > core, at least gcore/gdalarraybandblockcache.cpp and > gcore/gdalhashsetbandblockcache.cpp which interact with the block cache > should be made thread-safe, and "just" adding a lock would defeat the aim to > achieve linear scalability. The change in GDALDataset::RasterIO() I did in > https://github.com/OSGeo/gdal/commit/7f3a0e582eb189744bc7cb8e4a751135edaecaf5 > isn't thread-safe either (would be easy to make thread-safe though) > > Once GDAL raster code is ready, the main challenge is making drivers > themselves thread-safe. Raster drivers may directly read from a VSILFILE* > handle, which isn't thread safe when using the standard Seek() + Read() pair. > A few VSIVirtualFileSystem have a PRead() implementation, which is > thread-safe, but not all). Or they rely on using some instance of a "reader" > returned by a third-party library (libtiff, libjpeg, libpng, sqlite3, etc.) > (which in most cases also uses a VSILFILE*), none of which are thread-safe > (except sqlite3 that can be made thread-safe by passing a flag at > sqlite3_open() time, that will basically applies strategy (1) by protecting > all calls with a mutex). Perhaps using thread-specific instances of VSILFILE* > and third-party "reader" objects could be a way of solving this. But > realistically doing a pass in all GDAL drivers would be a multi-month-man to > multi-year-man type of effort. A realistic plan should be designed to allow > combining (1) and (2): (2) for a few select drivers, and (1) as a fallback > for most drivers that wouldn't be updated. > > Even > > Le 03/06/2024 à 15:44, Andrew Bell via gdal-dev a écrit : > > Hi, > > I am aware that there isn't thread-safe raster access with the current GDAL > interface for various reasons. Given the state of processors, I was wondering > if it would be valuable to take a look at providing the ability to do Raster > I/O (at least reads) in a thread-safe way. This could be done through a new > set of API calls or perhaps by modifications to what currently exists -- I > don't know what makes sense at this point. I would be happy to spend some > time looking at this if there is interest, but I would also like to learn > from existing experience as to what kinds of things that I'm surely not > considering would have to be dealt with. > > Thanks, > > -- > Andrew Bell > andrew.bell...@gmail.com > > _______________________________________________ > gdal-dev mailing list > gdal-dev@lists.osgeo.org > https://lists.osgeo.org/mailman/listinfo/gdal-dev > > -- > http://www.spatialys.com > My software is free, but my time generally not. > > _______________________________________________ > gdal-dev mailing list > gdal-dev@lists.osgeo.org > https://lists.osgeo.org/mailman/listinfo/gdal-dev _______________________________________________ gdal-dev mailing list gdal-dev@lists.osgeo.org https://lists.osgeo.org/mailman/listinfo/gdal-dev