Laurent and list; On Mon, Jul 7, 2025 at 2:44 AM celati Laurent via QGIS-User < [email protected]> wrote:
> Dear all, > I'm working with QGIS and PostGIS. As input, I have 25 polygonal layers > covering a large area (multicities area). One of these data is a very large > dataset (1 million objects). The other 24 are much smaller (a maximum of a > hundred objects). > For information, I should point out that some of these polygonal datasets > are in "multi-part features" mode and others in "single-part features" > mode. I imagine this may ultimately have a slight impact on the > method/result. These 25 polygonal .shp files have highly variable, > non-homogeneous/non-harmonized data structures. Each layer has a "data_id" > field that allows to define/link/reference, for each feature, its > membership in the layer. For example, all values in the "data_id" field for > the first layer have a value of '1'. For the second layer, the field values > are '2', etc. > > My goal would be to be able to apply/adapt the existing QGIS geoprocessing > tool called "Multiple Union": > > https://docs.qgis.org/3.40/en/docs/user_manual/processing_algs/qgis/vectoroverlay.html#union-multiple > > Below a screenshot from the QGIS documentation : > > [image: image.png] > > My goal would be to have an output file: > > > - Which would be the result of the union/overlay of the 25 input > data. To use the terms of the QGIS documentation, the processing should > check for overlaps between features within the 25 layers and create > separate features for the overlapping and non-overlapping parts. This > "multiple union" geoprocessing seems interesting for my goal where there is > no overlap (a, NULL; b, NULL; c, NULL). > > > - For areas where there is an overlap, the QGIS union geoprocessing > creates as many identical overlapping features as there are features > participating in this overlap. This doesn't bother me. But since, > ultimately, I'd like a field in the result/output file to allow, for each > feature, to retrieve the list of input layers that participate/contribute > to this result feature (in order to retrieve the origin/source of the > data). I was wondering/thinking it might be better if only one feature was > created per overlapping area? > > > - I'd like a field in the result file to allow, for each feature, to > retrieve the list of input layers that participate/contribute to this > result feature. In order to retrieve the origin/source of the data. > > > - Ideally, a field that allows you to retrieve the number (COUNT) of > layers that contribute to this feature (at least 1 layer, at most 25 > layers). > > > - Regarding the non-geometric attributes/fields, I would like to be > able to specify the selection/define the list of fields I ultimately want > to keep. I don't want to keep all of the fields, but rather just some of > the fields for each of the 25 input layers. > > > I imagine it's recommended to do this processing in PostGIS rather than > QGIS? I can, if necessary, import my 25 SHP files into a PostGIS database. > I also imagine it's important to keep in mind that the "multi-part > features" / "single-part pieces/features" mode of the input layers can > affect the result. If I'm using a PostGIS query, I was thinking it might be > helpful to force all features to be in single-part mode (using the PostGIS > 'st_dump' function?). > > If this were my task, I would surely do it all in PostGIS. Back in my forestry days, we had a very similar problem to solve - in a large area (hundreds of thousands to millions of hectares), a very detailed forest vegetation layer and 10+ layers of policy, administration and operational concerns, all to be combined via a union process to generate a "resultant" that could be fed into a long-term model to calculate sustainable flows of forest products, silvicultural treatments and so on. Some of the general principles I recall are: - the need to buffer line and point concerns into reasonable polygons (not overly dense number of vertices, for example) - the need to have single part polygons - the need to simplify unnecessarily complicated polygon shapes (again, reduce unnecessarily dense vertices) - the need to normalize all layers to reduce the cost of writing the results of the unions back to the database (for example, polygon non-spatial attributes on separate table linked to the spatial component) - the need to have a spatial indexing strategy and to review EXPLAIN costs to check that the strategy works - the need to investigate the performance requirements of repeated operations that simultaneously read and write large tables, possibly requiring PostgreSQL configuration tuning - the usual best approach being - union all the smaller tables together first, - then union that product with the detailed forest vegetation layer, - then consider pulling the required subset of attributes onto the resultant polygon records; or alternatively create a view Some observations about your goals: - I suppose you can create this resultant as overlapping polygons or alternatively as polygons "cut into" each other (what ESRI used as the original definition of union) - I have always worked with the "cut into" version, which results in a final table with the cut-up geometry for each overlap area and all the IDs of the constituent pieces as individual attributes - in that case your COUNT is just the number of non-null ID columns for each record - with a normalized set of input tables the ID column for each input layer allows you to retrieve the information for the input layer - with a normalized set of input tables, you need to specify the non-ID fields that you keep, either by appending those fields to the resultant table or creating a view or just doing table joins I'm going to stop here for now. -- Chris Hermansen · clhermansen "at" gmail "dot" com C'est ma façon de parler.
_______________________________________________ QGIS-User mailing list [email protected] List info: https://lists.osgeo.org/mailman/listinfo/qgis-user Unsubscribe: https://lists.osgeo.org/mailman/listinfo/qgis-user
