Thanks, Jon for your suggestion of GeoPandas.
Unfortunately, I'm not allowed to use new external dependencies.

I tried doing all steps in an SQLite file instead of using several
intermediate shapefiles. And I had some good results, so I created a script
dissolving an increasingly higher number of shapes.
Later I realized the performance increase was because in the new script I
forgot '-*explodecollections*'. This makes a huge difference. For now, I'll
keep the multipart polygons.

These commands I converted to a C# unit test:
// Convert fishnet shapefile to SQLite:
ogr2ogr -f SQLite taskmap.sqlite "Fishnet.shp" -nln fishnet -nlt POLYGON
-dsco SPATIALITE=YES -lco SPATIAL_INDEX=NO -gt unlimited --config
OGR_SQLITE_CACHE 4096 --config OGR_SQLITE_SYNCHRONOUS OFF
// Add field:
ogrinfo taskmap.sqlite -sql "ALTER TABLE fishnet ADD COLUMN randField real"
// Fill random values:
ogrinfo taskmap.sqlite -sql "UPDATE fishnet SET randField = ABS(RANDOM() %
10)"
// Create index:
ogrinfo taskmap.sqlite -sql "CREATE INDEX randfield_idx ON fishnet
(randField)"
// Combined dissolve and export:
ogr2ogr -f "ESRI Shapefile" -overwrite taskmap.shp taskmap.sqlite -sql
"SELECT ST_Union(geometry) as geom, randField FROM fishnet GROUP BY
randField" -gt unlimited --config OGR_SQLITE_CACHE 4096 --config
OGR_SQLITE_SYNCHRONOUS OFF

Some timing:
1,677 shapes --> 0.3s
4,810 shapes --> 1.8s
18,415 shapes --> 21.4s
72,288 shapes --> 5min, 54s
285,927 shapes --> 25m
1,139,424 shapes --> 6h, 47m
4,557,696 shapes --> Still running for 34h

4 million shapes are the amount my application needs to handle, but running
for days is not an option.

I noticed my script is using only a fraction of my resources: 30% RAM (of
12GB), 22-28% CPU (on 8 cores).
How can I let GDAL use more resources? Might it speed up the process?

I also read about CascadedUnion of GEOS. Can I also use it with GDAL/OGR?
If so how?
And would it help to enable GPU? If so, do I need a special build? I'm now
using the Windows-64bit of gisinternals.com

Thanks again for any pointers and/or suggestions.

Paul
_______________________________________________
gdal-dev mailing list
[email protected]
https://lists.osgeo.org/mailman/listinfo/gdal-dev

Reply via email to