On 2012-04-11 19:17:27 +0000, David Shean said:
Michael,
Scott is right. Not sure if this is the preferred approach, but I
accomplished this for large datasets by specifying buffer sizes for
ReadAsArray. The doc I consulted is here:
http://gdal.org/python/osgeo.gdal_array-module.html#BandReadAsArray.
I used masked arrays to exclude nodata values - you may not need to
worry about with this.
-David
Excerpt from my script:
src_ds = gdal.Open(src_fn, gdal.GA_ReadOnly)
b = src_ds.GetRasterBand(1)
ndv = b.GetNoDataValue()
ns = src_ds.RasterXSize
nl = src_ds.RasterYSize
#Don't want to load the entire dataset for stats computation
#This is maximum dimension for reduced resolution array
max_dim = 1024.
scale_ns = ns/max_dim
scale_nl = nl/max_dim
scale_max = max(scale_ns, scale_nl)
if scale_max > 1:
nl = round(nl/scale_max)
ns = round(ns/scale_max)
#The buf_size parameters determine the final array dimensions
bm = numpy.ma.masked_equal(numpy.array(b.ReadAsArray(buf_xsize=ns,
buf_ysize=nl)), ndv)
Holy excrements. I didn't know about this functionality, most likely
because I only have one-banded datasets and the ds.ReadAsArray does not
offer buf_xsize and buf_ysize!
I just tried it out and it's so endlessly fast? My compliments to the
GDAL team!
And muchas gracias to you and Scott for pointing it out to me.
This will be a happy summer when I finally implement my private fast
image viewer… ;)
Best regards,
Michael
On Apr 11, 2012, at 11:17 AM, Scott Arko wrote:
Hi Michael,
I may be missing your question, but why aren't you just using
ReadAsArray? It has an option to return a smaller array from the input
array. Now, I'm not sure how it does the resampling (you could look to
see), but you can make a call like
data =
banddata.ReadAsArray(0,0,filehandle.RasterXSize,filehandle.RasterYSize,xsize,ysize)
where xsize and ysize are smaller than the true RasterXSize or
RasterYSize. I haven't looked at this in a while, but I'm pretty sure
this will work. Did I miss the point of what you were asking?
Thanks,
Scott
On Wed, Apr 11, 2012 at 6:31 AM, K.-Michael Aye <[email protected]> wrote:
Dear all,
is there a Python API for downsampling a huge dataset?
What I would like to do:
* get my dataset
* read out RasterXSize and RasterYSize
* calculate how many lines and rows I need to skip to get a quick
overview image, e.g. 10 lines to skip.
* Have a ReadAsArray interface where I can say something like this:
** data = ds.ReadAsArray(xoffset, yoffset, 10000, 10000, skipping=10)
which in numpy terms would give me every 10nth line like this: array[:,:,10]
I really don't need quality at all, just speed, for a rough overview
for further zooming in with lassos, as the images I deal with sometimes
have more than 200 MPixels.
Is this possible in Python?
I was thinking now, maybe one could use numpy's memmap somehow for
this, don't know much about it, though…
Thanks for any hints!
Best regards,
Michael
_______________________________________________
gdal-dev mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/gdal-dev
_______________________________________________
gdal-dev mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/gdal-dev
_______________________________________________
gdal-dev mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/gdal-dev
_______________________________________________
gdal-dev mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/gdal-dev