Michael,
Scott is right. Not sure if this is the preferred approach, but I accomplished
this for large datasets by specifying buffer sizes for ReadAsArray. The doc I
consulted is here:
http://gdal.org/python/osgeo.gdal_array-module.html#BandReadAsArray.
I used masked arrays to exclude nodata values - you may not need to worry about
with this.
-David
Excerpt from my script:
src_ds = gdal.Open(src_fn, gdal.GA_ReadOnly)
b = src_ds.GetRasterBand(1)
ndv = b.GetNoDataValue()
ns = src_ds.RasterXSize
nl = src_ds.RasterYSize
#Don't want to load the entire dataset for stats computation
#This is maximum dimension for reduced resolution array
max_dim = 1024.
scale_ns = ns/max_dim
scale_nl = nl/max_dim
scale_max = max(scale_ns, scale_nl)
if scale_max > 1:
nl = round(nl/scale_max)
ns = round(ns/scale_max)
#The buf_size parameters determine the final array dimensions
bm = numpy.ma.masked_equal(numpy.array(b.ReadAsArray(buf_xsize=ns,
buf_ysize=nl)), ndv)
On Apr 11, 2012, at 11:17 AM, Scott Arko wrote:
> Hi Michael,
>
>
> I may be missing your question, but why aren't you just using ReadAsArray?
> It has an option to return a smaller array from the input array. Now, I'm
> not sure how it does the resampling (you could look to see), but you can make
> a call like
>
> data =
> banddata.ReadAsArray(0,0,filehandle.RasterXSize,filehandle.RasterYSize,xsize,ysize)
>
> where xsize and ysize are smaller than the true RasterXSize or RasterYSize.
> I haven't looked at this in a while, but I'm pretty sure this will work. Did
> I miss the point of what you were asking?
>
>
> Thanks,
> Scott
>
>
> On Wed, Apr 11, 2012 at 6:31 AM, K.-Michael Aye <[email protected]>
> wrote:
> Dear all,
>
> is there a Python API for downsampling a huge dataset?
> What I would like to do:
>
> * get my dataset
> * read out RasterXSize and RasterYSize
> * calculate how many lines and rows I need to skip to get a quick overview
> image, e.g. 10 lines to skip.
> * Have a ReadAsArray interface where I can say something like this:
> ** data = ds.ReadAsArray(xoffset, yoffset, 10000, 10000, skipping=10)
>
> which in numpy terms would give me every 10nth line like this: array[:,:,10]
>
> I really don't need quality at all, just speed, for a rough overview for
> further zooming in with lassos, as the images I deal with sometimes have more
> than 200 MPixels.
>
> Is this possible in Python?
> I was thinking now, maybe one could use numpy's memmap somehow for this,
> don't know much about it, though…
>
> Thanks for any hints!
>
> Best regards,
> Michael
>
>
> _______________________________________________
> gdal-dev mailing list
> [email protected]
> http://lists.osgeo.org/mailman/listinfo/gdal-dev
>
>
>
> _______________________________________________
> gdal-dev mailing list
> [email protected]
> http://lists.osgeo.org/mailman/listinfo/gdal-dev
_______________________________________________
gdal-dev mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/gdal-dev