Hello,  

First of all, sorry for the duplicate. I accidentally sent the same message 
twice, and I got responses in both of them. I include here the person who 
responded to the other messsage  


El miércoles 15 de mayo de 2013 a las 20:43, Merlin Moncure escribió:

> On Wed, May 15, 2013 at 11:31 AM, Jorge Arévalo
> <jorgearev...@libregis.org (mailto:jorgearev...@libregis.org)> wrote:
> > Hello,
> >  
> > I'd like to know what's the best way to reduce the number of server rounds 
> > in a libpq C app that fetches BLOBs from a remote PostgreSQL server.
> >  
> > About 75% of the time my app uses is spent querying database. I basically 
> > get binary objects (images). I have to fetch all the images from a table. 
> > This table can be really big (in number of rows) and each image can be big 
> > too.
>  
> #1 thing to make sure of when getting big blobs is that you are
> fetching data in binary. If you are not, do so before changing
> anything else (I wrote a library to help do that, libpqtypes).

Yes, I get data in binary. But thanks for the reference to the library.

>  
> > I guess I should go for cursors. If I understood the concept of "cursor", 
> > basically the query is executed, a ResultSet is generated inside the 
> > database server, and the client receives a "pointer" to this ResultSet. You 
> > can get all the rows by moving this pointer over the ResultSet, calling the 
> > right functions. But you still have to go to the database for each chunk of 
> > data. Am I right?
>  
> cursors are a way to page through a query result without fetching all
> the data at once. this would be most useful if you are processing one
> row at a time on the client side. but if the client needs all the
> data held in memory, cursors will only help in terms of reducing the
> temporary memory demands while doing the transfer. So it's hard to
> say if it's worth using them until you describe the client side
> requirements a little better.
>  
Ok, more context here. The images stored in the database, are really PostGIS 
Raster images. So, they are stored in this format [1], and are read in this 
another one [2], as binary objects (you can get a string representation of the 
format too, but is slower).

My point is to make the GDAL PostGIS Raster driver [3] faster, and my approach 
is: "instead of one server round per each user's data request, try to get the 
whole raster data with the fewer number of queries".

GDAL library raises a read operation for each chunk of data, in case of tiled 
raster formats. PostGIS Raster is a (special) tiled raster format, but one read 
operation per data chunk means one query (= one trip to server), and that's 
expensive. Even the simplest app (transform one raster table into a GeoTIFF 
file) spends 75% of its time in database queries, because this architecture.  

Plus, each data read's request requires an intersection between the area 
desired and the data stored in database (using ST_Intersects [4]). An 
intersection in the database is more expensive than in memory.

So, my goal is to save server rounds. My thought was: using a cursor, I still 
have to go to the database, but my data is stored in a ResultSet, and I just 
need to seek over it, instead of raising expensive queries. Am I wrong? Is 
there any better approach?

Eduardo, about your response, many thanks. I think blob data compression 
doesn't apply here. And out-db storage is an option of the PostGIS Raster 
format, but it's not mandatory.


[1] http://trac.osgeo.org/postgis/browser/trunk/raster/doc/RFC1-SerializedFormat
[2] 
http://trac.osgeo.org/postgis/browser/trunk/raster/doc/RFC2-WellKnownBinaryFormat
[3] http://trac.osgeo.org/gdal/wiki/frmts_wtkraster.html
[4] http://postgis.net/docs/manual-2.0/ST_Intersects.html


Best regards,
Jorge


> merlin  




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to