Hello, First of all, sorry for the duplicate. I accidentally sent the same message twice, and I got responses in both of them. I include here the person who responded to the other messsage
El miércoles 15 de mayo de 2013 a las 20:43, Merlin Moncure escribió: > On Wed, May 15, 2013 at 11:31 AM, Jorge Arévalo > <jorgearev...@libregis.org (mailto:jorgearev...@libregis.org)> wrote: > > Hello, > > > > I'd like to know what's the best way to reduce the number of server rounds > > in a libpq C app that fetches BLOBs from a remote PostgreSQL server. > > > > About 75% of the time my app uses is spent querying database. I basically > > get binary objects (images). I have to fetch all the images from a table. > > This table can be really big (in number of rows) and each image can be big > > too. > > #1 thing to make sure of when getting big blobs is that you are > fetching data in binary. If you are not, do so before changing > anything else (I wrote a library to help do that, libpqtypes). Yes, I get data in binary. But thanks for the reference to the library. > > > I guess I should go for cursors. If I understood the concept of "cursor", > > basically the query is executed, a ResultSet is generated inside the > > database server, and the client receives a "pointer" to this ResultSet. You > > can get all the rows by moving this pointer over the ResultSet, calling the > > right functions. But you still have to go to the database for each chunk of > > data. Am I right? > > cursors are a way to page through a query result without fetching all > the data at once. this would be most useful if you are processing one > row at a time on the client side. but if the client needs all the > data held in memory, cursors will only help in terms of reducing the > temporary memory demands while doing the transfer. So it's hard to > say if it's worth using them until you describe the client side > requirements a little better. > Ok, more context here. The images stored in the database, are really PostGIS Raster images. So, they are stored in this format [1], and are read in this another one [2], as binary objects (you can get a string representation of the format too, but is slower). My point is to make the GDAL PostGIS Raster driver [3] faster, and my approach is: "instead of one server round per each user's data request, try to get the whole raster data with the fewer number of queries". GDAL library raises a read operation for each chunk of data, in case of tiled raster formats. PostGIS Raster is a (special) tiled raster format, but one read operation per data chunk means one query (= one trip to server), and that's expensive. Even the simplest app (transform one raster table into a GeoTIFF file) spends 75% of its time in database queries, because this architecture. Plus, each data read's request requires an intersection between the area desired and the data stored in database (using ST_Intersects [4]). An intersection in the database is more expensive than in memory. So, my goal is to save server rounds. My thought was: using a cursor, I still have to go to the database, but my data is stored in a ResultSet, and I just need to seek over it, instead of raising expensive queries. Am I wrong? Is there any better approach? Eduardo, about your response, many thanks. I think blob data compression doesn't apply here. And out-db storage is an option of the PostGIS Raster format, but it's not mandatory. [1] http://trac.osgeo.org/postgis/browser/trunk/raster/doc/RFC1-SerializedFormat [2] http://trac.osgeo.org/postgis/browser/trunk/raster/doc/RFC2-WellKnownBinaryFormat [3] http://trac.osgeo.org/gdal/wiki/frmts_wtkraster.html [4] http://postgis.net/docs/manual-2.0/ST_Intersects.html Best regards, Jorge > merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general