Re: [HACKERS] directory archive format for pg_dump

2010-12-16 Thread Andrew Dunstan
On 12/16/2010 03:52 PM, Tom Lane wrote: Andrew Dunstan writes: On 12/16/2010 03:13 PM, Robert Haas wrote: So how bad would it be if we committed this new format without support for splitting large relations into multiple files, or with some stub support that never actually gets used, and fix

Re: [HACKERS] directory archive format for pg_dump

2010-12-16 Thread Andres Freund
On Thursday 16 December 2010 23:34:02 Heikki Linnakangas wrote: > On 17.12.2010 00:29, Andres Freund wrote: > > On Thursday 16 December 2010 19:33:10 Joachim Wieland wrote: > >> On Thu, Dec 16, 2010 at 12:48 PM, Heikki Linnakangas > >> > >> wrote: > >>> As soon as we have parallel pg_dump, the n

Re: [HACKERS] directory archive format for pg_dump

2010-12-16 Thread Heikki Linnakangas
On 17.12.2010 00:29, Andres Freund wrote: On Thursday 16 December 2010 19:33:10 Joachim Wieland wrote: On Thu, Dec 16, 2010 at 12:48 PM, Heikki Linnakangas wrote: As soon as we have parallel pg_dump, the next big thing is going to be parallel dump of the same table using multiple processes.

Re: [HACKERS] directory archive format for pg_dump

2010-12-16 Thread Andres Freund
On Thursday 16 December 2010 19:33:10 Joachim Wieland wrote: > On Thu, Dec 16, 2010 at 12:48 PM, Heikki Linnakangas > > wrote: > > As soon as we have parallel pg_dump, the next big thing is going to be > > parallel dump of the same table using multiple processes. Perhaps we > > should prepare for

Re: [HACKERS] directory archive format for pg_dump

2010-12-16 Thread Tom Lane
Andrew Dunstan writes: > On 12/16/2010 03:13 PM, Robert Haas wrote: >> So how bad would it be if we committed this new format without support >> for splitting large relations into multiple files, or with some stub >> support that never actually gets used, and fixed this later? Because >> this is

Re: [HACKERS] directory archive format for pg_dump

2010-12-16 Thread Andrew Dunstan
On 12/16/2010 03:13 PM, Robert Haas wrote: So how bad would it be if we committed this new format without support for splitting large relations into multiple files, or with some stub support that never actually gets used, and fixed this later? Because this is starting to sound like a bigger pr

Re: [HACKERS] directory archive format for pg_dump

2010-12-16 Thread Heikki Linnakangas
On 16.12.2010 22:13, Robert Haas wrote: So how bad would it be if we committed this new format without support for splitting large relations into multiple files, or with some stub support that never actually gets used, and fixed this later? Because this is starting to sound like a bigger project

Re: [HACKERS] directory archive format for pg_dump

2010-12-16 Thread Robert Haas
On Thu, Dec 16, 2010 at 2:29 PM, Tom Lane wrote: > Heikki Linnakangas writes: >> On 16.12.2010 20:33, Joachim Wieland wrote: >>> How exactly would you "just split the table in chunks of roughly the >>> same size" ? > >> Check pg_class.relpages, and divide that evenly across the processes. >> That

Re: [HACKERS] directory archive format for pg_dump

2010-12-16 Thread Tom Lane
Heikki Linnakangas writes: > On 16.12.2010 20:33, Joachim Wieland wrote: >> How exactly would you "just split the table in chunks of roughly the >> same size" ? > Check pg_class.relpages, and divide that evenly across the processes. > That should be good enough. Not even close ... relpages coul

Re: [HACKERS] directory archive format for pg_dump

2010-12-16 Thread Heikki Linnakangas
On 16.12.2010 20:33, Joachim Wieland wrote: On Thu, Dec 16, 2010 at 12:48 PM, Heikki Linnakangas wrote: As soon as we have parallel pg_dump, the next big thing is going to be parallel dump of the same table using multiple processes. Perhaps we should prepare for that in the directory archive f

Re: [HACKERS] directory archive format for pg_dump

2010-12-16 Thread Joachim Wieland
On Thu, Dec 16, 2010 at 12:48 PM, Heikki Linnakangas wrote: > As soon as we have parallel pg_dump, the next big thing is going to be > parallel dump of the same table using multiple processes. Perhaps we should > prepare for that in the directory archive format, by allowing the data of a > single

Re: [HACKERS] directory archive format for pg_dump

2010-12-16 Thread Heikki Linnakangas
On 16.12.2010 19:58, Robert Haas wrote: On Thu, Dec 16, 2010 at 12:48 PM, Heikki Linnakangas wrote: One more thing: the motivation behind this patch is to allow parallel pg_dump in the future, so we should be make sure this patch caters well for that. As soon as we have parallel pg_dump, the

Re: [HACKERS] directory archive format for pg_dump

2010-12-16 Thread Robert Haas
On Thu, Dec 16, 2010 at 12:48 PM, Heikki Linnakangas wrote: > One more thing: the motivation behind this patch is to allow parallel > pg_dump in the future, so we should be make sure this patch caters well for > that. > > As soon as we have parallel pg_dump, the next big thing is going to be > par

Re: [HACKERS] directory archive format for pg_dump

2010-12-16 Thread Heikki Linnakangas
On 16.12.2010 17:23, Heikki Linnakangas wrote: On 16.12.2010 12:12, Greg Smith wrote: There's a number of small things that I'd like to see improved in new rev of this code ... In addition to those: ... One more thing: the motivation behind this patch is to allow parallel pg_dump in the fut

Re: [HACKERS] directory archive format for pg_dump

2010-12-16 Thread Heikki Linnakangas
On 16.12.2010 12:12, Greg Smith wrote: Moving onto the directory archive part of this patch, the feature seems to work as advertised; here's a quick test case: createdb pgbench pgbench -i -s 1 pgbench pg_dump -F d -f test pg_restore -k test pg_restore -l test createdb copy pg_restore -d copy tes

Re: [HACKERS] directory archive format for pg_dump

2010-12-16 Thread Greg Smith
Moving onto the directory archive part of this patch, the feature seems to work as advertised; here's a quick test case: createdb pgbench pgbench -i -s 1 pgbench pg_dump -F d -f test pg_restore -k test pg_restore -l test createdb copy pg_restore -d copy test The copy made that way looked good.

Re: [HACKERS] directory archive format for pg_dump

2010-12-07 Thread Joachim Wieland
On Thu, Dec 2, 2010 at 2:52 PM, Heikki Linnakangas wrote: > Ok, committed, with some small cleanup since the last patch I posted. > > Could you update the directory-format patch on top of the committed version, > please? Thanks for committing the first part. Here is the updated and rebased direct

Re: [HACKERS] directory archive format for pg_dump

2010-12-03 Thread Heikki Linnakangas
On 02.12.2010 23:12, Alvaro Herrera wrote: Excerpts from Heikki Linnakangas's message of jue dic 02 16:52:27 -0300 2010: Ok, committed, with some small cleanup since the last patch I posted. I think the comments on _ReadBuf and friends need to be updated, since they are not just for headers an

Re: [HACKERS] directory archive format for pg_dump

2010-12-02 Thread Alvaro Herrera
Excerpts from Heikki Linnakangas's message of jue dic 02 16:52:27 -0300 2010: > Ok, committed, with some small cleanup since the last patch I posted. I think the comments on _ReadBuf and friends need to be updated, since they are not just for headers and TOC stuff anymore. I'm not sure if they we

Re: [HACKERS] directory archive format for pg_dump

2010-12-02 Thread Heikki Linnakangas
Ok, committed, with some small cleanup since the last patch I posted. Could you update the directory-format patch on top of the committed version, please? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)

Re: [HACKERS] directory archive format for pg_dump

2010-12-01 Thread Heikki Linnakangas
On 02.12.2010 04:35, Joachim Wieland wrote: There is one thing however that I am not in favor of, which is the removal of the "sizeHint" parameter for the read functions. The reason for this parameter is not very clear now without LZF but I have tried to put in a few comments to explain the situa

Re: [HACKERS] directory archive format for pg_dump

2010-12-01 Thread Joachim Wieland
On Wed, Dec 1, 2010 at 9:05 AM, Heikki Linnakangas wrote: > Forgot attachment. This is also available in the above git repo. I have quickly checked your modifications, on the one hand I like the reduction of functions, I would have said that we have AH around all the time and so we could just all

Re: [HACKERS] directory archive format for pg_dump

2010-12-01 Thread Heikki Linnakangas
On 01.12.2010 16:03, Heikki Linnakangas wrote: On 29.11.2010 22:21, Heikki Linnakangas wrote: I combined those, and the Free/Flush steps, and did a bunch of other editorializations and cleanups. Here's an updated patch, also available in my git repository at git://git.postgresql.org/git/users/he

Re: [HACKERS] directory archive format for pg_dump

2010-12-01 Thread Heikki Linnakangas
On 29.11.2010 22:21, Heikki Linnakangas wrote: On 29.11.2010 07:11, Joachim Wieland wrote: On Mon, Nov 22, 2010 at 3:44 PM, Heikki Linnakangas wrote: * wrap long lines * use extern in function prototypes in header files * "inline" some functions like _StartDataCompressor, _EndDataCompressor, _

Re: [HACKERS] directory archive format for pg_dump

2010-11-29 Thread Heikki Linnakangas
On 29.11.2010 07:11, Joachim Wieland wrote: On Mon, Nov 22, 2010 at 3:44 PM, Heikki Linnakangas wrote: * wrap long lines * use extern in function prototypes in header files * "inline" some functions like _StartDataCompressor, _EndDataCompressor, _DoInflate/_DoDeflate that aren't doing anythin

Re: [HACKERS] directory archive format for pg_dump

2010-11-29 Thread Robert Haas
On Mon, Nov 29, 2010 at 10:49 AM, Heikki Linnakangas wrote: > On 29.11.2010 07:11, Joachim Wieland wrote: >> >> On Mon, Nov 22, 2010 at 3:44 PM, Heikki Linnakangas >>  wrote: >>> >>> * wrap long lines >>> * use extern in function prototypes in header files >>> * "inline" some functions like _Star

Re: [HACKERS] directory archive format for pg_dump

2010-11-29 Thread Heikki Linnakangas
On 29.11.2010 07:11, Joachim Wieland wrote: On Mon, Nov 22, 2010 at 3:44 PM, Heikki Linnakangas wrote: * wrap long lines * use extern in function prototypes in header files * "inline" some functions like _StartDataCompressor, _EndDataCompressor, _DoInflate/_DoDeflate that aren't doing anythin

Re: [HACKERS] directory archive format for pg_dump

2010-11-22 Thread Heikki Linnakangas
On 22.11.2010 19:07, Tom Lane wrote: Heikki Linnakangas writes: But I'm not actually sure we should be preventing mix& match of files from different dumps. It might be very useful to do just that sometimes, like restoring a recent backup, with the contents of one table replaced with older data

Re: [HACKERS] directory archive format for pg_dump

2010-11-22 Thread Tom Lane
Heikki Linnakangas writes: > But I'm not actually sure we should be preventing mix & match of files > from different dumps. It might be very useful to do just that sometimes, > like restoring a recent backup, with the contents of one table replaced > with older data. A warning would be ok, thou

Re: [HACKERS] directory archive format for pg_dump

2010-11-22 Thread Heikki Linnakangas
On 20.11.2010 06:10, Joachim Wieland wrote: 2010/11/19 José Arthur Benetasso Villanova: The md5.c and kwlookup.c reuse using a link doesn't look nice either. This way you need to compile twice, among others things, but I think that its temporary, right? No, it isn't. md5.c is used in the same

Re: [HACKERS] directory archive format for pg_dump

2010-11-19 Thread Joachim Wieland
Hi Jose, 2010/11/19 José Arthur Benetasso Villanova : > The dir format generated in my database 60 files, with different > sizes, and it looks very confusing. Is it possible to use the same > trick as pigz and pbzip2, creating a concatenated file of streams? What pigz is parallelizing is the actu

Re: [HACKERS] directory archive format for pg_dump

2010-11-19 Thread Joachim Wieland
On Fri, Nov 19, 2010 at 11:53 PM, Tom Lane wrote: > Dimitri Fontaine writes: > > I think I'd like to see a separate patch for the new compression > > support. Sorry about that, I realize that's extra work… > > That part of the patch is likely to get rejected outright anyway, > so I *strongly* re

Re: [HACKERS] directory archive format for pg_dump

2010-11-19 Thread Tom Lane
Dimitri Fontaine writes: > I think I'd like to see a separate patch for the new compression > support. Sorry about that, I realize that's extra work… That part of the patch is likely to get rejected outright anyway, so I *strongly* recommend splitting it out. We have generally resisted adding

Re: [HACKERS] directory archive format for pg_dump

2010-11-19 Thread Joachim Wieland
Hi Dimitri, thanks for reviewing my patch! On Fri, Nov 19, 2010 at 2:44 PM, Dimitri Fontaine wrote: > I think I'd like to see a separate patch for the new compression > support. Sorry about that, I realize that's extra work… I guess it wouldn't be a very big deal but I also doubt that it makes

Re: [HACKERS] directory archive format for pg_dump

2010-11-19 Thread Alvaro Herrera
Excerpts from José Arthur Benetasso Villanova's message of vie nov 19 18:28:03 -0300 2010: > The md5.c and kwlookup.c reuse using a link doesn't look nice either. > This way you need to compile twice, among others things, but I think > that its temporary, right? Not sure what you mean here, but

Re: [HACKERS] directory archive format for pg_dump

2010-11-19 Thread José Arthur Benetasso Villanova
Hi Dimitri and Joachim. I've looked the patch too, and I want to share some thoughts too. I've used http://wiki.postgresql.org/wiki/Reviewing_a_Patch to guide my review. Submission review: I've apllied and compiled the patch successfully using the current master. Usability review: The dir form

Re: [HACKERS] directory archive format for pg_dump

2010-11-19 Thread Dimitri Fontaine
Hi, Sharing some thoughts after a first round of reviewing, where I only had time to read the patch itself. Joachim Wieland writes: > Since the compression is currently all down in the custom format > backup code, > the first thing I've done was refactoring the compression functions > into a > s