Re: [HACKERS] COPY enhancements

2009-10-20 Thread Tom Lane
Emmanuel Cecchet writes: > Tom Lane wrote: >> The key word in my sentence above is "arbitrary". You don't know what >> a datatype input function might try to do, let alone triggers or other >> functions that COPY might have to invoke. They might do things that >> need to be cleaned up after, and

Re: [HACKERS] COPY enhancements

2009-10-20 Thread Emmanuel Cecchet
Tom, Emmanuel Cecchet writes: Tom Lane wrote: There aren't any. You can *not* put a try/catch around arbitrary code without a subtransaction. Don't even think about it Well then why the tests provided with the patch are working? Because they carefully exercise only a tiny frac

Re: [HACKERS] COPY enhancements

2009-10-19 Thread Robert Haas
On Mon, Oct 19, 2009 at 11:21 AM, Alvaro Herrera wrote: > Gokulakannan Somasundaram escribió: > >> Actually this problem is present even in today's transaction id scenario and >> the only way we avoid is by using freezing. Can we use a similar approach? >> This freezing should mean that we are fre

Re: [HACKERS] COPY enhancements

2009-10-19 Thread Alvaro Herrera
Gokulakannan Somasundaram escribió: > Actually this problem is present even in today's transaction id scenario and > the only way we avoid is by using freezing. Can we use a similar approach? > This freezing should mean that we are freezing the sub-transaction in order > to avoid the sub-transacti

Re: [HACKERS] COPY enhancements

2009-10-18 Thread Gokulakannan Somasundaram
Actually i thought of a solution for the wrap-around sometime back. Let me try to put my initial thoughts into it. May be it would get refined over conversation. Transaction wrap-around failure Actually this problem is present even in today's transaction id scenario and the only way we avoid is b

Re: [HACKERS] COPY enhancements

2009-10-13 Thread Tom Lane
Emmanuel Cecchet writes: > Tom Lane wrote: >> There aren't any. You can *not* put a try/catch around arbitrary code >> without a subtransaction. Don't even think about it. >> > Well then why the tests provided with the patch are working? Because they carefully exercise only a tiny fraction of

Re: [HACKERS] COPY enhancements

2009-10-13 Thread Dimitri Fontaine
Emmanuel Cecchet writes: > Tom was also suggesting 'refactoring COPY into a series of steps that the > user can control'. What would these steps be? Would that be per row and > allow to discard a bad tuple? The idea is to have COPY usable from a general SELECT query so that the user control what

Re: [HACKERS] COPY enhancements

2009-10-13 Thread Emmanuel Cecchet
Tom Lane wrote: Emmanuel Cecchet writes: - speed with error logging best effort: no use of sub-transactions but errors that can safely be trapped with pg_try/catch (no index violation, There aren't any. You can *not* put a try/catch around arbitrary code without a subtransaction. D

Re: [HACKERS] COPY enhancements

2009-10-13 Thread Tom Lane
Emmanuel Cecchet writes: > - speed with error logging best effort: no use of sub-transactions but > errors that can safely be trapped with pg_try/catch (no index violation, There aren't any. You can *not* put a try/catch around arbitrary code without a subtransaction. Don't even think about i

Re: [HACKERS] COPY enhancements

2009-10-13 Thread Emmanuel Cecchet
Tom Lane wrote: Ultimately, there's always going to be a tradeoff between speed and flexibility. It may be that we should just say "if you want to import dirty data, it's gonna cost ya" and not worry about the speed penalty of subtransaction-per-row. But that still leaves us with the 2^32 limit

Re: [HACKERS] COPY enhancements

2009-10-12 Thread Simon Riggs
On Thu, 2009-10-08 at 11:01 -0400, Tom Lane wrote: > So as far as I can see, the only form of COPY error handling that > wouldn't be a cruel joke is to run a separate subtransaction for each > row, and roll back the subtransaction on error. Of course the > problems > with that are (a) speed, (b)

Re: [HACKERS] COPY enhancements

2009-10-09 Thread Greg Smith
On Fri, 9 Oct 2009, Tom Lane wrote: what do we do with rows that fail encoding conversion? For logging to a file we could/should just decree that we write out the original, allegedly-in-the-client-encoding data. I'm not sure what we do about logging to a table though. The idea of storing by

Re: [HACKERS] COPY enhancements

2009-10-09 Thread Tom Lane
Hannu Krosing writes: > On Thu, 2009-10-08 at 11:32 -0400, Robert Haas wrote: >> Another possible approach, which isn't perfect either, is the idea of >> allowing COPY to generate a single column of output of type text[]. >> That greatly reduces the number of possible error cases, > maybe make i

Re: [HACKERS] COPY enhancements

2009-10-09 Thread Tom Lane
Simon Riggs writes: > Another thing that has occurred to me is that RI checks are currently > resolved at end of statement and could end up rejecting any/all rows > loaded. If we break down the load into subtransaction pieces we would > really want the RI checks on the rows to be performed during

Re: [HACKERS] COPY enhancements

2009-10-09 Thread Tom Lane
Simon Riggs writes: > On Thu, 2009-10-08 at 12:21 -0400, Tom Lane wrote: >> So really we have to find some way to only expend one XID per failure, >> not one per row. > I discovered a few days back that ~550 subtransactions is sufficient to > blow max_stack_depth. 1 subtransaction per error doesn

Re: [HACKERS] COPY enhancements

2009-10-09 Thread Hannu Krosing
On Thu, 2009-10-08 at 11:32 -0400, Robert Haas wrote: > Another possible approach, which isn't perfect either, is the idea of > allowing COPY to generate a single column of output of type text[]. > That greatly reduces the number of possible error cases, maybe make it bytea[] to further reduce e

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Simon Riggs
On Fri, 2009-10-09 at 00:15 +0100, Simon Riggs wrote: > On Thu, 2009-10-08 at 12:21 -0400, Tom Lane wrote: > > > > You'd eat a sub-sub-transaction per row, and start a new sub-transaction > > every 2^32 rows. > > > > However, on second thought this really doesn't get us anywhere, it just > > move

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Simon Riggs
On Thu, 2009-10-08 at 12:21 -0400, Tom Lane wrote: > Robert Haas writes: > > On Thu, Oct 8, 2009 at 11:50 AM, Tom Lane wrote: > >> I wonder whether we could break down COPY into sub-sub > >> transactions to work around that... > > > How would that work? Don't you still need to increment the com

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Simon Riggs
On Thu, 2009-10-08 at 18:23 -0400, Bruce Momjian wrote: > Dimitri Fontaine wrote: > > Simon Riggs writes: > > > It will be best to have the ability to have a specific rejection reason > > > for each row rejected. That way we will be able to tell the difference > > > between uniqueness violation er

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Bruce Momjian
Robert Haas wrote: > > That was a compliment on your project management skills. ?Keeping the CF > > work moving forward steadily is both unglamorous and extremely valuable, and > > I don't think anyone else even understands why you've volunteered to handle > > so much of it. ?But I know I appreciat

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Bruce Momjian
Robert Haas wrote: > Each of those features deserves a separate discussion to decide > whether we want it and how best to implement it. Personally, I think > we should skip (C), at least as a starting point. Instead of logging > to a table, I think we should consider making COPY return the tuples

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Andrew Dunstan
Bruce Momjian wrote: What would be _cool_ would be to add the ability to have comments in the COPY files, like \#, and then the copy data lines and errors could be adjacent. (Because of the way we control COPY escaping, adding \# would not be a problem. We have \N for null, for example.)

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Bruce Momjian
Dimitri Fontaine wrote: > Simon Riggs writes: > > It will be best to have the ability to have a specific rejection reason > > for each row rejected. That way we will be able to tell the difference > > between uniqueness violation errors, invalid date format on col7, value > > fails check constrain

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Robert Haas
On Thu, Oct 8, 2009 at 1:26 PM, Tom Lane wrote: > Robert Haas writes: >> On Thu, Oct 8, 2009 at 12:21 PM, Tom Lane wrote: >>> Another approach that was discussed earlier was to divvy the rows into >>> batches.  Say every thousand rows you sub-commit and start a new >>> subtransaction.  Up to tha

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Tom Lane
Robert Haas writes: > On Thu, Oct 8, 2009 at 12:21 PM, Tom Lane wrote: >> Another approach that was discussed earlier was to divvy the rows into >> batches.  Say every thousand rows you sub-commit and start a new >> subtransaction.  Up to that point you save aside the good rows somewhere >> (mayb

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Greg Smith
On Thu, 8 Oct 2009, Rod Taylor wrote: 1) Having copy remember which specific line caused the error. So it can replace lines 1 through 487 in a subtransaction since it knows those are successful. Run 488 in its on subtransaction. Run 489 through ... in a new subtransaction. This is the standa

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Greg Smith
On Thu, 8 Oct 2009, Tom Lane wrote: It may be that we should just say "if you want to import dirty data, it's gonna cost ya" and not worry about the speed penalty of subtransaction-per-row. This goes along with the response I gave on objections to adding other bits of overhead into COPY. If

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Robert Haas
On Thu, Oct 8, 2009 at 12:21 PM, Tom Lane wrote: > Robert Haas writes: >> On Thu, Oct 8, 2009 at 11:50 AM, Tom Lane wrote: >>> I wonder whether we could break down COPY into sub-sub >>> transactions to work around that... > >> How would that work?  Don't you still need to increment the command c

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Kevin Grittner
Tom Lane wrote: > Hmm, if we were willing to break COPY into multiple *top level* > transactions, that would avoid my concern about XID wraparound. > The issue here is that if the COPY does eventually fail (and there > will always be failure conditions, eg out of disk space), then some > of the

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Robert Haas
On Thu, Oct 8, 2009 at 12:37 PM, Tom Lane wrote: > "Joshua D. Drake" writes: >> Couldn't you just commit each range of subtransactions based on some >> threshold? > >> COPY foo from '/tmp/bar/' COMMIT_THRESHOLD 100; > >> It counts to 1mil, commits starts a new transaction. Yes there would be

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Tom Lane
"Joshua D. Drake" writes: > Couldn't you just commit each range of subtransactions based on some > threshold? > COPY foo from '/tmp/bar/' COMMIT_THRESHOLD 100; > It counts to 1mil, commits starts a new transaction. Yes there would be > 1million sub transactions but once it hits those clean,

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Tom Lane
Robert Haas writes: > On Thu, Oct 8, 2009 at 11:50 AM, Tom Lane wrote: >> I wonder whether we could break down COPY into sub-sub >> transactions to work around that... > How would that work? Don't you still need to increment the command counter? Actually, command counter doesn't help because i

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Rod Taylor
> Yeah. I think it's going to be hard to make this work without having > standalone transactions. One idea would be to start a subtransaction, > insert tuples until one fails, then rollback the subtransaction and > start a new one, and continue on until the error limit is reached. > I've found p

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Joshua D. Drake
On Thu, 2009-10-08 at 11:59 -0400, Robert Haas wrote: > On Thu, Oct 8, 2009 at 11:50 AM, Tom Lane wrote: > >> Another possible approach, which isn't perfect either, is the idea of > >> allowing COPY to generate a single column of output of type text[]. > >> That greatly reduces the number of possi

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Robert Haas
On Thu, Oct 8, 2009 at 11:50 AM, Tom Lane wrote: >> Another possible approach, which isn't perfect either, is the idea of >> allowing COPY to generate a single column of output of type text[]. >> That greatly reduces the number of possible error cases, and at least >> gets the data into the DB whe

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Tom Lane
Robert Haas writes: > Subcommitting every single row is going to be really painful, > especially after Hot Standby goes in and we have to issue a WAL record > after every 64 subtransactions (AIUI). Yikes ... I had not been following that discussion, but that sure sounds like a deal-breaker. For

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Robert Haas
On Thu, Oct 8, 2009 at 11:29 AM, Alvaro Herrera wrote: > Robert Haas escribió: > >> Some defective part of my brain enjoys seeing things run smoothly more >> than it enjoys being lazy. > > Strangely, that seems to say you'd make a bad Perl programmer, per Larry > Wall's three virtues. Don't worry

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Robert Haas
On Thu, Oct 8, 2009 at 11:01 AM, Tom Lane wrote: > Robert Haas writes: >> Lest there be any unclarity, I am NOT trying to shoot down this >> feature with my laser-powered bazooka. > > Well, if you need somebody to do that Well, I'm trying not to demoralize people who have put in hard work, howev

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Alvaro Herrera
Robert Haas escribió: > Some defective part of my brain enjoys seeing things run smoothly more > than it enjoys being lazy. Strangely, that seems to say you'd make a bad Perl programmer, per Larry Wall's three virtues. -- Alvaro Herrerahttp://www.CommandPrompt.co

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Tom Lane
Robert Haas writes: > Lest there be any unclarity, I am NOT trying to shoot down this > feature with my laser-powered bazooka. Well, if you need somebody to do that --- I took a quick look through this patch, and it is NOT going to get committed. Not in anything approximately like its current fo

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Kevin Grittner
Robert Haas wrote: > It seems quite odd to me that when COPY succeeds but there are > errors, the transaction commits. The only indication that some of > my data didn't end up in the table is that the output says "COPY n" > where n is less than the total number of rows I attempted to copy. >

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Robert Haas
On Thu, Oct 8, 2009 at 8:34 AM, Dimitri Fontaine wrote: > Robert Haas writes: >> I'm a little mystified by this response since I spent several >> paragraphs following the one that you have quoted here explaining how >> I think we should approach the problem of providing the features that >> are c

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Dimitri Fontaine
Robert Haas writes: > I'm a little mystified by this response since I spent several > paragraphs following the one that you have quoted here explaining how > I think we should approach the problem of providing the features that > are currently all encapsulated under the mantle of "error logging".

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Simon Riggs
On Wed, 2009-10-07 at 22:30 -0400, Robert Haas wrote: > On Fri, Sep 25, 2009 at 10:01 AM, Emmanuel Cecchet wrote: > > Robert, > > > > Here is the new version of the patch that applies to CVS HEAD as of this > > morning. > > > > Emmanuel > > I took a look at this patch tonight and, having now rea

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Robert Haas
On Thu, Oct 8, 2009 at 4:42 AM, Dimitri Fontaine wrote: > Robert Haas writes: >> What's really bad about this is that a flag called "error_logging" is >> actually changing the behavior of the command in a way that is far >> more dramatic than (and doesn't actually have much to do with) error >> l

Re: [HACKERS] COPY enhancements

2009-10-08 Thread Dimitri Fontaine
Robert Haas writes: > What's really bad about this is that a flag called "error_logging" is > actually changing the behavior of the command in a way that is far > more dramatic than (and doesn't actually have much to do with) error > logging. It's actually making a COPY command succeed that would

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Robert Haas
On Fri, Sep 25, 2009 at 10:01 AM, Emmanuel Cecchet wrote: > Robert, > > Here is the new version of the patch that applies to CVS HEAD as of this > morning. > > Emmanuel I took a look at this patch tonight and, having now read through some of it, I have some more detailed comments. It seems quite

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Robert Haas
On Wed, Oct 7, 2009 at 7:52 PM, Greg Smith wrote: > On Wed, 7 Oct 2009, Robert Haas wrote: > >> On Wed, Oct 7, 2009 at 3:17 AM, Greg Smith wrote: >> >>> I doubt taskmaster Robert is going to let this one linger around with >>> scope creep for too long before being pushed out to the next CommitFes

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Greg Smith
On Wed, 7 Oct 2009, Robert Haas wrote: On Wed, Oct 7, 2009 at 3:17 AM, Greg Smith wrote: I doubt taskmaster Robert is going to let this one linger around with scope creep for too long before being pushed out to the next CommitFest. I'm can't decide whether to feel good or bad about that ap

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Greg Smith
On Wed, 7 Oct 2009, Emmanuel Cecchet wrote: I think there is a misunderstanding about what the current patch is about...the patch does NOT include logging errors into a file (a feature we can add later on (next commit fest?)) I understand that (as one of the few people who has read the patch

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Robert Haas
On Wed, Oct 7, 2009 at 11:45 AM, Emmanuel Cecchet wrote: > You are suggesting then that it is the COPY command that aborts the > transaction. That would only happen if you had set a limit on the number of > errors that you want to accept in a COPY command (in which case you know > that there is so

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Emmanuel Cecchet
Greg Smith wrote: Absolutely, that's the whole point of logging to a file in the first place. What needs to happen here is that when one is aborted, you need to make sure that fact is logged, and with enough information (the pid?) to tie it to the COPY that failed. Then someone can crawl the

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Emmanuel Cecchet
Robert Haas wrote: On Wed, Oct 7, 2009 at 11:39 AM, Emmanuel Cecchet wrote: Robert Haas wrote: On Wed, Oct 7, 2009 at 9:12 AM, Emmanuel Cecchet wrote: Hi all, I think there is a misunderstanding about what the current patch is about. The patch includes 2 things: - error logg

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Emmanuel Cecchet
The roadmap I would propose for the current list of enhancements to COPY is as follows: 1. new syntax for COPY options (already committed) 2. error logging in a table 3. auto-partitioning (just relies on basic error logging, so can be scheduled anytime after 2) 4. error logging in a file manu

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Robert Haas
On Wed, Oct 7, 2009 at 11:39 AM, Emmanuel Cecchet wrote: > Robert Haas wrote: >> >> On Wed, Oct 7, 2009 at 9:12 AM, Emmanuel Cecchet >> wrote: >> >>> >>> Hi all, >>> >>> I think there is a misunderstanding about what the current patch is >>> about. >>> The patch includes 2 things: >>> - error log

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Emmanuel Cecchet
Robert Haas wrote: On Wed, Oct 7, 2009 at 9:12 AM, Emmanuel Cecchet wrote: Hi all, I think there is a misunderstanding about what the current patch is about. The patch includes 2 things: - error logging in a table for bad tuples in a COPY operation (see http://wiki.postgresql.org/wiki/Error

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Andrew Dunstan
Tom Lane wrote: Andrew Dunstan writes: Emmanuel Cecchet wrote: If you prefer to postpone the auto-partitioning to the next commit fest, I can strip it from the current patch and re-submit it for the next fest (but it's just 2 isolated methods really easy to review). I ce

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Tom Lane
Andrew Dunstan writes: > Emmanuel Cecchet wrote: >> If you prefer to postpone the auto-partitioning to the next commit >> fest, I can strip it from the current patch and re-submit it for the >> next fest (but it's just 2 isolated methods really easy to review). > I certainly think this should b

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Robert Haas
On Wed, Oct 7, 2009 at 9:12 AM, Emmanuel Cecchet wrote: > Hi all, > > I think there is a misunderstanding about what the current patch is about. > The patch includes 2 things: > - error logging in a table for bad tuples in a COPY operation (see > http://wiki.postgresql.org/wiki/Error_logging_in_CO

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Simon Riggs
On Wed, 2009-10-07 at 15:33 +0200, Dimitri Fontaine wrote: > Simon Riggs writes: > > It will be best to have the ability to have a specific rejection reason > > for each row rejected. That way we will be able to tell the difference > > between uniqueness violation errors, invalid date format on c

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Dimitri Fontaine
Simon Riggs writes: > It will be best to have the ability to have a specific rejection reason > for each row rejected. That way we will be able to tell the difference > between uniqueness violation errors, invalid date format on col7, value > fails check constraint on col22 etc.. In case that he

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Andrew Dunstan
Emmanuel Cecchet wrote: If you prefer to postpone the auto-partitioning to the next commit fest, I can strip it from the current patch and re-submit it for the next fest (but it's just 2 isolated methods really easy to review). I certainly think this should be separated out. In general it

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Emmanuel Cecchet
Hi all, I think there is a misunderstanding about what the current patch is about. The patch includes 2 things: - error logging in a table for bad tuples in a COPY operation (see http://wiki.postgresql.org/wiki/Error_logging_in_COPY for an example; the error message, command and so on are autom

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Robert Haas
On Wed, Oct 7, 2009 at 3:17 AM, Greg Smith wrote: > I know this patch is attracting more reviewers lately, is anyone tracking > the general architecture of the code yet?  Emmanuel's work is tough to > review just because there's so many things mixed together, and there's other > inputs I think sho

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Simon Riggs
On Wed, 2009-10-07 at 03:17 -0400, Greg Smith wrote: > On Mon, 5 Oct 2009, Josh Berkus wrote: > > Also, presumbly, if you abort a COPY because of errors, you > > probably want to keep the errors around for later analysis. No? > > Absolutely, that's the whole point of logging to a file in the f

Re: [HACKERS] COPY enhancements

2009-10-07 Thread Greg Smith
On Mon, 5 Oct 2009, Josh Berkus wrote: I think that this was the original idea but we should probably rollback the error logging if the command has been rolled back. It might be more consistent to use the same hi_options as the copy command. Any idea what would be best? Well, if we're logging

Re: [HACKERS] COPY enhancements

2009-10-06 Thread Emmanuel Cecchet
I just realized that I forgot to CC the list when I answered to Josh... resending! Josh, >> I think that this was the original idea but we should probably rollback >> the error logging if the command has been rolled back. It might be more >> consistent to use the same hi_options as the copy c

Re: [HACKERS] COPY enhancements

2009-10-05 Thread Josh Berkus
Emmanuel, > I think that this was the original idea but we should probably rollback > the error logging if the command has been rolled back. It might be more > consistent to use the same hi_options as the copy command. Any idea what > would be best? Well, if we're logging to a file, you wouldn't

Re: [HACKERS] COPY enhancements

2009-10-05 Thread Emmanuel Cecchet
Hi Selena, This is my first pass at the error logging portion of this patch. I'm going to take a break and try to go through the partitioning logic as well later this afternoon. caveat: I'm not familiar with most of the code paths that are being touched by this patch. Overall: * I noticed '\se

Re: [HACKERS] COPY enhancements

2009-10-04 Thread Emmanuel Cecchet
The problem comes from the foo_malformed_terminator.data file. It is supposed to have a malformed terminator that was not catch by patch. The second line should look like: 2 two^M If it does not, you can edit it with emacs, go at the end of the second line and press Ctrl+q followed by Ctrl+m

Re: [HACKERS] COPY enhancements

2009-10-04 Thread Selena Deckelmann
Hi! On Fri, Sep 25, 2009 at 7:01 AM, Emmanuel Cecchet wrote: > Here is the new version of the patch that applies to CVS HEAD as of this > morning. Cool features! This is my first pass at the error logging portion of this patch. I'm going to take a break and try to go through the partitioning l

Re: [HACKERS] COPY enhancements

2009-10-04 Thread Jeff Davis
On Fri, 2009-09-25 at 10:01 -0400, Emmanuel Cecchet wrote: > Robert, > > Here is the new version of the patch that applies to CVS HEAD as of this > morning. I just started looking at this now. It seems to fail "make check", diffs attached. I haven't looked into the cause of the failure yet. Reg

Re: [HACKERS] COPY enhancements

2009-09-25 Thread Emmanuel Cecchet
Robert, Here is the new version of the patch that applies to CVS HEAD as of this morning. Emmanuel On Fri, Sep 18, 2009 at 12:14 AM, Emmanuel Cecchet wrote: Here is a new version of error logging and autopartitioning in COPY based on the latest COPY patch that provides the new syntax fo

Re: [HACKERS] COPY enhancements

2009-09-24 Thread Emmanuel Cecchet
Yes, I have to update the patch following what Tom already integrated of the COPY patch. I will get a new version posted as soon as I can. Emmanuel Robert Haas wrote: On Fri, Sep 18, 2009 at 12:14 AM, Emmanuel Cecchet wrote: Here is a new version of error logging and autopartitioning in C

Re: [HACKERS] COPY enhancements

2009-09-24 Thread Robert Haas
On Fri, Sep 18, 2009 at 12:14 AM, Emmanuel Cecchet wrote: > Here is a new version of error logging and autopartitioning in COPY based on > the latest COPY patch that provides the new syntax for copy options (this > patch also includes the COPY option patch). > > New features compared to previous v

Re: [HACKERS] COPY enhancements

2009-09-19 Thread Bruce Momjian
Tom Lane wrote: > Josh Berkus writes: > > It's not as if we don't have the ability to measure performance impact. > > It's reasonable to make a requirement that new options to COPY > > shouldn't slow it down noticeably if those options aren't used. And we > > can test that, and even make such te

Re: [HACKERS] COPY enhancements

2009-09-14 Thread Andrew Dunstan
Emmanuel Cecchet wrote: Greg Smith wrote: On Fri, 11 Sep 2009, Emmanuel Cecchet wrote: I guess the problem with extra or missing columns is to make sure that you know exactly which data belongs to which column so that you don't put data in the wrong columns which is likely to happen if thi

Re: [HACKERS] COPY enhancements

2009-09-14 Thread Emmanuel Cecchet
Greg Smith wrote: On Fri, 11 Sep 2009, Emmanuel Cecchet wrote: I guess the problem with extra or missing columns is to make sure that you know exactly which data belongs to which column so that you don't put data in the wrong columns which is likely to happen if this is fully automated. All

Re: [HACKERS] COPY enhancements

2009-09-13 Thread Andrew Dunstan
Tom Lane wrote: Josh Berkus writes: It's not as if we don't have the ability to measure performance impact. It's reasonable to make a requirement that new options to COPY shouldn't slow it down noticeably if those options aren't used. And we can test that, and even make such testing part

Re: [HACKERS] COPY enhancements

2009-09-13 Thread Tom Lane
Josh Berkus writes: > It's not as if we don't have the ability to measure performance impact. > It's reasonable to make a requirement that new options to COPY > shouldn't slow it down noticeably if those options aren't used. And we > can test that, and even make such testing part of the patch re

Re: [HACKERS] COPY enhancements

2009-09-13 Thread Josh Berkus
Tom, > [ shrug... ] Everybody in the world is going to want their own little > problem to be handled in the fast path. And soon it won't be so fast > anymore. I think it is perfectly reasonable to insist that the fast > path is only for "clean" data import. Why? No, really. It's not as if we

Re: [HACKERS] COPY enhancements

2009-09-12 Thread Greg Smith
On Sat, 12 Sep 2009, Tom Lane wrote: Everybody in the world is going to want their own little problem to be handled in the fast path. And soon it won't be so fast anymore. I think it is perfectly reasonable to insist that the fast path is only for "clean" data import. The extra overhead is

Re: [HACKERS] COPY enhancements

2009-09-12 Thread Tom Lane
Andrew Dunstan writes: > At the same time, I think it's probably not a good thing that users who > deal with very large amounts of data would be forced off the COPY fast > path by a need for something like input support for non-rectangular > data. [ shrug... ] Everybody in the world is going

Re: [HACKERS] COPY enhancements

2009-09-12 Thread Andrew Dunstan
Tom Lane wrote: Andrew Dunstan writes: Right. What I proposed would not have been terribly invasive or difficult, certainly less so than what seems to be our direction by an order of magnitude at least. I don't for a moment accept the assertion that we can get a general solution for the

Re: [HACKERS] COPY enhancements

2009-09-12 Thread Tom Lane
Andrew Dunstan writes: > Right. What I proposed would not have been terribly invasive or > difficult, certainly less so than what seems to be our direction by an > order of magnitude at least. I don't for a moment accept the assertion > that we can get a general solution for the same effort. A

Re: [HACKERS] COPY enhancements

2009-09-12 Thread Andrew Dunstan
Greg Smith wrote: After some thought, I think that Andrew's feature *is* generally applicable, if done as IGNORE COLUMN COUNT (or, more likely, column_count=ignore). I can think of a lot of data sets where column count is jagged and you want to do ELT instead of ETL. Exactly, the ELT approach

Re: [HACKERS] COPY enhancements

2009-09-12 Thread Greg Smith
On Fri, 11 Sep 2009, Emmanuel Cecchet wrote: I guess the problem with extra or missing columns is to make sure that you know exactly which data belongs to which column so that you don't put data in the wrong columns which is likely to happen if this is fully automated. Allowing the extra col

Re: [HACKERS] COPY enhancements

2009-09-12 Thread Greg Smith
On Fri, 11 Sep 2009, Josh Berkus wrote: I've been thinking about it, and can't come up with a really strong case for wanting a user-defined table if we settle the issue of having a strong key for pg_copy_errors. Do you have one? No, but I'd think that if the user table was only allowed to be

Re: [HACKERS] COPY enhancements

2009-09-12 Thread Heikki Linnakangas
Josh Berkus wrote: >> The user-defined table for rejects is obviously exclusive of the system >> one, either of those would be fine from my perspective. > > I've been thinking about it, and can't come up with a really strong case > for wanting a user-defined table if we settle the issue of having

Re: [HACKERS] COPY enhancements

2009-09-12 Thread Heikki Linnakangas
Josh Berkus wrote: >> The performance of every path to get data into the database besides COPY >> is too miserable for us to use anything else, and the current >> inflexibility makes it useless for anything but the cleanest input data. > > One potential issue we're facing down this road is that cu

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Robert Haas
On Fri, Sep 11, 2009 at 6:56 PM, Emmanuel Cecchet wrote: > Robert Haas wrote: >> >> http://developer.postgresql.org/pgdocs/postgres/sql-explain.html >> > > Just out of curiosity, it looks like I could write something like: > EXPLAIN (ANALYZE TRUE, COSTS FALSE, VERBOSE TRUE, COSTS TRUE) statement >

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Josh Berkus
On 9/11/09 3:56 PM, Emmanuel Cecchet wrote: > Robert Haas wrote: >> http://developer.postgresql.org/pgdocs/postgres/sql-explain.html >> > Just out of curiosity, it looks like I could write something like: > EXPLAIN (ANALYZE TRUE, COSTS FALSE, VERBOSE TRUE, COSTS TRUE) statement > > What is the

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Emmanuel Cecchet
Robert Haas wrote: http://developer.postgresql.org/pgdocs/postgres/sql-explain.html Just out of curiosity, it looks like I could write something like: EXPLAIN (ANALYZE TRUE, COSTS FALSE, VERBOSE TRUE, COSTS TRUE) statement What is the expected behavior if someone puts multiple time the same

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote: > Integrating hstore into core and then > making COPY able to execute a subquery to get its options is certainly > not easier than a straightforward grammar modification; it's taking a > small project and turning it into several big ones. To be honest,

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Emmanuel Cecchet
Greg Smith wrote: The full set of new behavior here I'd like to see allows adjusting: -Accept or reject rows with extra columns? -Accept or reject rows that are missing columns at the end? --Fill them with the default for the column (if available) or NULL? -Save rejected rows? --To a single syst

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Andrew Dunstan
Josh Berkus wrote: Greg, The performance of every path to get data into the database besides COPY is too miserable for us to use anything else, and the current inflexibility makes it useless for anything but the cleanest input data. One potential issue we're facing down this road is

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Josh Berkus
Greg, > The performance of every path to get data into the database besides COPY > is too miserable for us to use anything else, and the current > inflexibility makes it useless for anything but the cleanest input data. One potential issue we're facing down this road is that current COPY has a du

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Tom Lane
Robert Haas writes: > On Fri, Sep 11, 2009 at 5:32 PM, Tom Lane wrote: >> Why?  We'd certainly still support the old syntax for existing options, >> just as we did with EXPLAIN. > None of the syntax proposals upthread had that property, which doesn't > mean we can't do it. However, we'd need so

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Robert Haas
On Fri, Sep 11, 2009 at 5:32 PM, Tom Lane wrote: > Robert Haas writes: >> The biggest problem I have with this change is that it's going to >> massively break anyone who is using the existing COPY syntax. > > Why?  We'd certainly still support the old syntax for existing options, > just as we did

Re: [HACKERS] COPY enhancements

2009-09-11 Thread Tom Lane
Robert Haas writes: > The biggest problem I have with this change is that it's going to > massively break anyone who is using the existing COPY syntax. Why? We'd certainly still support the old syntax for existing options, just as we did with EXPLAIN. regards, tom lane

  1   2   >