[BUGS] Strange AFTER UPDATE trigger behavior
POSTGRESQL BUG REPORT TEMPLATE Your name : mason Your email address : mason (at) vanten.com System Configuration - Architecture (example: Intel Pentium) : AMD, multi proc, not sure exactly Operating System (example: Linux 2.0.26 ELF) : NetBSD PostgreSQL version (example: PostgreSQL-7.3.4): PostgreSQL-7.3.4 Please enter a FULL description of your problem: The following scenerio is the expected behavior: A function performs two operations: 1) updates a column on a table, casing a trigger to fire which will insert a row into a new table. 2) The original fuction then updates that newly inserted row with some additional data. Actual behavior: If the trigger is placed BEFORE UPDATE on the table with the changing column, this works as planned. If the trigger is instead placed AFTER UPDATE on the table then the function cannot find the newly inserted row to update after the trigger has fired even though the row is very clearly in the table. Please describe a way to repeat the problem. Please try to provide a concise reproducible example, if at all possible: -- To see the two types of behavior just switch the word AFTER with BEFORE and run again. CREATE TABLE foo ( id serial PRIMARY KEY ) WITHOUT OIDS; CREATE TABLE bar ( id serial PRIMARY KEY REFERENCES foo, datatext ) WITHOUT OIDS; CREATE OR REPLACE FUNCTION insert_rec() RETURNS TRIGGER AS ' BEGIN INSERT INTO bar VALUES (NEW.id, ''text''); RETURN NEW; END; ' LANGUAGE 'plpgsql' SECURITY DEFINER; CREATE TRIGGER foo_trigger AFTER UPDATE ON foo FOR EACH ROW EXECUTE PROCEDURE insert_rec(); INSERT INTO foo VALUES (1); INSERT INTO foo VALUES (2); INSERT INTO foo VALUES (3); INSERT INTO foo VALUES (4); CREATE OR REPLACE FUNCTION test() RETURNS boolean AS ' BEGIN UPDATE foo SET id = 10 WHERE id = 1; UPDATE bar SET data = ''changed'' WHERE id = 10; UPDATE foo SET id = 20 WHERE id = 2; UPDATE bar SET data = ''changed'' WHERE id = 20; RETURN true; END; ' LANGUAGE 'plpgsql' SECURITY DEFINER; SELECT test(); SELECT * FROM foo; SELECT * FROM bar; DROP TABLE foo CASCADE; DROP TABLE bar CASCADE; If you know how this problem might be fixed, list the solution below: - ?? -- Mason Glaves, Senior Programmer, Vanten K.K. [EMAIL PROTECTED] Tel: 03-5919-0266 http://www.vanten.com Fax: 03-5919-0267 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [BUGS] BUG #5015: MySQL migration wizard does not start
On Wed, Aug 26, 2009 at 02:55:37PM +, Ken Smith wrote: > I also noted > that if I try to run from the commandline that you use '/' in some of the > program paths causing the OS to say it cannot find the file - "C:/Program > Files/Java/jdk1.5.0_16/jre/bin/java.exe". I'm pretty sure that Windows has known about forward slashes being path delimiters since Windows 95 and the start of Win32. I know I normally use forward slashes in XP when I'm at work! -- Sam http://samason.me.uk/ -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] inconsistent composite type null handling in plpgsql out variable
On Fri, Aug 28, 2009 at 02:06:02PM -0400, Merlin Moncure wrote: > 3) If we decide the sql standard is correct, so that (null, null) is > null == true, then we should observe rule 1 and make things work in > consistent way. This means, for example, that null::foo and (null, > null)::foo should not be distinct. The more awkward case (to me anyway) is that the standard says (1,NULL) IS NULL should evaluate to TRUE. I'd never noticed the ROW / RECORD dichotomy before; could one of these be made SQL compatible and the other use more sane semantics? -- Sam http://samason.me.uk/ -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] inconsistent composite type null handling in plpgsql out variable
On Mon, Aug 31, 2009 at 07:26:59PM +0200, Pavel Stehule wrote: > 2009/8/31 Sam Mason : > > The more awkward case (to me anyway) is that the standard says (1,NULL) > > IS NULL should evaluate to TRUE. > > what? > > only (NULL, NULL) IS NULL is true Bah, sorry you're right! I was rattling my favorite tin and getting mixed up with the behavior with IS NOT NULL, the negation of which would say this row is null. I.e: SELECT NOT (1,NULL) IS NOT NULL; evaluates to TRUE. I think the consensus is that we should continue to follow the spec on this, but I was getting confused as to which operator contains the EXISTS and FORALL operator. I.e. a value "v" IS NULL iff all elements of "v" are not 'the null value', whereas "v" IS NOT NULL iff an element of "v" is 'the null value'. > p.s. what isn't consistent (maybe - there are more possible > interpretations) is > > (NULL, NULL) IS DISTINCT FROM NULL is true Yup, I'd agree with Merlin that a ROW consisting entirely of 'null values' should itself be 'the null value' (to use the terminology from the copy of the SQL spec I'm reading). I think this should also work recursively: SELECT ROW(ROW(NULL)) IS DISTINCT FROM NULL; should return FALSE, in my understanding. -- Sam http://samason.me.uk/ -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #5028: CASE returns ELSE value always when type is "char"
On Tue, Sep 01, 2009 at 04:36:25PM +, Joseph Shraibman wrote: > Description:CASE returns ELSE value always when type is "char" I think it's just silently truncating the literal to a single character. > [local]:playpen=> select c.relkind, CASE c.relkind WHEN 'r' THEN 'table' > WHEN 'v' THEN 'view' WHEN 'i' THEN 'index' WHEN 'S' THEN 'sequence' WHEN 's' > THEN 'special' WHEN 'c' THEN 'composite type' WHEN 't' THEN 'toast' ELSE > c.relkind > playpen-> END from (select 'r'::"char" AS relkind) c; > relkind | relkind > -+- > r | t Here, 'r' maps to the "char" literal 'table' which PG interprets as the value 't'--i.e. PG silently chops of the 'able'. The bug would seem to be in your code, but PG could maybe throw an error to tell you this is what is happening? A possible fix would be to have your ELSE clause as: c.relkind::text As that way the other branches would be interpreted as text and they wouldn't be getting chopped off along the way. -- Sam http://samason.me.uk/ -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #5028: CASE returns ELSE value always when type is "char"
On Tue, Sep 01, 2009 at 05:49:25PM +0100, Sam Mason wrote: > PG could maybe throw an error to tell you this is > what is happening? Would something like the included patch be accepted? -- Sam http://samason.me.uk/ *** src/backend/utils/adt/char.c~ 2009-01-01 17:23:49.0 + --- src/backend/utils/adt/char.c 2009-09-02 10:11:13.0 +0100 *** *** 34,39 --- 34,45 { char *ch = PG_GETARG_CSTRING(0); + if (ch[1]) + ereport(ERROR, + (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION), + errmsg("only a single character is supported in \"char\" literals: \"%s\"", + ch))); + PG_RETURN_CHAR(ch[0]); } -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #5028: CASE returns ELSE value always when type is"char"
On Wed, Sep 02, 2009 at 03:50:05PM +0100, Greg Stark wrote: > Perhaps we should stop thinking of "unknown" as, er, "unknown" and > think of it as "text literal". A text literal has implicit casts to > every data type but a normal text string has to be explicitly cast. How does that help things? You seem to be keeping the semantics and only changing the name, when it's the semantics that you seem to be complaining about. I'm pretty sure it's correct to treat it as "unknown", other type-systems do this and it all works well. The thing that makes things ambiguous is the ad-hoc polymorphism that exists in function calls and operators. With PG's type system you know almost nothing about any types involved in an arbitrary expression, operators are better than functions (given the types of the arguments you know the return type, with default parameters even this knowledge doesn't exist with functions) but still leave things far too open to have any rigor without spelling out types in full everywhere. For example: CREATE FUNCTION add(int,int) RETURNS int LANGUAGE sql AS $$ SELECT $1 + $2; $$; CREATE FUNCTION add(int,int,int DEFAULT NULL) RETURNS text LANGUAGE sql AS $$ SELECT ($1 + $2)::text; $$; What type should it attribute to the result of: SELECT add(1,2); In fact it doesn't seem to want to play ball at all. Even given the apparently unambiguous: SELECT 1+add(1,2); or SELECT 'hi'||add(1,2); It doesn't get anywhere. No need for "text 'hi'" in the second one because || isn't defined for values of integer type. -- Sam http://samason.me.uk/ -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #5028: CASE returns ELSE value always when type is"char"
On Wed, Sep 02, 2009 at 12:36:00PM -0400, Robert Haas wrote: > On Wed, Sep 2, 2009 at 11:55 AM, Sam Mason wrote: > > In fact it doesn't seem to want to play ball at all. Even given the > > apparently unambiguous: > > > > SELECT 1+add(1,2); > > or > > SELECT 'hi'||add(1,2); > > > > It doesn't get anywhere. No need for "text 'hi'" in the second one > > because || isn't defined for values of integer type. > > Right. This is exactly the sort of thing that languages with real > type inference have no problem handling. Of course, ML for example > doesn't allow overloading precisely because (AIUI) it makes type > inference difficult. It would be awesome if we could make this work > though. Difficult, but not intractable. Haskell has done this sort of thing for quite a while; although it handles ad-hoc polymorphism differently than PG does. You basically end up saying how much polymorphism you want to allow for each function, for example the return type of an operator (in PG) is determined exactly by the type of its arguments. In Haskell you would have the power to say, if you so wanted, that the type of an operator's RHS is determined exactly by the type of its LHS and return type, or even, in the most general case, that it's parametrized over all three types. Obviously the more types you leave free the more typing you have to do specifying all the types as the type-inference has less leverage to work with. I've been trying to think about how to apply a more modern type system to PG for a while and hence my comments about how things like NULL rows should be handled are based on this and may come across as rather dogmatic sometimes, it's about the only way I can get things to hold together without introducing much more complexity than seems strictly necessary. -- Sam http://samason.me.uk/ -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #5028: CASE returns ELSE value always when type is"char"
On Wed, Sep 02, 2009 at 01:19:07PM -0400, Tom Lane wrote: > Sam Mason writes: > > ... For example: > > > CREATE FUNCTION add(int,int) RETURNS int LANGUAGE sql > > AS $$ SELECT $1 + $2; $$; > > > CREATE FUNCTION add(int,int,int DEFAULT NULL) RETURNS text LANGUAGE sql > > AS $$ SELECT ($1 + $2)::text; $$; > > > What type should it attribute to the result of: > > > SELECT add(1,2); > > > In fact it doesn't seem to want to play ball at all. Even given the > > apparently unambiguous: > > > SELECT 1+add(1,2); > > or > > SELECT 'hi'||add(1,2); > > > It doesn't get anywhere. > > Well, no, because our type resolution is bottom-up; it does not consider > context when trying to resolve the overloaded "add()" function > reference. "Unknown" is the only part of the system that allows for any > delay at all in identifying the type of a construct, and even that is > limited to a literal and its first-level surrounding context. OK, I got distracted and my example was bad. > It's interesting that you want to go in 100% the opposite direction from > Kevin, who seems to want to eliminate type inference altogether. Maybe > our current compromise isn't too bad, if it makes everybody unhappy in > opposite directions ;-) The current compromise has worked for a while so there's no immediate reason to change it. I wasn't interpreting Kevin's request directly as I don't think he really means what he's saying and really wants an unambiguous way of writing literals in queries. I think it's easier to get this without specifying types everywhere and introducing constraints elsewhere to maintain what compatibility we've got with the SQL standard. If we did follow Kevin's request directly, should we also be specifying the type of NULL? -- Sam http://samason.me.uk/ -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #5028: CASE returns ELSE value always when type is"char"
On Wed, Sep 02, 2009 at 12:54:03PM -0500, Kevin Grittner wrote: > Sam Mason wrote: > > If we did follow Kevin's request directly, should we also be > > specifying the type of NULL? > > I don't *think* the SQL standard requires that, and barring that I > don't see any compelling reason to type NULL. The SQL standard certainly doesn't require it. It's just that you were requiring the types of literals that happened to be enclosed in quotes to have their type ascribed, so why not the NULL literal? > One problem I do see > with the current scheme, however, is that NULL *does* get typed to > text when it makes no sense. In my view, a CASE expression which has > only NULL for its return values, or an abbreviated form of CASE, such > as COALESCE or NULLIF, should be evaluated exactly the same as if they > were replaced by NULL itself. For example, COALESCE(NULL, NULL) > currently yields NULL::text. In my view that's wrong. I view it as a > bug, but that seems to be a hard sell here. Yes, that's because PG does a bottom-up solve of the type constraints. I think it should really result in an unknown type as well. > Likewise, I think that in the query which started this thread, the > cast to "char" is not sensible. I'm not sure how that could be > resolved, but it doesn't seem correct to me. All branches unify with the "char" type (i.e. they're all "char" or unknown) and hence the result of the expression is determined to be of type "char". -- Sam http://samason.me.uk/ -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #5028: CASE returns ELSE value always when type is"char"
On Wed, Sep 02, 2009 at 01:27:35PM -0500, Kevin Grittner wrote: > Sam Mason wrote: > > you were requiring the types of literals that happened to be > > enclosed in quotes to have their type ascribed, so why not the NULL > > literal? > > Well, unless things have changed in recent versions of the standard > and I've missed the change, a series of characters enclosed in > apostrophes is what the standard calls a "character string literal" > and defines it to be be related to character based types such as > varchar. As far as I'm aware, considering it to be undefined is a > PostgreSQL extension. If you can point to something in the standard > to show where I'm mistaken, I'll look it over. I'll go looking for > something to back my memories on the topic, too, since my memory seems > to be less reliable than it once was. Sorry, I was referring to your explicit naming of types as in the following: http://archives.postgresql.org/message-id/4a9e337802250002a...@gw.wicourts.gov reading it back again I'm not sure if that's what you meant now. Email is hard work isn't it! -- Sam http://samason.me.uk/ -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #5028: CASE returns ELSE value always when type is"char"
On Wed, Sep 02, 2009 at 01:55:28PM -0500, Kevin Grittner wrote: > So the behavior of the "char" type is anomalous in this regard? Other > character-based types behave like varchar (which has the behavior I > would expect here)? That is encouraging. Why isn't the behavior of > "char" in this regard considered a bug to be fixed? I think there are lots of implicit casts going on that muddy the water with respect to what's going on at a type level. Things get promoted to TEXT easily in PG. I posted a patch in this thread: http://archives.postgresql.org/message-id/20090902091654.gl5...@samason.me.uk -- Sam http://samason.me.uk/ -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #5028: CASE returns ELSE value always when type is"char"
On Wed, Sep 02, 2009 at 01:37:20PM -0500, Kevin Grittner wrote: > That still seems to be the case in the draft of the 2003 standard I > have: > > ::= > > | > | > | > | > | > | > ::= > [ ] > [ ... ] > [ { [ ... ] > }... ] > > The ball's in your court to show something in the standard to say that > a character string literal is ever *not* to be taken as a character > string. Huh, you're right. I'd always thought '2001-01-01' was a valid date literal, seems the standard has required it to be prefixed by DATE at least back to SQL92. -- Sam http://samason.me.uk/ -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #5028: CASE returns ELSE value always when type is"char"
On Wed, Sep 02, 2009 at 12:54:03PM -0500, Kevin Grittner wrote: > Sam Mason wrote: > > If we did follow Kevin's request directly, should we also be > > specifying the type of NULL? > > I don't *think* the SQL standard requires that, and barring that I > don't see any compelling reason to type NULL. I've just realized that either I'm missing your point entirely (it's happened before :) or this ignores the point entirely. PG wants to assign types to every expression, whether this expression will evaluate to a NULL value at run-time or not is immaterial in this regard. I think SQL wants to do the same, but I don't have as much conviction as Tom here. Once we're ascribing types to expressions then whether it happens to contain the literal "1", "'txt'" or "NULL" we're committed to giving it some type---the only question is which one. We thus need to type expressions consisting of just NULL constants. A fun puzzle to base any inductive solution on is what type to ascribe to the following: CREATE VIEW v (c) AS SELECT NULL; PG allows it, but the resulting view seems somewhat unusable. -- Sam http://samason.me.uk/ -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #5028: CASE returns ELSE value always when type is"char"
On Wed, Sep 02, 2009 at 02:41:32PM -0500, Kevin Grittner wrote: > Sam Mason wrote: > > > I'd always thought '2001-01-01' was a valid date literal, seems the > > standard has required it to be prefixed by DATE at least back to > > SQL92. > > Yep. I don't know if it would be remotely feasible, but the > implementation which seems like it would be "standard-safe" but still > give reasonable concessions to those wanting to skip the extra > keystrokes of declaring the type of literals which are not character > based would be to go with the suggestion of having a character string > literal type, and change the semantics such that if there is a valid > interpretation of the statement with the character string literal > taken as text, it should be used; if not, resolve by current "unknown" > rules. Probably not feasible, but it seems likely it would make > everyone reasonably happy if it could be done. Sounds as though that'll introduce more ambiguity into the system than there is already. > That leaves the issue of NULL being forced to type text in the absence > of any type info in CASE, COALESCE, and NULLIF. If there were a way > to say that these could return unknown type, that would be solved. > That doesn't seem as though it would be likely to be massively > difficult, although I could be wrong about that. Would be nice and I'd love it to work like this, but it's quite a big change I think. Currently, once PG has decided on a type it sticks with it against all further evidence. Another example: SELECT NULL AS c UNION SELECT '1' UNION SELECT 2; Once you're doing the above you're into the world of full parametric polymorphism and you're having to do much more complicated things at the type level. When my free time becomes infinite I'll have a chance! -- Sam http://samason.me.uk/ -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #5028: CASE returns ELSE value always when type is"char"
On Wed, Sep 02, 2009 at 02:59:54PM -0500, Kevin Grittner wrote: > Sam Mason wrote: > > CREATE VIEW v (c) AS > > SELECT NULL; > > > > PG allows it, but the resulting view seems somewhat unusable. > > I'm not sure whether the only place the standard doesn't require a > cast is on assignment, but this is one place that the standard clearly > does require a cast, and I'm all for that. I'm probably missing something obvious again, but where does it say that? Bear in mind that my simple NULL could be an arbitrarily complex expression, I've just chosen a simple NULL for pedagogic reasons. I can only see a requirement that the chosen type must be compatible. That seems to leave it open to arbitrarily choosing any type in this case. -- Sam http://samason.me.uk/ -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #5028: CASE returns ELSE value always when type is "char"
On Fri, Sep 04, 2009 at 10:59:48AM -0500, Kevin Grittner wrote: > Tom Lane wrote: > > I certainly don't want to have "char" emulate the misbegotten > > decision to have explicit and implicit coercions behave differently. > > So it looks to me like the argument to make "char" work like char(1) > > doesn't actually help us much to decide if an error should be thrown > > here or not. On the whole, throwing an error seems better from a > > usability perspective. I'm all for the error being thrown; no particular feelings about whether it only happens during "explicit" casts or everywhere. There's always the substring() function if the user wants it. > I feel that the behavior of "char" in at least this case should match > char(1) (or just plain char): Hum, I'm not sure if that's useful behavior. As far as I can tell, you seem to be wanting in-memory representations of "string like types" to all use the same representation and only use the actual types when saving to/from disk. This would give behavior that is consistent with what you're for asking below. > test=# select case when true then 'xxx' else 'a'::"char" end from t; > case > -- > x > (1 row) With the patch I gave, or something like it, this would throw an error because 'xxx' is being used to initialize a value of "char" type. > test=# select case when true then 'xxx' else 'a'::char(1) end from t; > case > -- > xxx > (1 row) This gives back 'xxx' because the types character and bpchar have an implicit cast defined between them. The result is thus of type bpchar which places no restrictions on the length (that would apply here anyway), with 'a' being cast from character to bpchar implicitly. > test=# select case when true then 'xxx' else 'a'::char end from t; > case > -- > xxx > (1 row) This does the same as above. > Much as the reason for the behavior of "char" may seem clear when > inside the code looking out, it is astonishing for someone writing > application code. I think things would be clearer if an error was thrown in the constructor of "char" types when the length wasn't correct. Implicit casts are muddying the waters as well, which doesn't aid understanding of what's going on here. -- Sam http://samason.me.uk/ -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #5028: CASE returns ELSE value always when type is "char"
On Fri, Sep 04, 2009 at 12:26:19PM -0500, Kevin Grittner wrote: > Sam Mason wrote: > > Kevin Grittner wrote: > >> test=# select case when true then 'xxx' else 'a'::"char" end from t; > > > > 'xxx' is being used to initialize a value of "char" type. > > As I read the semantics of the CASE predicate, it returns one of the > given values. 'x' is not one of the given values, regardless of type. You seem to be confused about the difference between literals and values. Maybe a different example: SELECT '1'::int; I get '1' back from that, and not '1'. This is because '1' is the literal that is parsed into a value of type integer and then the query is run and this same value is asked to convert itself back into a literal to be written out to the screen. Back to your example; you're asking PG to interpret the literal 'xxx' as a "char" and it does that (but doesn't give any error back when it chucks data away). This behavior may be confusing because for text types the literal exactly the same as the value itself, but this is only a very specific behavior of text types. For example, '{"1"}', '{1}' and even '{"+001"}' are all literal representations of identical integer arrays. > I don't think an error is the right thing, I think returning the > specified value is the right thing. I don't think it's a good thing > that the type system decides that the result type for this case > predicate is "char" and that 'xxx' needs to be coerced to that type. I fail to see how an error isn't the right thing; if we try with some other types let see if you think any of these should succeed. SELECT CASE WHEN TRUE THEN text 'xxx' ELSE 0 END; SELECT CASE WHEN TRUE THEN text 'xxx' ELSE TRUE END; SELECT CASE WHEN TRUE THEN text 'xxx' ELSE '{1}'::INT[] END; SELECT CASE WHEN TRUE THEN text 'xxx' ELSE array [1] END; "char" is no different other than, by default, it happens to look a lot like any value of text type. It's a different type (that happens to have some implicit casts to confuse things) and hence I can't see why invalid literals should not be thrown out. -- Sam http://samason.me.uk/ -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #5028: CASE returns ELSE value always when type is "char"
On Fri, Sep 04, 2009 at 02:01:07PM -0500, Kevin Grittner wrote: > > "char" is no different other than, by default, it happens to look a > > lot like any value of text type. > > So much so that it has the same name as a text type (wrapped in > quotes) and behaves a lot like one: You're getting bitten by implicit casts. Try creating a view of any of your examples and seeing what the resulting types are, or in 8.4 an EXPLAIN VERBOSE seems to show how it's typed the expression. -- Sam http://samason.me.uk/ -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #5053: domain constraints still leak
On Mon, Sep 14, 2009 at 11:16:23AM -0400, Robert Haas wrote: > I haven't read the code in this area, but for what it's worth, I guess > I lean toward the view that treating a row of NULLs as being the same > thing as an undecorated NULL does not make very much sense. I agree; when compared to most languages it doesn't. When compared to the semantics of the other operators in SQL it gets better. I personally think PG should strive to be internally consistent rather than consistency with other (non-SQL based) languages. > If I have > a table row which contains (1, NULL, NULL) and I update the first > column to be NULL, I feel like I now have (NULL, NULL, NULL), not just > NULL. Every other programming language I'm aware of makes this > distinction - for good reasons - and I don't really see any reason why > SQL should do anything different. I'm not aware of any other language that does the automatic "lifting" (to borrow nomenclature from Haskell) that SQL does, allowing NULL appear in *every* type. Java, for example, has null references, but these are very different creatures from nulls in databases--the programmer has to explicitly deal with them all the time and also they only apply to references. Taken another way, each object in a normal imperative language has its own identity, but in a database two rows that "look" the same are the same. Thirdly, IS NULL is defined to look "inside" composite values to see if they're "really" null. Its these differences in semantics that seem to make it all OK. > Under that view, null::test is not itself a test, but denotes the > absence of one. OK, but how can you distinguish NULL from ROW(NULL,NULL)? SELECT v IS NULL, v.a, v.b FROM (SELECT NULL, NULL) v(a,b); Would appear to return the same thing if ROW(NULL,NULL) evaluated to NULL or not. The only time it would show up is when you're trying to save the value into a table and I think this would tend to do the right thing more often. For example: INSERT INTO t (id,rv) SELECT f.id, b FROM foo f LEFT JOIN bar b ON (f.id = b.id); Would fail if any bar's didn't exist, whereas the current behavior is to insert a row with rv containing all null values. You can't test for this case because IS NULL would return the "wrong" thing as it looks inside composites. -- Sam http://samason.me.uk/ -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #5053: domain constraints still leak
On Mon, Sep 14, 2009 at 10:54:07AM -0500, Kevin Grittner wrote: > Sam Mason wrote: > > the deeper problem seems to be that the table was created as: > > > > create table test (a tstdom); > > > > and not as: > > > > create table test (a tstdom not null); > > Given that tstdom is declared as NOT NULL, is this difference > considered a *feature* or is it an implementation quirk? That's why I pointed it out! Based on my reading of the SQL spec (and reading about Codd's descriptions of domains) I'd say it was a bug/implementation quirk. -- Sam http://samason.me.uk/ -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #5053: domain constraints still leak
On Mon, Sep 14, 2009 at 10:22:34AM -0400, Tom Lane wrote: > Robert Haas writes: > > It seems like regardless of this discussion you oughtn't to be able to > > store a NULL into that table column. But maybe I'm just confused. > > The system is relying on the not-unreasonable assumption that if it > extracts a column of type X from someplace, what it has is a valid value > of type X. Yup; the deeper problem seems to be that the table was created as: create table test (a tstdom); and not as: create table test (a tstdom not null); which is how you seem to be treating it. > Depending on what we decide about the whole composite-null > mess, maybe we will be forced to abandon that assumption ... but I'm > sure not going to do so until my back is to the wall. There seems to be a little space yet! This whole issue seems only distantly related to the treatment of null rows to me. I think PG has got its semantics confused along the way somewhere and things need tweaking. The only way I can get it all to work nicely in my head is if ROW(NULL) evaluates to a NULL value (and not a row containing a NULL value, as it does at the moment) and the NULL/NOT NULL constraint on the CREATE DOMAIN is used somehow for the nullness constraint of any columns using this domain. It's the second part that seems to be more critical, but there are various ways of interpreting the meaning. I'm tempted to say that the nullness specified in the domain puts a bound on the amount of nullness available--i.e. it would be impossible to create a nullable column from a domain that specified NOT NULL. The reason it's only a "limit" is that it seems useful to be able to say that a normally nullable domain can't be null for this column in this table. Not sure if this is what people want though. You then get into fun cases like: create domain tstdom as integer; create domain tstdom2 as tstdom; -- Sam http://samason.me.uk/ -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
[BUGS] unable to fail over to warm standby server
by hand using `pg_ctl -D /data/pgdata-8.3/ start` -- this command returned a "server starting" message, but the server never appeared in the process list. 7.) The contents of the newly created postgresql.log were: tail postgresql-2010-01-18_210946.log 2010-01-18 21:09:46 UTC ()LOG: database system was interrupted while in recovery at log time 2010-01-18 20:10:59 UTC 2010-01-18 21:09:46 UTC ()HINT: If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target. 2010-01-18 21:09:46 UTC ()LOG: starting archive recovery 2010-01-18 21:09:46 UTC ()LOG: restore_command = 'pg_standby -t /tmp/pgsql.trigger.5432 /home/postgres/wal_archive %f %p %r 2>>pg_standby.log' 2010-01-18 21:09:46 UTC ()FATAL: could not restore file "0002.history" from archive: return code 65280 2010-01-18 21:09:46 UTC ()LOG: startup process (PID 9284) exited with exit code 1 2010-01-18 21:09:46 UTC ()LOG: aborting startup due to startup process failure 8.) Sysadmin noticed the 'recovery.conf' file was still in the db root directory (it had not been renamed recovery.done). The recovery.conf file was renamed to recovery.conf.old and running `pg_ctl -D /data/pgdata-8.3/ start` attempted again 9.) The server did not come up (again). This time the contents of the new postgresql.log file were: [postg...@prod-db-2 pg_log]$ tail -n 100 postgresql-2010-01-18_211132.log 2010-01-18 21:11:32 UTC ()LOG: database system was interrupted while in recovery at log time 2010-01-18 20:10:59 UTC 2010-01-18 21:11:32 UTC ()HINT: If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target. 2010-01-18 21:11:32 UTC ()LOG: could not open file "pg_xlog/00023C8200A3" (log file 15490, segment 163): No such file or directory 2010-01-18 21:11:32 UTC ()LOG: invalid primary checkpoint record 2010-01-18 21:11:32 UTC ()LOG: could not open file "pg_xlog/00023C820049" (log file 15490, segment 73): No such file or directory 2010-01-18 21:11:32 UTC ()LOG: invalid secondary checkpoint record 2010-01-18 21:11:32 UTC ()PANIC: could not locate a valid checkpoint record 2010-01-18 21:11:32 UTC ()LOG: startup process (PID 9328) was terminated by signal 6: Aborted 2010-01-18 21:11:32 UTC ()LOG: aborting startup due to startup process failure 10.) At this point a lot of time was spent on the #postgresql IRC channel looking for help diagnosing this problem. Before the issue was resolved, the primary server was brought back online. Soon after we decided to create a new base backup on the standby server and aborted trying to recover. FYI: The last few lines of pg_standby.log were: [postg...@prod-db-2 pgdata-8.3]$ tail -n 300 pg_standby.log trigger file found cp: cannot stat `/home/postgres/wal_archive/0002.history': No such file or directory cp: cannot stat `/home/postgres/wal_archive/0002.history': No such file or directory cp: cannot stat `/home/postgres/wal_archive/0002.history': No such file or directory cp: cannot stat `/home/postgres/wal_archive/0002.history': No such file or directory cp: cannot stat `/home/postgres/wal_archive/0002.history': No such file or directory cp: cannot stat `/home/postgres/wal_archive/0002.history': No such file or directory cp: cannot stat `/home/postgres/wal_archive/0002.history': No such file or directory trigger file found ERROR: could not remove "/tmp/pgsql.trigger.5432": Operation not permittedtrigger file found ERROR: could not remove "/tmp/pgsql.trigger.5432": Operation not permitted This file was not looked until after the attempt to recover was aborted. Clearly the permissions on /tmp/pgsql.trigger.5432 were a problem, but we don't see how that would explain the error messages, which seem to indicate that data on the standby server was corrupted. If you need additional information we have saved console logs and logs files from both the primary and secondary servers. Thanks for reading this far and any help in tracking down the cause of this unexpected failure. cheers, - Mason -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] unable to fail over to warm standby server
Hello Heikki -- Thank you for investigating this issue and clearing up this mystery. I do not believe it is obvious that the postgres process needs to be able to remove the trigger file. My naive assumption was that the trigger file was merely a flag to signal that recovery mode needed to be stopped. If I were to guess what those steps would be, I would assume the following: - detect the presence of the trigger file - stop the postgres process safely (e.g pg_ctl ... stop) - rename recovery.conf to recovery.done - restart the postgres process (e.g. pg_ctl ... start) It is not obvious that the trigger file needs to be removed. And if permissions prevent it from being removed the last thing that should happen is to cause to database to become corrupted. At minimum the pg_standby documentation should make this requirement clear. I suggest language to the effect of the following: Note it is critical the trigger file be created with permissions that allow > the postgres process to remove the file. Generally this is best done by > creating the file from the postgres user account. Data corruption may result > if the trigger file permissions prevent deletion of the trigger file. Of course the best solution is to avoid this issue entirely. Something as easy to miss as file permissions should not cause data corruption, especially in the process meant to fail over from a crashing primary database. thanks, Mason Hale http://www.onespot.com direct +1 800.618.0768 ext 701 On Thu, Jan 28, 2010 at 3:49 AM, Heikki Linnakangas < heikki.linnakan...@enterprisedb.com> wrote: > Mason Hale wrote: > > ERROR: could not remove "/tmp/pgsql.trigger.5432": Operation not > > permittedtrigger file found > > > > ERROR: could not remove "/tmp/pgsql.trigger.5432": Operation not > permitted > > > > This file was not looked until after the attempt to recover was > > aborted. Clearly the permissions on /tmp/pgsql.trigger.5432 were a > > problem, > > but we don't see how that would explain the error messages, which seem > > to indicate that data on the standby server was corrupted. > > Yes, that permission problem seems to be the root cause of the troubles. > If pg_standby fails to remove the trigger file, it exit()s with whatever > return code the unlink() call returned: > > > /* > >* If trigger file found, we *must* delete it. Here's why: > When > >* recovery completes, we will be asked again for the same > file from > >* the archive using pg_standby so must remove trigger file > so we can > >* reload file again and come up correctly. > >*/ > > rc = unlink(triggerPath); > > if (rc != 0) > > { > > fprintf(stderr, "\n ERROR: could not remove \"%s\": > %s", triggerPath, strerror(errno)); > > fflush(stderr); > > exit(rc); > > } > > unlink() returns -1 on error, so pg_standby calls exit(-1). -1 is out of > the range of normal return codes, and apparently gets mangled into the > mysterious 65280 code you saw in the logs. The server treats that as a > fatal error, and dies. > > That seems like a bug in pg_standby, but I'm not sure what it should do > if the unlink() fails. It could exit with some other exit code, so that > the server wouldn't die, but the lingering trigger file could cause > problems, as the comment explains. If it should indeed cause FATAL, it > should do so in a more robust way than the exit(rc) call above. > > BTW, this changed in PostgreSQL 8.4; pg_standby no longer tries to > delete the trigger file (so that problematic block of code is gone), but > there's a new restore_end_command option in recovery.conf instead, where > you're supposed to put 'rm '. I think in that > configuration, the standby would've started up, even though removal of > the trigger file would've still failed. > > -- > Heikki Linnakangas > EnterpriseDB http://www.enterprisedb.com >
Re: [BUGS] unable to fail over to warm standby server
> On Fri, Jan 29, 2010 at 12:03 AM, Mason Hale wrote: > > Of course the best solution is to avoid this issue entirely. Something as > > easy to miss as file permissions should not cause data corruption, > > especially in the process meant to fail over from a crashing primary > > database. > > I believe that such a file permission problem does nothing but > shut down the standby by a FATAL error, and wouldn't cause data > corruption. So if you remove the trigger file with a wrong > permission after the shutdown, you can restart a recovery well > by just starting the standby postgres. > > Perhaps my wording of "data corruption" was too harsh? While I did not remove the trigger file, I did rename recovery.conf to recovery.conf.old. That file contained the recovery_command configuration that identified the trigger file. So that rename should have eliminated the problem. But it didn't. Even after making this change and taking the trigger file out of the equation my database failed to come online. Maybe that wasn't data corruption. Maybe the issue was repairable. I just know that with my 3+ years of experience working with Postgres and the help of the #postgresql IRC channel, I was not able to revive the database at a time when I desperately needed it to work. The failover process failed for me at the worst possible time. I will surely be careful about trigger file permissions in the future. I just shared my experience so that future DBA's who might make the same mistake in a similar situation don't have to deal with the same unexpected results. - Mason
Re: [BUGS] unable to fail over to warm standby server
Hello Fujii -- Thanks for the clarification. It's clear my understanding of the recovery process is lacking. My naive assumption was that Postgres would recover using whatever files were available and if it had run out of files it would stop there and come up. And that if recovery.conf were renamed it would stop copying files from the wal_archive into pg_xlog. Thus without the recovery.conf file, the database would just come up, without expecting or waiting on additional files. I see my assumption was wrong, but I think you can agree that it is not surprising someone could expect things this work this way if they aren't directly familiar with the code. I think you can also see how seeing the message "If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target" in the log would lead me to believe my database was corrupted. It is good to know that if I had left recovery.conf in place and just removed the trigger file the issue would have resolved itself. I'm happy to hear the database was not, in fact, corrupted by this error. Perhaps its best to chalk this up to a scenario that creates a confusing, hard-to-diagnose issue -- one that easily looks like corruption, but thankfully is not. Hopefully if anyone tuning into this thread experiences or hears of similar fail-over problems in the future (say on IRC), they'll remember to check the permissions on the trigger file. Thanks again, Mason On Fri, Jan 29, 2010 at 10:02 AM, Fujii Masao wrote: > On Fri, Jan 29, 2010 at 11:49 PM, Mason Hale wrote: > > While I did not remove the trigger file, I did rename recovery.conf to > > recovery.conf.old. > > That file contained the recovery_command configuration that identified > the > > trigger file. So that rename should have eliminated the problem. But it > > didn't. Even after making this change and taking the trigger file out of > the > > equation my database failed to come online. > > Renaming of the recovery.conf doesn't resolve the problem at all. Instead, > the sysadmin had to remove only the trigger file with a wrong permission > and just restart postgres. > > >> 9.) The server did not come up (again). This time the contents of the > >> new postgresql.log file were: > >> > >> [postg...@prod-db-2 pg_log]$ tail -n 100 > postgresql-2010-01-18_211132.log > >> 2010-01-18 21:11:32 UTC ()LOG: database system was interrupted while in > recovery at log time 2010-01-18 20:10:59 UTC > >> 2010-01-18 21:11:32 UTC ()HINT: If this has occurred more than once > some data might be corrupted and you might need to choose an earlier > recovery target. > >> 2010-01-18 21:11:32 UTC ()LOG: could not open file > "pg_xlog/00023C8200A3" (log file 15490, segment 163): No such > file or directory > >> 2010-01-18 21:11:32 UTC ()LOG: invalid primary checkpoint record > >> 2010-01-18 21:11:32 UTC ()LOG: could not open file > "pg_xlog/00023C820049" (log file 15490, segment 73): No such > file or directory > >> 2010-01-18 21:11:32 UTC ()LOG: invalid secondary checkpoint record > >> 2010-01-18 21:11:32 UTC ()PANIC: could not locate a valid checkpoint > record > >> 2010-01-18 21:11:32 UTC ()LOG: startup process (PID 9328) was > terminated by signal 6: Aborted > >> 2010-01-18 21:11:32 UTC ()LOG: aborting startup due to startup process > failure > > You seem to focus on the above trouble. I think that this happened because > recovery.conf was deleted and restore_command was not given. In fact, the > WAL file (e.g., pg_xlog/00023C8200A3) required for recovery > was unable to be restored from the archive because restore_command was > not supplied. Then recovery failed. > > If the sysadmin had left the recovery.conf and removed the trigger file, > pg_standby in restore_command would have restored all WAL files required > for recovery, and recovery would advance well. > > Hope this helps. > > Regards, > > -- > Fujii Masao > NIPPON TELEGRAPH AND TELEPHONE CORPORATION > NTT Open Source Software Center >
Re: [BUGS] unable to fail over to warm standby server
> > > >> If the sysadmin had left the recovery.conf and removed the trigger file, > >> pg_standby in restore_command would have restored all WAL files required > >> for recovery, and recovery would advance well. > > > > That may be true, but it's certainly seems unfortunate that we don't > > handle this case a bit more gracefully. > > Yeah. I don't think we can do much better in versions <= 8.3, though we > should change that unlink() call to cause the FATAL error in a more > explicit way; it seems accidental and possibly non-portable as it is. > > Adding the note to the docs that Mason suggested is a good idea. > > Given that this situation did NOT actually cause corruption, rather the error message was mangled such that it suggested corruption, I offer this revised suggestion for update to the documentation: Important note: It is critical the trigger file be created with permissions > allowing the postgres process to remove the file. Generally this is best > done by creating the file from the postgres user account. Failure to do so > will prevent completion of WAL file recovery and the server from coming back > online successfully. Best regards, Mason
[BUGS] BUG #5459: Unable to cancel query while in send()
The following bug has been logged online: Bug reference: 5459 Logged by: Mason Hale Email address: ma...@onespot.com PostgreSQL version: 8.3.8 Operating system: Redhat EL 5.1-64 bit Description:Unable to cancel query while in send() Details: ISSUE: unable to cancel queries using pg_cancel_backend(), that are in send() function call, waiting on client receipt of data. EXPECTED RESULT: expect to be able to cancel most/all queries using pg_cancel_backend() as superuser, perhaps with some wait time, but not an hour or more. = SYMPTOM = A SELECT query was running over 18 hours on our PostgreSQL 8.3.8 server. Verified that it was not waiting on any locks via pg_stat_activity. Attempted to cancel the query using pg_cancel_backend(), which returned 't'. However more than an hour later the process was still active, using about 6% of CPU and 5% of RAM. Terminated the client process that was running the query (from another server) did not cause the query process on the pgsql server to stop. In this case the client was connecting via a ssh tunnel through an intermediate 'gateway' server. Connection path was: CLIENT --> SSH GATEWAY --> DB SERVER = DIAGNOSIS = Diagnosed this issue with help from 'andres' in #postgresql IRC. Per his request, attached to 'stuck' process using gdb, generating the following outputs: - Initial backtrace: http://pgsql.privatepaste.com/6f15c7e363 -( 'c', then ctrl+c, then 'bt full') x 4: http://pgsql.privatepaste.com/3d3261659a - Stepping several times with 'n': http://pgsql.privatepaste.com/0f302125a8 'andres' reported that interrupts were not checked in send() and probably should be, and suggested opening this bug report. Additional investigation of the ssh tunnel connection revealed the connection on the intermediate gateway server was stuck in a FIN_WAIT2 state (as reported by netstat). The other end of the connection on the pgsql server was reported as CLOSE_WAIT by netstat. Kiling the ssh tunnel process on the gateway server cleared the connection and the long-running query process db server terminated very soon after. -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
[BUGS] BUG #2739: INTERSECT ALL not working
The following bug has been logged online: Bug reference: 2739 Logged by: Mason Hale Email address: [EMAIL PROTECTED] PostgreSQL version: 8.1.5 Operating system: GNU/Linux 2.6.9-42.0.3.ELsmp Description:INTERSECT ALL not working Details: 'INTERSECT ALL' does not return duplicate rows in a query. The query below should return 10 rows, but it returns 5 rows on my system: ( SELECT tablename FROM pg_tables LIMIT 5 ) INTERSECT ALL ( ( SELECT tablename FROM pg_tables LIMIT 5 ) UNION ALL ( SELECT tablename FROM pg_tables LIMIT 5 ) ) Note, the above is a simplied query meant to demonstrate the problem. This same behavior occurs (and was discovered) in real-world situations with user-defined tables. This is nearly a deal-stopper for our application. Please reply to let me know the status of this report. Thanks, Mason Hale ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [BUGS] BUG #2739: INTERSECT ALL not working
Tom -Many thanks for the quick reply. I feel honored to receive email from you after seeing your name so many times in my web searches on Postgres topics.That's not how I understood INTERSECT ALL to work. But it's the clear the spec is right and my understanding is wrong. This is not a bug.Unfortunately the INTERSECT ALL as spec'd and implemented doesn't quite give me what I need. So back to the drawing board for me...best regards,Mason On 11/6/06, Tom Lane <[EMAIL PROTECTED]> wrote: "Mason Hale" <[EMAIL PROTECTED]> writes:> The query below should return 10 rows,Not by my reading of the spec.  SQL92 7.10 saith:b) If a set operator is specified, then the result of applying   the set operator is a table containing the following rows:  i) Let R be a row that is a duplicate of some row in T1 or of some row in T2 or both. Let m be the number of duplicates of R in T1 and let n be the number of duplicates of R in T2, where m >= 0 and n >= 0iii) If ALL is specified, then... 3) If INTERSECT is specified, then the number of duplicates   of R that T contains is the minimum of m and n.You have m = 1, n = 2 for each distinct row at the INTERSECT step,ergo you get one copy out.regards, tom lane
[BUGS] BUG #3723: dropping an index that doesn't refer to table's columns
The following bug has been logged online: Bug reference: 3723 Logged by: Sam Mason Email address: [EMAIL PROTECTED] PostgreSQL version: 8.2.5 Operating system: Linux Description:dropping an index that doesn't refer to table's columns Details: Hi, I've just discovered that an index that doesn't refer to any of its table's columns isn't automatically dropped when its table is. The test case for me is: CREATE TABLE foo ( id INTEGER NOT NULL ); CREATE UNIQUE INDEX foo_id ON foo (id); CREATE UNIQUE INDEX foo_exp ON foo ((1)); DROP TABLE foo; -- foo_id will have gone, but foo_exp will still be there \di foo_id \di foo_exp AndrewSN suggested the following query to show indexes that have missing tables: SELECT indexrelid::regclass FROM pg_index i LEFT JOIN pg_class c ON i.indrelid=c.oid WHERE c.oid IS NULL; He also showed me which system catalogs to change in order to delete these indexes which I'm happy with at the moment. Thanks, Sam p.s. the reason for creating this strange index was to ensure that a maximum of one row was inserted into the table---I can do this different ways for now. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [BUGS] BUG #3723: dropping an index that doesn't refer to table's columns
On Tue, Nov 06, 2007 at 10:00:43AM -0500, Tom Lane wrote: > Heikki Linnakangas <[EMAIL PROTECTED]> writes: > > Not sure that's enough of a use case to justify not banning it... > > Yeah, it probably is. It's reasonably easy to do this instead: CREATE TABLE foo ( one INTEGER NOT NULL UNIQUE CHECK (one = 1) ); The bug (for me) was that it's possible to get the database into an inconsisant state, with no warning or obvious way to back out. Sam ---(end of broadcast)--- TIP 6: explain analyze is your friend
[BUGS] BUG #3724: Duplicate values added to table despite unique index
The following bug has been logged online: Bug reference: 3724 Logged by: Mason Hale Email address: [EMAIL PROTECTED] PostgreSQL version: 8.2.5 Operating system: Redhat Linux (kernel: Linux 2.6.18-8.1.15.el5PAE) Description:Duplicate values added to table despite unique index Details: I discovered this issue when an update statement was resulting in a duplicate key violation error, even though the update did not change any of the columns in the unique index (and neither did the one trigger on this table). Here is the table description: prod_2=> \d topic_version_page Table "bdu.topic_version_page" Column|Type | Modifiers -+-+--- topic_version_id| integer | not null page_id | integer | not null link_score | double precision| created_at | timestamp without time zone | default now() updated_at | timestamp without time zone | default now() is_entry_page | boolean | default false linking_entry_count | integer | default 0 Indexes: "index_topic_version_page_on_topic_version_id_and_page_id" UNIQUE, btree (topic_version_id, page_id) "index_topic_version_page_on_link_score" btree (link_score) "index_topic_version_page_on_topic_version_id_and_created_at" btree (topic_version_id, created_at) Foreign-key constraints: "topic_version_page_topic_version_id_fkey" FOREIGN KEY (topic_version_id) REFERENCES topic_version(id) ON DELETE CASCADE Triggers: topic_version_page_updated_at_trigger BEFORE UPDATE ON topic_version_page FOR EACH ROW EXECUTE PROCEDURE update_updated_at_timestamp() Note that there is a unique index on (topic_version_id, page_id). Now look at the result of this query: prod_2=> select page_id, count(*) from topic_version_page where topic_version_id = 263 group by 1 having count(*) > 1; page_id | count ---+--- 161335682 | 2 194359108 | 2 (2 rows) Here we have two rows that violate the unique index constraint. Looking at the rows in more detail: prod_2=> select * from topic_version_page where topic_version_id = 263 and page_id in (161335682, 194359108); topic_version_id | page_id | link_score | created_at | updated_at | is_entry_page | linking_entry_count --+---+++--- -+---+- 263 | 161335682 | 0 | 2007-10-13 02:40:49.864219 | 2007-11-01 15:58:57.268593 | f | 5 263 | 194359108 | 0 | 2007-10-25 13:34:20.654336 | 2007-10-25 13:34:20.654336 | f | 1 263 | 194359108 | 0 | 2007-10-25 13:34:20.654336 | 2007-11-04 13:08:03.011292 | f | 2 263 | 161335682 | 0 | 2007-10-13 02:40:49.864219 | 2007-11-04 13:08:03.011292 | f | 6 (4 rows) We can see that each duplicate within a pair was created at the same time (or at least within the same transaction), but that each pair was created at a different time. My expectation is that with the unique index in place this should not be able to happen. Other info that may be useful: This database was created by taking a dump from a 8.2.4 database on 11/3 and restoring into a new 8.2.5 database (on different hardware). The created_at timestamps seem to indicate that these duplicates were created prior to the dump being created. However running the same query on the original 8.2.4 database returns zero rows: prod_1=> select page_id, count(*) from topic_version_page where topic_version_id = 263 group by 1 having count(*) > 1; page_id | count -+--- (0 rows) prod_1=> Finally, I restored the *same* dump file in another 8.2.5 database instance and that database does not have the duplicate rows either. >From this it seems to me that on the following occurred: Some subsequent update operation resulted in two identical rows being created in the same table. This likely happened at '2007-11-04 13:08:03.011292' which is the updated_at timestamp on two of the duplicate rows (and is after the restore). This looks like a bug related to update operations and unique indexes, but I could of course be wrong. Thanks for looking into it. Mason ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [BUGS] BUG #3724: Duplicate values added to table despite unique index
On Nov 6, 2007 11:16 AM, Tom Lane <[EMAIL PROTECTED]> wrote: > "Mason Hale" <[EMAIL PROTECTED]> writes: > > However running the same query on the original 8.2.4 database returns > zero > > rows: > > > prod_1=> select page_id, count(*) from topic_version_page where > > topic_version_id = 263 group by 1 having count(*) > 1; > > page_id | count > > -+--- > > (0 rows) > > Is that still true if you do the query with enable_indexscan = off > and enable_bitmapscan = off? If you see matching rows then, I'd > suspect a corrupt index in the 8.2.4 installation --- does REINDEXing > it succeed? > With enable_indexscan = off and enable_bitmapscan = off on the 8.2.4instance, it still returns zero rows. prod_1=> set enable_indexscan = off; SET prod_1=> set enable_bitmapscan = off; SET prod_1=> select page_id, count(*) from topic_version_page where topic_version_id = 263 group by 1 having count(*) > 1; page_id | count -+--- (0 rows) > For that matter, do you still see dups if you prevent use of the index > in the 8.2.5 query? Maybe it's that index that is corrupt. Unfortunately, I'm not able to test that at this point. To get our production db (the 8.2.5 instance) back in operation I deleted the extra duplicate rows, so that the update statement would complete. If I need to reindex the table in 8.2.5 database I'll need to wait until this evening, barring any emergencies. (It is a 20GB table) Mason
Re: [BUGS] BUG #3724: Duplicate values added to table despite unique index
On Nov 6, 2007 1:06 PM, Tom Lane <[EMAIL PROTECTED]> wrote: > "Mason Hale" <[EMAIL PROTECTED]> writes: > >> For that matter, do you still see dups if you prevent use of the index > >> in the 8.2.5 query? Maybe it's that index that is corrupt. > > > Unfortunately, I'm not able to test that at this point. > > To get our production db (the 8.2.5 instance) back in operation I > deleted > > the extra duplicate rows, so that the update statement would complete. > > Mph. I'm afraid the evidence is mostly gone then, and we probably won't > be able to figure out what happened. Sorry about that. But I had to get things back up and running. > However, it would be worth > checking two things: > > 1. Can you confirm that the rows that got duplicated were in fact > present (in only one copy) in the 8.2.4 DB? Yes, they are present: prod_1=> select * from topic_version_page where topic_version_id = 263 and page_id in (161335682, 194359108); topic_version_id | page_id | link_score | created_at | updated_at | is_entry_page | linking_entry_count --+---++++---+- 263 | 161335682 | 0 | 2007-10-13 02:40:49.864219 | 2007-11-01 15:58:57.268593 | f | 5 263 | 194359108 | 0 | 2007-10-25 13:34:20.654336 | 2007-11-04 01:01:50.512446 | f | 2 (2 rows) prod_1=> > > 2. Can you check that there are still 1 (rather than 0) copies of the > rows in the 8.2.5 DB? One possible theory about this is that what you > had was (two instances of) two index entries pointing at the same heap > row, in which case a DELETE that you thought removed only one copy would > have deleted both. > Yes, we have 1 of each row (I kept the most recently updated version of each): prod_2=> select * from topic_version_page where topic_version_id = 263 and page_id in (161335682, 194359108); topic_version_id | page_id | link_score | created_at | updated_at | is_entry_page | linking_entry_count --+---++++---+- 263 | 194359108 | 0 | 2007-10-25 13:34:20.654336 | 2007-11-04 13:08:03.011292 | f | 2 263 | 161335682 | 0 | 2007-10-13 02:40:49.864219 | 2007-11-06 16:01:36.393953 | f | 7 (2 rows) prod_2=> Mason
Re: [BUGS] BUG #3724: Duplicate values added to table despite unique index
> >> 2. Can you check that there are still 1 (rather than 0) copies of the > >> rows in the 8.2.5 DB? > > > Yes, we have 1 of each row (I kept the most recently updated version of > > each): > > Ah, I forgot that the rows were obviously not identical because of the > differing updated_at values. > > Tell us more about the updating process --- is it likely that there > could be conflicting concurrent updates on a row? What does your > app do in such cases? > We do some application-layer locking to prevent the same 'topic-version' from being updated at the same time. But there could be a bug somewhere that let's that happen, in which case, concurrent updates of the same row could occur. So I guess I would say it is unlikely, but possible. If there is a concurrent update of the same row, I guess we're relying on Postgres to handle that. If that results in a deadlock or any other error, then we catch and log the error, mark the update job as failed (in another db table) and try again. In most every case, the same request will complete successfully on a second try. In this case, every update to the rows in question was resulting in a unique index violation, which resulted in a near-constant stream of errors being logged.
[BUGS] Duplicate values found when reindexing unique index
Hello -- I noticed recently some errors in my postgres log like the following: ERROR: could not open segment 9 of relation 1663/16830/1384385 (target block 776929292): No such file or directory These are occurring at a rate of 1 every 2-3 days. But that rate has been increasing. After Googling around, I found this error could indicate index corruption. I found that the relation with oid 1384385 was the following index: "index_entry_on_guid_and_feed_id" UNIQUE, btree (guid, feed_id) -- on a table named entry. When I tried to rebuild that index, I got the following error: prod_2=# reindex index index_entry_on_guid_and_feed_id; ERROR: could not create unique index DETAIL: Table contains duplicated values. Now, if that index was in fact corrupted, the perhaps it is not such a surprise that the duplicate values are there. The thing that makes this more mysterious is that we have another unique index on entry(feed_id, guid) -- and I have no reason to believe it is corrupted (although I guess it still could be). Please note: I have previously reported another issue with duplicate values making into into a table despite having a unique index in place (bug #3724). This issue involves a completely different table. We can stay running despite this error but I would very much like to track down the cause as soon as possible. To avoid destroying any evidence, I'm going to leave things intact and await further instructions. Given my experience, the reliability of unique indexes is becoming somewhat suspect. Please help. ;-) thanks in advance, Mason
Re: [BUGS] Duplicate values found when reindexing unique index
On Dec 30, 2007 12:09 PM, Tom Lane <[EMAIL PROTECTED]> wrote: > "Mason Hale" <[EMAIL PROTECTED]> writes: > > Given my experience, the reliability of unique indexes is becoming > somewhat > > suspect. Please help. ;-) > > Well, as in the previous report, there is not enough information here to > offer much chance of understanding what's going wrong. > Please just tell me what information you need and I will provide what I can. > Have you tried reindexing that other index with the same columns in the > other order? My guess is that there really are duplicate entries in the > table; if so the other one should fail too. If so, please try to > identify the duplicated values, along the lines of > >select guid, feed_id from entry group by 1,2 having count(*) > 1 > > and show us the system columns (ctid,xmin,xmax,cmin,cmax) from the > tuples having duplicate value(s). Note that you should probably disable > indexscan and bitmapscan while doing this probing, so as not to have the > queries use the suspect indexes. I found a single pair of rows that were duplicated. Interestingly it was not just the guid and feed_id that were duplicated but all columns were indentical, including the primary key, except an update_at column which is automatically populated via a trigger (BEFORE UPDATE on entry FOR EACH ROW). The duplicate data included a created_at column which defaults to now() -- that the two duplicate rows have exactly the same values strongly hints to me that the duplicates were created during the same transaction. Here's the system column data you requested. id | ctid | xmin | xmax | cmin | cmax ---+--+--+--+--+-- 151341072 | (1508573,11) |2 |0 | 19 |0 151341072 | (1818219,11) |2 |0 | 19 |0 (2 rows) > This is 8.2.5 right? Was the DB loaded fresh into 8.2.5, or was it > inherited from previous 8.2.x release(s)? > It is 8.2.5. It was loaded from a pg_dump from an 8.2.3 database into a fresh 8.2.5 database on new hardware. > BTW, what are the datatypes of the index columns? > id integer not null (serial) guid character varying not null ( no size limit defined ) feed_id integer not null > >regards, tom lane > thanks for the help Tom. I do want to clear out one of the offending duplicates and reindex. But I'll wait to do that until I get the okay from you. Mason
Re: [BUGS] Duplicate values found when reindexing unique index
Trolling through my server log I found this error: 2007-12-30 20:02:08 CST (10.11.199.136) PANIC: right sibling's left-link doesn't match 2007-12-30 20:02:08 CST (10.11.199.136) STATEMENT: update bdu.entry set title=$1, author=$2, description_type=$3, description_length=$4, description=$5, published_at=$6, republished_at=$7, link=$8, link_page_id=$9, link_count=$10, enclosure=$11, enclosure_page_id=$12, enclosure_count=$13 where id=$14 2007-12-30 20:02:08 CST () LOG: server process (PID 30004) was terminated by signal 6 2007-12-30 20:02:08 CST () LOG: terminating any other active server processes This seems related to the entry table -- so I wonder if it is related to this problem? On Dec 30, 2007 8:23 PM, Mason Hale <[EMAIL PROTECTED]> wrote: > > > On Dec 30, 2007 12:09 PM, Tom Lane <[EMAIL PROTECTED]> wrote: > > > "Mason Hale" <[EMAIL PROTECTED]> writes: > > > Given my experience, the reliability of unique indexes is becoming > > somewhat > > > suspect. Please help. ;-) > > > > Well, as in the previous report, there is not enough information here to > > offer much chance of understanding what's going wrong. > > > > Please just tell me what information you need and I will provide what I > can. > > > > Have you tried reindexing that other index with the same columns in the > > other order? My guess is that there really are duplicate entries in the > > > > table; if so the other one should fail too. If so, please try to > > identify the duplicated values, along the lines of > > > >select guid, feed_id from entry group by 1,2 having count(*) > 1 > > > > and show us the system columns (ctid,xmin,xmax,cmin,cmax) from the > > tuples having duplicate value(s). Note that you should probably disable > > indexscan and bitmapscan while doing this probing, so as not to have the > > queries use the suspect indexes. > > > I found a single pair of rows that were duplicated. Interestingly it was > not just the guid and feed_id that were duplicated but all columns were > indentical, including the primary key, except an update_at column which is > automatically populated via a trigger (BEFORE UPDATE on entry FOR EACH ROW). > > > The duplicate data included a created_at column which defaults to now() -- > that the two duplicate rows have exactly the same values strongly hints to > me that the duplicates were created during the same transaction. > > Here's the system column data you requested. > > id | ctid | xmin | xmax | cmin | cmax > ---+--+--+--+--+-- > 151341072 | (1508573,11) |2 |0 | 19 |0 > 151341072 | (1818219,11) |2 |0 | 19 |0 > (2 rows) > > > > > This is 8.2.5 right? Was the DB loaded fresh into 8.2.5, or was it > > inherited from previous 8.2.x release(s)? > > > > It is 8.2.5. It was loaded from a pg_dump from an 8.2.3 database into a > fresh 8.2.5 database on new hardware. > > > > BTW, what are the datatypes of the index columns? > > > > id integer not null (serial) > guid character varying not null ( no size limit defined ) > feed_id integer not null > > > > > >regards, tom lane > > > > thanks for the help Tom. > > I do want to clear out one of the offending duplicates and reindex. But > I'll wait to do that until I get the okay from you. > > > Mason >
Re: [BUGS] Duplicate values found when reindexing unique index
> > > Can you show us all the triggers on this table? Here they are: Triggers: entry_correct_published_at_trigger BEFORE INSERT OR UPDATE ON entry FOR EACH ROW EXECUTE PROCEDURE correct_published_at() entry_feed_page_trigger BEFORE INSERT OR UPDATE ON entry FOR EACH ROW EXECUTE PROCEDURE entry_feed_page_trigger() entry_updated_at_trigger BEFORE UPDATE ON entry FOR EACH ROW EXECUTE PROCEDURE update_updated_at_timestamp() feed_entry_count_trigger AFTER INSERT ON entry FOR EACH ROW EXECUTE PROCEDURE update_feed_entry_count() > Also, it would be real > interesting to see "pg_filedump -i -f" output for the two blocks in > question (1508573 and 1818219) --- see http://sources.redhat.com/rhdb/ > to get a copy of pg_filedump. > I have downloaded, compiled and installed pg_filedump -- but I am not sure how to determine which file I should have it dump. I am not very familiar with the postgres file structure. Can you please provide some guidance? How do I determine the correct file? I've determined the relation "entry" has an oid = 16838 -- but the /data/base/16830 directory contains 92 1GB files named 16838.[1-92] I've tried: pg_filedump -i -f -R 1508573 16838 and got the following error: [EMAIL PROTECTED] 16830]$ pg_filedump -i -f -R 1508573 16838 *** * PostgreSQL File/Block Formatted Dump Utility - Version 8.2.0 * * File: 16838 * Options used: -i -f -R 1508573 * * Dump created on: Sun Dec 30 23:18:01 2007 *** Error: Seek error encountered before requested start block <1508573>. I tried a few other files in the 1-92 range with the same results. Mason
Re: [BUGS] Duplicate values found when reindexing unique index
> > > I think you misread Mason's post of 20:23 GMT-6 where he says the > created_at values are the *same*, not different. Mason's previous bug > report 3724 also had duplicate rows with matching created_at values. > Yes, to confirm, the created_at values are the same. The *only* values that are different are updated_at and that value is set via trigger. Another data point that may or may not be relevant: I've been working to set up a warm standby server and have run into some weird behavior there as well. The symptom is the server encounters an error during the file restore, then pre-maturely exits recovery mode. I'm using the pg_standby program for the restore_command in my recovery.conf on the standby server. The error message from the standby server was: Last week I had turned my attention away from the database restore, and didn't notice that our standby server had exited recovery mode on 12/20. Here's the last few lines from the log file (full log attached). 2007-12-20 04:11:43 CST () LOG: restored log file "000104220057" from archive 2007-12-20 04:13:09 CST () LOG: restored log file "000104220058" from archive 2007-12-20 04:14:40 CST () LOG: restored log file "000104220059" from archive 2007-12-20 04:14:40 CST () LOG: invalid info bits 0001 in log file 1058, segment 89, offset 0 2007-12-20 04:14:40 CST () LOG: redo done at 422/58FFEE38 2007-12-20 04:14:40 CST () LOG: restored log file "000104220058" from archive 2007-12-20 04:14:40 CST () LOG: archive recovery complete 2007-12-20 04:24:57 CST () LOG: database system is ready thanks, Mason
Re: [BUGS] Duplicate values found when reindexing unique index
On Dec 31, 2007 9:48 AM, Tom Lane <[EMAIL PROTECTED]> wrote: > "Mason Hale" <[EMAIL PROTECTED]> writes: > >> I think you misread Mason's post of 20:23 GMT-6 where he says the > >> created_at values are the *same*, not different. Mason's previous bug > >> report 3724 also had duplicate rows with matching created_at values. > > > Yes, to confirm, the created_at values are the same. The *only* values > that > > are different are updated_at and that value is set via trigger. > > Which still demonstrates that they aren't exact images. I'd still like > to see the pg_filedump output though ... I sent in previous email. You saw it, yes? > > 2007-12-20 04:11:43 CST () LOG: restored log file > > "000104220057" from archive > > 2007-12-20 04:13:09 CST () LOG: restored log file > > "000104220058" from archive > > 2007-12-20 04:14:40 CST () LOG: restored log file > > "000104220059" from archive > > 2007-12-20 04:14:40 CST () LOG: invalid info bits 0001 in log file > 1058, > > segment 89, offset 0 > > Do you by any chance still have 000104220058 and > 000104220059 archived? If so it would be useful to > look at the first dozen lines of "od -x" dump of each of them. > Yes, I do. Here's the output: [EMAIL PROTECTED] wal_archive]$ od -x 000104220058 | head -n15 000 d05e 0002 0001 0423 c100 020 f7df 472e e701 4728 0100 2000 040 a1db 81e6 0423 0068 c000 060 0048 002c 0423 100 0020 c100 0001 120 780b 2ede 9f68 00f5 7834 140 2f4d 0001 1f35 4774 160 * 002 d05e 0001 0422 2000 5800 0020020 53c2 48bc 0422 1fbc 5800 ce6f 2edd 0020040 003a 001e 0b00 067f 0020060 41be 1ff8 0015 0186 0007 0020100 4337 000a 000c 008f 0122 db84 d429 0020120 0422 2010 5800 ce6f 2edd 003a [EMAIL PROTECTED] wal_archive]$ od -x 000104220059 | head -n15 000 d05e 0001 0001 006b 6000 69dc 020 12ae 6380 0024 0010 375a 21cd 1174 040 4001 0001 637c 0058 0010 375a 21cd 1174 060 4001 0001 6355 0010 0010 375a 21cd 1174 100 4001 0001 631d 005a 0010 375a 21cd 1174 120 4001 0001 62e8 001e 0010 375a 21cd 1174 140 4001 0001 629a 0019 0010 375a 21cd 1174 160 4001 0001 604a 0013 0010 375a 21cd 1174 200 4001 0001 6045 0025 0010 375a 21cd 1174 220 4001 0001 6044 0029 0010 375a 21cd 1174 240 4001 0001 603a 0005 0010 375a 21cd 1174 260 4001 0001 6020 0035 0010 375a 21cd 1174 300 4001 0001 600e 0050 0010 375a 21cd 1174 320 4001 0001 600d 0018 0010 375a 21cd 1174 340 4001 0001 600b 0033 0010 375a 21cd 1174 [EMAIL PROTECTED] wal_archive]$ Mason
Re: [BUGS] Duplicate values found when reindexing unique index
Tom, I'll send these to you privately. Mason On Dec 31, 2007 10:22 AM, Tom Lane <[EMAIL PROTECTED]> wrote: > "Mason Hale" <[EMAIL PROTECTED]> writes: > >> Can you show us all the triggers on this table? > > > Here they are: > > > Triggers: > > entry_correct_published_at_trigger BEFORE INSERT OR UPDATE ON entry > FOR > > EACH ROW EXECUTE PROCEDURE correct_published_at() > > entry_feed_page_trigger BEFORE INSERT OR UPDATE ON entry FOR EACH > ROW > > EXECUTE PROCEDURE entry_feed_page_trigger() > > entry_updated_at_trigger BEFORE UPDATE ON entry FOR EACH ROW EXECUTE > > PROCEDURE update_updated_at_timestamp() > > feed_entry_count_trigger AFTER INSERT ON entry FOR EACH ROW EXECUTE > > PROCEDURE update_feed_entry_count() > > Actually I wanted to see the function bodies ... > >regards, tom lane >
Re: [BUGS] Duplicate values found when reindexing unique index
> > Something else interesting about that: the apparent interloper page > contains just a single WAL record, which appears to be a shutdown > checkpoint bearing the timestamp "Thu Dec 27 2007, 16:55:01 EST". > Not sure if Mason can correlate that with any recent activity... > Nothing jumps to mind, but I'll crawl through the logs looking for anything interesting. I'm starting to think that Occam's razor says you've got hardware > problems. Or maybe a kernel-level bug that is causing writes to get > discarded. > For what its worth, we do closely monitor the server for RAID and kernel errors and haven't seen either variety reported in quite some time. If it is some mysterious hardware or kernel error -- any suggestions for how I should go about narrowing that down? Finally, if it would help to see the full wal segment files, let me know and I will email individually. (I assume the list doesn't want 32MB of attachments). Mason
Re: [BUGS] Duplicate values found when reindexing unique index
On Dec 31, 2007 12:51 PM, Tom Lane <[EMAIL PROTECTED]> wrote: > "Mason Hale" <[EMAIL PROTECTED]> writes: > > If it is some mysterious hardware or kernel error -- any suggestions for > how > > I should go about narrowing that down? > > The first thing to ask is whether you're running an up-to-date kernel > (and whose is it). > The kernel is not up to date. What we have is: Kernel: Linux 2.6.18-8.1.15.el5PAE Distribution: RedHat EL 5.0-32 We can/will update the kernel at our next opportunity. > > > Finally, if it would help to see the full wal segment files, let me know > and > > I will email individually. (I assume the list doesn't want 32MB of > > attachments). > > If you'd send them to me privately, I'd be interested. > I will send these to you right away. Mason
Re: [BUGS] Duplicate values found when reindexing unique index
Hello Tom -- a thousand thanks for taking the time to walk through those files. This could be the kernel's fault, but I'm wondering whether the > RAID controller is going south. Offhand it seems like the controller > would be the only place that would be likely to explain both the > bogus-data and good-data-in-wrong-place behaviors. > To clarify a bit further -- on the production server, the data is written to a 10-disk RAID 1+0, but the pg_xlog directory is symlinked to a separate, dedicated SATA II disk. There is a similar setup on the standby server, except that in addition to the RAID for the data, and a separate SATA II disk for the pg_xlog, there is another disk (also SATA II) dedicated for the archive of wal files copied over from the production server. I mention all this because if the problem is the wal files in the pg_xlog directory becoming corrupted, and those exist on a dedicated disk (not in a RAID), then doesn't that indicate that the RAID controller is not involved? Mason
Re: [BUGS] BUG #3848: function pg_catalog.substring(date, integer, integer) does not exist
On Tue, Jan 01, 2008 at 08:11:01PM +, Ion wrote: > Email address: [EMAIL PROTECTED] lets hope Ion is subscribed! :) > I tested PostgreSQL 8.3 beta4 with tinyerp and I have this problem: > pg_catalog.substring(date, integer, integer) does not exist This is most likely a bug in tinyerp. substring has only ever been defined for string data types and isn't valid for date types at all. > I have not this problem with postgresql 8.2.5. Prior to PG 8.3, all types would be silently coerced into TEXT types in certain cases, leading to all sorts of strangeness. I've just downloaded the source of tinyerp and had a look though and this practice seems pretty endemic. They should really be using to_char(date_exp, '-MM') instead of substring(date_exp FOR 7) in all their queries. Sam ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [BUGS] BUG #3848: function pg_catalog.substring(date, integer, integer) does not exist
On Tue, Jan 01, 2008 at 04:29:47PM -0500, Tom Lane wrote: > Sam Mason <[EMAIL PROTECTED]> writes: > > I've just downloaded the source of tinyerp and had a look though and > > this practice seems pretty endemic. They should really be using > > to_char(date_exp, '-MM') > > instead of > > substring(date_exp FOR 7) > > in all their queries. > > Or at least explicitly casting the date to text. But your way is > better, since it won't break if the datestyle is something other > than ISO. It also improves code readability is is always a good thing. I've just gone though the code and sent a patch to the tinyerp-devel list with these fixes in. Lets see if it gets incorporated. Sam ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [BUGS] Duplicate values found when reindexing unique index
Hi everyone -- Sorry to revisit a dead horse, but I wanted to clear up some misinformation -- On Dec 31, 2007 5:35 PM, Tom Lane <[EMAIL PROTECTED]> wrote: > "Mason Hale" <[EMAIL PROTECTED]> writes: > >> This could be the kernel's fault, but I'm wondering whether the > >> RAID controller is going south. > > > To clarify a bit further -- on the production server, the data is > written to > > a 10-disk RAID 1+0, but the pg_xlog directory is symlinked to a > separate, > > dedicated SATA II disk. > > > There is a similar setup on the standby server, except that in addition > to > > the RAID for the data, and a separate SATA II disk for the pg_xlog, > there is > > another disk (also SATA II) dedicated for the archive of wal files > copied > > over from the production server. > It turns out that the separate SATA II disk was configured as a single-disk JBOD on the same controller as the 10-disk RAID 1+0. Since we've seen corruption in the data directory (on the RAID) and in the pg_xlog directory (on the SATA II disk) the RAID controller is one of the few common elements between those two partitions and hence is highly suspect, and may dispel some of the mystery with our situation. We will be replacing the RAID controller in short order. For what it is worth it is an Adaptec 31605 with a battery backup module. > > Oh. Maybe it's one of those disks' fault then. Although WAL corruption > would not lead to corruption of the primary DB as long as there were no > crash/replay events. Maybe there is more than one issue here, or maybe > it's the kernel's fault after all. > > Given the new information about the RAID controller is managing all the disks in the question (after all) -- if the RAID controller is going south, then there would be no need for a crash/replay event for that corruption to make it into the primary DB. Seems to be pretty damning evidence against the RAID controller, agreed? Mason
Re: [BUGS] BUG #4035: sql table aliases do not work
On Fri, Mar 14, 2008 at 08:53:08PM +, RGF wrote: > PostgreSQL version: latest do you mean 8.2.6, 8.3.0 or a latest version of some other series? please try a little harder next time!! > Description:sql table aliases do not work they do whenever I use them! by the looks of your SQL you're not even using them where you should be: > Context: SQL statement "update tonodes set tonodes.cost = case when > tonodes.cost is NULL then fromnodes.cost + paths.cost when fromnodes.cost + > paths.cost < tonodes.cost then fromnodes.cost + paths.cost else tonodes.cost > end, tonodes.pathid = paths.pathid from nodes as fromnodes inner join paths > on paths.fromnodeid = fromnodes.nodeid inner join tonodes on tonodes.nodeid > = paths.tonodeid where fromnodes.nodeid = $1 and (tonodes.cost is NULL or > fromnodes.cost + paths.cost < tonodes.cost) and tonodes.calculated = 0" > PL/pgSQL function "dijkstra_resolve" line 53 at SQL statement If we rewrite this to be somewhat readable: update tonodes set tonodes.cost = case when tonodes.cost is NULL then fromnodes.cost + paths.cost when fromnodes.cost + paths.cost < tonodes.cost then fromnodes.cost + paths.cost else tonodes.cost end, tonodes.pathid = paths.pathid from nodes as fromnodes inner join paths on paths.fromnodeid = fromnodes.nodeid inner join tonodes on tonodes.nodeid = paths.tonodeid where fromnodes.nodeid = $1 and (tonodes.cost is NULL or fromnodes.cost + paths.cost < tonodes.cost) and tonodes.calculated = 0; You refer to "tonodes" but never actually say that it's an alias for nodes (I assume, you've not actually said this anywhere). > The tables referenced (nodes and paths) exist and have data. The SQL works > in MS SQL Server 2000 The PG manual[1] has this to say: Some other database systems offer a FROM option in which the target table is supposed to be listed again within FROM. That is not how PostgreSQL interprets FROM. Be careful when porting applications that use this extension. I'd guess this is what MS SQL does. That said, it's easy to rewrite your query to use PG syntax. I've also noticed that your CASE statement is somewhat redundant so I've removed it (it's cases are exactly the same as the WHERE clause). UPDATE nodes f SET cost = f.cost + p.cost, pathid = p.pathid FROM nodes t, paths p WHERE (p.fromnodeid,p.tonodeid) = (f.nodeid,t.nodeid) AND (t.cost IS NULL OR f.cost + p.cost < t.cost) AND t.calculated = 0 AND f.nodeid = $1; Which, to me, is even more readable. For future reference, the pgsql-general mailing list[2] is more appropiate for questions like this. As a side note, do you have exactly one path from each node to another node, or do you run this code several times until it converges on the minimum? In the latter case you'd probably be better off using an aggregation to find the shortest path in a single pass. Sam [1] http://www.postgresql.org/docs/8.3/static/sql-update.html#AEN61013 [2] http://archives.postgresql.org/pgsql-general/ -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] Redundant explicit field name/types description while select from function with return type of record
On Fri, Mar 28, 2008 at 01:43:25PM -0300, Euler Taveira de Oliveira wrote: > [EMAIL PROTECTED] wrote: > >a column definition list is required for functions returning "record" > >It seems a BUG > > > I don't think so. We can say it is a missing feature. As stated in [1], > record types don't have a predefined structure -- they're placeholders. I was having a similar discussion with Gregory Stark about this and hadn't realised that such small amounts of state was recorded with each row. > How do you know the row structure before hand? Its structure can be > changed on-the-fly. Sorry, I don't understand this comment. Could you elaborate? Sam -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #4085: No implicit cast after coalesce
On Thu, Apr 03, 2008 at 12:24:17AM +0200, Peter Eisentraut wrote: > Jeff Dwyer wrote: > > This works fine: > > select 1 where current_date between '1900-3-3' and '1900-2-2'; > > This doesn't: > > select 1 where current_date between coalesce(null,current_date) and > > coalesce(null, '1900-1-2'); > > > > This fix works: > > select 1 where current_date between coalesce(null,current_date) and > > coalesce(null, date('1900-1-2')); > > > > This seems like a bug to me. Why should an explicit cast be necessary after > > a coalesce? > > Because coalesce(null, '1900-1-2') has no other type information attached, so > it would have picked text by default as result type, and that then clashes > with the result type of coalesce(null,current_date), which can be derived to > be date. This is a robustness improvement: 8.2 and earlier would silently > accept coalesce(null, 'abc') and apply text-semantics comparison. The types look as though they could be interpreted unambiguously and correctly very easily. Parametric polymorphism and some basic type inference would easily be able to resolve this. BETWEEN would have the following type (very informally presented; lower case characters stand for type variables, Titlecase for type names, UPPERCASE for identifiers!): Boolean (t BETWEEN t AND t) i.e. when BETWEEN is called all the types must be the same. COALESCE is also parametrised over a single type: t COALESCE(t,t) NULLs could be encoded in the type system in many ways as long it had a polymorphic type. The type system should realise that "current_date" is of type Date and because NULL is polymorphic the COALESCEs would unify, both returning values of type Date, which would in turn unify with the BETWEEN operator resulting in a value of BOOLEAN type, which is exactly what the WHERE clause expects. This sort of type inference has been known (and extensively studied) for about 50 years now, it always surprises me how little it's known outside the functional programming community (ML and Haskell being the old guard). Apparently, according to the fountain of wisdom that is Wikipedia, It's finally starting to break into very mainstream languages like the next versions of VB9 and C#3. Moving an existing implementation over to a new type system is an entirely non-trivial matter though! Sam -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] Inconsistent shift operator
On Sat, Apr 19, 2008 at 11:26:57AM -0400, Tom Lane wrote: > Roman Kononov <[EMAIL PROTECTED]> writes: > > The below test cases show the obvious inconsistency between different > > integer types. > > [ shrug... ] The << and >> operators just expose the behavior of the > local C compiler's shift operators, and it's clearly stated in the C > spec that shifting by more than the word width produces unspecified > results. I thought that getting the correct answer was more important than getting the "wrong" answer quickly. The current code also introduces its own set of strangeness in the 16 bit case on my x86_64 box. It does something closer to: int16 shl (int val, int n) { n %= 32; return n < 16 ? val << n : 0; } This is exactly what the code says, it's just the resulting semantics aren't very nice for the user. Wouldn't it be easy to put some code like this in: if (arg2 < 16) return PG_RETURN_INT16(arg1 << arg2); return PG_RETURN_INT16(0); People would have one less platform specific weirdo to worry about. As an aside, I thought it would be interesting to see what MySQL did and it seems to check for and handle this case--albeit only the 64bit case, but as far as I can tell it only really knows about "long long" ints. Sam -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] Inconsistent shift operator
On Sun, Apr 20, 2008 at 12:27:38PM -0400, Tom Lane wrote: > Sam Mason <[EMAIL PROTECTED]> writes: > > Wouldn't it be easy to put some code like this in: > > if (arg2 < 16) > > return PG_RETURN_INT16(arg1 << arg2); > > return PG_RETURN_INT16(0); > > This is a straw man. It covers *one* of the behaviors left undefined > by the C standard. I quote from C99: [...] wow, I never realised how little semantics C actually defines. I'm a fan of formally defined semantics and the above just seems like a big cop-out. The case of E1 being negative seems completely implementation defined! > We are shifting signed types so we are exposed to every one of these > unspecified behaviors. In particular, since we don't know whether the > behavior of >> will be zero-fill or sign-fill, it's not going to be > exactly trivial to make a consistent extension of it to shift values > greater than the word width. About the only reasonable thing I can think of that would remain within C's spec (at least with respect of shifting) would be to always treat E1 as an unsigned value. This would allow it to be used consistently with the other bit-wise operators, but would cause any non-bitwise interpretation of the result to become implementation defined. Not very nice. > By the time you get done replicating all this for the int2, int4, > and int8 shift operators, it's not looking like such a small patch > anymore. And I still find the premise entirely unconvincing. After an afternoon of getting utterly bogged down looking into what other languages do and getting very distracted with ring theory I'm tempted to reluctantly agree. Maybe the warning about floats could be extended to cover the shift operators as well. Maybe: The bit shift operators only return consistent results when both the RHS is within the bit-size of the arguments' data type and the LHS is positive, in all other cases the behaviour is platform specific. I think that's technically correct, but seems to come across as very pessimistic. > Maybe the user *wants* to see the local behavior of shift, whatever > it might be. It's certainly not impossible that we'd break applications > that worked fine before (at least on the hardware they were being > used on). Yes. It always surprises me how hard getting formal semantics into an existing language seems to be. Sam -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #4114: Inconsistent shift operator
On Sun, Apr 20, 2008 at 08:17:50PM +0200, Zdenek Kotala wrote: > Roman Kononov napsal(a): > >The below test cases show the obvious inconsistency between different > >integer types. > > It seems to be OK regarding how C shift operator works. Try Yes, but I interpret this behaviour as not being very useful for people writing SQL code that uses the shift operators. C is a very low level language and you almost always end up writing platform specific code, SQL is supposed to be much higher level and should abstract away system specific differences. I've not been able to think of a nice way of doing this though. Sam -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] PB with postgresql 7.4
On Mon, Apr 28, 2008 at 02:14:17PM +0200, Nicolas ZABOURI wrote: > My server is down and all my dump aren't make. > How can make for re-install the data on a new install of the server ? If you can get another copy of 7.4 installed that's built for the same processor as the database was originally running from and then point it at your old data directory it should be able to use it. I.e. something like: postmaster-7.4 -D /path/to/old/data It is probably worth working with a copy of the data just in case something strange happens. If you want to stay with 7.4 then you're done, if you want to upgrade to a newer release (i.e. 8.2 or 8.3) of PG then you should do a pg_dump from the newer version's dump program. > I don't find a forum who can speak about this ! The [EMAIL PROTECTED] mailing list is probably the best place to ask these sorts of questions. They are described here: http://www.postgresql.org/community/lists/ Sam -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #4207: EXTRACT function
On Thu, May 29, 2008 at 07:01:46AM +, Jeferson Kasper wrote: > the EXTRACT(field FROM source) function allows to use some kind of 'fields', > and i need to know if a time (like '07:00') is after the midday(12:00) or > not. > I know i can resolve it implementing some function like "select > ('07:00'<'12:00')".. i just want to know if is difficult to implement a new > field called 'am', to use like. > > select extract(AM from '07:00'); What's wrong with just doing: SELECT extract(hour FROM date) < 12; Sam -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs