Re: [GENERAL] Workaround for bug #13148 (deferred EXCLUDE constraint violation)
On Mon, Apr 27, 2015 at 7:45 AM, Evan Martin wrote: > I submitted the following bug report through the web form a few days ago. > It's causing problems in my application and I've been unable to find a way > to get around it. If someone here, familiar with PostgreSQL internals, > could suggest a workaround I'd really appreciate it! > > I have a deferred EXCLUDE constraint on a derived table. Inside a > transaction I insert a new row that conflicts with an existing one (so the > constraint would fail if it was immediate), delete the old row and run an > unrelated UPDATE on the new row, then try to commit. I would expect the > commit to succeed, since there is now no conflict, but it fails with > > ERROR: conflicting key value violates exclusion constraint > "uq_derived_timeslice_dup_time_ex" > SQL state: 23P01 > Detail: Key (feature_id, valid_time_begin, interpretation, > (COALESCE(sequence_number, (-1=(1, 2015-01-01 00:00:00, X, -1) conflicts > with existing key (feature_id, valid_time_begin, interpretation, > (COALESCE(sequence_number, (-1=(1, 2015-01-01 00:00:00, X, -1). > > If I run the delete statement first it works. If I remove the (seemingly > unrelated) update statement it also works. Reproducible under PostgreSQL > 9.3.6 and 9.4.1 64-bit on Windows 7 and Postgresql 9.2.10 32-bit on Ubuntu > using the attached script. > > I don't know if it is acceptable to you, but I did manage a work around. I ran you script as is and got the same problem. I was able to run the script to successful completion by adding in one statement just _before_ the BEGIN command: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE; http://www.postgresql.org/docs/9.4/static/sql-set-transaction.html SERIALIZABLE All statements of the current transaction can only see rows committed before the first query or data-modification statement was executed in this transaction. If a pattern of reads and writes among concurrent serializable transactions would create a situation which could not have occurred for any serial (one-at-a-time) execution of those transactions, one of them will be rolled back with a serialization_failure error. I do not know the internals, but I have a "gut feel" that the problem somehow relates to the MVCC implementation in PostgreSQL. Sorry about delay but: (1) I was on Jury duty yesterday & (2) I was hoping a more experienced person would speak up. -- If you sent twitter messages while exploring, are you on a textpedition? He's about as useful as a wax frying pan. 10 to the 12th power microphones = 1 Megaphone Maranatha! <>< John McKown
Re: [GENERAL] Workaround for bug #13148 (deferred EXCLUDE constraint violation)
Thanks for looking into this! I tried your workaround on both 9.3.6 and 9.4.1 on Windows (64-bit), but it made no difference for me. If I put the SET TRANSACTION statement before BEGIN on 9.4.1 I get "WARNING: SET TRANSACTION can only be used in transaction blocks" - but putting it inside the transaction block doesn't prevent the constraint violation, either. On 28/04/2015 2:16 PM, John McKown wrote: On Mon, Apr 27, 2015 at 7:45 AM, Evan Martin mailto:postgre...@realityexists.net>>wrote: I submitted the following bug report through the web form a few days ago. It's causing problems in my application and I've been unable to find a way to get around it. If someone here, familiar with PostgreSQL internals, could suggest a workaround I'd really appreciate it! I have a deferred EXCLUDE constraint on a derived table. Inside a transaction I insert a new row that conflicts with an existing one (so the constraint would fail if it was immediate), delete the old row and run an unrelated UPDATE on the new row, then try to commit. I would expect the commit to succeed, since there is now no conflict, but it fails with ERROR: conflicting key value violates exclusion constraint "uq_derived_timeslice_dup_time_ex" SQL state: 23P01 Detail: Key (feature_id, valid_time_begin, interpretation, (COALESCE(sequence_number, (-1=(1, 2015-01-01 00:00:00, X, -1) conflicts with existing key (feature_id, valid_time_begin, interpretation, (COALESCE(sequence_number, (-1=(1, 2015-01-01 00:00:00, X, -1). If I run the delete statement first it works. If I remove the (seemingly unrelated) update statement it also works. Reproducible under PostgreSQL 9.3.6 and 9.4.1 64-bit on Windows 7 and Postgresql 9.2.10 32-bit on Ubuntu using the attached script. I don't know if it is acceptable to you, but I did manage a work around. I ran you script as is and got the same problem. I was able to run the script to successful completion by adding in one statement just _before_ the BEGIN command: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE; http://www.postgresql.org/docs/9.4/static/sql-set-transaction.html SERIALIZABLE All statements of the current transaction can only see rows committed before the first query or data-modification statement was executed in this transaction. If a pattern of reads and writes among concurrent serializable transactions would create a situation which could not have occurred for any serial (one-at-a-time) execution of those transactions, one of them will be rolled back with a serialization_failure error. I do not know the internals, but I have a "gut feel" that the problem somehow relates to the MVCC implementation in PostgreSQL. Sorry about delay but: (1) I was on Jury duty yesterday & (2) I was hoping a more experienced person would speak up. -- If you sent twitter messages while exploring, are you on a textpedition? He's about as useful as a wax frying pan. 10 to the 12th power microphones = 1 Megaphone Maranatha! <>< John McKown
[GENERAL] Documentation Inaccuracy – Transaction Isolation
Hi, I noticed an inaccuracy in the transaction isolation docs. Under the Repeatable Read Isolation Level section it states: “The Repeatable Read isolation level only sees data committed before the transaction began; it never sees either uncommitted data or changes committed during transaction execution by concurrent transactions.” That is not entirely accurate. The snapshot starts with the first SQL statement in the transaction, not at the start of the transaction. Any change that is committed in another transaction after the start of the transaction but before the first SQL statement will be seen. Brad. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Documentation Inaccuracy – Transaction Isolation
On Tue, Apr 28, 2015 at 08:00:24PM +, Nicholson, Brad (Toronto, ON, CA) wrote: > Hi, > > I noticed an inaccuracy in the transaction isolation docs. Under the > Repeatable Read Isolation Level section it states: > > “The Repeatable Read isolation level only sees data committed before the > transaction began; it never sees either uncommitted data or changes committed > during transaction execution by concurrent transactions.” > > That is not entirely accurate. The snapshot starts with the first SQL > statement in the transaction, not at the start of the transaction. Any > change that is committed in another transaction after the start of the > transaction but before the first SQL statement will be seen. Yes, we have fixed that for PG 9.5: http://www.postgresql.org/docs/devel/static/transaction-iso.html This level is different from Read Committed in that a query in a --> repeatable read transaction sees a snapshot as of the start of the first --> non-transaction-control statement in the transaction, not as of the start of the current statement within the transaction. Thus, successive SELECT commands within a single transaction see the same data, i.e., they do not see changes made by other transactions that committed after their own transaction started. -- Bruce Momjian http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] newsfeed type query
I'm trying to upgrade some code that powers a newfeed type stream, and hoping someone can offer some insight on better ways to structure some parts of the query The part that has me stumped right now... There are several criteria for why something could appear in a stream. for example, here are 2 handling a posting: * a posting by a friend * a posting in a group the general way I've handled this so far has been simple: select * from posting where author_id in (select friend_id from friends where user_id = ?) or group_id in (select group_id from memberships where user_id = ?); now i need to pull in the context of the match (friend, group, both), but I can't figure out how to do this cleanly. 1. if i just add 'case' statements to the select to note the origin, those subselects run again. (ie, the same subquery is executed twice) 2. if i structure this as a union (and note the origin with a string), it takes a lot more work to integrate and sort the 2 separate selects ( eg "select id, timestamp, 'by-friend'" unioned with "in-group") does anyone have ideas on other approaches to structuring this? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] newsfeed type query
Since you very nicely DID NOT provide the pg version, O/S or table structure(s), which is what you should do REGARDLESS of the type of question (it's just the smart and polite thing to do when asking for help) The best I can suggest is: SELECT CASE WHEN context = 'friend' THEN p.junka WHEN context = 'group' THEN p.junkb WHEN context = 'both' THEN p.junka || ' ' || p.junkb END FROM posting p where p.author_id in (SELECT f.friend_id FROM friends f WHERE f.user_id = ?) OR p.group_id in (SELECT m.group_id FROM memberships m WHERE m.user_id = ?); On Tue, Apr 28, 2015 at 6:26 PM, Jonathan Vanasco wrote: > > I'm trying to upgrade some code that powers a newfeed type stream, and > hoping someone can offer some insight on better ways to structure some > parts of the query > > The part that has me stumped right now... > > There are several criteria for why something could appear in a stream. > for example, here are 2 handling a posting: > > * a posting by a friend > * a posting in a group > > the general way I've handled this so far has been simple: > > select * from posting where author_id in (select friend_id from > friends where user_id = ?) or group_id in (select group_id from memberships > where user_id = ?); > > now i need to pull in the context of the match (friend, group, both), but > I can't figure out how to do this cleanly. > > 1. if i just add 'case' statements to the select to note the origin, those > subselects run again. (ie, the same subquery is executed twice) > 2. if i structure this as a union (and note the origin with a string), it > takes a lot more work to integrate and sort the 2 separate selects ( eg > "select id, timestamp, 'by-friend'" unioned with "in-group") > > does anyone have ideas on other approaches to structuring this? > > > > > > > > > > > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > -- *Melvin Davidson* I reserve the right to fantasize. Whether or not you wish to share my fantasy is entirely up to you.
Re: [GENERAL] newsfeed type query
Sorry, I was trying to ask something very abstract as I have similar situations on multiple groups of queries/tables (and they're all much more complex). I'm on pg 9.3 The relevant structure is: posting: id timestamp_publish group_id__in user_id__author friends: user_id__a user_id__b memberships: user_id group_id role_id -- working sql CREATE TABLE groups( id SERIAL NOT NULL PRIMARY KEY ); CREATE TABLE users( id SERIAL NOT NULL PRIMARY KEY ); CREATE TABLE friends ( user_id__a INT NOT NULL REFERENCES users( id ), user_id__b INT NOT NULL REFERENCES users( id ) ); CREATE TABLE memberships ( user_id INT NOT NULL REFERENCES users( id ), group_id INT NOT NULL REFERENCES groups( id ), role_id INT NOT NULL ); CREATE TABLE posting ( id SERIAL NOT NULL, timestamp_publish timestamp not null, group_id__in INT NOT NULL REFERENCES groups(id), user_id__author INT NOT NULL REFERENCES users(id), is_published BOOL ); The output that I'm trying to get is: posting.id {the context of the select} posting.timestamp_publish (this may need to get correlated into other queries) These approaches had bad performance: -- huge selects / memory -- it needs to load everything from 2 tables before it limits EXPLAIN ANALYZE SELECT id, feed_context FROM ( SELECT id, timestamp_publish, 'in-group' AS feed_context FROM posting WHERE ( group_id__in IN (SELECT group_id FROM memberships WHERE user_id = 57 AND role_id IN (1,2,3)) AND (is_published = True AND timestamp_publish <= CURRENT_TIMESTAMP AT TIME ZONE 'UTC') ) UNION SELECT id, timestamp_publish, 'by-user' AS feed_context FROM posting WHERE ( user_id__author IN (SELECT user_id__b FROM friends WHERE user_id__a = 57) AND (is_published = True AND timestamp_publish <= CURRENT_TIMESTAMP AT TIME ZONE 'UTC') ) ) AS feed ORDER BY timestamp_publish DESC LIMIT 10 ; -- selects minimized, but repetitive subqueries SELECT id, CASE WHEN group_id__in IN (SELECT group_id FROM memberships WHERE user_id = 57 AND role_id IN (1,2,3)) THEN True ELSE NULL END AS feed_context_group, CASE WHEN user_id__author IN (SELECT user_id__b FROM friends WHERE user_id__a = 57) THEN True ELSE NULL END AS feed_context_user FROM posting WHERE ( group_id__in IN (SELECT group_id FROM memberships WHERE user_id = 57 AND role_id IN (1,2,3)) OR user_id__author IN (SELECT user_id__b FROM friends WHERE user_id__a = 57) ) AND (is_published = True AND timestamp_publish <= CURRENT_TIMESTAMP AT TIME ZONE 'UTC') ORDER BY timestamp_publish DESC LIMIT 10 ; On Apr 28, 2015, at 6:56 PM, Melvin Davidson wrote: > Since you very nicely DID NOT provide the pg version, O/S or table > structure(s), which is what you should do REGARDLESS of the > type of question (it's just the smart and polite thing to do when asking for > help) The best I can suggest is: > SELECT > CASE WHEN context = 'friend' THEN p.junka > WHEN context = 'group' THEN p.junkb > WHEN context = 'both' THEN p.junka || ' ' || p.junkb > END >FROM posting p > where p.author_id in (SELECT f.friend_id > FROM friends f > WHERE f.user_id = ?) >OR p.group_id in (SELECT m.group_id > FROM memberships m > WHERE m.user_id = ?);
Re: [GENERAL] BDR Selective Replication
On 4/27/15 7:54 PM, Craig Ringer wrote: If 'default replication set' is the idea of "here's what tables *should* be getting replicated regardless of whether that's happening or not", it'd be great if that was done so it could be split out on it's own at some point. It's a problem that affects all replication systems. It wasn't, but that's an interesting idea. You need away to identify peer nodes in an abstract way before you can really define sets of which nodes should get which tables. So I think replication identifiers ( https://commitfest.postgresql.org/4/161/ ) are a pre-requisite for that though, and one that's proving difficult to get in. Perhaps... different replication systems probably use different methods to identify, so presumably there'd need to be some way to map a generic identifier into an appropriate identifier for whatever replication system you're using. I think any sort of replication sets is likely to have similar problems, especially the "no in-core user" problem. There's nothing fundamentally impossible about filtering WAL sent to physical downstreams over streaming replication to include only replicated tables and the catalogs, though, so perhaps there could be an in-core user for it. Oh, I wasn't thinking this needed to be in-core. I think it'd be a lot easier to develop it as an extension to start with... certainly a lot less headache ;) If it becomes popular then it'll be a lot easier to get it added. In BDR we're currently (ab)using security labels to tag tables with their replication sets, but I'd love to have a proper way to do that. As I recall the prior approach, of allowing custom relation options, was rejected on -hackers. How would you want to go about storing and tracking the information? A new catalog? The other issue for in-core replication sets would probably be making it foreign-key aware, so replication of a table transitively requires replication of its references. As you said, we'd need a way to identify replication nodes. We might also need/want a way to specify topology. I don't think topology would be too hard (presumably it's either a single 'parent' node, or a list of peers). What might be more interesting is dealing with different systems methods of identifying nodes. You'd want a way to define different sets and associate them with nodes. A node could be a provider, subscriber, or both. I think some replication systems support 'pass through' as well, where the node passes data downstream but doesn't apply it itself. Or it could be multi-master and possibly a provider to read-only subscribers. Finally you'd need to associate tables and sequences with a set. I agree you'd want to look at FKs. I'd also like to be able to define rules for a set, like "include everything in this schema, unless the first character is _". -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] newsfeed type query
On 4/28/15 6:57 PM, Jonathan Vanasco wrote: The relevant structure is: posting: id timestamp_publish group_id__in user_id__author friends: user_id__a user_id__b memberships: user_id group_id role_id Try this... SELECT ... , f.user_id__b IS NOT NULL AS in_friends , m.user_id IS NOT NULL AS in_group FROM posting p LEFT JOIN friends f ON( f.user_id__b = p.user_id__author ) LEFT JOIN memberships m ON( m.group_id = p.group_id__in ) WHERE is_published AND timestamp_publish ... AND ( f.user_id__a = 57 OR ( m.user_id = 57 AND m.group_id IN (1,2,3) ) ) I'm not sure how fast it'll be though. I suspect your best bet is to put the UNION approach inside a set returning function; that way the ugly is contained in one place. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Upgrading hot standbys
Greetings, I'm in the process of upgrading PG 9.1.3 to 9.4.1 in the near future. I have several machines which each house unique databases. Each of those are replicated to a standby server with matching configurations. A total of 10 servers, 5 masters, 5 slaves. Everything runs on Ubuntu. My question, as I can't seem to find any documentation on this part, is once I successfully upgrade the master I will need to upgrade the standby as well. Will I have to rebuild the standby from scratch, or will the standby pick up where it was before the upgrade if I do things correctly? Thanking you in advance, Aaron dunnhumby limited is a limited company registered in England and Wales with registered number 02388853 and VAT registered number 927 5871 83. Our registered office is at Aurora House, 71-75 Uxbridge Road, London W5 5SL. The contents of this message and any attachments to it are confidential and may be legally privileged. If you have received this message in error you should delete it from your system immediately and advise the sender. dunnhumby may monitor and record all emails. The views expressed in this email are those of the sender and not those of dunnhumby.
[GENERAL] clearing of the transactions shown in pg_locks
Hi, I am new to postgreSQL and facing an issue with the transactions which are being shown in pg_locks. We are using a script which deletes entries from a table called audottrailLogEntry table. This script first does the indexing then it deletes the entries from the table. Now issue is that this script is taking lot of time and has acquired some locks also. Nor sure if it is happening due to indexing or what. We did not stop the script and after that when we tried to manually run the delete operaion, that query has also acquired some locks and is not working. 1. We can stop the script which is running in background , but if stopping the script is sufficient to release the locks which are shown in pg_locks? 2. We want to kill the script and manally want to run the delete operation, what steps should I follow besides killing the script, which will clear all the transactions held by the previously running script? fm_db_Server3=# select t.relname,l.locktype,page,virtualtransaction,pid,mode,granted from pg_locks l, pg_stat_all_tables t where l.relation=t.relid order by relation asc; relname | locktype | page | virtualtransaction | pid | mode | granted +--+++---+--+- pg_class | relation || 3/31423| 10675 | AccessShareLock | t pg_index | relation || 3/31423| 10675 | AccessShareLock | t pg_namespace | relation || 3/31423| 10675 | AccessShareLock | t audittraillogentry | relation || 2/33089| 28223 | RowExclusiveLock | t audittraillogentry | relation || 6/94 | 3722 | RowExclusiveLock | t audittraillogentry | tuple| 872812 | 2/33089| 28223 | ExclusiveLock| t audittraillogentry | relation || 6/94 | 3722 | RowShareLock | t cdrdetails | relation || 6/94 | 3722 | RowExclusiveLock | t cdrlogentry| relation || 6/94 | 3722 | RowShareLock | t (9 rows) fm_db_Server3=# SELECT * FROM pg_locks; locktype| database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction | pid | mode | granted ---+--+--++---++---+-+---+--++---+--+- virtualxid| | || | 2/33089| | | | | 2/33089| 28223 | ExclusiveLock | t relation |16384 |16409 || || | | | | 6/94 | 3722 | AccessShareLock | t relation |16384 |16409 || || | | | | 6/94 | 3722 | RowExclusiveLock | t virtualxid| | || | 3/31424| | | | | 3/31424| 10675 | ExclusiveLock | t relation |16384 |16406 || || | | | | 6/94 | 3722 | RowShareLock | t relation |16384 |16406 || || | | | | 6/94 | 3722 | RowExclusiveLock | t transactionid | | || || 166393 | | | | 6/94 | 3722 | ExclusiveLock| t relation |16384 |50024 || || | | | | 6/94 | 3722 | AccessShareLock | t relation |16384 |50024 || || | | | | 6/94 | 3722 | RowExclusiveLock | t virtualxid| | || | 6/94 | | | | | 6/94 | 3722 | ExclusiveLock | t relation |16384 |11000 || || | | | | 3/31424| 10675 | AccessShareLock | t relation |16384 |16479 || || | | | | 6/94 | 3722 | AccessShareLock | t relation |16384 |50026 || || | | | | 2/33089| 28223 | RowExclusiveLock | t relation |16384 |50025 || || | | | | 2/33089| 28223 | RowExclusiveLock | t relation |16384 |50027 || || | | | | 2/33089| 28223 | R
Re: [GENERAL] BDR Selective Replication
On 29 April 2015 at 09:14, Jim Nasby wrote: > On 4/27/15 7:54 PM, Craig Ringer wrote: > >> If 'default replication set' is the idea of "here's what tables >> *should* be getting replicated regardless of whether that's >> happening or not", it'd be great if that was done so it could be >> split out on it's own at some point. It's a problem that affects all >> replication systems. >> >> >> It wasn't, but that's an interesting idea. >> >> You need away to identify peer nodes in an abstract way before you can >> really define sets of which nodes should get which tables. So I think >> replication identifiers ( https://commitfest.postgresql.org/4/161/ ) are >> a pre-requisite for that though, and one that's proving difficult to get >> in. >> > > Perhaps... different replication systems probably use different methods to > identify, so presumably there'd need to be some way to map a generic > identifier into an appropriate identifier for whatever replication system > you're using. Replication identifiers do just that: provide a way to map identifiers from some external system into a local unique identifier for a peer node, along with tracking of the replay position from the peer so replay can be restarted at a consistent point. The replay position is an LSN, so they're not going to work for any arbitrary system, though. How would you want to go about storing and tracking the information? A >> new catalog? The other issue for in-core replication sets would probably >> be making it foreign-key aware, so replication of a table transitively >> requires replication of its references. >> > > As you said, we'd need a way to identify replication nodes. We might also > need/want a way to specify topology. Topology? Why? All a node needs to know is "send data from to ". It's just a set. If a replication system is doing something fancy it'd be able to manage the replication sets on the nodes. > I don't think topology would be too hard (presumably it's either a single > 'parent' node, or a list of peers). What might be more interesting is > dealing with different systems methods of identifying nodes. > Yeah, topology is hard. Rings, mesh with dangling follower nodes, etc. I don't think it's really the same thing as replication sets. You'd want a way to define different sets and associate them with nodes. A > node could be a provider, subscriber, or both. I think some replication > systems support 'pass through' as well, where the node passes data > downstream but doesn't apply it itself. Or it could be multi-master and > possibly a provider to read-only subscribers. > Yeah, you're talking about some kind of abstract modelling of a replication topology. I'm not sure that's at all necessary to keep track of which tables should be replicated to which nodes. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services