Re: [GENERAL] Comparing txid and xmin (under BDR)

2015-05-12 Thread Peter Mogensen



On 2015-05-12 06:06, Craig Ringer wrote:

On 11 May 2015 at 21:10, Peter Mogensen  wrote:



So ... I can easily get the current txid of the SELECT transaction by
calling txid_current().


Note that by doing so, you force txid allocation for a read-only query that
might otherwise not need one, which increases your txid burn rate and
decreases time until you need to do wraparound-protection vacuuming.




The same is not true for txid_snapshot_xmin() is it?

I mean ... I really don't need the actual txid of a SELECT statement.
Only to ensure that it's never than any invalidation event.
So it's enough to just use txid_snapshot_xmin() ... at the cost of 
possibly not caching new values in a small window after invalidation.


/Peter



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [BDR] Node Join Question

2015-05-12 Thread Craig Ringer
On 12 May 2015 at 14:33, Wayne E. Seguin  wrote:

>
>
7. on nodes 1-4 I am doing (adjusted for the nodes IP):
> SELECT bdr.bdr_group_join(
> local_node_name := 'pgbdr1',
> node_external_dsn := 'host=10.244.2.6 port=5432 user=postgres
> dbname=pgbdr',
> join_using_dsn := 'host=10.244.2.2 port=5432 user=postgres
> dbname=pgbdr'
> );
>

At a guess you're probably not waiting between joins to ensure that each
new node has finished joining before starting another node join.

BDR really needs to be enhanced to either support parallel join of multiple
nodes or identify and reject it.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [GENERAL] [BDR] Node Join Question

2015-05-12 Thread Craig Ringer
On 12 May 2015 at 14:36, Wayne E. Seguin  wrote:

> Also,
>
> Is there a way to remove these things from the init target node easier?
>
> d= p=504 a=ERROR:  55000: previous init failed, manual cleanup is required
> d= p=504 a=DETAIL:  Found bdr.bdr_nodes entry for bdr
> (6147869128174526660,1,16908,) with state=i in remote bdr.bdr_nodes
> d= p=504 a=HINT:  Remove all replication identifiers and slots
> corresponding to this node from the init target node then drop and recreate
> this database and try again
>

Now that we have SQL-level join it'd probably make sense to provide a
cleanup function for failed node joins. At this point there's no such
function.


Take note of the node identity given in the error as it corresponds to the
replication identifier name and slot name.

You need to, on the join target node:

 SELECT pg_drop_replication_slot(slot_name)
 FROM pg_replication_slots
 WHERE slot_name =
bdr.bdr_format_slot_name('6147869128174526660',1,16908)

where the sysid, timeline ID and database OID are those given in the error.
You must run this from the target node's database, as it'll only consider
slots for the current database.

Then

SELECT pg_replication_identifier_drop(...)

the replication identifier used, after looking up the replication
identifier from pg_catalog.pg_replication_identifier. There isn't an
equivalent of  bdr.bdr_format_slot_name for replication identifiers; I'll
look at adding one. Look it up visually or write a simple function to
format the string in the mean time.

Then delete the bdr.bdr_nodes entry for the failed-to-join node and any
bdr.bdr_connections entries for it.

You *must* drop and re-create the database on the failed-to-join node,
making a new blank db (preferably from template0).


Re: [GENERAL] Why does this SQL work?

2015-05-12 Thread hubert depesz lubaczewski
On Tue, May 12, 2015 at 04:07:52PM +0800, Anil Menon wrote:
> Thank you very much - looks like I will have to prefix all cols.

You should anyway.
Queries with unaliased columns make it impossible to analyze without
in-depth knowledge of the database.

Consider:

select c1, c2, c3, c4, c5
from t1 join t2 using (c6)
where c7 = 'a' and c8 < now() and c9;

which fields belong to which tables? what indexes make sense? it's
impossible to tell. if the column references were prefixed with table
name/alias - it would become possible, and easy, even, to figure out
what's going on.

depesz


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [BDR] Node Join Question

2015-05-12 Thread Wayne E. Seguin
Craig, thank you so much for the quick response!

Adding these cleanup functions sounds wonderful, thank you for looking into
that.

One question, why template0 vs template1 ? (My guess is because you want it
to be devoid of pretty much everything?)

On Tue, May 12, 2015 at 1:31 AM, Craig Ringer  wrote:

>
>
> On 12 May 2015 at 14:36, Wayne E. Seguin  wrote:
>
>> Also,
>>
>> Is there a way to remove these things from the init target node easier?
>>
>> d= p=504 a=ERROR:  55000: previous init failed, manual cleanup is required
>> d= p=504 a=DETAIL:  Found bdr.bdr_nodes entry for bdr
>> (6147869128174526660,1,16908,) with state=i in remote bdr.bdr_nodes
>> d= p=504 a=HINT:  Remove all replication identifiers and slots
>> corresponding to this node from the init target node then drop and recreate
>> this database and try again
>>
>
> Now that we have SQL-level join it'd probably make sense to provide a
> cleanup function for failed node joins. At this point there's no such
> function.
>
>
> Take note of the node identity given in the error as it corresponds to the
> replication identifier name and slot name.
>
> You need to, on the join target node:
>
>  SELECT pg_drop_replication_slot(slot_name)
>  FROM pg_replication_slots
>  WHERE slot_name =
> bdr.bdr_format_slot_name('6147869128174526660',1,16908)
>
> where the sysid, timeline ID and database OID are those given in the
> error. You must run this from the target node's database, as it'll only
> consider slots for the current database.
>
> Then
>
> SELECT pg_replication_identifier_drop(...)
>
> the replication identifier used, after looking up the replication
> identifier from pg_catalog.pg_replication_identifier. There isn't an
> equivalent of  bdr.bdr_format_slot_name for replication identifiers; I'll
> look at adding one. Look it up visually or write a simple function to
> format the string in the mean time.
>
> Then delete the bdr.bdr_nodes entry for the failed-to-join node and any
> bdr.bdr_connections entries for it.
>
> You *must* drop and re-create the database on the failed-to-join node,
> making a new blank db (preferably from template0).
>
>
>
>
>


-- 
  ~Wayne

Wayne E. Seguin
wayneeseg...@gmail.com
wayneeseguin on irc.freenode.net
http://twitter.com/wayneeseguin/
https://github.com/wayneeseguin/


Re: [GENERAL] [BDR] Node Join Question

2015-05-12 Thread Wayne E. Seguin
Craig,

I was starting the first node then letting all of the other nodes join as
quick as they could which clearly won't work. It also explains why it
worked when I did it manually, I can only do it sequentially myself ;) I
had suspected a race condition and it seems I was in the right area :)

Thank you for this, I will alter what I am doing to start sequentially.

On Tue, May 12, 2015 at 1:19 AM, Craig Ringer  wrote:

>
>
> On 12 May 2015 at 14:33, Wayne E. Seguin  wrote:
>
>>
>>
> 7. on nodes 1-4 I am doing (adjusted for the nodes IP):
>> SELECT bdr.bdr_group_join(
>> local_node_name := 'pgbdr1',
>> node_external_dsn := 'host=10.244.2.6 port=5432 user=postgres
>> dbname=pgbdr',
>> join_using_dsn := 'host=10.244.2.2 port=5432 user=postgres
>> dbname=pgbdr'
>> );
>>
>
> At a guess you're probably not waiting between joins to ensure that each
> new node has finished joining before starting another node join.
>
> BDR really needs to be enhanced to either support parallel join of
> multiple nodes or identify and reject it.
>
>
> --
>  Craig Ringer   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>



-- 
  ~Wayne

Wayne E. Seguin
wayneeseg...@gmail.com
wayneeseguin on irc.freenode.net
http://twitter.com/wayneeseguin/
https://github.com/wayneeseguin/


Re: [GENERAL] [BDR] Node Join Question

2015-05-12 Thread Wayne E. Seguin
Craig,

It's alive!!!

One more question on this thread, where can I find the meanings of
node_status in the documentation?

pgbdr=# SELECT * FROM bdr.bdr_nodes;
 node_sysid  | node_timeline | node_dboid | node_status | node_name
|node_local_dsn |
 node_init_from_
dsn
-+---++-+---+---+-
-
 6147988955199598785 | 1 |  16386 | r   | rdpg0
| host=10.244.2.2 port=5432 user=postgres dbname=pgbdr  |
 6147988955568627909 | 1 |  16386 | i   | rdpg1
| host=10.244.2.6 port=5432 user=postgres dbname=pgbdr  | host=10.244.2.2
port=5432 user=p
ostgres dbname=pgbdr
 6147988955597271233 | 1 |  16386 | r   | rdpg3
| host=10.244.2.14 port=5432 user=postgres dbname=pgbdr | host=10.244.2.2
port=5432 user=p
ostgres dbname=pgbdr
 6147988963356274882 | 1 |  16386 | r   | rdpg4
| host=10.244.2.18 port=5432 user=postgres dbname=pgbdr | host=10.244.2.2
port=5432 user=p
ostgres dbname=pgbdr
 6147988963438956739 | 1 |  16386 | r   | rdpg2
| host=10.244.2.10 port=5432 user=postgres dbname=pgbdr | host=10.244.2.2
port=5432 user=p
ostgres dbname=pgbdr
(5 rows)

​


Re: [GENERAL] Why does this SQL work?

2015-05-12 Thread Anil Menon
Thank you very much - looks like I will have to prefix all cols.

Regards
AK

On Tue, May 12, 2015 at 3:05 AM, Victor Yegorov  wrote:

> 2015-05-11 19:26 GMT+03:00 Anil Menon :
>
>> manualscan=> select count(*) From public.msgtxt where msgid in (select
>> msgid From ver736.courier where org_id=3);
>>  count
>> ---
>>  10225
>> (1 row)
>>
>> Please note, there is no msgid col in courier table. Which brings the
>> question why does this SQL work? An "select msgid From courier where
>> org_id=3" by itself gives error column "msgid" does not exist.
>>
>
> Because you can reference both, inner and outer columns from the inner
> query.
> Here you're most likely referring to the outer `msgid` in the subquery.
>
> That's why it is always a good idea to prefix all your columns with tables
> aliases.
>
>
> --
> Victor Y. Yegorov
>


Re: [GENERAL] finding tables about to be vacuum freezed

2015-05-12 Thread Steve Kehlet
On Wed, May 6, 2015 at 7:24 PM Jeff Janes  wrote:

> I've booked-marked these but haven't really looked into them to any
> extent.  It would be awesome if you put the SQL one somewhere on
> http://wiki.postgresql.org.  That way it is easier to find, and anyone
> who finds it can contribute explanations, corrections, and update it to
> keep up with changes to the database.
>

I'll see if I can do this!


> I don't have any experience with 3.5TB databases, but I certainly think
> that that is something to worry about.
>

We did hit the autovacuum_freeze_max_age threshold on Saturday and the
autovacuumer has been running for days now, slowly cranking through each
table. Fortunately, I had autovacuum_vacuum_cost_delay set so the IO impact
isn't bad: no complaints from the customer, and our ops group says the IO
load is okay. So Postgres is just quietly doing its thing. This has clearly
happened numerous times before, and explains a few mysterious incidents in
the past where a nightly analyze script has hung for several days. It's
really great to understand this better now.


>  There are two main problems you are likely to encounter (from what I
> know):
>
> One is that the autovacuum scheduler deals poorly with a database
> exceeding autovacuum_freeze_max_age.  It forces all available to autovacuum
> resources to be directed to that database, starving any other database of
> attention.  If you have multiple active databases, by the time one database
> has been frozen enough to no longer exceed autovacuum_freeze_max_age, the
> other one(s) might be horribly bloated.  If your cluster only has one
> active database in it, this won't be a problem.  The one that gets all the
> attention is the one that needs all the attention.  But if you have
> multiple active databases in your cluster, this could be a problem.
>

Fortunately in this case it's just one database, but good to know.


> The other problem is that autovac takes a fairly strong lock out on the
> table while it is vacuuming it.  Normally it relinquishes the lock once it
> realizes someone else is waiting on it.  But in the case of a forced
> full-table scan (either autovacuum_freeze_max_age or
> vacuum_freeze_table_age is exceeded), it refuses to relinquish the lock.
> This means that any process which needs a strong-ish table lock (add
> column, drop column, create index, drop index, cluster, truncate, reindex,
> etc.) is going to block for potentially a very very long time.  This is
> only a problem if you actually have such processes.  If all you do is
> select, insert, update, delete, none of those things will be blocked.
>

You're right, that was the exact problem that got me going down this path.
Some of our guys were doing a software update and some CREATE INDEX
operations it wanted to do were blocked by the autovacuumer. Fortunately,
we don't do software updates all the time, but it's falling onto me to
figure out how to make the autovacuumer not do its thing in the middle of
future software updates :-). I might do ALTER TABLES on all tables to
temporarily increase their autovacuum_freeze_max_age before, and undo it
after. Kind of hacky, but it should work.



> So if you have lull time at night, it would be a good idea to preemptively
> vacuum tables approaching autovacuum_freeze_max_age (and also exceeding
> vacuum_freeze_table_age).  I wouldn't even do VACUUM FREEZE, just VACUUM.
>

I'll investigate this. I found Josh Berkus' excellent articles, [Freezing
your tuples off](
http://www.databasesoup.com/2012/09/freezing-your-tuples-off-part-1.html)
and the nightly [Flexible Freeze](
https://github.com/pgexperts/flexible-freeze) program that came out that.
I'll investigate using this so we can better control when this happens.
Although, given that the IO load even during production hours isn't making
anyone scream, I might just leave it alone.

But aware that, be default setting, autovac is highly throttled for IO,
> while regular vacuum is entirely unthrottled for IO. So if nighttime is not
> completely idle but only relatively less busy of user activity, you might
> want to evaluate what level of throttling is appropriate.
>

Thanks Jeff for all your help and insight.


Re: [GENERAL] finding tables about to be vacuum freezed

2015-05-12 Thread William Dunn
Hello Steve,

Great monitoring query (https://gist.github.com/skehlet/36aad599171b25826e82).
I suggest modifying the value "autovacuum_freeze_table_age" to
"LEAST(autovacuum_freeze_table_age,(0.95*autovacuum_freeze_max_age))
AS autovacuum_freeze_table_age" since PostgreSQL implicitly
limits vacuum_freeze_table_age to 95% of autovacuum_freeze_max_age (as
documented at:
http://www.postgresql.org/docs/current/static/runtime-config-client.html#GUC-VACUUM-FREEZE-TABLE-AGE
).

It might also be cool to add something like:
  (((2^32)/2)-1-100) AS wraparound_dbfreeze_age,
  ROUND(100.0 * age(relfrozenxid) / (((2^32)/2)-1-100)::numeric, 1) ||
'%' AS "% til wraparound db freeze"
to monitor how close it is getting to the point at which it gets close to
going into safety shutdown mode (as documented in
http://www.postgresql.org/docs/current/static/routine-vacuuming.html#VACUUM-FOR-WRAPAROUND)
in case there is a problem with autovacuum (rare, but if you are already
running that query adding it is cheap and nice for completeness).

You have the count of rows there already, but it might also be nice (albeit
more computationally expensive) to add the size of the table so you can
evaluate if it's really worth doing a manual off-hours vacuum of the table
to avoid the scan happening during peak hours:
(pg_relation_size(pg_class.oid)) AS table_bytes


For your situation it might be good to set vacuum_freeze_min_age to a very
low value, which will make it more likely that your ordinary vacuums are
more likely to freeze the rows and advance relfrozenxid, and to keep
autovacuum tuned aggressive (for bloat/ space reclamation purposes). This
may favor many short/cheap operations and help avoid the long lock heavy
vacuum freeze operations. Increasing autovacuum_max_workers will also help
avoid having freeze operation of one table monopolize the autovacuum
workers.

*Will J. Dunn*
*willjdunn.com *

On Tue, May 12, 2015 at 12:51 PM, Steve Kehlet 
wrote:

> On Wed, May 6, 2015 at 7:24 PM Jeff Janes  wrote:
>
>> I've booked-marked these but haven't really looked into them to any
>> extent.  It would be awesome if you put the SQL one somewhere on
>> http://wiki.postgresql.org.  That way it is easier to find, and anyone
>> who finds it can contribute explanations, corrections, and update it to
>> keep up with changes to the database.
>>
>
> I'll see if I can do this!
>
>
>> I don't have any experience with 3.5TB databases, but I certainly think
>> that that is something to worry about.
>>
>
> We did hit the autovacuum_freeze_max_age threshold on Saturday and the
> autovacuumer has been running for days now, slowly cranking through each
> table. Fortunately, I had autovacuum_vacuum_cost_delay set so the IO impact
> isn't bad: no complaints from the customer, and our ops group says the IO
> load is okay. So Postgres is just quietly doing its thing. This has clearly
> happened numerous times before, and explains a few mysterious incidents in
> the past where a nightly analyze script has hung for several days. It's
> really great to understand this better now.
>
>
>>  There are two main problems you are likely to encounter (from what I
>> know):
>>
>> One is that the autovacuum scheduler deals poorly with a database
>> exceeding autovacuum_freeze_max_age.  It forces all available to autovacuum
>> resources to be directed to that database, starving any other database of
>> attention.  If you have multiple active databases, by the time one database
>> has been frozen enough to no longer exceed autovacuum_freeze_max_age, the
>> other one(s) might be horribly bloated.  If your cluster only has one
>> active database in it, this won't be a problem.  The one that gets all the
>> attention is the one that needs all the attention.  But if you have
>> multiple active databases in your cluster, this could be a problem.
>>
>
> Fortunately in this case it's just one database, but good to know.
>
>
>> The other problem is that autovac takes a fairly strong lock out on the
>> table while it is vacuuming it.  Normally it relinquishes the lock once it
>> realizes someone else is waiting on it.  But in the case of a forced
>> full-table scan (either autovacuum_freeze_max_age or
>> vacuum_freeze_table_age is exceeded), it refuses to relinquish the lock.
>> This means that any process which needs a strong-ish table lock (add
>> column, drop column, create index, drop index, cluster, truncate, reindex,
>> etc.) is going to block for potentially a very very long time.  This is
>> only a problem if you actually have such processes.  If all you do is
>> select, insert, update, delete, none of those things will be blocked.
>>
>
> You're right, that was the exact problem that got me going down this path.
> Some of our guys were doing a software update and some CREATE INDEX
> operations it wanted to do were blocked by the autovacuumer. Fortunately,
> we don't do software updates all the time, but it's falling onto me to
> figure out how t

[GENERAL] Why is there no object create date is the catalogs?

2015-05-12 Thread Melvin Davidson
Can anyone tell me why there is no "relcreated" column in pg_class to track
the creation date of an object?

It seems to me it would make sense to have one as it would facilitate
auditing of when objects are created. In addition, it would also facilitate
the dropping of objects that have exceeded a certain age.

EG: SELECT 'DELETE TABLE ' || relname || ';'
  FROM pg_class
 WHERE relkind = 'r'
   AND relcreated > current_timestamp - INTERVAL ' 1 year';

Adding that column should be relatively easy and would not break backwards
compatiblity with previous versions.
-- 
*Melvin Davidson*


Re: [GENERAL] Why is there no object create date is the catalogs?

2015-05-12 Thread Alvaro Herrera
Melvin Davidson wrote:
> Can anyone tell me why there is no "relcreated" column in pg_class to track
> the creation date of an object?
> 
> It seems to me it would make sense to have one as it would facilitate
> auditing of when objects are created. In addition, it would also facilitate
> the dropping of objects that have exceeded a certain age.

But why -- you can implement that using event triggers.  See the
pg_event_trigger_ddl_commands() function in the docs.
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b488c580aef4e05f39be5daaab6464da5b22a494
http://www.postgresql.org/docs/devel/static/event-trigger-definition.html

... oh, the facility is only two days old, I forgot.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why is there no object create date is the catalogs?

2015-05-12 Thread Adrian Klaver

On 05/12/2015 12:51 PM, Melvin Davidson wrote:


Can anyone tell me why there is no "relcreated" column in pg_class to
track the creation date of an object?


So what date would it track?:

1) The date in the original database?
2) The date the table was restored to another database cluster?
3) The date it was replicated to a standby?
4) The date it went through a DROP TABLE IF EXISTS some_table, CREATE 
TABLE some_table cycle?


I could go on. I imagine that most people that want to track that sort 
of thing keep their schema definitions under version control and keep 
track of the dates there.




It seems to me it would make sense to have one as it would facilitate
auditing of when objects are created. In addition, it would also
facilitate the dropping of objects that have exceeded a certain age.


Now, that just scares me:)

That is often handled through partitioning:
www.postgresql.org/docs/9.4/static/ddl-partitioning.html

Otherwise I am not sure how an object being past a certain date equates 
to dropping it?





EG: SELECT 'DELETE TABLE ' || relname || ';'
   FROM pg_class
  WHERE relkind = 'r'
AND relcreated > current_timestamp - INTERVAL ' 1 year';

Adding that column should be relatively easy and would not break
backwards compatiblity with previous versions.
--
*Melvin Davidson*



--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why is there no object create date is the catalogs?

2015-05-12 Thread Adrian Klaver

On 05/12/2015 12:51 PM, Melvin Davidson wrote:


Can anyone tell me why there is no "relcreated" column in pg_class to
track the creation date of an object?


Meant to add to my previous post, back before I 'discovered' version 
control I use to put the creation date in the table COMMENT:


http://www.postgresql.org/docs/9.4/interactive/sql-comment.html



It seems to me it would make sense to have one as it would facilitate
auditing of when objects are created. In addition, it would also
facilitate the dropping of objects that have exceeded a certain age.

EG: SELECT 'DELETE TABLE ' || relname || ';'
   FROM pg_class
  WHERE relkind = 'r'
AND relcreated > current_timestamp - INTERVAL ' 1 year';

Adding that column should be relatively easy and would not break
backwards compatiblity with previous versions.
--
*Melvin Davidson*



--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why is there no object create date is the catalogs?

2015-05-12 Thread Melvin Davidson
Adrian,

You are over thinking this. An object is only "created" once! That is what
I meant by relcreatedate. If it is dropped, then it is deleted from the
catalogs. If it is modified, then it does NOT affect the creation date.
Everything else is superfluous.

It is also not unusual for tables to have an end of cycle in certain
application, hence the need to be dropped after a certain time. EG. Tables
that track data only for a specific year.

Since PostgreSQL already tracks when tables are vacuum, auto vacuumed,
analyzed and auto analyzed ( pg_stat_all_tables ), I don't see why it is
such a big deal ( or so hard ) to track when an object is created. It
should be a very simple patch to the catalogs.

On Tue, May 12, 2015 at 6:00 PM, Adrian Klaver 
wrote:

> On 05/12/2015 12:51 PM, Melvin Davidson wrote:
>
>>
>> Can anyone tell me why there is no "relcreated" column in pg_class to
>> track the creation date of an object?
>>
>
> Meant to add to my previous post, back before I 'discovered' version
> control I use to put the creation date in the table COMMENT:
>
> http://www.postgresql.org/docs/9.4/interactive/sql-comment.html
>
>
>> It seems to me it would make sense to have one as it would facilitate
>> auditing of when objects are created. In addition, it would also
>> facilitate the dropping of objects that have exceeded a certain age.
>>
>> EG: SELECT 'DELETE TABLE ' || relname || ';'
>>FROM pg_class
>>   WHERE relkind = 'r'
>> AND relcreated > current_timestamp - INTERVAL ' 1 year';
>>
>> Adding that column should be relatively easy and would not break
>> backwards compatiblity with previous versions.
>> --
>> *Melvin Davidson*
>>
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>



-- 
*Melvin Davidson*
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.


Re: [GENERAL] Why is there no object create date is the catalogs?

2015-05-12 Thread Tom Lane
Melvin Davidson  writes:
> You are over thinking this. An object is only "created" once!

Yeah?  Would you expect that pg_dump followed by pg_restore would preserve
the original creation date?  What about pg_upgrade?

This has come up many times before, and we've always decided that it was
not as simple as it seems at first glance, and that it would be difficult
to satisfy all use-cases.  Try searching the archives for previous threads.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why is there no object create date is the catalogs?

2015-05-12 Thread Adrian Klaver

On 05/12/2015 03:44 PM, Melvin Davidson wrote:

Adrian,

You are over thinking this. An object is only "created" once! That is
what I meant by relcreatedate. If it is dropped, then it is deleted from
the catalogs. If it is modified, then it does NOT affect the creation
date. Everything else is superfluous.


See my original post and Tom Lanes response.



It is also not unusual for tables to have an end of cycle in certain
application, hence the need to be dropped after a certain time. EG.
Tables that track data only for a specific year.


Hence my link to the partitioning part of the manual.



Since PostgreSQL already tracks when tables are vacuum, auto vacuumed,
analyzed and auto analyzed ( pg_stat_all_tables ), I don't see why it is
such a big deal ( or so hard ) to track when an object is created. It
should be a very simple patch to the catalogs.


It is probably not a big deal to create a timestamp field and populate 
it. The issues arise when you start asking what it really means. The 
Postgres catalogs are not part of dump file, so the data in them will 
not transfer when you restore to another database. So on restore the 
create date will be the date the table is restored, not the date the 
table was originally created. For some people that is okay, for others 
not okay.




On Tue, May 12, 2015 at 6:00 PM, Adrian Klaver
mailto:adrian.kla...@aklaver.com>> wrote:

On 05/12/2015 12:51 PM, Melvin Davidson wrote:


Can anyone tell me why there is no "relcreated" column in
pg_class to
track the creation date of an object?


Meant to add to my previous post, back before I 'discovered' version
control I use to put the creation date in the table COMMENT:

http://www.postgresql.org/docs/9.4/interactive/sql-comment.html


It seems to me it would make sense to have one as it would
facilitate
auditing of when objects are created. In addition, it would also
facilitate the dropping of objects that have exceeded a certain age.

EG: SELECT 'DELETE TABLE ' || relname || ';'
FROM pg_class
   WHERE relkind = 'r'
 AND relcreated > current_timestamp - INTERVAL ' 1 year';

Adding that column should be relatively easy and would not break
backwards compatiblity with previous versions.
--
*Melvin Davidson*



--
Adrian Klaver
adrian.kla...@aklaver.com 




--
*Melvin Davidson*
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.



--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [BDR] Node Join Question

2015-05-12 Thread Craig Ringer
On 12 May 2015 at 22:21, Wayne E. Seguin  wrote:

> Craig,
>
> It's alive!!!
>
> One more question on this thread, where can I find the meanings of
> node_status in the documentation?
>

node_status is really internal, but it's covered briefly in the docs:

http://bdr-project.org/docs/stable/catalog-bdr-nodes.html


Re: [GENERAL] [BDR] Node Join Question

2015-05-12 Thread Wayne E. Seguin
*awesome*, the question was for my own curiosity so thanks for this!
On Tue, May 12, 2015 at 18:27 Craig Ringer  wrote:

> On 12 May 2015 at 22:21, Wayne E. Seguin  wrote:
>
>> Craig,
>>
>> It's alive!!!
>>
>> One more question on this thread, where can I find the meanings of
>> node_status in the documentation?
>>
>
> node_status is really internal, but it's covered briefly in the docs:
>
> http://bdr-project.org/docs/stable/catalog-bdr-nodes.html
>


Re: [GENERAL] Why is there no object create date is the catalogs?

2015-05-12 Thread Melvin Davidson
I thank everyone for their feedback regarding the omission of object
creation date from the catalog.

I do respect the various reasons for not including it, but I feel it is my
duty to draw out this issue a bit longer.

I would like to counter the argument that a restore from a dump will
override the create date.
That cannot happen.
If an object already exists, it cannot be created again. The worst case
scenario is that an object must be dropped due to some terrible corruption
or other disaster, in which case the creation date is not a major
consideration.

Further to the point, why is it that both Oracle and SQL Server _do_ have
the object create date in the catalogs?

http://stackoverflow.com/questions/4442323/how-to-find-out-when-a-particular-table-was-created-in-oracle

https://msdn.microsoft.com/en-us/library/ms190324.aspx

All I have heard so far is that the ONLY reason there is no object create
date in pg_class is because there is no general agreement as to what create
date means. Well I am giving it right now. When you execute the SQL
statement of the form

CREATE TABLE ...
CREATE INDEX ...
CREATE SEQUENCE ...
CREATE MATERIALIZED VIEW ...
CREATE TYPE ...
CREATE FOREIGN TABLE ...

then that is when clock_timestamp() should be recorded as relcreatedate or
relcreatetime.
Providing, of course, that the column is added to pg_class. :)

Is there some other overwhelming _technical_ reason that I am overlooking
that prevents this from being done?

On Tue, May 12, 2015 at 8:08 PM, Adrian Klaver 
wrote:

> On 05/12/2015 03:44 PM, Melvin Davidson wrote:
>
>> Adrian,
>>
>> You are over thinking this. An object is only "created" once! That is
>> what I meant by relcreatedate. If it is dropped, then it is deleted from
>> the catalogs. If it is modified, then it does NOT affect the creation
>> date. Everything else is superfluous.
>>
>
> See my original post and Tom Lanes response.
>
>
>> It is also not unusual for tables to have an end of cycle in certain
>> application, hence the need to be dropped after a certain time. EG.
>> Tables that track data only for a specific year.
>>
>
> Hence my link to the partitioning part of the manual.
>
>
>> Since PostgreSQL already tracks when tables are vacuum, auto vacuumed,
>> analyzed and auto analyzed ( pg_stat_all_tables ), I don't see why it is
>> such a big deal ( or so hard ) to track when an object is created. It
>> should be a very simple patch to the catalogs.
>>
>
> It is probably not a big deal to create a timestamp field and populate it.
> The issues arise when you start asking what it really means. The Postgres
> catalogs are not part of dump file, so the data in them will not transfer
> when you restore to another database. So on restore the create date will be
> the date the table is restored, not the date the table was originally
> created. For some people that is okay, for others not okay.
>
>
>> On Tue, May 12, 2015 at 6:00 PM, Adrian Klaver
>> mailto:adrian.kla...@aklaver.com>> wrote:
>>
>> On 05/12/2015 12:51 PM, Melvin Davidson wrote:
>>
>>
>> Can anyone tell me why there is no "relcreated" column in
>> pg_class to
>> track the creation date of an object?
>>
>>
>> Meant to add to my previous post, back before I 'discovered' version
>> control I use to put the creation date in the table COMMENT:
>>
>> http://www.postgresql.org/docs/9.4/interactive/sql-comment.html
>>
>>
>> It seems to me it would make sense to have one as it would
>> facilitate
>> auditing of when objects are created. In addition, it would also
>> facilitate the dropping of objects that have exceeded a certain
>> age.
>>
>> EG: SELECT 'DELETE TABLE ' || relname || ';'
>> FROM pg_class
>>WHERE relkind = 'r'
>>  AND relcreated > current_timestamp - INTERVAL ' 1 year';
>>
>> Adding that column should be relatively easy and would not break
>> backwards compatiblity with previous versions.
>> --
>> *Melvin Davidson*
>>
>>
>>
>> --
>> Adrian Klaver
>> adrian.kla...@aklaver.com 
>>
>>
>>
>>
>> --
>> *Melvin Davidson*
>> I reserve the right to fantasize.  Whether or not you
>> wish to share my fantasy is entirely up to you.
>>
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>



-- 
*Melvin Davidson*
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.


Re: [GENERAL] [BDR] Node Join Question

2015-05-12 Thread Craig Ringer
On 13 May 2015 at 09:29, Wayne E. Seguin  wrote:

> *awesome*, the question was for my own curiosity so thanks for this!
>


No worries.

I know it's trite, but for internal-ish detail like that the best reference
remains the source code. I'd like to think the sources are fairly sane.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [GENERAL] Why is there no object create date is the catalogs?

2015-05-12 Thread Adrian Klaver

On 05/12/2015 06:33 PM, Melvin Davidson wrote:

I thank everyone for their feedback regarding the omission of object
creation date from the catalog.

I do respect the various reasons for not including it, but I feel it is
my duty to draw out this issue a bit longer.

I would like to counter the argument that a restore from a dump will
override the create date.
That cannot happen.


Yes it can. You are asking for date field in a system catalog. The 
system catalog data is not included in the dump/restore cycle. So when 
you restore the schema objects to the 'new' database what do you propose 
to do with the create_date field? I see two options, use the restore 
time as the create_date or set it to NULL. In either case you have 
changed the date. The same holds for pg_upgrade, as Tom stated.



If an object already exists, it cannot be created again.


When you dump/restore, from the viewpoint of the new database instance, 
it is created again.



The worst case

scenario is that an object must be dropped due to some terrible
corruption or other disaster, in which case the creation date is not a
major consideration.

Further to the point, why is it that both Oracle and SQL Server _do_
have the object create date in the catalogs?

http://stackoverflow.com/questions/4442323/how-to-find-out-when-a-particular-table-was-created-in-oracle

https://msdn.microsoft.com/en-us/library/ms190324.aspx

All I have heard so far is that the ONLY reason there is no object
create date in pg_class is because there is no general agreement as to
what create date means. Well I am giving it right now. When you execute
the SQL statement of the form

CREATE TABLE ...
CREATE INDEX ...
CREATE SEQUENCE ...
CREATE MATERIALIZED VIEW ...
CREATE TYPE ...
CREATE FOREIGN TABLE ...

then that is when clock_timestamp() should be recorded as relcreatedate
or relcreatetime.
Providing, of course, that the column is added to pg_class. :)

Is there some other overwhelming _technical_ reason that I am
overlooking that prevents this from being done?


There is no technical reason. There is the 'camel nose under the tent' 
problem. The create_date gets added, then the petitions start for an 
update_date column and before you know it the move is on for an entire 
schema versioning system in the system catalogs. This is something that 
is already handled by other programs. What it comes down is the old 
problem of time and money and where to spend either/or in the project. 
As Alvaro said there is some functionality on the horizon that will make 
this easier and I could see someone in the future rolling an extension 
that does this by creating an audit trail in a non-system table.




On Tue, May 12, 2015 at 8:08 PM, Adrian Klaver
mailto:adrian.kla...@aklaver.com>> wrote:

On 05/12/2015 03:44 PM, Melvin Davidson wrote:

Adrian,

You are over thinking this. An object is only "created" once!
That is
what I meant by relcreatedate. If it is dropped, then it is
deleted from
the catalogs. If it is modified, then it does NOT affect the
creation
date. Everything else is superfluous.


See my original post and Tom Lanes response.


It is also not unusual for tables to have an end of cycle in certain
application, hence the need to be dropped after a certain time. EG.
Tables that track data only for a specific year.


Hence my link to the partitioning part of the manual.


Since PostgreSQL already tracks when tables are vacuum, auto
vacuumed,
analyzed and auto analyzed ( pg_stat_all_tables ), I don't see
why it is
such a big deal ( or so hard ) to track when an object is
created. It
should be a very simple patch to the catalogs.


It is probably not a big deal to create a timestamp field and
populate it. The issues arise when you start asking what it really
means. The Postgres catalogs are not part of dump file, so the data
in them will not transfer when you restore to another database. So
on restore the create date will be the date the table is restored,
not the date the table was originally created. For some people that
is okay, for others not okay.


On Tue, May 12, 2015 at 6:00 PM, Adrian Klaver
mailto:adrian.kla...@aklaver.com>
>> wrote:

 On 05/12/2015 12:51 PM, Melvin Davidson wrote:


 Can anyone tell me why there is no "relcreated" column in
 pg_class to
 track the creation date of an object?


 Meant to add to my previous post, back before I
'discovered' version
 control I use to put the creation date in the table COMMENT:

http://www.postgresql.org/docs/9.4/interactive/sql-comment.html


 It seems to me it would make sense to have one as it would
  

Re: [GENERAL] Why is there no object create date is the catalogs?

2015-05-12 Thread Brent Wood
Yep.

Still "created" once - instantiated repeated times, but "created" once. Try 
federated metadata records only one "original creation date" which is an 
explicit attribute of a record. Last copied, updated, edited are different.

Creation date can be when first entered into a spreadsheet, or written down... 
insert date pertains to "creation of the record as a database tuple", etc...

A replica can be copied - but that is a date this instance was created, not the 
original record.

One question - does an edit explicitly destroy the original object and create a 
new (child? linked?) object, or a modified version of the original? Answer 
"yeah/nah" - whichever you decide is correct for your use case - there no 
universal yes or no answer.

The real issue is confusion about what "created" means - for data audit 
tracking/provenance, etc - very important in best practice data mgmt in many 
domains - all these are dates representing different actions which can be 
defined & maintained - but by the user rather than the system (albeit often by 
triggers representing local business rules). Postgres has all the tools you 
need to implement whatever audit trails you need for create (when first written 
on a piece of paper), inserts, updates/edits, etc... but doing this in a 
standard way to meet all users needs is a long standing, unsolved & probably 
unsolvable issue.


Brent Wood

Programme leader: Environmental Information Delivery
NIWA
DDI:  +64 (4) 3860529

Brent Wood
Principal Technician - GIS and Spatial Data Management
Programme Leader - Environmental Information Delivery
+64-4-386-0529 | 301 Evans Bay Parade, Greta Point, Wellington | 
www.niwa.co.nz
[NIWA]
To ensure compliance with legal requirements and to maintain cyber security 
standards, NIWA's IT systems are subject to ongoing monitoring, activity 
logging and auditing. This monitoring and auditing service may be provided by 
third parties. Such third parties can access information transmitted to, 
processed by and stored on NIWA's IT systems.

From: pgsql-general-ow...@postgresql.org [pgsql-general-ow...@postgresql.org] 
on behalf of Tom Lane [t...@sss.pgh.pa.us]
Sent: Wednesday, May 13, 2015 11:26 AM
To: Melvin Davidson
Cc: Adrian Klaver; pgsql-general@postgresql.org
Subject: Re: [GENERAL] Why is there no object create date is the catalogs?

Melvin Davidson  writes:
> You are over thinking this. An object is only "created" once!

Yeah?  Would you expect that pg_dump followed by pg_restore would preserve
the original creation date?  What about pg_upgrade?

This has come up many times before, and we've always decided that it was
not as simple as it seems at first glance, and that it would be difficult
to satisfy all use-cases.  Try searching the archives for previous threads.

   regards, tom lane


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general





Re: [GENERAL] Why is there no object create date is the catalogs?

2015-05-12 Thread David G. Johnston
On Tue, May 12, 2015 at 6:33 PM, Melvin Davidson 
wrote:

> I thank everyone for their feedback regarding the omission of object
> creation date from the catalog.
>
> I do respect the various reasons for not including it, but I feel it is my
> duty to draw out this issue a bit longer.
>

> I would like to counter the argument that a restore from a dump will
> override the create date.
> That cannot happen.
>

​You can restore a database into an empty (and newer) cluster.  This is
possible because instead of archiving and restoring the catalog the
restoration script recreates everything by issuing CREATE and INSERT/COPY
statements.  Perform an SQL dump and look at it if you need affirmation.

Now, saying that this field is marginalized for people who use
pg_dump/pg_restore instead of pg_upgrade is a viable decision but the use
cases put forth so far don't scream for something like this to exist.

Specifically, if an application has a requirement for something like this
then the application should take pains to manage it.  It can be placed into
a system catalog but if the system is not going to use the information then
it shouldn't be responsible for it.  I guess "comments" would be an
exception to this rule - but there you are dealing with constants that are
dumped and restored.


> Further to the point, why is it that both Oracle and SQL Server _do_ have
> the object create date in the catalogs?
>
>
> http://stackoverflow.com/questions/4442323/how-to-find-out-when-a-particular-table-was-created-in-oracle
>
> https://msdn.microsoft.com/en-us/library/ms190324.aspx
>

​If a good why can be provided maybe we'd be convinced to add the field...​

Pondering the above I rather think to implement this as a "comment" table
but where there are two "comment" fields - one text and one timestamptz.
Like with comments pg_dump would treat this as user data to be included in
the dump and restored as-is.  During the restoration new "created" entries
would be added and so then there would be two.  Though at this point you
are basically saying the project should provide official storage and set of
event triggers and for the "CREATE" events.

Going that far it must be argued why -core should be responsible for such a
feature instead of interested parties maintaining it on PGXN.

In the end there is no technical reason to exclude the field but such a
field is arguably application data and should not be present on a system
catalog table.  Personally, I could see an argument for such information
being valuable during schema exploration - just like comments are.

The dump/restore problem should be solvable - just export "UPDATE pg_class
SET creationdate = '2015-05-12T00:00:00UTC'::timestamptz WHERE oid = xxx"
as part of the dump - just after the COMMENT ON statements.  New objects
will be created during restoration but then the old timstamp will replace
the newly assigned one.  Not that I've thought this through in great detail
- the event-based setup, with history maintained across dumps - definitely
is more appealing if quite a bit more work.

I don't see this field being an end of itself but something that would be
added if some other feature required it - thus basically making it a system
field instead of an application one...

David J.


Re: [GENERAL] Dry run through input function for a given built-in data type

2015-05-12 Thread David G. Johnston
On Tue, May 12, 2015 at 11:23 PM, Fabio Ugo Venchiarutti 
wrote:

> Is there any cleaner way to, say, only run the validation part of a type
> input function
> ​ [...]​
>

​This pre-supposes that said type input function has a distinct validation
phase as opposed to simply performing its parse and failing when it
encounters something it cannot handle.

​I would suggest you attempt to separate the non-context-aware stuff and
the context-aware stuff into two separate phases; and only data that passes
the non-context-aware tests would then be bundled up into the full CTE
where you then check constraints and the like.​  The how is beyond me but
this is an open source project so you have access to all of the relevant
functions.  How you would go about using them without having to invoke the
engine I do not know but that seems like the only performant option since
the PostgreSQL executor isn't designed to handle your usage pattern.

Without considerably more understanding of the how suggestions are
difficult to make - even were I to know the innards of PostgreSQL and using
C.  Others more knowledgeable will likely chime in but I suspect that I'm
pretty close to the mark on this one.

Yes, you could make a regexp-base validation library...but that screams
"maintenance nightmare" to me.  I'm doubtful such a thing already exists.

David J.