date:20141116

Re: [HACKERS] New Event Trigger: table_rewrite

2014-11-16 Thread Simon Riggs

On 16 November 2014 06:59, Michael Paquier  wrote:

> Note that this patch has been submitted but there have been no real
> discussion around it.. This seems a bit too fast to commit it, no?

Committing uncontentious patches at the end of the commitfest seems normal, no?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New Event Trigger: table_rewrite

2014-11-16 Thread Simon Riggs

On 16 November 2014 06:59, Michael Paquier  wrote:

> 1) This patch is authorizing VACUUM and CLUSTER to use the event
> triggers ddl_command_start and ddl_command_end, but aren't those
> commands actually not DDLs but control commands?

I could go either way on that. I'm happy to remove those from this commit.

> 2) The documentation of src/sgml/event-trigger.sgml can be improved,
> particularly I think that the example function should use a maximum of
> upper-case letters for reserved keywords, and also this bit:
> you're not allowed to rewrite the table foo
> should be rewritten to something like that:
> Rewrite of table foo not allowed
> 3) A typo, missing a plural:
> provides two built-in event trigger helper functionS

I thought the documentation was very good, in comparison to most other
feature submissions. Given that this is one of the areas I moan about
a lot, that says something.

> 4) pg_event_trigger_table_rewrite_oid is able to return only one OID,
> which is the one of the table being rewritten, and it is limited to
> one OID because VACUUM, CLUSTER and ALTER TABLE can only run on one
> object at the same time in a single transaction. What about thinking
> that we may have in the future multiple objects rewritten in a single
> transaction, hence multiple OIDs could be fetched?

Why would this API support something which the normal trigger API
doesn't, just in case we support a feature that hadn't ever been
proposed or discussed? Why can't such a change wait until that feature
arrives?

> 5) parsetree is passed to cluster_rel only for
> EventTriggerTableRewrite. I am not sure if there are any extension
> using cluster_rel as is but wouldn't it make more sense to call
> EventTriggerTableRewrite before the calls to cluster_rel instead? ISTM
> that this patch is breaking cluster_rel way of doing things.

I will remove the call to CLUSTER and VACUUM as proposed above.

> 6) in_table_rewrite seems unnecessary.
>  typedef struct EventTriggerQueryState
>  {
> slist_head  SQLDropList;
> boolin_sql_drop;
> +   boolin_table_rewrite;
> +   Oid tableOid;
> We could simplify that by renaming tableOid to rewriteTableOid or
> rewriteObjOid and check if its value is InvalidOid to determine if the
> event table_rewrite is in use or not. Each code path setting those
> variables sets them all the time similarly:
> +   state->in_table_rewrite = false;
> +   state->tableOid = InvalidOid;
> And if tableOid is InvaliOid, in_table_rewrite is false. If it is a
> valid Oid, in_table_rewrite is set to true.

Well, that seems a minor change. I'm happy to accept the original
coding, but also happy to receive suggested changes.

> 7) table_rewrite is kicked in ALTER TABLE only when ATRewriteTables is
> used. The list of commands that actually go through this code path
> should be clarified in the documentation IMO to help the user
> apprehend this function.

That is somewhat orthogonal to the patch. The rules for rewriting are
quite complex, which is why this is needed and why documentation isn't
really the answer. Separate doc patch on that would be welcome.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review of Refactoring code for sync node detection

2014-11-16 Thread Simon Riggs

On 31 October 2014 00:27, Michael Paquier  wrote:
> On Fri, Oct 31, 2014 at 6:59 AM, Jim Nasby  wrote:
>>
>> If we stick with this version I'd argue it makes more sense to just stick
>> the sync_node = and priority = statements into the if block and ditch the
>> continue. 
>
> Let's go with the cleaner version then, I'd prefer code that can be read
> easily for this code path. Switching back is not much complicated either.

I like where you are headed with this feature set. I will do my best
to comment and review so we get something in 9.5.

Some review comments

* The new function calls them a Node rather than a Standby. That is a
good change, but it needs to be applied consistently, so we no longer
use the word Standby anywhere near the sync rep code. I'd prefer if we
continue to use the abbreviation sync for synchronous, since its less
typing and avoids spelling mistakes in code and comments.

* Rationale...
Robert Haas wrote
> Interestingly, the syncrep code has in some of its code paths the idea
> that a synchronous node is unique, while other code paths assume that
> there can be multiple synchronous nodes. If that's fine I think that
> it would be better to just make the code multiple-sync node aware, by
> having a single function call in walsender.c and syncrep.c that
> returns an integer array of WAL sender positions (WalSndCtl). as that
> seems more extensible long-term.

I thought this was the rationale for the patch, but it doesn't do
this. If you disagree with Robert, it would be best to say so now,
since I would guess he's expecting the patch to be doing as he asked.
Or perhaps I misunderstand.

* Multiple sync standbys were originally avoided because of the code
cost and complexity at ReleaseWaiters(). I'm very keen to keep the
code at that point very robust, so simplicity is essential. It would
also be useful to have some kind of micro-benchmark that allows us to
understand the overhead of various configs. Jim mentions he isn't sure
of the performance there. I don't see too much a problem with this
patch, but the later patches will require this, so we may as well do
this soon.

* We probably don't need a comment like /* Cleanup */ inserted above a
pfree. ;-)

I see this should get committed in next few days, even outside the CF.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review of Refactoring code for sync node detection

2014-11-16 Thread Michael Paquier

Thanks for the comments!

On Sun, Nov 16, 2014 at 8:32 PM, Simon Riggs  wrote:
> On 31 October 2014 00:27, Michael Paquier  wrote:
>> On Fri, Oct 31, 2014 at 6:59 AM, Jim Nasby  wrote:
>>>
>>> If we stick with this version I'd argue it makes more sense to just stick
>>> the sync_node = and priority = statements into the if block and ditch the
>>> continue. 
>>
>> Let's go with the cleaner version then, I'd prefer code that can be read
>> easily for this code path. Switching back is not much complicated either.
>
> I like where you are headed with this feature set. I will do my best
> to comment and review so we get something in 9.5.
>
> Some review comments
>
> * The new function calls them a Node rather than a Standby. That is a
> good change, but it needs to be applied consistently, so we no longer
> use the word Standby anywhere near the sync rep code. I'd prefer if we
> continue to use the abbreviation sync for synchronous, since its less
> typing and avoids spelling mistakes in code and comments.
Yes I guess this makes sense.

> * Rationale...
> Robert Haas wrote
>> Interestingly, the syncrep code has in some of its code paths the idea
>> that a synchronous node is unique, while other code paths assume that
>> there can be multiple synchronous nodes. If that's fine I think that
>> it would be better to just make the code multiple-sync node aware, by
>> having a single function call in walsender.c and syncrep.c that
>> returns an integer array of WAL sender positions (WalSndCtl). as that
>> seems more extensible long-term.
>
> I thought this was the rationale for the patch, but it doesn't do
> this. If you disagree with Robert, it would be best to say so now,
> since I would guess he's expecting the patch to be doing as he asked.
> Or perhaps I misunderstand.

This comment is originally from me :)
http://www.postgresql.org/message-id/CAB7nPqTtmSo0YX1_3_PykpKbdGUi3UeFd0_=2-6vrqo5n_t...@mail.gmail.com
Changing with my older opinion, I wrote the patch on this thread with
in mind the idea to keep the code as simple as possible, so for now it
is better to still think that we have a single sync node. Let's work
the multiple node thing once we have a better spec of how to do it,
visibly using a dedicated micro-language within s_s_names.

> * Multiple sync standbys were originally avoided because of the code
> cost and complexity at ReleaseWaiters(). I'm very keen to keep the
> code at that point very robust, so simplicity is essential. It would
> also be useful to have some kind of micro-benchmark that allows us to
> understand the overhead of various configs. Jim mentions he isn't sure
> of the performance there. I don't see too much a problem with this
> patch, but the later patches will require this, so we may as well do
> this soon.
Yes I have been playing with that with the patch series of the
previous commit fest, with something not that much kludgy.

> * We probably don't need a comment like /* Cleanup */ inserted above a
> pfree. ;-)
Sure. I'll send an updated patch tomorrow.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Failback to old master

2014-11-16 Thread didier

Hi,


On Sat, Nov 15, 2014 at 5:31 PM, Maeldron T.  wrote:
>> A safely shut down master (-m fast is safe) can be safely restarted as
>> a slave to the newly promoted master. Fast shutdown shuts down all
>> normal connections, does a shutdown checkpoint and then waits for this
>> checkpoint to be replicated to all active streaming clients. Promoting
>> slave to master creates a timeline switch, that prior to version 9.3
>> was only possible to replicate using the archive mechanism. As of
>> version 9.3 you don't need to configure archiving to follow timeline
>> switches, just add a recovery.conf to the old master to start it up as
>> a slave and it will fetch everything it needs from the new master.
>>
> I took your advice and I understood that removing the recovery.conf followed
> by a restart is wrong. I will not do that on my production servers.
>
> However, I can't make it work with promotion. What did I wrong? It was
> 9.4beta3.
>
> mkdir 1
> mkdir 2
> initdb -D 1/
>  max_wal_senders=7, wal_keep_segments=100, uncomment replication in hba.conf>
> pg_ctl -D 1/ start
> createdb -p 5433
> psql -p 5433
> pg_basebackup -p 5433 -R -D 2/
> mcedit 2/postgresql.conf 
> chmod -R 700 1
> chmod -R 700 2
> pg_ctl -D 2/ start
> psql -p 5433
> psql -p 5434
> 
> pg_ctl -D 1/ stop
> pg_ctl -D 2/ promote
> psql -p 5434
> cp 2/recovery.done 1/recovery.conf
> mcedit 1/recovery.conf 
> pg_ctl -D 1/ start
>
> LOG:  replication terminated by primary server
> DETAIL:  End of WAL reached on timeline 1 at 0/3000AE0.
> LOG:  restarted WAL streaming at 0/300 on timeline 1
> LOG:  replication terminated by primary server
> DETAIL:  End of WAL reached on timeline 1 at 0/3000AE0.
>
> This is what I experienced in the past when I tried with promote. The old
> master disconnects from the new. What am I missing?
>
I think you have to add
recovery_target_timeline = '2'
in recovery.conf
with '2' being the new primary timeline .
cf http://www.postgresql.org/docs/9.4/static/recovery-target-settings.html

Didier


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] alternative model for handling locking in parallel groups

2014-11-16 Thread Robert Haas

On Fri, Nov 14, 2014 at 12:03 PM, Andres Freund  wrote:
> Note that you'd definitely not want to do this naively - currently
> there's baked in assumptions into the vaccum code that only one backend
> is doing parts of it.
>
> I think there's

Did something you intended get left out here?

>> 4. Parallel query on a locked relation.  Parallel query should work on
>> a table created in the current transaction, or one explicitly locked
>> by user action.  It's not acceptable for that to just randomly
>> deadlock, and skipping parallelism altogether, while it'd probably be
>> acceptable for a first version, is not going a good long-term
>> solution.
>
> FWIW, I think it's perfectly acceptable to refuse to work in parallel in
> that scenario. And not just for now.

I don't agree with that, but my point is that I think that fixing it
so it works is probably no more work than detecting that it isn't
going to work, whether the specific proposal in this email pans out or
not.

> The biggest argument I can see to that is parallel index creation.
>
>> It also sounds buggy and fragile for the query planner to
>> try to guess whether the lock requests in the parallel workers will
>> succeed or fail when issued.
>
> I don't know. Checking whether we hold a self exclusive lock on that
> relation doesn't sound very problematic to me.

That seems like  gross oversimplification of what we need to check.
For example, suppose we want to do a parallel scan.  We grab
AccessShareLock.  Now, another backends waits for AccessExcusiveLock.
We start workers, who all try to get AccessShareLock.  They wait
behind the AccessExclusiveLock, while, outside the view of the current
lock manager, we wait for them.  No process in our group acquired more
than AccesShareLock, yet we've got problems.  We need a simple and
elegant way to avoid this kind of situation.

>> After thinking about these cases for a bit, I came up with a new
>> possible approach to this problem.  Suppose that, at the beginning of
>> parallelism, when we decide to start up workers, we grant all of the
>> locks already held by the master to each worker (ignoring the normal
>> rules for lock conflicts).  Thereafter, we do everything the same as
>> now, with no changes to the deadlock detector.  That allows the lock
>> conflicts to happen normally in the first two cases above, while
>> preventing the unwanted lock conflicts in the second two cases.
>
> I don't think that's safe enough. There's e.g. a reason why ANALYZE
> requires SUE lock. It'd definitely not be safe to simply grant the
> worker a SUE lock, just because the master backend already analyzed it
> or something like that. You could end up with the master and worker
> backends ANALYZEing the same relation.

Well, in the first version of this, I expect to prohibit parallel
workers from doing any DML or DDL whatsoever - they will be strictly
read-only.  So you won't have that problem.  Now, eventually, we might
relax that, but I would expect that a prohibition on the workers
starting new utility commands while in parallel mode wouldn't be very
high on anyone's list of restrictions to relax.

> That said, I can definitely see use for a infrastructure where we
> explicitly can grant another backend a lock that'd conflict with one
> we're already holding. But I think it'll need to be more explicit than
> just granting all currently held locks at the "highest" current
> level. And I think it's not necessarily going to be granting them all
> the locks at their current levels.
>
> What I'm thinking of is basically add a step to execMain.c:InitPlan()
> that goes through the locks and grants the child process all the locks
> that are required for the statement to run. But not possibly preexisting
> higher locks.

This doesn't actually solve the problem, because we can be
incidentally holding locks on other relations - system catalogs, in
particular - that will cause the child processes to fail.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Failback to old master

2014-11-16 Thread Maeldron T.


On 16/11/14 13:13, didier wrote:

I think you have to add
recovery_target_timeline = '2'
in recovery.conf
with '2' being the new primary timeline .
cf http://www.postgresql.org/docs/9.4/static/recovery-target-settings.html



Thank you.

Based on the link I have added:
recovery_target_timeline = 'latest'

And now it works.




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Index scan optimization

2014-11-16 Thread Simon Riggs

On 30 October 2014 08:43, Haribabu Kommi  wrote:

> I marked the patch as ready for committer.

This looks very interesting.

The value of the patch seems to come from skipping costly checks. Your
performance test has a leading VARCHAR column, so in that specific
case skipping the leading column is a big win. I don't see it would be
in all cases.

Can we see a few tests on similar things to check for other
opportunities and regressions.
* Single column bigint index
* Multi-column bigint index
* 5 column index with mixed text and integers

The explanatory comments need some work to more clearly explain what
this patch does.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Order of views in stats docs

2014-11-16 Thread Magnus Hagander

On Sun, Nov 9, 2014 at 3:46 PM, Magnus Hagander  wrote:
> On Thu, Nov 6, 2014 at 3:01 PM, Peter Eisentraut  wrote:
>> On 11/6/14 6:16 AM, Magnus Hagander wrote:
>>> Another thought I had in that case is maybe we need to break out the
>>> pg_stat_activity and pg_stat_replication views into their own table.
>>> They are really the only two views that are different in a lot of
>>> ways. Maybe call the splits "session statistics views" and "object
>>> statistics views", instead of just "standard statistics views"? The
>>> difference being that they show snapshots of *right now*, whereas all
>>> the other views show accumulated statistics over time.
>>
>> Yeah, splitting this up a bit would help, too.
>
> Here's an initial run of this. It still feels wrong that we have the
> dynamic views under "the statistics collector". Perhaps longterm we
> should look at actually splitting them out to a completely different
> sect1?
>
> I only fixed the obvious parts in this - the split out, and the moving
> of pg_stat_database_conflicts. But it's a first step at least.
>
> Objections?

Hearing no objections, I've pushed this. There's more to do, but it's a start.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GSoC2014] Patch ALTER TABLE ... SET LOGGED

2014-11-16 Thread Christian Ullrich


* Alvaro Herrera wrote:


Michael Paquier wrote:


Btw, perhaps this diff should be pushed as a different patch as this is a
rather different thing:
-   if (heapRelation->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED
&&
+   if (indexRelation->rd_rel->relpersistence ==
RELPERSISTENCE_UNLOGGED &&
 !smgrexists(indexRelation->rd_smgr, INIT_FORKNUM)
As this is a correctness fix, it does not seem necessary to back-patch it:
the parent relation always has the same persistence as its indexes.


There was an argument for doing it this way that only applies if this
patch went in, but I can't remember now what it was.

Anyway I pushed the patch after some slight additional changes.  Thanks!


The buildfarm says -DCLOBBER_CACHE_ALWAYS does not like this patch.

--
Christian





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical decoding - reading a user catalog table

2014-11-16 Thread Steve Singer


On 11/13/2014 02:44 PM, Andres Freund wrote:

Hi Steve,



If it still happens, could you send me instructions of how to reproduce
the problem after cloning the necessary source repositories? It's quite
hard to validate a possible fix otherwise.


1. Install PG 9.4

2. Perform an initdb
   max_connections = 200
   wal_level=logical
   max_walsenders=50
   wal_keep_segments = 100
   wal_sender_timeout = 600s
   max_replication_slots = 120



Create a user slon with superuser
create a user test
(set passwords for them you'll use them later)

Configure pg_hba.conf to allow slon to connect as a replication user


3. Download slony from github (https://github.com/ssinger/slony1-engine.git)
   checkout  the branch logical_remote_helper
./configure --with-pgconfigdir=/path_to_your_pgcbindir
make
make install

4. Download clustertest & compile clustertest from 
(https://github.com/clustertest/clustertest-framework).

5. In the slony source tree cd to the clustertest directory
6. cp conf/disorder.properties.sample to conf/disorder.properties
Edit the file to have the correct paths, ports, users, passwords.

7 cp conf/java.properties.sample to conf/java.properties edit the file 
to point at the JAR you downloaded earlier.


I run with all 5 databases on the same PG cluster. Performance will be 
much better with 5 different clusters, but I recommend trying to emulate 
my configuration as much as possible to replicate this


To run the tests then do
./run_all_disorder_tests.sh


At the moment (commit df5acfd1a3) is configured to just run the Failover 
node test cases where I am seeing this not the entire suite.


It typically takes between 30 minutes to an hour to run though.

I installed things following the above steps on a different system than 
my usual development laptop and I have been unable to reproduce the 
error so for (on that system).  But I am still able to reproduce it on 
occasion on my normal development laptop.





Greetings,

Andres Freund





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] printing table in asciidoc with psql

2014-11-16 Thread Pavel Stehule

Hi

2014-11-07 22:37 GMT+01:00 Alvaro Herrera :

>
> I did \o /tmp/tst, then
> \dS
> create table "eh | oh" ();
> \dS
>
> and then filtered the output file to HTML.  The CREATE TABLE tag ended
> up in the same line as the title of the following table.  I think
> there's a newline is missing somewhere.
>

> The good news is that the | in the table name was processed correctly;
> but I noticed that the table title is not using the escaped print, so I
> would imagine that if I put a | in the title, things would go wrong.
> (I don't know how to put titles on tables other than editing the
> hardcoded titles for \ commands in psql).
>
> Another thing is that spaces around the | seem gratuituous and produce
> bad results.  I tried "select * from pg_class" which results in a very
> wide table, and then the HTML output contains some cells with the values
> in the second line; this makes all rows taller than they must be,
> because some cells use the first line and other cells in the same row
> use the second line for the text.  I hand-edited the asciidoc and
> removed the spaces around | which makes the result nicer.  (Maybe
> removing the trailing space is enough.)
>

I see a trailing spaces, but I don't see a described effect. Please, can
you send some more specific test case?

I fixed a status view and removing trailing spaces



Regards

Pavel


>
> --
> Álvaro Herrerahttp://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>
commit 76e033f3a287115f15f249c657b544c148e2efd2
Author: Pavel Stehule 
Date:   Sun Nov 16 23:38:46 2014 +0100

every table is in own paragraph

diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index e7fcc73..cd64b88 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -2092,8 +2092,8 @@ lo_import 152801
   aligned, wrapped,
   html,
   latex (uses tabular),
-  latex-longtable, or
-  troff-ms.
+  latex-longtable,
+  troff-ms, or asciidoc.
   Unique abbreviations are allowed.  (That would mean one letter
   is enough.)
   
@@ -2120,7 +2120,8 @@ lo_import 152801
 
   
   The html, latex,
-  latex-longtable, and troff-ms
+  latex-longtable, troff-ms,
+  and asciidoc
   formats put out tables that are intended to
   be included in documents using the respective mark-up
   language. They are not complete documents! This might not be
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 26089352..e00e47b 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -2248,6 +2248,9 @@ _align2string(enum printFormat in)
 		case PRINT_TROFF_MS:
 			return "troff-ms";
 			break;
+		case PRINT_ASCIIDOC:
+			return "asciidoc";
+			break;
 	}
 	return "unknown";
 }
@@ -2321,9 +2324,11 @@ do_pset(const char *param, const char *value, printQueryOpt *popt, bool quiet)
 			popt->topt.format = PRINT_LATEX_LONGTABLE;
 		else if (pg_strncasecmp("troff-ms", value, vallen) == 0)
 			popt->topt.format = PRINT_TROFF_MS;
+		else if (pg_strncasecmp("asciidoc", value, vallen) == 0)
+			popt->topt.format = PRINT_ASCIIDOC;
 		else
 		{
-			psql_error("\\pset: allowed formats are unaligned, aligned, wrapped, html, latex, troff-ms\n");
+			psql_error("\\pset: allowed formats are unaligned, aligned, wrapped, html, latex, troff-ms, asciidoc\n");
 			return false;
 		}
 
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 66d80b5..ea8b9c1 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -858,6 +858,10 @@ PrintQueryStatus(PGresult *results)
 			html_escaped_print(PQcmdStatus(results), pset.queryFout);
 			fputs("\n", pset.queryFout);
 		}
+		else if (pset.popt.topt.format == PRINT_ASCIIDOC)
+		{
+			fprintf(pset.queryFout, "\n%s\n", PQcmdStatus(results));
+		}
 		else
 			fprintf(pset.queryFout, "%s\n", PQcmdStatus(results));
 	}
diff --git a/src/bin/psql/help.c b/src/bin/psql/help.c
index ae5fe88..b14b313 100644
--- a/src/bin/psql/help.c
+++ b/src/bin/psql/help.c
@@ -351,7 +351,7 @@ helpVariables(unsigned short int pager)
 	fprintf(output, _("  expanded (or x)toggle expanded output\n"));
 	fprintf(output, _("  fieldsep   field separator for unaligned output (default '|')\n"));
 	fprintf(output, _("  fieldsep_zero  set field separator in unaligned mode to zero\n"));
-	fprintf(output, _("  format set output format [unaligned, aligned, wrapped, html, latex, ..]\n"));
+	fprintf(output, _("  format set output format [unaligned, aligned, wrapped, html, latex, asciidoc ..]\n"));
 	fprintf(output, _("  footer enable or disable display of the table footer [on, off]\n"));
 	fprintf(output, _("  linestyle  set the border line drawing style [ascii, old-ascii, unicode]\n"));
 	fprintf(output, _("  null   set the string to be printed in place of a null value\n"));
diff

Re: [HACKERS] PostgreSQL doesn't stop propley when --slot option is specified with pg_receivexlog.

2014-11-16 Thread Fujii Masao

On Sat, Nov 15, 2014 at 9:10 PM, Michael Paquier
 wrote:
> On Sat, Nov 15, 2014 at 3:42 AM, Andres Freund  wrote:
>> On 2014-11-15 03:25:16 +0900, Fujii Masao wrote:
>>> On Fri, Nov 14, 2014 at 7:22 PM,   wrote:
>>> > "pg_ctl stop" does't work propley, if --slot option is specified when WAL 
>>> > is flushed only it has switched.
>>> > These processes still continue even after the posmaster 
>>> > failed:pg_receivexlog, walsender and logger.
>>>
>>> I could reproduce this problem. At normal shutdown, walsender keeps waiting
>>> for the last WAL record to be replicated and flushed in pg_receivexlog. But
>>> pg_receivexlog issues sync command only when WAL file is switched. Thus,
>>> since pg_receivexlog may never flush the last WAL record, walsender may have
>>> to keep waiting infinitely.
>>
>> Right.
> It is surprising that nobody complained about that before,
> pg_receivexlog has been released two years ago.

It's not so surprising because the problem can happen only when
replication slot is specified, i.e., the version is 9.4 or later.

>>> pg_recvlogical handles this problem by calling fsync() when it receives the
>>> request of immediate reply from the server. That is, at shutdown, walsender
>>> sends the request, pg_receivexlog receives it, flushes the last WAL record,
>>> and sends the flush location back to the server. Since walsender can see 
>>> that
>>> the last WAL record is successfully flushed in pg_receivexlog, it can
>>> exit cleanly.
>>>
>>> One idea to the problem is to introduce the same logic as pg_recvlogical 
>>> has,
>>> to pg_receivexlog. Thought?
>>
>> Sounds sane to me. Are you looking into doing that?
> Yep, sounds a good thing to do if master requested answer from the
> client in the keepalive message. Something like the patch attached
> would make the deal.

Isn't it better to do this only when replication slot is used?

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

2014-11-16 Thread Dilip kumar

On 13 November 2014 15:35 Amit Kapila Wrote,

>As mentioned by you offlist that you are not able reproduce this
>issue, I have tried again and what I observe is that I am able to
>reproduce it only on *release* build and some cases work without
>this issue as well,
>example:
>./vacuumdb --analyze-in-stages -t t1 -t t2 -t t3 -t t4 -t t5 -t t6 -t t7 -t t8 
>-j 8 -d postgres
>Generating minimal optimizer statistics (1 target)
>Generating medium optimizer statistics (10 targets)
>Generating default (full) optimizer statistics

>So to me, it looks like this is a timing issue and please notice
>why in error the statement looks like
>"ANALYZE ng minimal optimizer statistics (1 target)".  I think this
>is not a valid statement.
>Let me know if you still could not reproduce it.

Thank you for looking into it once again..

I have tried with the release mode, but could not reproduce the same. By 
looking at server LOG sent by you “"ANALYZE ng minimal optimizer statistics (1 
target)". ”, seems like some corruption.

So actually looks like two issues here.

1.  Query string sent to server is getting corrupted.

2.  Client is getting crashed.

I will review the code and try to find the same, meanwhile if you can find some 
time to debug this, it will be really helpful.


Regards,
Dilip

Re: [HACKERS] PostgreSQL doesn't stop propley when --slot option is specified with pg_receivexlog.

2014-11-16 Thread Michael Paquier

On Mon, Nov 17, 2014 at 10:02 AM, Fujii Masao  wrote:
> On Sat, Nov 15, 2014 at 9:10 PM, Michael Paquier
>  wrote:
>> Yep, sounds a good thing to do if master requested answer from the
>> client in the keepalive message. Something like the patch attached
>> would make the deal.
>
> Isn't it better to do this only when replication slot is used?
Makes sense. What about a check using reportFlushPosition then?
-- 
Michael
diff --git a/src/bin/pg_basebackup/receivelog.c b/src/bin/pg_basebackup/receivelog.c
index c6c90fb..b426cfe 100644
--- a/src/bin/pg_basebackup/receivelog.c
+++ b/src/bin/pg_basebackup/receivelog.c
@@ -149,6 +149,34 @@ open_walfile(XLogRecPtr startpoint, uint32 timeline, char *basedir,
 }
 
 /*
+ * Flush the current WAL file to disk and update the last WAL flush
+ * positions as well as the last fsync timestamp. On failure, print
+ * a message to stderr and return false, otherwise return true.
+ */
+static bool
+fsync_walfile(XLogRecPtr blockpos, int64 timestamp)
+{
+	/* nothing to do if no WAL file open */
+	if (walfile == -1)
+		return true;
+
+	/* nothing to do if data has already been flushed */
+	if (blockpos <= lastFlushPosition)
+		return true;
+
+	if (fsync(walfile) != 0)
+	{
+		fprintf(stderr, _("%s: could not fsync file \"%s\": %s\n"),
+progname, current_walfile_name, strerror(errno));
+		return false;
+	}
+
+	lastFlushPosition = blockpos;
+	last_fsync = timestamp;
+	return true;
+}
+
+/*
  * Close the current WAL file (if open), and rename it to the correct
  * filename if it's complete. On failure, prints an error message to stderr
  * and returns false, otherwise returns true.
@@ -787,21 +815,12 @@ HandleCopyStream(PGconn *conn, XLogRecPtr startpos, uint32 timeline,
 		 * If fsync_interval has elapsed since last WAL flush and we've written
 		 * some WAL data, flush them to disk.
 		 */
-		if (lastFlushPosition < blockpos &&
-			walfile != -1 &&
-			((fsync_interval > 0 &&
-			  feTimestampDifferenceExceeds(last_fsync, now, fsync_interval)) ||
-			 fsync_interval < 0))
+		if ((fsync_interval > 0 &&
+			 feTimestampDifferenceExceeds(last_fsync, now, fsync_interval)) ||
+			 fsync_interval < 0)
 		{
-			if (fsync(walfile) != 0)
-			{
-fprintf(stderr, _("%s: could not fsync file \"%s\": %s\n"),
-		progname, current_walfile_name, strerror(errno));
+			if (!fsync_walfile(blockpos, now))
 goto error;
-			}
-
-			lastFlushPosition = blockpos;
-			last_fsync = now;
 		}
 
 		/*
@@ -999,7 +1018,6 @@ ProcessKeepaliveMsg(PGconn *conn, char *copybuf, int len,
 {
 	int			pos;
 	bool		replyRequested;
-	int64		now;
 
 	/*
 	 * Parse the keepalive message, enclosed in the CopyData message.
@@ -1021,7 +1039,15 @@ ProcessKeepaliveMsg(PGconn *conn, char *copybuf, int len,
 	/* If the server requested an immediate reply, send one. */
 	if (replyRequested && still_sending)
 	{
-		now = feGetCurrentTimestamp();
+		int64		now = feGetCurrentTimestamp();
+
+		/*
+		 * Flush data, so a fresh position is sent but do this only if a
+		 * flush position needs to be reported.
+		 */
+		if (reportFlushPosition && !fsync_walfile(blockpos, now))
+			return false;
+
 		if (!sendFeedback(conn, blockpos, now, false))
 			return false;
 		*last_status = now;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] BRIN indexes - TRAP: BadArgument

2014-11-16 Thread Amit Kapila

On Sat, Nov 8, 2014 at 1:26 AM, Alvaro Herrera 
wrote:
>
>
> I just pushed this, after some more minor tweaks.  Thanks, and please do
> continue testing!
>

Few typo's and few questions

1.  * range.  Need to an extra flag in mmtuples for that.
 */
Datum
brinbulkdelete(PG_FUNCTION_ARGS)

Isn't the part of comment referring *mmtuples* require some change,
as I think mmtuples was used in initial version of patch.

2.
/* ---
 * mt_info is laid out in the following fashion:
 *
 * 7th (high)
bit: has nulls
 * 6th bit: is placeholder tuple
 * 5th bit: unused
 * 4-0 bit: offset of data
 * ---
 */
uint8 bt_info;
} BrinTuple;

Here in comments, bt_info is referred as mt_info.

3.
/*
 * t_info manipulation macros
 */
#define BRIN_OFFSET_MASK 0x1F

I think in above comment it should be bt_info, rather than t_info.

4.
static void
revmap_physical_extend(BrinRevmap *revmap)
{
..
..
START_CRIT_SECTION();

/* the rm_tids array is initialized to all invalid by PageInit */
brin_page_init(page, BRIN_PAGETYPE_REVMAP);
MarkBufferDirty(buf);

metadata->lastRevmapPage = mapBlk;
MarkBufferDirty(revmap->rm_metaBuf);
..
}

Can't we update revmap->rm_lastRevmapPage along with metadata->lastRevmap?

5.
typedef struct BrinMemTuple
{
bool bt_placeholder; /* this is a placeholder tuple */
BlockNumber bt_blkno; /* heap blkno that the tuple is for */
MemoryContext bt_context; /*
memcxt holding the dt_column values */
..
}

How is this memory context getting used?

I could see that this is used brin_deform_tuple() which gets called from
3 other places in core code bringetbitmap(), brininsert() and union_tuples()
and in all the 3 places there is already another temporaray memory context
used to avoid any form of memory leaks.

6.
Is there anyway to force brin index to be off, if not, then do we need it
as it is present for other type of scan's.

like set enable_indexscan=off;

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Review of Refactoring code for sync node detection

2014-11-16 Thread Michael Paquier

On Sun, Nov 16, 2014 at 9:07 PM, Michael Paquier
 wrote:
> I'll send an updated patch tomorrow.

Here are updated versions. I divided the patch into two for clarity,
the first one is the actual refactoring patch, renaming
SyncRepGetSynchronousNode to SyncRepGetSynchronousStandby (+alpha,
like updating synchronous to sync in the comments as you mentioned)
such as the namings have no conflicts.

The second one updates the syncrep code, including WAL sender and WAL
receiver, and its surroundings to use the term "node" instead of
"standby". This brings in the code the idea that a node using
replication APIs is not necessarily a standby, making it more generic.
This can be applied on top of the refactoring patch. If any other
folks (Heikki or Fujii-san?) have comments about this idea feel free.
Regards,
-- 
Michael
From 51e9988ff44b7c2b716e3a0da3f1d1c9359a1d79 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Thu, 30 Oct 2014 22:01:10 +0900
Subject: [PATCH 1/2] Refactor code to detect synchronous node in WAL sender
 array

This patch is made to remove code duplication in walsender.c and syncrep.c
in order to detect what is the node with the lowest strictly-positive
priority, facilitating maintenance of this code.
---
 src/backend/replication/syncrep.c   | 97 +++--
 src/backend/replication/walsender.c | 35 +++--
 src/include/replication/syncrep.h   |  1 +
 3 files changed, 78 insertions(+), 55 deletions(-)

diff --git a/src/backend/replication/syncrep.c b/src/backend/replication/syncrep.c
index aa54bfb..f838ad0 100644
--- a/src/backend/replication/syncrep.c
+++ b/src/backend/replication/syncrep.c
@@ -357,6 +357,70 @@ SyncRepInitConfig(void)
 	}
 }
 
+
+/*
+ * Obtain position of sync standby in the array referencing all the WAL
+ * senders, or -1 if no sync node can be found. The caller of this function
+ * should take a lock on SyncRepLock. If there are multiple nodes with
+ * the same lowest priority value, the first node found is selected.
+ * sync_priority is a preallocated array of size max_wal_senders that can
+ * be used to retrieve the priority of each WAL sender. Its inclusion in
+ * this function has the advantage to limit the scan of the WAL sender
+ * array to one pass, limiting the amount of cycles SyncRepLock is taken.
+ */
+int
+SyncRepGetSynchronousStandby(int *node_priority)
+{
+	int		sync_node = -1;
+	int		priority = 0;
+	int		i;
+
+	/* Scan WAL senders and find the sync node if any */
+	for (i = 0; i < max_wal_senders; i++)
+	{
+		/* Use volatile pointer to prevent code rearrangement */
+		volatile WalSnd *walsnd = &WalSndCtl->walsnds[i];
+
+		/* First get the priority of this WAL sender */
+		if (node_priority)
+			node_priority[i] = XLogRecPtrIsInvalid(walsnd->flush) ?
+0 : walsnd->sync_standby_priority;
+
+		/* Proceed to next if not active */
+		if (walsnd->pid == 0)
+			continue;
+
+		/* Proceed to next if not streaming */
+		if (walsnd->state != WALSNDSTATE_STREAMING)
+			continue;
+
+		/* Proceed to next one if asynchronous */
+		if (walsnd->sync_standby_priority == 0)
+			continue;
+
+		/* Proceed to next one if priority conditions not satisfied */
+		if (priority != 0 &&
+			priority <= walsnd->sync_standby_priority)
+			continue;
+
+		/* Proceed to next one if flush position is invalid */
+		if (XLogRecPtrIsInvalid(walsnd->flush))
+			continue;
+
+		/*
+		 * We have a potential sync candidate, choose it as the sync
+		 * node for the time being before going through the other nodes
+		 * listed in the WAL sender array.
+		 */
+		sync_node = i;
+
+		/* Update priority to current value of WAL sender */
+		priority = walsnd->sync_standby_priority;
+	}
+
+	return sync_node;
+}
+
 /*
  * Update the LSNs on each queue based upon our latest state. This
  * implements a simple policy of first-valid-standby-releases-waiter.
@@ -369,10 +433,9 @@ SyncRepReleaseWaiters(void)
 {
 	volatile WalSndCtlData *walsndctl = WalSndCtl;
 	volatile WalSnd *syncWalSnd = NULL;
+	int			sync_node;
 	int			numwrite = 0;
 	int			numflush = 0;
-	int			priority = 0;
-	int			i;
 
 	/*
 	 * If this WALSender is serving a standby that is not on the list of
@@ -388,32 +451,14 @@ SyncRepReleaseWaiters(void)
 	/*
 	 * We're a potential sync standby. Release waiters if we are the highest
 	 * priority standby. If there are multiple standbys with same priorities
-	 * then we use the first mentioned standby. If you change this, also
-	 * change pg_stat_get_wal_senders().
+	 * then we use the first mentioned standby.
 	 */
 	LWLockAcquire(SyncRepLock, LW_EXCLUSIVE);
+	sync_node = SyncRepGetSynchronousStandby(NULL);
 
-	for (i = 0; i < max_wal_senders; i++)
-	{
-		/* use volatile pointer to prevent code rearrangement */
-		volatile WalSnd *walsnd = &walsndctl->walsnds[i];
-
-		if (walsnd->pid != 0 &&
-			walsnd->state == WALSNDSTATE_STREAMING &&
-			walsnd->sync_standby_priority > 0 &&
-			(priority == 0 ||
-			 priority > walsnd->sync_standby_priority) &&
-			!XLogRecPtrIsInvalid

Re: [HACKERS] Index scan optimization

2014-11-16 Thread Rajeev rastogi

On 16 November 2014 19:30, Simon Riggs Wrote:
> Sent: 16 November 2014 19:30
 
> > I marked the patch as ready for committer.
> 
> This looks very interesting.
> 
> The value of the patch seems to come from skipping costly checks. Your
> performance test has a leading VARCHAR column, so in that specific case
> skipping the leading column is a big win. I don't see it would be in
> all cases.

Yes you are right. Best improvement will be for a case where leading column 
comparison is very costly (e.g. VARCHAR).

> Can we see a few tests on similar things to check for other
> opportunities and regressions.

> * Single column bigint index
It gives around 5% improvement. 

> * Multi-column bigint index
It gives around 5% improvement.

> * 5 column index with mixed text and integers
It gives around 15% improvement.

As we can see in-case of bigint, improvement is not as high as for VARCHAR 
because comparison of bigint data-type is not costly.
So this being one of the worst scenario performance test and I think even 
in-case of worst case 5% improvement with so less/safe code changes
should be OK specially since for other scenario (like varchar index) 
improvement is high (15-30%). 

Also even for bigint (or any other similar data-type) improvement can increase 
if number of records going to be selected increases.

Test-case used for testing is attached.

Please provide your opinion.

> The explanatory comments need some work to more clearly explain what
> this patch does.

Please help me to understand this point, you want me to add more comments about 
patch in this mail chain or in code.


Thanks and Regards,
Kumar Rajeev Rastogi
As per Simon Riggs suggestion:

--
With Single column bigint index:
--Schema
create table tbl2(id1 int, id2 bigint, id3 int);
create index idx2 on tbl2(id2);

--Procedure to insert 1M data:

insert into tbl2 values(1,generate_series(1, 100), 2);

--Procedure to select data 1000 times (1000 times selected to make data more 
appropriate.)
create or replace function select_data(count1 int) returns void
AS
$$
declare
x int;
Begin
for i IN 1..count1 loop
select count(*) into x from tbl2 where id2>99; 
end loop;   
End;  
$$  language plpgsql;

select select_data(1000);

--
With Multiple column bigint index:

create table tbl3(id1 int, id2 bigint, id3 bigint, id4 bigint);
create index idx3 on tbl3(id2, id3, id4);

insert into tbl3 values(1,generate_series(1, 100), generate_series(1, 
100), generate_series(1, 100));

--Procedure to select data 1000 times (1000 times selected to make data more 
appropriate.)
create or replace function select_data_multi_bigint(count1 int) returns void
AS
$$
declare
x int;
Begin
for i IN 1..count1 loop
select count(*) into x from tbl3 where id2>99 and 
id3>99 and id4>99; 
end loop;   
End;  
$$  language plpgsql;

select select_data_multi_bigint(1000);  


--
---5 column index with mixed text and integers

--Schema
create table tbl4(id1 int, id2 varchar(10), id3 int, id4 varchar(10), id5 int, 
id6 int);
create index idx4 on tbl4(id2, id3, id4, id5, id6);

--Procedure to insert 1M data:

create or replace function insert_data_mix_text_int(count1 int, count2 int) 
returns void
AS
$$
Begin
for i IN 1..count1 loop
insert into tbl4 values(i, 'a', i, 'a', i, i);
end loop;   

for i IN count1+1..count2 loop
insert into tbl4 values(i, 'b', i, 'b', i, i);
end loop;   
End;  
$$  language plpgsql;

select insert_data_mix_text_int(99, 100);

--Procedure to select data 1000 times (1000 times selected to make data more 
appropriate.)
create or replace function select_data_mix_text_int(count1 int) returns void
AS
$$
declare
x int;
Begin
for i IN 1..count1 loop
select count(*) into x from tbl4 where id2>'a' and id3>99 
and id4>'a' and id5>99 and id6>99; 
end loop;   
End;  
$$  language plpgsql;

select select_data_mix_text_int(1000);
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5: Better memory accounting, towards memory-bounded HashAgg

2014-11-16 Thread Jeff Davis

On Sat, 2014-11-15 at 21:36 +, Simon Riggs wrote:
> Do I understand correctly that we are trying to account for exact
> memory usage at palloc/pfree time? Why??

Not palloc chunks, only tracking at the level of allocated blocks (that
we allocate with malloc).

It was a surprise to me that accounting at that level would have any
measurable impact, but Robert found a reasonable case on a POWER machine
that degraded a couple percent. I wasn't able to reproduce it
consistently on x86.

> Or alternatively, can't we just sample the allocations to reduce the overhead?

Not sure quite what you mean by "sample", but it sounds like something
along those lines would work.

Attached is a patch that does something very simple: only tracks blocks
held in the current context, with no inheritance or anything like it.
This reduces it to a couple arithmetic instructions added to the
alloc/dealloc path, so hopefully that removes the general performance
concern raised by Robert[1].

To calculate the total memory used, I included a function
MemoryContextMemAllocated() that walks the memory context and its
children recursively.

Of course, I was originally trying to avoid that, because it moves the
problem to HashAgg. For each group, it will need to execute
MemoryContextMemAllocated() to see if work_mem has been exceeded. It
will have to visit a couple contexts, or perhaps many (in the case of
array_agg, which creates one per group), which could be a measurable
regression for HashAgg.

But if that does turn out to be a problem, I think it's solvable. First,
I could micro-optimize the function by making it iterative rather than
recursive, to save on function call overhead. Second, I could offer a
way to prevent the HTAB from creating its own context, which would be
one less context to visit. And if those don't work, perhaps I could
resort to a sampling method of some kind, as you allude to above.

Regards,
Jeff Davis

[1] I'm fairly sure I tested something very similar on Robert's POWER
machine a while ago, and it was fine. But I think it's offline or moved
now, so I can't verify the results. If this patch still somehow turns
out to be a 1%+ regression on any reasonable test, I don't know what to
say. 

*** a/src/backend/utils/mmgr/aset.c
--- b/src/backend/utils/mmgr/aset.c
***
*** 438,451  AllocSetContextCreate(MemoryContext parent,
  	  Size initBlockSize,
  	  Size maxBlockSize)
  {
! 	AllocSet	context;

  	/* Do the type-independent part of context creation */
! 	context = (AllocSet) MemoryContextCreate(T_AllocSetContext,
! 			 sizeof(AllocSetContext),
! 			 &AllocSetMethods,
! 			 parent,
! 			 name);

  	/*
  	 * Make sure alloc parameters are reasonable, and save them.
--- 438,454 
  	  Size initBlockSize,
  	  Size maxBlockSize)
  {
! 	AllocSet			set;
! 	MemoryContext		context;

  	/* Do the type-independent part of context creation */
! 	context = MemoryContextCreate(T_AllocSetContext,
!   sizeof(AllocSetContext),
!   &AllocSetMethods,
!   parent,
!   name);
! 
! 	set = (AllocSet) context;

  	/*
  	 * Make sure alloc parameters are reasonable, and save them.
***
*** 459,467  AllocSetContextCreate(MemoryContext parent,
  	if (maxBlockSize < initBlockSize)
  		maxBlockSize = initBlockSize;
  	Assert(AllocHugeSizeIsValid(maxBlockSize)); /* must be safe to double */
! 	context->initBlockSize = initBlockSize;
! 	context->maxBlockSize = maxBlockSize;
! 	context->nextBlockSize = initBlockSize;

  	/*
  	 * Compute the allocation chunk size limit for this context.  It can't be
--- 462,470 
  	if (maxBlockSize < initBlockSize)
  		maxBlockSize = initBlockSize;
  	Assert(AllocHugeSizeIsValid(maxBlockSize)); /* must be safe to double */
! 	set->initBlockSize = initBlockSize;
! 	set->maxBlockSize = maxBlockSize;
! 	set->nextBlockSize = initBlockSize;

  	/*
  	 * Compute the allocation chunk size limit for this context.  It can't be
***
*** 477,486  AllocSetContextCreate(MemoryContext parent,
  	 * and actually-allocated sizes of any chunk must be on the same side of
  	 * the limit, else we get confused about whether the chunk is "big".
  	 */
! 	context->allocChunkLimit = ALLOC_CHUNK_LIMIT;
! 	while ((Size) (context->allocChunkLimit + ALLOC_CHUNKHDRSZ) >
  		   (Size) ((maxBlockSize - ALLOC_BLOCKHDRSZ) / ALLOC_CHUNK_FRACTION))
! 		context->allocChunkLimit >>= 1;

  	/*
  	 * Grab always-allocated space, if requested
--- 480,489 
  	 * and actually-allocated sizes of any chunk must be on the same side of
  	 * the limit, else we get confused about whether the chunk is "big".
  	 */
! 	set->allocChunkLimit = ALLOC_CHUNK_LIMIT;
! 	while ((Size) (set->allocChunkLimit + ALLOC_CHUNKHDRSZ) >
  		   (Size) ((maxBlockSize - ALLOC_BLOCKHDRSZ) / ALLOC_CHUNK_FRACTION))
! 		set->allocChunkLimit >>= 1;

  	/*
  	 * Grab always-allocated space, if requested
***
***

Re: [HACKERS] inherit support for foreign tables

2014-11-16 Thread Etsuro Fujita


(2014/11/12 20:04), Ashutosh Bapat wrote:

I reviewed fdw-chk-3 patch. Here are my comments


Thanks for the review!


Tests
---
1. The tests added in file_fdw module look good. We should add tests for
CREATE TABLE with CHECK CONSTRAINT also.


Agreed.  I added the tests, and also changed the proposed tests a bit.


2.  For postgres_fdw we need tests to check the behaviour in case the
constraints mismatch between the remote table and its local foreign
table declaration in case of INSERT, UPDATE and SELECT.


Done.


3. In the testcases for postgres_fdw it seems that you have forgotten to
add statement after SET constraint_exclusion to 'partition'


I added the statement.


4. In test foreign_data there are changes to fix the diffs caused by
these changes like below
  ALTER FOREIGN TABLE ft1 DROP CONSTRAINT no_const;   -- ERROR
-ERROR:  "ft1" is not a table
+ERROR:  constraint "no_const" of relation "ft1" does not exist
  ALTER FOREIGN TABLE ft1 DROP CONSTRAINT IF EXISTS no_const;
-ERROR:  "ft1" is not a table
+NOTICE:  constraint "no_const" of relation "ft1" does not exist, skipping
  ALTER FOREIGN TABLE ft1 DROP CONSTRAINT ft1_c1_check;
-ERROR:  "ft1" is not a table
+ERROR:  constraint "ft1_c1_check" of relation "ft1" does not exist



Earlier when constraints were not supported for FOREIGN TABLE, these
tests made sure the same functionality. So, even though the
corresponding constraints were not created on the table (in fact it
didn't allow the creation as well). Now that the constraints are
allowed, I think the tests for "no_const" (without IF EXISTS) and
"ft1_c1_check" are duplicating the same testcase. May be we should
review this set of statement in the light of new functionality.


Agreed.  I removed the "DROP CONSTRAINT ft1_c1_check" test, and added a 
new test for ALTER CONSTRAINT.



Code and implementation
--
The usage of NO INHERIT and NOT VALID with CONSTRAINT on foreign table
is blocked, but corresponding documentation entry doesn't mention so.
Since foreign tables can not be inherited NO INHERIT option isn't
applicable to foreign tables and the constraints on the foreign tables
are declarative, hence NOT VALID option is also not applicable. So, I
agree with what the code is doing, but that should be reflected in
documentation with this explanation.
Rest of the code modifies the condition checks for CHECK CONSTRAINTs on
foreign tables, and it looks good to me.


Done.

Other changes:
* Modified one error message that I added in AddRelationNewConstraints, 
to match the other message there.

* Revised other docs a little bit.

Attached is an updated version of the patch.

Thanks,

Best regards,
Etsuro Fujita
*** a/contrib/file_fdw/input/file_fdw.source
--- b/contrib/file_fdw/input/file_fdw.source
***
*** 62,68  CREATE FOREIGN TABLE tbl () SERVER file_server OPTIONS (format 'csv', null '
  CREATE FOREIGN TABLE tbl () SERVER file_server;  -- ERROR
  
  CREATE FOREIGN TABLE agg_text (
! 	a	int2,
  	b	float4
  ) SERVER file_server
  OPTIONS (format 'text', filename '@abs_srcdir@/data/agg.data', delimiter '	', null '\N');
--- 62,68 
  CREATE FOREIGN TABLE tbl () SERVER file_server;  -- ERROR
  
  CREATE FOREIGN TABLE agg_text (
! 	a	int2 CHECK (a >= 0),
  	b	float4
  ) SERVER file_server
  OPTIONS (format 'text', filename '@abs_srcdir@/data/agg.data', delimiter '	', null '\N');
***
*** 72,82  CREATE FOREIGN TABLE agg_csv (
--- 72,84 
  	b	float4
  ) SERVER file_server
  OPTIONS (format 'csv', filename '@abs_srcdir@/data/agg.csv', header 'true', delimiter ';', quote '@', escape '"', null '');
+ ALTER FOREIGN TABLE agg_csv ADD CHECK (a >= 0);
  CREATE FOREIGN TABLE agg_bad (
  	a	int2,
  	b	float4
  ) SERVER file_server
  OPTIONS (format 'csv', filename '@abs_srcdir@/data/agg.bad', header 'true', delimiter ';', quote '@', escape '"', null '');
+ ALTER FOREIGN TABLE agg_bad ADD CHECK (a >= 0);
  
  -- per-column options tests
  CREATE FOREIGN TABLE text_csv (
***
*** 134,139  DELETE FROM agg_csv WHERE a = 100;
--- 136,153 
  -- but this should be ignored
  SELECT * FROM agg_csv FOR UPDATE;
  
+ -- constraint exclusion tests
+ \t on
+ EXPLAIN (VERBOSE, COSTS FALSE) SELECT * FROM agg_csv WHERE a < 0;
+ \t off
+ SELECT * FROM agg_csv WHERE a < 0;
+ SET constraint_exclusion = 'on';
+ \t on
+ EXPLAIN (VERBOSE, COSTS FALSE) SELECT * FROM agg_csv WHERE a < 0;
+ \t off
+ SELECT * FROM agg_csv WHERE a < 0;
+ SET constraint_exclusion = 'partition';
+ 
  -- privilege tests
  SET ROLE file_fdw_superuser;
  SELECT * FROM agg_text ORDER BY a;
*** a/contrib/file_fdw/output/file_fdw.source
--- b/contrib/file_fdw/output/file_fdw.source
***
*** 78,84  ERROR:  COPY null representation cannot use newline or carriage return
  CREATE FOREIGN TABLE tbl () SERVER file_server;  -- ERROR
  ERROR:  filename is required for file_fdw foreign tables
  CREATE FOREIGN TABLE agg_text (
! 	a	int2,
  	b	float4
  )

Re: [HACKERS] New Event Trigger: table_rewrite

Re: [HACKERS] New Event Trigger: table_rewrite

Re: [HACKERS] Review of Refactoring code for sync node detection

Re: [HACKERS] Review of Refactoring code for sync node detection

Re: [HACKERS] Failback to old master

Re: [HACKERS] alternative model for handling locking in parallel groups

Re: [HACKERS] Failback to old master

Re: [HACKERS] Index scan optimization

Re: [HACKERS] Order of views in stats docs

Re: [HACKERS] [GSoC2014] Patch ALTER TABLE ... SET LOGGED

Re: [HACKERS] logical decoding - reading a user catalog table

Re: [HACKERS] printing table in asciidoc with psql

Re: [HACKERS] PostgreSQL doesn't stop propley when --slot option is specified with pg_receivexlog.

Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

Re: [HACKERS] PostgreSQL doesn't stop propley when --slot option is specified with pg_receivexlog.

Re: [HACKERS] BRIN indexes - TRAP: BadArgument

Re: [HACKERS] Review of Refactoring code for sync node detection

Re: [HACKERS] Index scan optimization

Re: [HACKERS] 9.5: Better memory accounting, towards memory-bounded HashAgg

Re: [HACKERS] inherit support for foreign tables

20 matches

Site Navigation

Mail list logo

Footer information