date:20121130

Re: [HACKERS] [PATCH] Patch to fix a crash of psql

2012-11-30 Thread Albe Laurenz

Tatsuo Ishii wrote:
> I think we should detect the cases as much as possible and warn users,
> rather than silently ignore that fact client encoding != file
> encoding. I don't think we can detect it in a reliable way, but at
> least we could check the cases above(sum of PQmblen is not equale to
> buffer lenghth). So my proposal is, if prepare_buffer() detects
> possible inconsistency between buffer encoding and file encoding, warn
> user.
> 
> [t-ishii@localhost psql]$ PGCLIENTENCODING=SJIS psql postgres
> Pager usage is off.
> psql (9.3devel)
> Type "help" for help.
> 
> postgres=# \i ~/sql
> CREATE DATABASE
> You are now connected to database "mydb" as user "t-ishii".
> CREATE SCHEMA
> psql:/home/t-ishii/sql:7: warning: possible conflict between client
encoding SJIS and input file
> encoding
> CREATE TABLE
> 
> Comments?

I wonder about the "possible".

Could there be false positives with the test?
If yes, I don't like the idea.
If there is no possibility for false positives, I'd say
that the "possible" should go.  Maybe it should even be
an error and no warning then.

Yours,
Laurenz Albe


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] DEALLOCATE IF EXISTS

2012-11-30 Thread Vik Reykja

On Tue, Nov 27, 2012 at 3:15 PM, Heikki Linnakangas  wrote:

> I fail to see the point of DEALLOCATE IF EXISTS. Do you have real use case
> for this, or was this just a case of adding IF EXISTS to all commands for
> the sake of completeness?
>
> Usually the client knows what statements have been prepared, but perhaps
> you want to make sure everything is deallocated in some error handling case
> or similar. But in that case, you might as well just issue a regular
> DEALLOCATE and ignore errors. Or even more likely, you'll want to use
> DEALLOCATE ALL.
>

Hmm.  The test case I had for it, which was very annoying in an "I want to
be lazy" sort of way, I am unable to reproduce now.  So I guess this
becomes a "make it like the others" and the community can decide whether
that's desirable.

In my personal case, which again I can't reproduce because it's been a
while since I've done it, DEALLOCATE ALL would have worked.  I was
basically preparing a query to work on it in the same conditions that it
would be executed in a function, and I was only working on one of these at
a time so ALL would have been fine.

Re: [HACKERS] Review: Extra Daemons / bgworker

2012-11-30 Thread Dimitri Fontaine

Markus Wanner  writes:
>> However, as you say, maybe we need more coding examples.
>
> Maybe a minimally usable extra daemon example? Showing how to avoid
> common pitfalls? Use cases, anybody? :-)

What about the PGQ ticker, pgqd?

  https://github.com/markokr/skytools/tree/master/sql/ticker
  https://github.com/markokr/skytools/blob/master/sql/ticker/pgqd.c

Or maybe pgAgent, which seems to live there, but is in C++ so might need
a rewrite to the specs:

http://git.postgresql.org/gitweb/?p=pgadmin3.git;a=tree;f=pgadmin/agent;h=ebbcf71bd918efdc82466785ffac6f2ac3443847;hb=HEAD

Maybe it would be easier to have a version of GNU mcron as an extension,
with the abitity to fire PostgreSQL stored procedures directly?  (That
way the cron specific parts of the logic are already implemented)

  http://www.gnu.org/software/mcron/

Another idea would be to have a pgbouncer extension. We would still need
of course to have pgbouncer as a separate component so that client
connection can outlive a postmaster crash, but that would still be very
useful as a first step into admission control. Let's not talk about the
feedback loop and per-cluster resource usage monitoring yet, but I guess
that you can see the drift.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Extra Daemons / bgworker

2012-11-30 Thread Pavel Stehule

2012/11/30 Dimitri Fontaine :
> Markus Wanner  writes:
>>> However, as you say, maybe we need more coding examples.
>>
>> Maybe a minimally usable extra daemon example? Showing how to avoid
>> common pitfalls? Use cases, anybody? :-)
>
> What about the PGQ ticker, pgqd?
>
>   https://github.com/markokr/skytools/tree/master/sql/ticker
>   https://github.com/markokr/skytools/blob/master/sql/ticker/pgqd.c
>
> Or maybe pgAgent, which seems to live there, but is in C++ so might need
> a rewrite to the specs:
>
>   
> http://git.postgresql.org/gitweb/?p=pgadmin3.git;a=tree;f=pgadmin/agent;h=ebbcf71bd918efdc82466785ffac6f2ac3443847;hb=HEAD
>
> Maybe it would be easier to have a version of GNU mcron as an extension,
> with the abitity to fire PostgreSQL stored procedures directly?  (That
> way the cron specific parts of the logic are already implemented)
>
>   http://www.gnu.org/software/mcron/
>
> Another idea would be to have a pgbouncer extension. We would still need
> of course to have pgbouncer as a separate component so that client
> connection can outlive a postmaster crash, but that would still be very
> useful as a first step into admission control. Let's not talk about the
> feedback loop and per-cluster resource usage monitoring yet, but I guess
> that you can see the drift.

both will be nice

Pavel

>
> Regards,
> --
> Dimitri Fontaine
> http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: fix corner use case of variadic fuctions usage

2012-11-30 Thread Vik Reykja

On Sun, Nov 4, 2012 at 12:49 PM, Pavel Stehule wrote:

> Hello
>
> here is patch, that enables using a variadic parameter modifier for
> variadic "any" function.
>
> Motivation for this patch is consistent behave of "format" function,
> but it fixes behave of all this class variadic functions.
>
> postgres=> -- check pass variadic argument
> postgres=> select format('%s, %s', variadic array['Hello','World']);
> format
> ──
>  Hello, World
> (1 row)
>
> postgres=> -- multidimensional array is supported
> postgres=> select format('%s, %s', variadic
> array[['Nazdar','Svete'],['Hello','World']]);
> format
> ───
>  {Nazdar,Svete}, {Hello,World}
> (1 row)
>
> It respect Tom's comments - it is based on short-lived FmgrInfo
> structures, that are created immediately before function invocation.
>
> Note: there is unsolved issue - reusing transformed arguments - so it
> is little bit suboptimal for VARIADIC RETURNING MultiFuncCall
> functions, where we don't need to repeat a unpacking of array value.
>
> Regards
>
> Pavel
>

Hi Pavel.

I am trying to review this patch and on my work computer everything
compiles and tests perfectly. However, on my laptop, the regression tests
don't pass with "cache lookup failed for type XYZ" where XYZ is some number
that does not appear to be any type oid.

I don't really know where to go from here. I am asking that other people
try this patch to see if they get errors as well.

Vik

PS: I won't be able to answer this thread until Tuesday.

[HACKERS] Review: create extension default_full_version

2012-11-30 Thread Ibrar Ahmed

Hi,

I looked at the discussion for this patch and the patch itself. Here
are my comments and observations about the patch.
What I got from the discussion is that the patch tries to implement a
mechanism to install extension from series of SQL scripts from
base/full version e.g. if a user wants to create an extension "1.1",
system should run v1.0 script followed by 1.0--1.1 script. In that
case we need to know about the base or full version which in the above
case is v1.0. So the patch added a defualt_full_version option in
extension control file.

Here are my comments about the patch

* Note: Patch does not apply cleanly on latest code base. You probably
need to re-base the code

ibrar@ibrar-laptop:~/work/postgresql$ patch -p1
arg) || 
pcontrol->default_full_version)


* In case the "default_full_version" and VERSION in SQL are same then
we are getting a misleading error message i.e.

comment = 'data type for storing sets of (key, value) pairs'
default_version = '1.1'
default_full_version = '1.0'
module_pathname = '$libdir/hstore'
relocatable = true

postgres=# create EXTENSION hstore version '1.0';
ERROR:  FROM version must be different from installation target version "1.0"

Error message is complaining about "FROM" clause which is not used in
the query.  I think EXTENSION creation should not be blocked in case
"default_full_version" and VERSION are same. But if we want to block
that; error message should be meaningful.

* I noticed another issue with the patch.

In case we do not specify the default_full_version in control file
then this is the sequence of sql scripts.

postgres=# CREATE EXTENSION hstore version '1.3' from '1.0';
WARNING:  /usr/local/pgsql/share/extension/hstore--1.0--1.1.sql
WARNING:  /usr/local/pgsql/share/extension/hstore--1.1--1.2.sql
WARNING:  /usr/local/pgsql/share/extension/hstore--1.2--1.3.sql
CREATE EXTENSION


But in case default_full_version = 1.0, then we are getting  "ERROR:
could not stat file..." error message.

postgres=# CREATE EXTENSION hstore version '1.3' from '1.0';
WARNING:  SCRIPT = /usr/local/pgsql/share/extension/hstore--1.0.sql
WARNING:  SCRIPT =
/usr/local/pgsql/share/extension/hstore--1.0--1.2.sql
   <<---  Why not 1.0--1.1
ERROR:  could not stat file
"/usr/local/pgsql/share/extension/hstore--1.0--1.2.sql": No such file
or directory

This is because of missing version number i.e. first we're executing
1.0 followed by 1.0--1.2 not 1.0--1.1 but I think following should be
the right sequence


postgres=# CREATE EXTENSION hstore version '1.3' from '1.0';
WARNING:  /usr/local/pgsql/share/extension/hstore--1.0.sql
WARNING:  /usr/local/pgsql/share/extension/hstore--1.0--1.1.sql
WARNING:  /usr/local/pgsql/share/extension/hstore--1.1--1.2.sql
WARNING:  /usr/local/pgsql/share/extension/hstore--1.2--1.3.sql
CREATE EXTENSION

PS: I modified the code to get this result.



- IMHO there should be an SQL option along with the default_full_version; like.

postgres=# create EXTENSION hstore VERSION '1.1' FULL_VERSION '1.0';


- hstore regression is also failing.


---

Ibrar Ahmed

Re: [HACKERS] Refactoring standby mode logic

2012-11-30 Thread Dimitri Fontaine

Hi,

Heikki Linnakangas  writes:
> Attached is a patch to refactor that logic into a more straightforward state
> machine. It's always been a kind of a state machine, but it's been hard to
> see, as the code wasn't explicitly written that way. Any objections?

On a quick glance over, looks good to me. Making that code simpler to
read and reason about seems a good goal.

> This change should have no effect in normal restore scenarios. It'd only
> make a difference if some files in the middle of the sequence of WAL files
> are missing from the archive, but have been copied to pg_xlog manually, and
> only if that file contains a timeline switch. Even then, I think I like the
> new order better; it's easier to explain if nothing else.

I'm not understanding the sequence difference well enough to comment
here, but I think some people are currently playing tricks in their
failover scripts with moving files directly to the pg_xlog of the server
to be promoted.

Is it possible for your refactoring to keep the old sequence?

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: index support for regexp search

2012-11-30 Thread Alexander Korotkov

On Thu, Nov 29, 2012 at 5:25 PM, Heikki Linnakangas  wrote:

> One thing that bothers me with this algoritm is that the overflow
> mechanism is all-or-nothing. In many cases, even when there is a huge
> number of states in the diagram, you could still extract at least a few
> trigrams that must be present in any matching string, with little effort.
> At least, it seems like that to a human :-).
>
> For example, consider this:
>
> explain analyze select count(*) from azjunk4 where txt ~ ('^**
> aabaacaadaaeaafaagaahaaiaajaak**aalaamaanaaoaapaaqaaraasaataau**
> aavaawaaxaayaazabaabbabcabdabe**abfabgabhabiabjabkablabmabnabo**
> abpabqabrabsabtabuabvabwabxaby**abzacaacbaccacdaceacfacgachaci**
> acjackaclacmacnacoacpacqacracs**actacuacvacwacxacyaczadaadbadc**
> addadeadfadgadhadiadjadkadladm**adnadoadpadqadradsadtaduadvadw**
> adxadyadzaeaaebaecaedaeeaefaeg**aehaeiaejaekaelaemaenaeoaepaeq**
> aeraesaetaeuaevaewaexaeyaezafa**afbafcafdafeaffafgafhafiafjafk**
> aflafmafnafoafpafqafrafsaftafu**afvafwafxafyafzagaagbagcagdage**
> agfaggaghagiagjagkaglagmagnago**agpagqagragsagtaguagvagwagxagy**
> agzahaahbahcahdaheahfahgahhahi**ahjahkahlahmahnahoahpahqahrahs**$');
>
> you get a query plan like this (the long regexp string edited out):
>
>  Aggregate  (cost=228148.02..228148.03 rows=1 width=0) (actual
> time=131.100..131
> .101 rows=1 loops=1)
>->  Bitmap Heap Scan on azjunk4  (cost=228144.01..228148.02 rows=1
> width=0) (
> actual time=131.096..131.096 rows=0 loops=1)
>  Recheck Cond: (txt ~ )
>  Rows Removed by Index Recheck: 1
>  ->  Bitmap Index Scan on azjunk4_trgmrgx_txt_01_idx
> (cost=0.00..228144
> .01 rows=1 width=0) (actual time=82.914..82.914 rows=1 loops=1)
>Index Cond: (txt ~ )
>  Total runtime: 131.230 ms
> (7 rows)
>
> That ridiculously long string exceeds the number of states (I think, could
> be number of paths or arcs too), and the algorithm gives up, resorting to
> scanning the whole index as can be seen by the "Rows Removed by Index
> Recheck" line. However, it's easy to see that any matching string must
> contain *any* of the possible trigrams the algorithm extracts. If it could
> safely return just a few of them, say "aab" and "abz", and discard the
> rest, that would already be much better than a full index scan.
>
> Would it be safe to simply stop short the depth-first search on overflow,
> and proceed with the graph that was constructed up to that point?


For depth-first it's not. But your proposal naturally makes sense. I've
changed it to breadth-first search. And then it's safe to mark all
unprocessed states as final when overflow. It means that we assume every
unprocessed branch to immediately finish with matching (this can give us
more false positives but no false negatives).
For overflow of matrix collection, it's safe to do just OR between all the
trigrams.
New version of patch is attached.

--
With best regards,
Alexander Korotkov.


trgm-regexp-0.7.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: index support for regexp search

2012-11-30 Thread Alexander Korotkov

Hi!

On Thu, Nov 29, 2012 at 12:58 PM, er  wrote:

> On Mon, November 26, 2012 20:49, Alexander Korotkov wrote:
>
> > trgm-regexp-0.6.patch.gz
>
> I ran the simple-minded tests against generated data (similar to the ones
> I did in January 2012).
> The problems of that older version seem pretty much all removed. (although
> I didn't do much work
> on it -- just reran these tests).
>

Thanks a lot for testing! Could you repeat for 0.7 version of patch which
has new overflow handling?

--
With best regards,
Alexander Korotkov.

Re: [HACKERS] WIP: index support for regexp search

2012-11-30 Thread Alexander Korotkov

On Fri, Nov 30, 2012 at 3:20 PM, Alexander Korotkov wrote:

> For depth-first it's not.
>

Oh, I didn't explained it.
In order to stop graph processing we need to be sure that we put all
outgoing arcs from state or assume that state to be final. In DFS we can be
in the final part of graph producing but still didn't add some arc (with
new trigram) from initial state directly to the final state. It obviously
leads to false negatives.

--
With best regards,
Alexander Korotkov.

[HACKERS] Re: missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?

2012-11-30 Thread Andres Freund

On 2012-11-30 08:52:18 +0530, Pavan Deolasee wrote:
> On Fri, Nov 30, 2012 at 3:19 AM, Andres Freund wrote:
>
> > Hi,
> >
> > while looking at trigger.c from the POV of foreign key locks I noticed
> > that GetTupleForTrigger commentless assumes it can just look at a pages
> > content without a
> > LockBuffer(buffer, BUFFER_LOCK_SHARE);
> >
> > That code path is only reached for AFTER ... FOR EACH ROW ... triggers,
> > so its fine it's not locking the tuple itself. That already needs to
> > have happened before.
> >
> > The code in question:
> >
> > if (newslot_which_is_passed_by_before_triggers)
> > ...
> > else
> > {
> > Pagepage;
> > ItemId  lp;
> >
> > buffer = ReadBuffer(relation, ItemPointerGetBlockNumber(tid));
> >
> > page = BufferGetPage(buffer);
> > lp = PageGetItemId(page, ItemPointerGetOffsetNumber(tid));
> >
> > Assert(ItemIdIsNormal(lp));
> >
> > tuple.t_data = (HeapTupleHeader) PageGetItem(page, lp);
> > tuple.t_len = ItemIdGetLength(lp);
> > tuple.t_self = *tid;
> > tuple.t_tableOid = RelationGetRelid(relation);
> > }
> >
> > result = heap_copytuple(&tuple);
> > ReleaseBuffer(buffer);
> >
> > As can be seen in the excerpt above this is basically a very stripped
> > down heap_fetch(). But without any locking on the buffer we just read.
> >
> > I can't believe this is safe? Just about anything but eviction could
> > happen to that buffer at that moment?
> >
> > Am I missing something?
> >
> >
> That seems to be safe to me. Anything thats been read above can't really
> change. The tuple is already locked, so a concurrent update/delete has to
> wait on us. We have a pin on the buffer, so VACUUM or HOT-prune can't
> happen either. I can't see any other operation that can really change those
> fields.

We only get the pin right there, I don't see any preexisting pin. Which
means we might have just opened a page thats in the process of being
pruned/vacuumed by another backend.
I think a concurrent heap_(insert|update)/PageAddItem might actually be
already dangerous because it might move line pointers around

> heap_fetch() though looks very similar has an important difference. It
> might be reading the tuple without lock on it and we need the buffer lock
> to protect against concurrent update/delete to the tuple. AFAIK once the
> tuple is passed through qualification checks etc, even callers of
> heap_fetch() can read the tuple data without any lock on the buffer itself.

Sure, but the page header isn't accessed there, just the tuple data
itself which always stays at the same place on the page.
(A normal heap_fetch shouldn't be worried about updates/deletions to its
tuple, MVCC to the rescue.)

Even if it turns out to be safe, I think this deserves at least a
comment...


Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Extra Daemons / bgworker

2012-11-30 Thread Markus Wanner

On 11/30/2012 11:09 AM, Dimitri Fontaine wrote:
> Markus Wanner  writes:
>>> However, as you say, maybe we need more coding examples.
>>
>> Maybe a minimally usable extra daemon example? Showing how to avoid
>> common pitfalls? Use cases, anybody? :-)
> 
> What about the PGQ ticker, pgqd?
> 
>   https://github.com/markokr/skytools/tree/master/sql/ticker
>   https://github.com/markokr/skytools/blob/master/sql/ticker/pgqd.c

AFAICS pgqd currently uses libpq, so I think it would rather turn into
what I call a background worker, with a connection to Postgres shared
memory. I perfectly well see use cases (plural!) for those.

What I'm questioning is the use for what I rather call "extra daemons",
that is, additional processes started by the postmaster without a
connection to Postgres shared memory (and thus without a database
connection).

I was asking for a minimal example of such an extra daemon, similar to
worker_spi, showing how to properly write such a thing. Ideally showing
how to avoid common pitfalls.

> Maybe it would be easier to have a version of GNU mcron as an extension,
> with the abitity to fire PostgreSQL stored procedures directly?  (That
> way the cron specific parts of the logic are already implemented)
> 
>   http://www.gnu.org/software/mcron/

Again, that's something that would eventually require a database
connection. Or at least access to Postgres shared memory to eventually
start the required background jobs. (Something Alvaro's patch doesn't
implement, yet. But which exactly matches what the coordinator and
bgworkers in Postgres-R do.)

For ordinary extra daemons, I'm worried about things like an OOM killer
deciding to kill the postmaster, being its parent. Or (io)niceness
settings. Or Linux cgroups. IMO all of these things just get harder to
administrate for extra daemons, when they move under the hat of the
postmaster.

Without access to shared memory, the only thing an extra daemon would
gain by moving under postmaster is the "knowledge" of the start, stop
and restart (crash) events of the database. And that in a very
inflexible way: the extra process is forced to start, stop and restart
together with the database to "learn" about these events.

Using a normal client connection arguably is a better way to learn about
crash events - it doesn't have the above limitation. And the start and
stop events, well, the DBA or sysadmin is under control of these,
already. We can possibly improve on that, yes. But extra daemons are not
the way to go, IMO.

> Another idea would be to have a pgbouncer extension. We would still need
> of course to have pgbouncer as a separate component so that client
> connection can outlive a postmaster crash, but that would still be very
> useful as a first step into admission control. Let's not talk about the
> feedback loop and per-cluster resource usage monitoring yet, but I guess
> that you can see the drift.

Sorry, I don't. Especially not with something like pgbouncer, because I
definitely *want* to control and administrate that separately.

So I guess this is a vote to disallow `worker.bgw_flags = 0` on the
basis that everything a process, which doesn't need to access Postgres
shared memory, better does whatever it does outside of Postgres. For the
benefit of the stability of Postgres and for ease of administration of
the two.

Or, maybe, rather drop the BGWORKER_SHMEM_ACCESS flag and imply that
every registered process wants to have access to Postgres shared memory.
Then document the gotchas and requirements of living under the
Postmaster, as I've requested before. (If you want a foot gun, you can
still write an extension that doesn't need to access Postgres shared
memory, but hey.. I we can't help those who desperately try to shoot
their foot).

Regards

Markus Wanner

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?

2012-11-30 Thread Alvaro Herrera

Andres Freund escribió:
> On 2012-11-30 08:52:18 +0530, Pavan Deolasee wrote:
> > On Fri, Nov 30, 2012 at 3:19 AM, Andres Freund 
> > wrote:

> > > I can't believe this is safe? Just about anything but eviction could
> > > happen to that buffer at that moment?

Yeah, it does seem fishy to me as well.

> Even if it turns out to be safe, I think this deserves at least a
> comment...

No doubt that trigger.c is hugely underdocumented in some places.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?

2012-11-30 Thread Pavan Deolasee

On Fri, Nov 30, 2012 at 5:28 PM, Andres Freund wrote:

>
> > >
> > That seems to be safe to me. Anything thats been read above can't really
> > change. The tuple is already locked, so a concurrent update/delete has to
> > wait on us. We have a pin on the buffer, so VACUUM or HOT-prune can't
> > happen either. I can't see any other operation that can really change
> those
> > fields.
>
> We only get the pin right there, I don't see any preexisting pin. Which
> means we might have just opened a page thats in the process of being
> pruned/vacuumed by another backend.
>

Hmm. Yeah, you're right. That is a possible risky scenario. Even though
cleanup lock waits for all pins to go away, it will work only if every
reader takes at least a SHARE lock unless it was continuously holding a pin
on a buffer (in which case its OK to drop lock and read a tuple body
without reacquiring it again). Otherwise, as you rightly pointed out, we
could actually be reading a page which being actively cleaned up and tuples
are being moved around.

> I think a concurrent heap_(insert|update)/PageAddItem might actually be
> already dangerous because it might move line pointers around
>
>
I don't we move the line pointers around ever because the indexes will be
pointing to them. But the vacuum/prune is dangerous enough to require a
SHARE lock here in any case.

Thanks,
Pavan

-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee

Re: [HACKERS] Review: Extra Daemons / bgworker

2012-11-30 Thread Dimitri Fontaine

Markus Wanner  writes:
> AFAICS pgqd currently uses libpq, so I think it would rather turn into
> what I call a background worker, with a connection to Postgres shared
> memory. I perfectly well see use cases (plural!) for those.
>
> What I'm questioning is the use for what I rather call "extra daemons",
> that is, additional processes started by the postmaster without a
> connection to Postgres shared memory (and thus without a database
> connection).

I totally missed the need to connect to shared memory to be able to
connect to a database and query it. Can't we just link the bgworkder
code to the client libpq library, just as plproxy is doing I believe?

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?

2012-11-30 Thread Simon Riggs

On 30 November 2012 11:58, Andres Freund  wrote:

> We only get the pin right there, I don't see any preexisting pin.

Seems easy enough to test with an Assert patch.

If the Assert doesn't fail, we apply it as "documentation" of the
requirement for a pin.

If it fails, we fix the bug.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?

2012-11-30 Thread Andres Freund

On 2012-11-30 18:09:49 +0530, Pavan Deolasee wrote:
> On Fri, Nov 30, 2012 at 5:28 PM, Andres Freund wrote:
>
> >
> > > >
> > > That seems to be safe to me. Anything thats been read above can't really
> > > change. The tuple is already locked, so a concurrent update/delete has to
> > > wait on us. We have a pin on the buffer, so VACUUM or HOT-prune can't
> > > happen either. I can't see any other operation that can really change
> > those
> > > fields.
> >
> > We only get the pin right there, I don't see any preexisting pin. Which
> > means we might have just opened a page thats in the process of being
> > pruned/vacuumed by another backend.
> >
>
> Hmm. Yeah, you're right. That is a possible risky scenario. Even though
> cleanup lock waits for all pins to go away, it will work only if every
> reader takes at least a SHARE lock unless it was continuously holding a pin
> on a buffer (in which case its OK to drop lock and read a tuple body
> without reacquiring it again). Otherwise, as you rightly pointed out, we
> could actually be reading a page which being actively cleaned up and tuples
> are being moved around.

Well, live (or recently dead) tuples can't be just moved arround unless
I miss something. That would cause problems with with HeapTuples
directly pointing into the page as youve noticed.

> > I think a concurrent heap_(insert|update)/PageAddItem might actually be
> > already dangerous because it might move line pointers around
> >
> >
> I don't we move the line pointers around ever because the indexes will be
> pointing to them.

The indexes point to a tid, not to a line pointer. So reshuffling the
linepointer array - while keeping the tids valid - is entirely
fine. Right?
Also, PageAddItem does that all the time, so I think we would have
noticed if not ;)

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Extra Daemons / bgworker

2012-11-30 Thread Alvaro Herrera

Dimitri Fontaine wrote:
> Markus Wanner  writes:
> > AFAICS pgqd currently uses libpq, so I think it would rather turn into
> > what I call a background worker, with a connection to Postgres shared
> > memory. I perfectly well see use cases (plural!) for those.
> >
> > What I'm questioning is the use for what I rather call "extra daemons",
> > that is, additional processes started by the postmaster without a
> > connection to Postgres shared memory (and thus without a database
> > connection).
> 
> I totally missed the need to connect to shared memory to be able to
> connect to a database and query it. Can't we just link the bgworkder
> code to the client libpq library, just as plproxy is doing I believe?

One of the uses for bgworkers that don't have shmem connection is to
have them use libpq connections instead.  I don't really see the point
of forcing everyone to use backend connections when libpq connections
are enough.  In particular, they are easier to port from existing code;
and they make it easier to share code with systems that still have to
support older PG versions.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?

2012-11-30 Thread Andres Freund

On 2012-11-30 12:50:06 +, Simon Riggs wrote:
> On 30 November 2012 11:58, Andres Freund  wrote:
>
> > We only get the pin right there, I don't see any preexisting pin.
>
> Seems easy enough to test with an Assert patch.
>
> If the Assert doesn't fail, we apply it as "documentation" of the
> requirement for a pin.
>
> If it fails, we fix the bug.

I think its wrong even if we were holding a pin all the time due the the
aforementioned PageAddItem reshuffling of line pointers. So that Assert
wouldn't proof enough.

I can try to proof corruption there, but I would rather see somebody
coming along telling me why its safe and that I am dumb for not
realizing it.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Extra Daemons / bgworker

2012-11-30 Thread Andres Freund

On 2012-11-30 09:57:20 -0300, Alvaro Herrera wrote:
> Dimitri Fontaine wrote:
> > Markus Wanner  writes:
> > > AFAICS pgqd currently uses libpq, so I think it would rather turn into
> > > what I call a background worker, with a connection to Postgres shared
> > > memory. I perfectly well see use cases (plural!) for those.
> > >
> > > What I'm questioning is the use for what I rather call "extra daemons",
> > > that is, additional processes started by the postmaster without a
> > > connection to Postgres shared memory (and thus without a database
> > > connection).
> >
> > I totally missed the need to connect to shared memory to be able to
> > connect to a database and query it. Can't we just link the bgworkder
> > code to the client libpq library, just as plproxy is doing I believe?
>
> One of the uses for bgworkers that don't have shmem connection is to
> have them use libpq connections instead.  I don't really see the point
> of forcing everyone to use backend connections when libpq connections
> are enough.  In particular, they are easier to port from existing code;
> and they make it easier to share code with systems that still have to
> support older PG versions.

They also can get away with a lot more crazy stuff without corrupting
the database. You better know something about what youre doing before
doing something with direct shared memory access.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Extra Daemons / bgworker

2012-11-30 Thread Dimitri Fontaine

Andres Freund  writes:
>> One of the uses for bgworkers that don't have shmem connection is to
>> have them use libpq connections instead.  I don't really see the point
>> of forcing everyone to use backend connections when libpq connections
>> are enough.  In particular, they are easier to port from existing code;
>> and they make it easier to share code with systems that still have to
>> support older PG versions.

Exactly, I think most bgworker would just use libpq if that's available,
using a backend's infrastructure is not that good a fit here. I mean,
connect from your worker to a database using libpq and call a backend's
function (provided by the same extension I guess) in there.

That's how I think pgqd would get integrated into the worker
infrastructure, right?

> They also can get away with a lot more crazy stuff without corrupting
> the database. You better know something about what youre doing before
> doing something with direct shared memory access.

And there's a whole lot you can already do just with a C coded stored
procedure already.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?

2012-11-30 Thread Pavan Deolasee

On Fri, Nov 30, 2012 at 6:21 PM, Andres Freund wrote:

> On 2012-11-30 18:09:49 +0530, Pavan Deolasee wrote:
> > On Fri, Nov 30, 2012 at 5:28 PM, Andres Freund  >wrote:
> >
> > >
> > > > >
> > > > That seems to be safe to me. Anything thats been read above can't
> really
> > > > change. The tuple is already locked, so a concurrent update/delete
> has to
> > > > wait on us. We have a pin on the buffer, so VACUUM or HOT-prune can't
> > > > happen either. I can't see any other operation that can really change
> > > those
> > > > fields.
> > >
> > > We only get the pin right there, I don't see any preexisting pin. Which
> > > means we might have just opened a page thats in the process of being
> > > pruned/vacuumed by another backend.
> > >
> >
> > Hmm. Yeah, you're right. That is a possible risky scenario. Even though
> > cleanup lock waits for all pins to go away, it will work only if every
> > reader takes at least a SHARE lock unless it was continuously holding a
> pin
> > on a buffer (in which case its OK to drop lock and read a tuple body
> > without reacquiring it again). Otherwise, as you rightly pointed out, we
> > could actually be reading a page which being actively cleaned up and
> tuples
> > are being moved around.
>
> Well, live (or recently dead) tuples can't be just moved arround unless
> I miss something. That would cause problems with with HeapTuples
> directly pointing into the page as youve noticed.
>
>
I think we confusing with the terminology. The tuple headers and bodies
(live or dead) can be moved around freely as long as you have a cleanup
lock on the page. Thats how HOT-prune frees up fragmented space.


> > > I think a concurrent heap_(insert|update)/PageAddItem might actually be
> > > already dangerous because it might move line pointers around
> > >
> > >
> > I don't we move the line pointers around ever because the indexes will be
> > pointing to them.
>
> The indexes point to a tid, not to a line pointer. So reshuffling the
> linepointer array - while keeping the tids valid - is entirely
> fine. Right?
>

I think its again terminology confusion :-) I thought TID and Line Pointers
are the same and used interchangeably. Or at least I've done so for long.
But I think you are reading line pointer as the thing stored in the ItemId
structure and not the ItemId itself.

Also, PageAddItem() doesn't move things around normally. It does that only
during recovery, at least for heao pages. Elsewhere it will either reuse an
UNUSED pointer or add at the end. Isn't that how it is ?

Thanks,
Pavan

-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee

Re: [HACKERS] [PATCH] Patch to fix a crash of psql

2012-11-30 Thread Peter Eisentraut

On 11/30/12 3:26 AM, Albe Laurenz wrote:
> Tatsuo Ishii wrote:
>> postgres=# \i ~/sql
>> CREATE DATABASE
>> You are now connected to database "mydb" as user "t-ishii".
>> CREATE SCHEMA
>> psql:/home/t-ishii/sql:7: warning: possible conflict between client
> encoding SJIS and input file
>> encoding
>> CREATE TABLE
>>
>> Comments?
> 
> I wonder about the "possible".
> 
> Could there be false positives with the test?
> If yes, I don't like the idea.
> If there is no possibility for false positives, I'd say
> that the "possible" should go.  Maybe it should even be
> an error and no warning then.

Yes, encoding mismatches are generally an error.

I think the message should be more precise.  Nobody will know what an
"encoding conflict" is.  The error condition is "last multibyte
character ran over end of file" or something like that, which should be
clearer.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?

2012-11-30 Thread Pavan Deolasee

On Fri, Nov 30, 2012 at 5:28 PM, Andres Freund wrote:

>
>
> > heap_fetch() though looks very similar has an important difference. It
> > might be reading the tuple without lock on it and we need the buffer lock
> > to protect against concurrent update/delete to the tuple. AFAIK once the
> > tuple is passed through qualification checks etc, even callers of
> > heap_fetch() can read the tuple data without any lock on the buffer
> itself.
>
> Sure, but the page header isn't accessed there, just the tuple data
> itself which always stays at the same place on the page.
> (A normal heap_fetch shouldn't be worried about updates/deletions to its
> tuple, MVCC to the rescue.)
>
>
heap_fetch() reads the tuple header though to apply snapshot visibility
rules. So a SHARE lock is must for that purpose because a concurrent
update/delete may set XMAX or other visibility related fields. On the other
hand, you can read the tuple body without a page lock if you were holding a
pin on the buffer continuously since you last dropped the lock.

In any case, a lock is needed in GetTupleForTrigger unless someone can
argue that a pin is continuously held on the buffer since the lock was last
dropped, something I don't see happening in this case.

Thanks,
Pavan

-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee

Re: [HACKERS] Review: Extra Daemons / bgworker

2012-11-30 Thread Kohei KaiGai

2012/11/30 Dimitri Fontaine :
> Andres Freund  writes:
>>> One of the uses for bgworkers that don't have shmem connection is to
>>> have them use libpq connections instead.  I don't really see the point
>>> of forcing everyone to use backend connections when libpq connections
>>> are enough.  In particular, they are easier to port from existing code;
>>> and they make it easier to share code with systems that still have to
>>> support older PG versions.
>
> Exactly, I think most bgworker would just use libpq if that's available,
> using a backend's infrastructure is not that good a fit here. I mean,
> connect from your worker to a database using libpq and call a backend's
> function (provided by the same extension I guess) in there.
>
> That's how I think pgqd would get integrated into the worker
> infrastructure, right?
>
One thing we have to pay attention is, the backend code cannot distinguish
connection from pgworker via libpq from other regular connections, from
perspective of access control.
Even if we implement major portion with C-function, do we have a reliable way
to prohibit C-function being invoked with user's query?

I also plan to use bgworker to load data chunk from shared_buffer to GPU
device in parallel, it shall be performed under the regular access control
stuff.

I think, using libpq is a good "option" for 95% of development, however, it
also should be possible to use SPI interface for corner case usage.

Thanks,
-- 
KaiGai Kohei 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Extra Daemons / bgworker

2012-11-30 Thread Markus Wanner

On 11/30/2012 01:40 PM, Dimitri Fontaine wrote:
> I totally missed the need to connect to shared memory to be able to
> connect to a database and query it. Can't we just link the bgworkder
> code to the client libpq library, just as plproxy is doing I believe?

Well, sure you can use libpq. But so can external processes. So that's
no benefit of running under the postmaster.

Regards

Markus Wanner


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?

2012-11-30 Thread Andres Freund

On 2012-11-30 18:35:05 +0530, Pavan Deolasee wrote:
> On Fri, Nov 30, 2012 at 6:21 PM, Andres Freund wrote:
>
> > On 2012-11-30 18:09:49 +0530, Pavan Deolasee wrote:
> > > On Fri, Nov 30, 2012 at 5:28 PM, Andres Freund  > >wrote:
> > >
> > > >
> > > > > >
> > > > > That seems to be safe to me. Anything thats been read above can't
> > really
> > > > > change. The tuple is already locked, so a concurrent update/delete
> > has to
> > > > > wait on us. We have a pin on the buffer, so VACUUM or HOT-prune can't
> > > > > happen either. I can't see any other operation that can really change
> > > > those
> > > > > fields.
> > > >
> > > > We only get the pin right there, I don't see any preexisting pin. Which
> > > > means we might have just opened a page thats in the process of being
> > > > pruned/vacuumed by another backend.
> > > >
> > >
> > > Hmm. Yeah, you're right. That is a possible risky scenario. Even though
> > > cleanup lock waits for all pins to go away, it will work only if every
> > > reader takes at least a SHARE lock unless it was continuously holding a
> > pin
> > > on a buffer (in which case its OK to drop lock and read a tuple body
> > > without reacquiring it again). Otherwise, as you rightly pointed out, we
> > > could actually be reading a page which being actively cleaned up and
> > tuples
> > > are being moved around.
> >
> > Well, live (or recently dead) tuples can't be just moved arround unless
> > I miss something. That would cause problems with with HeapTuples
> > directly pointing into the page as youve noticed.

> I think we confusing with the terminology. The tuple headers and bodies
> (live or dead) can be moved around freely as long as you have a cleanup
> lock on the page. Thats how HOT-prune frees up fragmented space.

Need to read more code here. This is nothing I really have looked into
before.Youre probably right.

Up to now I thought hot pruning only removed intermediate & surely dead
tuples but didn't move stuff arround. And that page fragementation was
repaired separately. But it looks like I am wrong.

> > > > I think a concurrent heap_(insert|update)/PageAddItem might actually be
> > > > already dangerous because it might move line pointers around
> > > >
> > > >
> > > I don't we move the line pointers around ever because the indexes will be
> > > pointing to them.
> >
> > The indexes point to a tid, not to a line pointer. So reshuffling the
> > linepointer array - while keeping the tids valid - is entirely
> > fine. Right?

> I think its again terminology confusion :-) I thought TID and Line Pointers
> are the same and used interchangeably. Or at least I've done so for long.
> But I think you are reading line pointer as the thing stored in the ItemId
> structure and not the ItemId itself.

I always understood ItemIds to be line pointers and ItemPointers to be
tids. Line pointers are only meaningful within a single page, store the
actual offset within the page and so on. ItemPointers (cids) are longer
lasting and store (page, offset_number)…

> Also, PageAddItem() doesn't move things around normally. It does that only
> during recovery, at least for heao pages. Elsewhere it will either reuse an
> UNUSED pointer or add at the end. Isn't that how it is ?

Hm? It doesn't move the page contents around but it moves the ItemId
array during completely normal operation (c.f. needshuffle in
PageAddItem). Or am I missing something?

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Extra Daemons / bgworker

2012-11-30 Thread Markus Wanner

On 11/30/2012 01:59 PM, Andres Freund wrote:
> On 2012-11-30 09:57:20 -0300, Alvaro Herrera wrote:
>> One of the uses for bgworkers that don't have shmem connection is to
>> have them use libpq connections instead.  I don't really see the point
>> of forcing everyone to use backend connections when libpq connections
>> are enough.

Requiring a libpq connection is a good indication for *not* wanting the
process to run under the postmaster, IMO.

>> In particular, they are easier to port from existing code;
>> and they make it easier to share code with systems that still have to
>> support older PG versions.
> 
> They also can get away with a lot more crazy stuff without corrupting
> the database.

Exactly. That's a good reason to *not* tie that to the postmaster, then.
Please keep as much of the potentially dangerous stuff separate (and
advice developers to do so as well, instead of offering them a foot
gun). So that our postmaster can do its job. And do it reliably, without
trying to be a general purpose start/stop daemon. There are better and
well established tools for that.

Regards

Markus Wanner

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?

2012-11-30 Thread Pavan Deolasee

On Fri, Nov 30, 2012 at 6:55 PM, Andres Freund wrote:

>
>
> Hm? It doesn't move the page contents around but it moves the ItemId
> array during completely normal operation (c.f. needshuffle in
> PageAddItem). Or am I missing something?
>
>
I think that probably only used for non-heap pages. For heap pages, it just
doesn't make sense to shuffle the ItemId array. That would defeat the
entire purpose of having them in the first place.

Thanks,
Pavan

-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee

Re: [HACKERS] Review: Extra Daemons / bgworker

2012-11-30 Thread Dimitri Fontaine

Kohei KaiGai  writes:
> One thing we have to pay attention is, the backend code cannot distinguish
> connection from pgworker via libpq from other regular connections, from
> perspective of access control.
> Even if we implement major portion with C-function, do we have a reliable way
> to prohibit C-function being invoked with user's query?

Why would you want to do that? And maybe the way to enforce that would
be by having your extension do its connecting using SPI to be able to
place things in known pieces of memory for the function to check?

> I also plan to use bgworker to load data chunk from shared_buffer to GPU
> device in parallel, it shall be performed under the regular access control
> stuff.

That sounds like a job where you need shared memory access but maybe not
a database connection?

> I think, using libpq is a good "option" for 95% of development, however, it
> also should be possible to use SPI interface for corner case usage.

+1, totally agreed.

-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Extra Daemons / bgworker

2012-11-30 Thread Alvaro Herrera

Markus Wanner wrote:
> On 11/30/2012 01:40 PM, Dimitri Fontaine wrote:
> > I totally missed the need to connect to shared memory to be able to
> > connect to a database and query it. Can't we just link the bgworkder
> > code to the client libpq library, just as plproxy is doing I believe?
> 
> Well, sure you can use libpq. But so can external processes. So that's
> no benefit of running under the postmaster.

No, it's not a benefit of that; but such a process would get started up
when postmaster is started, and shut down when postmaster stops.  So it
makes easier to have processes that need to run alongside postmaster.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Extra Daemons / bgworker

2012-11-30 Thread Kohei KaiGai

2012/11/30 Dimitri Fontaine :
> Kohei KaiGai  writes:
>> One thing we have to pay attention is, the backend code cannot distinguish
>> connection from pgworker via libpq from other regular connections, from
>> perspective of access control.
>> Even if we implement major portion with C-function, do we have a reliable way
>> to prohibit C-function being invoked with user's query?
>
> Why would you want to do that? And maybe the way to enforce that would
> be by having your extension do its connecting using SPI to be able to
> place things in known pieces of memory for the function to check?
>
As long as SPI is an option of bgworker, I have nothing to argue here.
If the framework enforced extension performing as background worker using
libpq for database connection, a certain kind of works being tied with internal
stuff has to be implemented as C-functions. Thus, I worried about it.

>> I also plan to use bgworker to load data chunk from shared_buffer to GPU
>> device in parallel, it shall be performed under the regular access control
>> stuff.
>
> That sounds like a job where you need shared memory access but maybe not
> a database connection?
>
Both of them are needed in this case. This "io-worker" will perform according
to the request from regular backend process, and fetch tuples from the heap
to the DMA buffer being on shared memory.

>> I think, using libpq is a good "option" for 95% of development, however, it
>> also should be possible to use SPI interface for corner case usage.
>
> +1, totally agreed.
>
Thanks,
-- 
KaiGai Kohei 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Extra Daemons / bgworker

2012-11-30 Thread Markus Wanner

Alvaro,

On 11/30/2012 02:44 PM, Alvaro Herrera wrote:
> So it
> makes easier to have processes that need to run alongside postmaster.

That's where we are in respectful disagreement, then. As I don't think
that's easier, overall, but in my eye, this looks like a foot gun.

As long as things like pgbouncer, pgqd, etc.. keep to be available as
separate daemons, I'm happy, though.

Regards

Markus Wanner

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Extra Daemons / bgworker

2012-11-30 Thread Kohei KaiGai

2012/11/30 Markus Wanner :
> Alvaro,
>
> On 11/30/2012 02:44 PM, Alvaro Herrera wrote:
>> So it
>> makes easier to have processes that need to run alongside postmaster.
>
> That's where we are in respectful disagreement, then. As I don't think
> that's easier, overall, but in my eye, this looks like a foot gun.
>
> As long as things like pgbouncer, pgqd, etc.. keep to be available as
> separate daemons, I'm happy, though.
>
This feature does not enforce them to implement with this new framework.
If they can perform as separate daemons, it is fine enough.

But it is not all the cases where we want background workers being tied
with postmaster's duration.
-- 
KaiGai Kohei 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: index support for regexp search

2012-11-30 Thread Heikki Linnakangas


On 30.11.2012 13:20, Alexander Korotkov wrote:

On Thu, Nov 29, 2012 at 5:25 PM, Heikki Linnakangas
wrote:



Would it be safe to simply stop short the depth-first search on overflow,
and proceed with the graph that was constructed up to that point?


For depth-first it's not. But your proposal naturally makes sense. I've
changed it to breadth-first search. And then it's safe to mark all
unprocessed states as final when overflow. It means that we assume every
unprocessed branch to immediately finish with matching (this can give us
more false positives but no false negatives).
For overflow of matrix collection, it's safe to do just OR between all the
trigrams.
New version of patch is attached.


Thanks, sounds good.

I've spent quite a long time trying to understand the transformation the 
getState/addKeys/addAcrs functions do to the original CNFA graph. I 
think that still needs more comments to explain the steps involved in it.


One thing that occurs to me is that it might be better to delay 
expanding colors to characters. You could build and transform the graph, 
and even create the paths, all while operating on colors. You would end 
up with lists of "color trigrams", consisting of sequences of three 
colors that must appear in the source string. Only at the last step you 
would expand the color trigrams to real character trigrams. I think that 
would save a lot of processing while building the graph, if you have 
colors that contain many characters. It would allow us to do better than 
the fixed small MAX_COLOR_CHARS limit. For example with MAX_COLOR_CHARS 
= 4 in the current patch, it's a shame that we can't do anything with a 
fairly simple regexp like "^a[b-g]h$"


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Extra Daemons / bgworker

2012-11-30 Thread Markus Wanner

On 11/30/2012 03:16 PM, Kohei KaiGai wrote:
> This feature does not enforce them to implement with this new framework.
> If they can perform as separate daemons, it is fine enough.

I'm not clear on what exactly you envision, but if a process needs
access to shared buffers, it sounds like it should be a bgworker. I
don't quite understand why that process also wants a libpq connection,
but that's certainly doable.

> But it is not all the cases where we want background workers being tied
> with postmaster's duration.

Not wanting a process to be tied to postmaster's duration is a strong
indication that it better not be a bgworker, I think.

Regards

Markus Wanner

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?

2012-11-30 Thread Andres Freund

On 2012-11-30 13:57:46 +0100, Andres Freund wrote:
> On 2012-11-30 12:50:06 +, Simon Riggs wrote:
> > On 30 November 2012 11:58, Andres Freund  wrote:
> >
> > > We only get the pin right there, I don't see any preexisting pin.
> >
> > Seems easy enough to test with an Assert patch.
> >
> > If the Assert doesn't fail, we apply it as "documentation" of the
> > requirement for a pin.
> >
> > If it fails, we fix the bug.
> 
> I think its wrong even if we were holding a pin all the time due the the
> aforementioned PageAddItem reshuffling of line pointers. So that Assert
> wouldn't proof enough.

But a failing Assert obviously proofs something. Stupid me. So here we
go:

diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index 98b8207..3b61d06 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -2662,6 +2662,16 @@ ltrmark:;
 
buffer = ReadBuffer(relation, ItemPointerGetBlockNumber(tid));
 
+#ifdef USE_ASSERT_CHECKING
+   if (!BufferIsLocal(buffer))
+   {
+   /* Only pinned by the above ReadBuffer */
+   if (PrivateRefCount[buffer - 1] <= 1)
+   elog(ERROR, "too low local pin count: %d",
+PrivateRefCount[buffer - 1]);
+   }
+#endif
+
page = BufferGetPage(buffer);
lp = PageGetItemId(page, ItemPointerGetOffsetNumber(tid));


CREATE OR REPLACE FUNCTION raise_notice_id()
RETURNS trigger
LANGUAGE plpgsql AS $body$
BEGIN
RAISE NOTICE 'id: %', OLD.id;
RETURN NULL;
END
$body$;

postgres=#
CREATE TABLE crashdummy(id serial primary key, data int);

postgres=#
CREATE TRIGGER crashdummy_after_delete
AFTER DELETE ON crashdummy
FOR EACH ROW EXECUTE PROCEDURE raise_notice_id();

postgres=#
INSERT INTO crashdummy(data) SELECT * FROM generate_series(1, 1000);

postgres=# 
DELETE FROM crashdummy WHERE ctid IN (SELECT ctid FROM crashdummy WHERE data < 
1000);
ERROR:  too low local pin count: 1
Time: 4.515 ms

A plain DELETE without the subselect doesn't trigger the problem though,
thats probably the reason we haven't seen any problems so far.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] json accessors

2012-11-30 Thread Andrew Dunstan



On 11/29/2012 06:34 PM, Merlin Moncure wrote:

On Thu, Nov 29, 2012 at 4:14 PM, Andrew Dunstan  wrote:

There are many things wrong with this. First, converting to hstore so you
can call populate_record is quite horrible and ugly and inefficient. And
it's dependent on having hstore loaded - you can't have an hstore_to_jon in
core because hstore itself isn't in core. If you want a populate_record that
takes data from json we should have one coded direct. I'm happy to add it to
the list as long as everyone understands the limitations. Given a function
to unnest the json array, which I already suggested upthread, you could do
what you suggested above much more elegantly and directly.

I wasn't suggesting you added the hstore stuff and I understand
perfectly well the awkwardness of the hstore route.  That said, this
is how people are going to use your api so it doesn't hurt to go
through the motions; I'm just feeling out how code in the wild would
shape up.

Anyways, my example was busted since you'd need an extra step to move
the set returning output from the json array unnest() into a
'populate_record' type function call.

So, AIUI I think you're proposing (i'm assuming optional quotes)
following my example above:

INSERT INTO foo(a,b)
SELECT
   json_get_as_text(v, 'a')::int,
   json_get_as_text(v, 'b')::int
FROM
   json_each() v;  /* gives you array of json (a,b) records  */

a hypothetical 'json_to_record (cribbing usage from populate_record)'
variant might look like (please note, I'm not saying 'write this now',
just feeling it out)::

INSERT INTO foo(a,b)
SELECT  r.*
FROM
   json_each() v,
LATERAL
   json_to_record(null::foo, v) r;

you're right: that's pretty clean.

An json_object_each(json), => key, value couldn't hurt either -- this
would handle those oddball cases of really wide objects that you
occasionally see in json.  Plus as_text variants of both each and
object_each.  If you're buying json_object_each, I think you can scrap
json_object_keys().




OK, so based on this discussion, I'm thinking of the following:

 * keep the original functions and operators. json_keys is still
   required for the case where the json is not flat.
 * json_each(json) => setof (text, text)
   errors if the json is not a flat object
 * json_unnest(json) => setof json
   errors if the json is not an array
 * json_unnest_each => setof (int, text, text)
   errors if the array is not an array of flat objects
 * populate_record(record, json) => record
   errors if the json isn't a flat object
 * populate_recordset(record, json) => setof record
   errors if the json is not an array of flat objects

Note that I've added a couple of things to deal with json that 
represents a recordset (i.e. an array of objects). This is a very common 
pattern and one well worth optimizing for.


I think that would let you do a lot of what you want pretty cleanly.

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] json accessors

2012-11-30 Thread Merlin Moncure

On Fri, Nov 30, 2012 at 8:38 AM, Andrew Dunstan  wrote:
> OK, so based on this discussion, I'm thinking of the following:

ok, this is looking awesome -- couple naming suggestions (see inline):

>  * keep the original functions and operators. json_keys is still
>required for the case where the json is not flat.
>  * json_each(json) => setof (text, text)
>errors if the json is not a flat object
>  * json_unnest(json) => setof json
>errors if the json is not an array

I wonder if usage of 'unnest' is appropriate: sql unnest()
*completely* unwraps the array to a list of scalars where as json
unnest() only peels of one level.  If you agree with that (it's
debatable), how about json_array_each()?

>  * json_unnest_each => setof (int, text, text)
>errors if the array is not an array of flat objects

I like this.  Maybe json_object_each() if you agree with my analysis above.

>  * populate_record(record, json) => record
>errors if the json isn't a flat object
>  * populate_recordset(record, json) => setof record
>errors if the json is not an array of flat objects

Two questions:
1) is it possible for these to work without a polymorphic object
passed through as hstore does (null::foo)?
select  populate_record(anyelement, record, json)

2) in keeping with naming style of json api, how about json_to_record,
json_to_recordset?
Maybe though keeping similarity with hstore convention is more important.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Extra Daemons / bgworker

2012-11-30 Thread Kohei KaiGai

2012/11/30 Markus Wanner :
> On 11/30/2012 03:16 PM, Kohei KaiGai wrote:
>> This feature does not enforce them to implement with this new framework.
>> If they can perform as separate daemons, it is fine enough.
>
> I'm not clear on what exactly you envision, but if a process needs
> access to shared buffers, it sounds like it should be a bgworker. I
> don't quite understand why that process also wants a libpq connection,
> but that's certainly doable.
>
It seemed to me you are advocating that any use case of background-
worker can be implemented with existing separate daemon approach.
What I wanted to say is, we have some cases that background-worker
framework allows to implement such kind of extensions with more
reasonable design.

Yes, one reason I want to use background-worker is access to shared-
memory segment. Also, it want to connect databases simultaneously
out of access controls; like as autovacuum. It is a reason why I'm saying
SPI interface should be only an option, not only libpq.
(If extension choose libpq, it does not take anything special, does it?)

>> But it is not all the cases where we want background workers being tied
>> with postmaster's duration.
>
> Not wanting a process to be tied to postmaster's duration is a strong
> indication that it better not be a bgworker, I think.
>
It also probably means, in case when a process whose duration wants to
be tied with duration of postmaster, its author can consider to implement
it as background worker.

Thanks,
-- 
KaiGai Kohei 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] json accessors

2012-11-30 Thread Andrew Dunstan



On 11/30/2012 09:51 AM, Merlin Moncure wrote:


Two questions:
1) is it possible for these to work without a polymorphic object
passed through as hstore does (null::foo)?
select  populate_record(anyelement, record, json)


I don't understand the question. The API I'm suggesting is exactly in 
line with hstore's, which uses a polymorphic parameter. I don't see how 
it can not, and I don't understand why you would have 3 parameters.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] json accessors

2012-11-30 Thread Merlin Moncure

On Fri, Nov 30, 2012 at 9:02 AM, Andrew Dunstan  wrote:
>
> On 11/30/2012 09:51 AM, Merlin Moncure wrote:
>>
>>
>> Two questions:
>> 1) is it possible for these to work without a polymorphic object
>> passed through as hstore does (null::foo)?
>> select  populate_record(anyelement, record, json)
>
>
> I don't understand the question. The API I'm suggesting is exactly in line
> with hstore's, which uses a polymorphic parameter. I don't see how it can
> not, and I don't understand why you would have 3 parameters.

my mistake: I misread the function as you write it.  it's good as is.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] json accessors

2012-11-30 Thread Hannu Krosing


On 11/30/2012 03:38 PM, Andrew Dunstan wrote:


On 11/29/2012 06:34 PM, Merlin Moncure wrote:
On Thu, Nov 29, 2012 at 4:14 PM, Andrew Dunstan  
wrote:
There are many things wrong with this. First, converting to hstore 
so you
can call populate_record is quite horrible and ugly and inefficient. 
And
it's dependent on having hstore loaded - you can't have an 
hstore_to_jon in
core because hstore itself isn't in core. If you want a 
populate_record that
takes data from json we should have one coded direct. I'm happy to 
add it to
the list as long as everyone understands the limitations. Given a 
function
to unnest the json array, which I already suggested upthread, you 
could do

what you suggested above much more elegantly and directly.

I wasn't suggesting you added the hstore stuff and I understand
perfectly well the awkwardness of the hstore route.  That said, this
is how people are going to use your api so it doesn't hurt to go
through the motions; I'm just feeling out how code in the wild would
shape up.

Anyways, my example was busted since you'd need an extra step to move
the set returning output from the json array unnest() into a
'populate_record' type function call.

So, AIUI I think you're proposing (i'm assuming optional quotes)
following my example above:

INSERT INTO foo(a,b)
SELECT
   json_get_as_text(v, 'a')::int,
   json_get_as_text(v, 'b')::int
FROM
   json_each() v;  /* gives you array of json (a,b) 
records  */


a hypothetical 'json_to_record (cribbing usage from populate_record)'
variant might look like (please note, I'm not saying 'write this now',
just feeling it out)::

INSERT INTO foo(a,b)
SELECT  r.*
FROM
   json_each() v,
LATERAL
   json_to_record(null::foo, v) r;

you're right: that's pretty clean.

An json_object_each(json), => key, value couldn't hurt either -- this
would handle those oddball cases of really wide objects that you
occasionally see in json.  Plus as_text variants of both each and
object_each.  If you're buying json_object_each, I think you can scrap
json_object_keys().




OK, so based on this discussion, I'm thinking of the following:

 * keep the original functions and operators. json_keys is still
   required for the case where the json is not flat.
 * json_each(json) => setof (text, text)
   errors if the json is not a flat object

Why not json_each(json) => setof (text, json) ? with no erroring out ?

if the json does represent text it is easy to convert to text on the 
query side.



 * json_unnest(json) => setof json
   errors if the json is not an array
 * json_unnest_each => setof (int, text, text)
   errors if the array is not an array of flat objects

json_unnest_each => setof (int, text, json)

 * populate_record(record, json) => record
   errors if the json isn't a flat object

errors if the values are not castable to records field types

nb! some nonflatness is castable. especially to json or hstore or record 
types



 * populate_recordset(record, json) => setof record
   errors if the json is not an array of flat objects

ditto
Note that I've added a couple of things to deal with json that 
represents a recordset (i.e. an array of objects). This is a very 
common pattern and one well worth optimizing for.


I think that would let you do a lot of what you want pretty cleanly.

cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Review: Extra Daemons / bgworker

2012-11-30 Thread Markus Wanner

On 11/30/2012 03:58 PM, Kohei KaiGai wrote:
> It seemed to me you are advocating that any use case of background-
> worker can be implemented with existing separate daemon approach.

That sounds like a misunderstanding. All I'm advocating is that only
3rd-party processes with a real need (like accessing shared memory)
should run under the postmaster.

> What I wanted to say is, we have some cases that background-worker
> framework allows to implement such kind of extensions with more
> reasonable design.

I absolutely agree to that. And I think I can safely claim to be the
first person to publish a patch that provides some kind of background
worker infrastructure for Postgres.

> Yes, one reason I want to use background-worker is access to shared-
> memory segment. Also, it want to connect databases simultaneously
> out of access controls; like as autovacuum.

Yeah, that's the entire reason for background workers. For clarity and
differentiation, I'd add: .. without having a client connection. That's
what makes them *background* workers. (Not to be confused with the
frontend vs backend differentiation. They are background backends, if
you want).

> It is a reason why I'm saying
> SPI interface should be only an option, not only libpq.

I'm extending that to say extensions should better *not* use libpq.
After all, they have a more direct access, already.

> It also probably means, in case when a process whose duration wants to
> be tied with duration of postmaster, its author can consider to implement
> it as background worker.

I personally don't count that as a real need. There are better tools for
this job; while there clearly are dangers in (ab)using the postmaster to
do it.

Regards

Markus Wanner

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] json accessors

2012-11-30 Thread Andrew Dunstan



On 11/30/2012 10:04 AM, Hannu Krosing wrote:



OK, so based on this discussion, I'm thinking of the following:

 * keep the original functions and operators. json_keys is still
   required for the case where the json is not flat.
 * json_each(json) => setof (text, text)
   errors if the json is not a flat object


Why not json_each(json) => setof (text, json) ? with no erroring out ?

if the json does represent text it is easy to convert to text on the 
query side.



Well, it would be possible, sure. I'm not sure how useful. Or we could 
do both fairly easily. It's not as simple or efficient as you might 
think to dequote / de-escape json string values, which is why the 
original API had variants for returning both types of values. Maybe we 
need a function for doing just that.





 * json_unnest(json) => setof json
   errors if the json is not an array
 * json_unnest_each => setof (int, text, text)
   errors if the array is not an array of flat objects

json_unnest_each => setof (int, text, json)


ditto.


 * populate_record(record, json) => record
   errors if the json isn't a flat object

errors if the values are not castable to records field types

nb! some nonflatness is castable. especially to json or hstore or 
record types



If the record has a json field, certainly. If it has a record field, 
fairly likely.  hstore could probably be a problem given it's not a core 
type. Similarly to the generation functions discussed in another thread, 
I could possibly look up a cast from json to the non-core type and use 
it. That might work for hstore.


I'll try to keep this as permissive as possible.




 * populate_recordset(record, json) => setof record
   errors if the json is not an array of flat objects

ditto


ditto ;-)


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Patch to fix a crash of psql

2012-11-30 Thread Tom Lane

Peter Eisentraut  writes:
> On 11/30/12 3:26 AM, Albe Laurenz wrote:
>> If there is no possibility for false positives, I'd say
>> that the "possible" should go.  Maybe it should even be
>> an error and no warning then.

> Yes, encoding mismatches are generally an error.

> I think the message should be more precise.  Nobody will know what an
> "encoding conflict" is.  The error condition is "last multibyte
> character ran over end of file" or something like that, which should be
> clearer.

TBH I think that a message here is unnecessary; it's sufficient to
ensure psql doesn't crash.  The backend will produce a better message
than this anyway once the data gets there, and that way we don't have to
invent a new error recovery path inside psql.  As-is, the patch just
creates the question of what to do after issuing the error.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?

2012-11-30 Thread Tom Lane

Andres Freund  writes:
> But a failing Assert obviously proofs something.

It doesn't prove that the code is unsafe though.

I am suspicious that it is safe, because it's pretty darn hard to
believe we'd not have seen field reports of problems with triggers
otherwise.  It's not like this is a seldom-executed code path.

But I concede that I don't see *why* it's safe --- it looks like
it ought to be possible for a VACUUM to move the tuple while the
trigger code is fetching it.  You'd need the timing to be just
so...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?

2012-11-30 Thread Andres Freund

On 2012-11-30 10:42:21 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > But a failing Assert obviously proofs something.
>
> It doesn't prove that the code is unsafe though.

Hehe.

> I am suspicious that it is safe, because it's pretty darn hard to
> believe we'd not have seen field reports of problems with triggers
> otherwise.  It's not like this is a seldom-executed code path.

Thats true, but I think due to the fact that the plain DELETEs do *not*
trigger the Assert we might just not have noticed it. A make check with
the error checking in place doesn't error out in fact.
Also I think the wrong data caused by it aren't that likely to be
noticed. Its just the OLD in triggers so its not going to be looked at
all the time all too closely and we might return data thats somewhat
validly looking.

I think most executor paths actually hold a separate pin "by accident"
while this is executing. I think that my example is hitting that case is
due to the ReleaseBuffer() after ExecStoreTuple() in TidNext(). Seqscans
have another pin via scan->rs_cbuf, indexscans one via ->xs_cbuf. Many
of the other nodes that are likely below ExecUpdate/Delete probably work
similar.

I think most cases with high throughput (which you probably need to
actually hit the relatively small window) won't use plans that are
complex enough to trigger the bug.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] json accessors

2012-11-30 Thread Hannu Krosing


On 11/30/2012 04:29 PM, Andrew Dunstan wrote:


On 11/30/2012 10:04 AM, Hannu Krosing wrote:



OK, so based on this discussion, I'm thinking of the following:

 * keep the original functions and operators. json_keys is still
   required for the case where the json is not flat.
 * json_each(json) => setof (text, text)
   errors if the json is not a flat object


Why not json_each(json) => setof (text, json) ? with no erroring out ?

if the json does represent text it is easy to convert to text on the 
query side.



Well, it would be possible, sure. I'm not sure how useful. Or we could 
do both fairly easily. It's not as simple or efficient as you might 
think to dequote / de-escape json string values, which is why the 
original API had variants for returning both types of values. Maybe we 
need a function for doing just that.


Btw, how does current json type handle code pages - is json always utf-8 
even when server encoding is not ?


if so then we could at least have a shortcut conversion of json to 
utf8-text which can skip codepage changes.


--
Hannu



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Patch to fix a crash of psql

2012-11-30 Thread Tatsuo Ishii

> Peter Eisentraut  writes:
>> On 11/30/12 3:26 AM, Albe Laurenz wrote:
>>> If there is no possibility for false positives, I'd say
>>> that the "possible" should go.  Maybe it should even be
>>> an error and no warning then.
> 
>> Yes, encoding mismatches are generally an error.
> 
>> I think the message should be more precise.  Nobody will know what an
>> "encoding conflict" is.  The error condition is "last multibyte
>> character ran over end of file" or something like that, which should be
>> clearer.
> 
> TBH I think that a message here is unnecessary; it's sufficient to
> ensure psql doesn't crash.  The backend will produce a better message
> than this anyway once the data gets there, and that way we don't have to
> invent a new error recovery path inside psql.  As-is, the patch just
> creates the question of what to do after issuing the error.

Problem is, backend does not always detect errors.

For my example backend caches an error:
postgres=# \i ~/a
psql:/home/t-ishii/a:1: warning: possible conflict between client encoding SJIS 
and input file encoding
ERROR:  invalid byte sequence for encoding "SJIS": 0x88
psql:/home/t-ishii/a:1: ERROR:  invalid byte sequence for encoding "SJIS": 0x88

However for Jiang Guiqing's example, backend silently ignores error:
postgres=# \i ~/sql
CREATE DATABASE
You are now connected to database "mydb" as user "t-ishii".
CREATE SCHEMA
psql:/home/t-ishii/sql:7: warning: possible conflict between client encoding 
SJIS and input file encoding
CREATE TABLE

IMO it is a benefit for users to detect such errors earlier.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] json accessors

2012-11-30 Thread Andrew Dunstan



On 11/30/2012 10:59 AM, Hannu Krosing wrote:


Btw, how does current json type handle code pages - is json always 
utf-8 even when server encoding is not ?


if so then we could at least have a shortcut conversion of json to 
utf8-text which can skip codepage changes.





IIRC json is stored and processed in the server encoding. Normally it 
would make sense to have that be utf8. It is delivered to the client in 
the client encoding.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Patch to fix a crash of psql

2012-11-30 Thread Tatsuo Ishii

> I wonder about the "possible".
> 
> Could there be false positives with the test?
> If yes, I don't like the idea.

Yes, there could be false positives. It was just because the input
file was broken. In the real world, I think probably encoding mismatch
is the most possible cause, but this is not 100%.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Patch to fix a crash of psql

2012-11-30 Thread Tatsuo Ishii

> I think the message should be more precise.  Nobody will know what an
> "encoding conflict" is.  The error condition is "last multibyte
> character ran over end of file" or something like that, which should be
> clearer.

"last multibyte character ran over" is not precise at all because the
error was caused by each byte in the file confused PQmblen, not just
last multibyte character. However to explain those in the messgae is
too technical for users, I'm afraid. Maybe just "encoding mismatch
detected. please make sure that input file encoding is SJIS" or
something like that?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP json generation enhancements

2012-11-30 Thread Andrew Dunstan



On 11/27/2012 02:38 PM, Andrew Dunstan wrote:


On 11/26/2012 12:31 PM, Robert Haas wrote:
On Mon, Nov 26, 2012 at 11:43 AM, Andrew Dunstan 
 wrote:
I don't understand why you would want to create such a cast. If the 
cast

doesn't exist it will do exactly what it does now, i.e. use the type's
output function and then json quote and escape it, which in the case of
citext is the Right Thing (tm):

andrew=# select to_json('foo"bar'::citext);
   to_json

  "foo\"bar"

I'm not sure either, but setting up a system where seemingly innocuous
actions can in fact have surprising and not-easily-fixable
consequences in other parts of the system doesn't seem like good
design to me.  Of course, maybe I'm misunderstanding what will happen;
I haven't actually tested it myself.




I'm all for caution, but the argument here seems a bit nebulous. We 
could create a sort of auxiliary type, as has been previously 
suggested, that would be in all respects the same as the json type 
except that it would be the target of the casts that would be used in 
to_json() and friends. But, that's a darned ugly thing to have to do, 
so I'd want a good concrete reason for doing it. Right now I'm having 
a hard time envisioning a problem that could be caused by just using 
the straightforward solution that's in my latest patch.






So, are there any other opinions on this besides mine and Robert's? I'd 
like to move forward but I want to get this resolved first.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?

2012-11-30 Thread Tom Lane

Andres Freund  writes:
> On 2012-11-30 10:42:21 -0500, Tom Lane wrote:
>> I am suspicious that it is safe, because it's pretty darn hard to
>> believe we'd not have seen field reports of problems with triggers
>> otherwise.  It's not like this is a seldom-executed code path.

> Thats true, but I think due to the fact that the plain DELETEs do *not*
> trigger the Assert we might just not have noticed it.

OK, I've managed to reproduce an actual failure, ie VACUUM moving the
tuple underneath the fetch.  It's not easy to do though, and the timing
has to be *really* tight.

The reason that simple update/delete queries don't show the issue is
that the tuple being fetched is the "old" one that was just
updated/deleted, and in a simple query that one was just returned by a
relation-scan plan node that's underneath the ModifyTuple node, and
*that scan node still has pin* on the table page; or more accurately the
TupleTableSlot it's returned the scan tuple in has the pin.  This pin's
been held since we did a LockBuffer to examine the tuple's liveness, so
it's sufficient to prevent anyone from getting a cleanup lock.  However,
in a more complex plan such as a join, the ModifyTable node could be
receiving tuples that aren't the current tuple of a scan anymore.
(Your example case passes the tuples through a Sort, so none of them
are pinned by the time the ModifyTable node gets them.)

Another thing that reduces the odds of seeing this, in recent versions,
is that when we first scanned the page containing the "old" tuple,
we'll have (tried to) do a page prune.  So even if a VACUUM comes along
now, the odds are that it will not find any newly-reclaimable space.
To reproduce the problem, I had to arrange for another tuple on the
same page to become reclaimable between the relation scan and the
GetTupleForTrigger fetch (which I did by having another, serializable
transaction scan the table, then deleting the other tuple, then allowing
the serializable transaction to commit after the page prune step).

Lastly, the only way you see a problem is if VACUUM acquires cleanup
lock before GetTupleForTrigger pins the page, finds some reclaimable
space, and moves the target tuple on the page in order to compact the
free space, and that data movement happens in the narrow time window
between where GetTupleForTrigger does PageGetItem() and where it
finishes the heap_copytuple() call just below that.  Even then, you
might escape seeing a problem, unless the data movement also shifts
some *other* tuple into the space formerly occupied by the target.

So that's why nobody's reported the problem --- it's not happening
often enough for anyone to see it as a repeatable issue.

I'll go insert a LockBuffer call.  Good catch!

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?

2012-11-30 Thread Andres Freund

On 2012-11-30 12:53:27 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > On 2012-11-30 10:42:21 -0500, Tom Lane wrote:
> >> I am suspicious that it is safe, because it's pretty darn hard to
> >> believe we'd not have seen field reports of problems with triggers
> >> otherwise.  It's not like this is a seldom-executed code path.
>
> > Thats true, but I think due to the fact that the plain DELETEs do *not*
> > trigger the Assert we might just not have noticed it.
>
> OK, I've managed to reproduce an actual failure, ie VACUUM moving the
> tuple underneath the fetch.  It's not easy to do though, and the timing
> has to be *really* tight.
>
> So that's why nobody's reported the problem --- it's not happening
> often enough for anyone to see it as a repeatable issue.

Yea. I have been playing with trying to reproduce the issue as well and
I needed 3 sessions and two gdb's to cause any problem... And even then
it took me several tries to get all conditions right.

We call heap_prune* surprisingly often...

> I'll go insert a LockBuffer call.  Good catch!

Thanks.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Hot Standby Feedback should default to on in 9.3+

2012-11-30 Thread Andres Freund

Hi,

The subject says it all.

There are workloads where its detrimental, but in general having it
default to on improver experience tremendously because getting conflicts
because of vacuum is rather confusing.

In the workloads where it might not be a good idea (very long queries on
the standby, many dead tuples on the primary) you need to think very
carefuly about the strategy of avoiding conflicts anyway, and explicit
configuration is required as well.

Does anybody have an argument against changing the default value?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?

2012-11-30 Thread Tom Lane

BTW, I went looking for other places that might have a similar mistake.
I found that contrib/pageinspect/btreefuncs.c pokes around in btree
pages without any buffer lock, which seems pretty broken --- don't we
want it to take share lock?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?

2012-11-30 Thread Andres Freund

On 2012-11-30 14:08:05 -0500, Tom Lane wrote:
> BTW, I went looking for other places that might have a similar mistake.
> I found that contrib/pageinspect/btreefuncs.c pokes around in btree
> pages without any buffer lock, which seems pretty broken --- don't we
> want it to take share lock?

I seem to remember comments somewhere indicating that pageinspect (?)
doesn't take locks by intention to make debugging of locking problems
easier. Not sure whether thats really realistic, but ...

Andres

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?

2012-11-30 Thread Tom Lane

Andres Freund  writes:
> On 2012-11-30 14:08:05 -0500, Tom Lane wrote:
>> BTW, I went looking for other places that might have a similar mistake.
>> I found that contrib/pageinspect/btreefuncs.c pokes around in btree
>> pages without any buffer lock, which seems pretty broken --- don't we
>> want it to take share lock?

> I seem to remember comments somewhere indicating that pageinspect (?)
> doesn't take locks by intention to make debugging of locking problems
> easier. Not sure whether thats really realistic, but ...

Dunno, that seems like a pretty lame argument when compared to the very
real possibility of crashing due to processing corrupt data.  Besides,
the heap-page examination functions in the same module take buffer lock,
so why shouldn't these?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+

2012-11-30 Thread Simon Riggs

On 30 November 2012 19:02, Andres Freund  wrote:

> The subject says it all.
>
> There are workloads where its detrimental, but in general having it
> default to on improver experience tremendously because getting conflicts
> because of vacuum is rather confusing.
>
> In the workloads where it might not be a good idea (very long queries on
> the standby, many dead tuples on the primary) you need to think very
> carefuly about the strategy of avoiding conflicts anyway, and explicit
> configuration is required as well.
>
> Does anybody have an argument against changing the default value?

I don't see a technical objection, perhaps others do.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_basebackup is taking backup of extra files inside a tablespace directory

2012-11-30 Thread Robert Haas

On Wed, Nov 28, 2012 at 1:55 AM, Hari Babu  wrote:
> I think backup should be done only files and folders present inside
> '/opt/tblspc/PG_*' directory (TABLESPACE_VERSION_DIRECTORY).
> Not all the files and folders in side '/opt/tblspc.' directory.

I think I agree.  The current behavior would have made sense in older
releases of PG where we plopped down our stuff right inside the
tablespace directory, but now that everything is being shoved in a
subdirectory I don't see a reason to copy anything other than that
subdirectory.  Of course it's probably bad style to have anything
that's not related to PG in there, but given that the whole point of
this change was to allow different PG versions to create tablespaces
on the same directory at the same time, we can hardly say that this is
a case that should never arise in real life.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+

2012-11-30 Thread Robert Haas

On Fri, Nov 30, 2012 at 2:02 PM, Andres Freund  wrote:
> Does anybody have an argument against changing the default value?

Well, the disadvantage of it is that the standby can bloat the master,
which might be surprising to some people, too.  But I don't really
have a lot of skin in this game.

While we're talking about changing defaults, how about changing the
default value of the recovery.conf parameter 'standby_mode' to on?
Not sure about anybody else, but I never want it any other way.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL

2012-11-30 Thread Robert Haas

On Wed, Nov 28, 2012 at 9:47 AM, Amit kapila  wrote:
> 5. PERSISTENT Keyword is added to the reserved keyword list. As it was giving 
> some errors given below while parsing gram.y
>   15 shift/reduce conflicts .

Allow me to be the first to say that any syntax for this feature that
involves reserving new keywords is a bad syntax.

The cost of an unreserved keyword is that the parser gets a little
bigger and slows down, but previous experimentation shows that the
effect is pretty small.  However, adding a new reserved keyword breaks
user applications.  It is hardly difficult to imagine that there are a
decent number of users who have columns or PL/pgsql variables called
"persistent".  Let's not break them.  Instead, since there were
multiple proposed syntaxes for this feature, let's just pick one of
the other ones that doesn't have this problem.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Patch to fix a crash of psql

2012-11-30 Thread Tom Lane

Tatsuo Ishii  writes:
>> TBH I think that a message here is unnecessary; it's sufficient to
>> ensure psql doesn't crash.  The backend will produce a better message
>> than this anyway once the data gets there, and that way we don't have to
>> invent a new error recovery path inside psql.  As-is, the patch just
>> creates the question of what to do after issuing the error.

> Problem is, backend does not always detect errors.

The reason it doesn't produce an error in Jiang Guiqing's example is
that the encoding error is in a comment and thus is never transmitted
to the backend at all.  I don't see a big problem with that.  If we
did have a problem with it, I think the better fix would be to stop
having psql strip comments from what it sends.  (I've considered
proposing such a change anyway, in order that logged statements look
more like what the user typed.)

> IMO it is a benefit for users to detect such errors earlier.

It is not a benefit for users to get randomly different (and less
informative) messages and different error behaviors for the same
problem.

I think we should just do this and call it good:

diff --git a/src/bin/psql/psqlscan.l b/src/bin/psql/psqlscan.l
index 
d32a12c63c856625aa42c708b24d4b58e3cdd677..6c1429815f2eec597f735d18ea86d9c8c9f1f3a2
 100644
*** a/src/bin/psql/psqlscan.l
--- b/src/bin/psql/psqlscan.l
*** prepare_buffer(const char *txt, int len,
*** 1807,1813 
/* first byte should always be okay... */
newtxt[i] = txt[i];
i++;
!   while (--thislen > 0)
newtxt[i++] = (char) 0xFF;
}
}
--- 1807,1813 
/* first byte should always be okay... */
newtxt[i] = txt[i];
i++;
!   while (--thislen > 0 && i < len)
newtxt[i++] = (char) 0xFF;
}
}

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] make -jN check fails / unset MAKEFLAGS in pg_regress.c

2012-11-30 Thread Robert Haas

On Thu, Nov 29, 2012 at 8:50 AM, Andres Freund  wrote:
> Hi,
>
> Currently "make -jN check" fails during "creating temporary installation"
> with:
> make[1]: *** read jobs pipe: Invalid argument.  Stop.
> make[1]: *** Waiting for unfinished jobs
> make[2]: warning: jobserver unavailable: using -j1.  Add `+' to parent make 
> rule.
> in install.log.
>
> This is due to pg_regress invoking make while being invoked by make
> itself. gnu make internally sets the MAKEFLAGS environment variable to
> remember arguments. The problem in this case is that it contains
> "--jobserver-fds=4,5" which makes the pg_regress invoked make think its
> running as a make child process.
>
> Now the problem obviously can be worked around by using "make -jN &&
> make check" instead of "make -j16 check" but I several times now have
> spent time trying to figure out what I broke so it sees sensible to
> "fix" this.
>
> Any arguments against doing so?
>
> The attached patch also resets the MAKELEVEL environment variable for
> good measure. I haven't seen any indication that its needed, but it
> feelds safer ;)

Seems reasonable to me.

But shouldn't the comment in the patch say "to be certain the child
DOESN'T notice the make above us"?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] initdb.c::main() too large

2012-11-30 Thread Bruce Momjian

On Thu, Nov 29, 2012 at 11:23:59PM -0500, Tom Lane wrote:
> Bruce Momjian  writes:
> > In looking to add an fsync-only option to initdb, I found its main()
> > function to be 743 lines long, and very hard to understand.
> 
> > The attached patch moves much of that code into separate functions,
> > which will make initdb.c easier to understand, and easier to add an
> > fsync-only option.  The original initdb.c author, Andrew Dunstan, has
> > accepted the restructuring, in principle.
> 
> No objection to breaking it into multiple functions --- but I do say
> it's a lousy idea to put the long_options[] constant at the front of
> the file, thousands of lines away from the switch construct that it
> has to be in sync with.  We don't do that in any other program AFAIR.
> Keep that in the main() function, please.

Good point.  I had not noticed that.  I fixed my initdb patch, and
adjusted a few other C files to be consistent.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL

2012-11-30 Thread Tom Lane

Robert Haas  writes:
> On Wed, Nov 28, 2012 at 9:47 AM, Amit kapila  wrote:
>> 5. PERSISTENT Keyword is added to the reserved keyword list. As it was 
>> giving some errors given below while parsing gram.y
>> 15 shift/reduce conflicts .

> Allow me to be the first to say that any syntax for this feature that
> involves reserving new keywords is a bad syntax.

Let me put that a little more strongly: syntax that requires reserving
words that aren't reserved in the SQL standard is unacceptable.

Even if the new word is reserved according to SQL, we'll frequently
try pretty hard to avoid making it reserved in Postgres, so as not to
break existing applications.  But if it's not in the standard then
you're breaking applications that can reasonably expect not to get
broken.

But having said that, it's not apparent to me why inventing SET
PERSISTENT should require reserving PERSISTENT.  In the existing
syntaxes SET LOCAL and SET SESSION, there's not been a need to
reserve LOCAL or SESSION.  Maybe you're just trying to be a bit
too cute in the grammar productions?  Frequently there's more than
one way to do it and not all require the same level of keyword
reservedness.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Enabling frontend-only xlog "desc" routines

2012-11-30 Thread Robert Haas

On Thu, Nov 29, 2012 at 3:03 PM, Tom Lane  wrote:
> Alvaro Herrera  writes:
>> The other interesting question remaining is what to do about the rm_desc
>> function in rmgr.c.  We're split between these two ideas:
>
> Why try to link rmgr.c into frontend versions at all?  Just make
> a new table file that only includes the desc function pointers.
> Yeah, then there would be two table files to maintain, but it's
> not clear to me that it's uglier than these proposals ...

+1.  It's not likely to get updated very often, we can add comments to
remind people to keep them all in sync, and if you manage to screw it
up without noticing then you are adding recovery code that you have
not tested in any way whatsoever.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+

2012-11-30 Thread Josh Berkus


> In the workloads where it might not be a good idea (very long queries on
> the standby, many dead tuples on the primary) you need to think very
> carefuly about the strategy of avoiding conflicts anyway, and explicit
> configuration is required as well.
> 
> Does anybody have an argument against changing the default value?

On balance, I think it's a good idea.  It's easier for new users,
conceptually, to deal with table bloat than query cancel.

Have we done testing on how much query cancel it actually eliminates?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+

2012-11-30 Thread Heikki Linnakangas


On 30.11.2012 21:02, Andres Freund wrote:

Hi,

The subject says it all.

There are workloads where its detrimental, but in general having it
default to on improver experience tremendously because getting conflicts
because of vacuum is rather confusing.

In the workloads where it might not be a good idea (very long queries on
the standby, many dead tuples on the primary) you need to think very
carefuly about the strategy of avoiding conflicts anyway, and explicit
configuration is required as well.

Does anybody have an argument against changing the default value?


-1. By default, I would expect a standby server to not have any 
meaningful impact on the performance of the master. With hot standby 
feedback, you can bloat the master very badly if you're not careful.


Think of someone setting up a test server, by setting it up as a standby 
from the master. Now, when someone holds a transaction open in the test 
server, you get bloat in the master. Or if you set up a standby for 
reporting purposes - a very common use case - you would not expect a 
long running ad-hoc query in the standby to bloat the master. That's 
precisely why you set up such a standby in the first place.


You could of course still turn it off, but you would have to know about 
it in the first place. I think it's a reasonable assumption that a 
standby does *not* affect the master (aside from the bandwidth and disk 
space required to retain/ship the WAL). If you have to remember to 
explicitly set a GUC to get that behavior, that's a pretty big gotcha.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: missing LockBuffer(buffer, BUFFER_LOCK_SHARE) in trigger.c GetTupleForTrigger?

2012-11-30 Thread Tom Lane

BTW, on further inspection, there's yet another reason why we've not
heard about this from the field: even if all the wrong things happen and
GetTupleForTrigger manages to copy bogus data for the old tuple, that
data *is not passed to the trigger function*.  The only thing we do with
it is decide whether to queue an event for the trigger.  So if you've
got a WHEN condition for the trigger, that expression would see the bad
data, or if it's an RI trigger the bad data is passed to
RI_FKey_pk_upd_check_required or RI_FKey_fk_upd_check_required.  But
unless the data is corrupt enough to cause a crash, the worst outcome
would be a wrong decision about whether the trigger needs to be run.

It's possible this explains some of the reports we've seen about RI
constraints being violated.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+

2012-11-30 Thread Claudio Freire

On Fri, Nov 30, 2012 at 5:46 PM, Heikki Linnakangas
 wrote:
>
> Think of someone setting up a test server, by setting it up as a standby
> from the master. Now, when someone holds a transaction open in the test
> server, you get bloat in the master. Or if you set up a standby for
> reporting purposes - a very common use case - you would not expect a long
> running ad-hoc query in the standby to bloat the master. That's precisely
> why you set up such a standby in the first place.

Without hot standby feedback, reporting queries are impossible. I've
experienced it. Cancellations make it impossible to finish any
decently complex reporting query.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+

2012-11-30 Thread Tom Lane

Robert Haas  writes:
> While we're talking about changing defaults, how about changing the
> default value of the recovery.conf parameter 'standby_mode' to on?
> Not sure about anybody else, but I never want it any other way.

Dunno, it's been only a couple of days since there was a thread about
somebody who had turned it on and not gotten the results he wanted
(because he was only trying to do a point-in-time recovery not create
a standby).  There's enough other configuration needed to set up a
standby node that I'm not sure flipping this default helps the case
much.

But having said that, would it be practical to get rid of the explicit
standby_mode parameter altogether?  I'm thinking we could assume standby
mode is wanted if primary_conninfo has a nonempty value.

There remains the case of a standby being fed solely from WAL archive
without primary_conninfo, but that's a pretty darn corner-y corner case,
and I doubt it has to be easy to set up.  One possibility for it is to
allow primary_conninfo to be set to "none", which would still trigger
standby mode but could be coded to not enable connection attempts.

Mind you, I'm not sure that such a design is easier to understand or
document.  But it would be one less parameter.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+

2012-11-30 Thread Heikki Linnakangas


On 30.11.2012 22:49, Claudio Freire wrote:

On Fri, Nov 30, 2012 at 5:46 PM, Heikki Linnakangas
  wrote:


Think of someone setting up a test server, by setting it up as a standby
from the master. Now, when someone holds a transaction open in the test
server, you get bloat in the master. Or if you set up a standby for
reporting purposes - a very common use case - you would not expect a long
running ad-hoc query in the standby to bloat the master. That's precisely
why you set up such a standby in the first place.


Without hot standby feedback, reporting queries are impossible. I've
experienced it. Cancellations make it impossible to finish any
decently complex reporting query.


Maybe so, but I'd rather get cancellations in the standby, and then read 
up on feedback and the other options and figure out how to make it work, 
than get severe bloat in the master and scratch my head wondering what's 
causing it.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+

2012-11-30 Thread Kevin Grittner

Claudio Freire wrote:

> Without hot standby feedback, reporting queries are impossible.
> I've experienced it. Cancellations make it impossible to finish
> any decently complex reporting query.

With what setting of max_standby_streaming_delay? I would rather
default that to -1 than default hot_standby_feedback on. That way
what you do on the standby only affects the standby.

A default that allows anyone who has a read-only login to a standby
to bloat the server by default, which may require hours of down
time to correct, seems dangerous to me.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+

2012-11-30 Thread Claudio Freire

On Fri, Nov 30, 2012 at 6:06 PM, Kevin Grittner  wrote:
>
>> Without hot standby feedback, reporting queries are impossible.
>> I've experienced it. Cancellations make it impossible to finish
>> any decently complex reporting query.
>
> With what setting of max_standby_streaming_delay? I would rather
> default that to -1 than default hot_standby_feedback on. That way
> what you do on the standby only affects the standby.

1d


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+

2012-11-30 Thread Tom Lane

Claudio Freire  writes:
> Without hot standby feedback, reporting queries are impossible. I've
> experienced it. Cancellations make it impossible to finish any
> decently complex reporting query.

The original expectation was that slave-side cancels would be
infrequent.  Maybe there's some fixing/tuning to be done there.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Removing PD_ALL_VISIBLE

2012-11-30 Thread Jeff Davis

On Mon, 2012-11-26 at 16:55 -0600, Merlin Moncure wrote:
> > Based on previous measurements, I think there *is* contention pinning
> > the root of an index.  Currently, I believe it's largely overwhelmed
> > by contention from other sources, such as the buffer manager lwlocks
> > and the very-evil ProcArrayLock.  However, I believe that as we fix
> > those problems, this will start to percolate up towards the top of the
> > heap.
> 
> Yup -- it (buffer pin contention on high traffic buffers) been caught
> in the wild -- just maintaining the pin count was enough to do it in
> at least one documented case.  Pathological workloads demonstrate
> contention today and there's no good reason to assume it's limited
> index root nodes -- i'm strongly suspicious buffer spinlock issues are
> behind some other malfeasance we've seen recently.

I tried for quite a while to show any kind of performance difference
between checking the VM and checking PD_ALL_VISIBLE on a 12-core box (24
if you count HT).

Three patches in question:
  1. Current unpatched master
  2. patch that naively always checks the VM page, pinning and unpinning
each time
  3. Same as #2, but tries to keep buffers pinned (I had to fix a bug in
this patch though -- new version forthcoming)

I tested from 1 to 64 concurrent connections.

One test was just concurrent scans of a ~400MB table.

The other test was a DELETE FROM foo WHERE ctid BETWEEN '(N,0)' AND
'(N,256)' where N is the worker number in the test program. The table
here is only about 2 MB. The idea is that the scan will happen quickly,
leading to many quick deletes, but the deletes won't actually touch the
same pages (aside from the VM page). So, this was designed to be
uncontended except for pinning the VM page.

On the scan test, it was really hard to see any difference in the test
noise, but I may have seen about a 3-4% degradation between patch #1 and
patch #2 at higher concurrencies. It was difficult for me to reproduce
this result -- it usually wouldn't show up. I didn't see any difference
between patch #1 and patch #3.

On the delete test I detected no difference between #1 and #2 at all.

I think someone with access to a larger box may need to test this. Or,
if someone has a more specific suggestion about how I can create a worst
case, then let me know.

Regards,
Jeff Davis

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+

2012-11-30 Thread Kevin Grittner

Claudio Freire wrote:

>> With what setting of max_standby_streaming_delay? I would rather
>> default that to -1 than default hot_standby_feedback on. That
>> way what you do on the standby only affects the standby.
> 
> 1d

Was there actually a transaction hanging open for an entire day on
the standby? Was it a query which actually ran that long, or an
ill-behaved user or piece of software?

I have most certainly managed databases where holding up vacuuming
on the source would cripple performance to the point that users
would have demanded that any other process causing it must be
immediately canceled. And canceling it wouldn't be enough at that
point -- the bloat would still need to be fixed before they could
work efficiently.

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Further pg_upgrade analysis for many tables

2012-11-30 Thread Bruce Momjian

On Thu, Nov 29, 2012 at 12:59:19PM -0500, Bruce Momjian wrote:
> I have polished up the patch (attached) and it is ready for application
> to 9.3.

Applied.

---

> Since there is no pg_dump/pg_restore pipe parallelism, I had the old
> cluster create per-database dump files, so I don't need to have the old
> and new clusters running at the same time, which would have required two
> port numbers and make shared memory exhaustion more likely.
> 
> We now create a dump file per database, so thousands of database dump
> files might cause a performance problem.
> 
> This also adds status output so you can see the database names as their
> schemas are dumped and restored.  This was requested by users.
> 
> I retained custom mode for pg_dump because it is measurably faster than
> text mode (not sure why, psql overhead?):
> 
>   git -Fc -Fp
>   1  11.04   11.08   11.02
>1000  22.37   19.68   21.64
>2000  32.39   28.62   31.40
>4000  56.18   48.53   51.15
>8000 105.15   81.23   91.84
>   16000 227.64  156.72  177.79
>   32000 542.80  323.19  371.81
>   640001711.77  789.17  865.03
> 
> Text dump files are slightly easier to debug, but probably not by much.
> 
> Single-transaction restores were recommended to me over a year ago (by
> Magnus?), but I wanted to get pg_upgrade rock-solid before doing
> optimization, and now is the right time to optimize.
> 
> One risk of single-transaction restores is max_locks_per_transaction
> exhaustion, but you will need to increase that on the old cluster for
> pg_dump anyway because that is done a single transaction, so the only
> new thing is that the new cluster might also need to adjust
> max_locks_per_transaction.
> 
> I was able to remove split_old_dump() because pg_dumpall now produces a
> full global restore file and we do database dumps separately.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+

2012-11-30 Thread Claudio Freire

On Fri, Nov 30, 2012 at 6:20 PM, Kevin Grittner  wrote:
> Claudio Freire wrote:
>
>>> With what setting of max_standby_streaming_delay? I would rather
>>> default that to -1 than default hot_standby_feedback on. That
>>> way what you do on the standby only affects the standby.
>>
>> 1d
>
> Was there actually a transaction hanging open for an entire day on
> the standby? Was it a query which actually ran that long, or an
> ill-behaved user or piece of software?

No, and if there was, I wouldn't care for it to be cancelled.

Queries were being cancelled way before that timeout was reached,
probably something to do with max_keep_segments on the master side
being unable to keep up for that long.

> I have most certainly managed databases where holding up vacuuming
> on the source would cripple performance to the point that users
> would have demanded that any other process causing it must be
> immediately canceled. And canceling it wouldn't be enough at that
> point -- the bloat would still need to be fixed before they could
> work efficiently.

I wouldn't mind occasional cancels, but these were recurring. When a
query ran long enough, there was no way for it to finish, no matter
how many times you tried. The master never stops being busy, that's
probably a factor.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] initdb.c::main() too large

2012-11-30 Thread Bruce Momjian

On Thu, Nov 29, 2012 at 11:12:23PM -0500, Bruce Momjian wrote:
> In looking to add an fsync-only option to initdb, I found its main()
> function to be 743 lines long, and very hard to understand.
> 
> The attached patch moves much of that code into separate functions,
> which will make initdb.c easier to understand, and easier to add an
> fsync-only option.  The original initdb.c author, Andrew Dunstan, has
> accepted the restructuring, in principle.

Applied.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+

2012-11-30 Thread Heikki Linnakangas


On 30.11.2012 23:40, Claudio Freire wrote:

On Fri, Nov 30, 2012 at 6:20 PM, Kevin Grittner  wrote:

Claudio Freire wrote:


With what setting of max_standby_streaming_delay? I would rather
default that to -1 than default hot_standby_feedback on. That
way what you do on the standby only affects the standby.


1d


Was there actually a transaction hanging open for an entire day on
the standby? Was it a query which actually ran that long, or an
ill-behaved user or piece of software?


No, and if there was, I wouldn't care for it to be cancelled.

Queries were being cancelled way before that timeout was reached,
probably something to do with max_keep_segments on the master side
being unable to keep up for that long.


Running out of max_keep_segments would produce a different error, 
requiring a new base backup.



I have most certainly managed databases where holding up vacuuming
on the source would cripple performance to the point that users
would have demanded that any other process causing it must be
immediately canceled. And canceling it wouldn't be enough at that
point -- the bloat would still need to be fixed before they could
work efficiently.


I wouldn't mind occasional cancels, but these were recurring. When a
query ran long enough, there was no way for it to finish, no matter
how many times you tried. The master never stops being busy, that's
probably a factor.


Hmm, it sounds like max_standby_streaming_delay=1d didn't work as 
intended for some reason. It should've given the query one day to run 
before canceling it. Unless the standby was running one day behind the 
master already, but that seems unlikely. Any chance you could reproduce 
that?


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+

2012-11-30 Thread Claudio Freire

On Fri, Nov 30, 2012 at 6:49 PM, Heikki Linnakangas
 wrote:
>>> I have most certainly managed databases where holding up vacuuming
>>> on the source would cripple performance to the point that users
>>> would have demanded that any other process causing it must be
>>> immediately canceled. And canceling it wouldn't be enough at that
>>> point -- the bloat would still need to be fixed before they could
>>> work efficiently.
>>
>>
>> I wouldn't mind occasional cancels, but these were recurring. When a
>> query ran long enough, there was no way for it to finish, no matter
>> how many times you tried. The master never stops being busy, that's
>> probably a factor.
>
>
> Hmm, it sounds like max_standby_streaming_delay=1d didn't work as intended
> for some reason. It should've given the query one day to run before
> canceling it. Unless the standby was running one day behind the master
> already, but that seems unlikely. Any chance you could reproduce that?

I have a pre-production server with replication for these tests. I
could create a fake stream of writes on it, disable feedback, and see
what happens.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] initdb.c::main() too large

2012-11-30 Thread Andrew Dunstan



On 11/30/2012 04:45 PM, Bruce Momjian wrote:

On Thu, Nov 29, 2012 at 11:12:23PM -0500, Bruce Momjian wrote:

In looking to add an fsync-only option to initdb, I found its main()
function to be 743 lines long, and very hard to understand.

The attached patch moves much of that code into separate functions,
which will make initdb.c easier to understand, and easier to add an
fsync-only option.  The original initdb.c author, Andrew Dunstan, has
accepted the restructuring, in principle.

Applied.



Sorry I didn't have time to review this before it was applied.

A few minor nitpicks:

 * process() is a fairly uninformative function name, not sure what I'd
   call it, but something more descriptive.
 * the setup_signals_and_umask() call and possibly the final message
   section of process() would be better placed back in main() IMNSHO.
 * the large statements for setting up the datadir and the xlogdir
   should be factored out of process() into their own functions, I
   think. That would make it much more readable.

cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] initdb.c::main() too large

2012-11-30 Thread Bruce Momjian

On Fri, Nov 30, 2012 at 06:06:39PM -0500, Andrew Dunstan wrote:
> 
> On 11/30/2012 04:45 PM, Bruce Momjian wrote:
> >On Thu, Nov 29, 2012 at 11:12:23PM -0500, Bruce Momjian wrote:
> >>In looking to add an fsync-only option to initdb, I found its main()
> >>function to be 743 lines long, and very hard to understand.
> >>
> >>The attached patch moves much of that code into separate functions,
> >>which will make initdb.c easier to understand, and easier to add an
> >>fsync-only option.  The original initdb.c author, Andrew Dunstan, has
> >>accepted the restructuring, in principle.
> >Applied.
> >
> 
> Sorry I didn't have time to review this before it was applied.
> 
> A few minor nitpicks:
> 
>  * process() is a fairly uninformative function name, not sure what I'd
>call it, but something more descriptive.
>  * the setup_signals_and_umask() call and possibly the final message
>section of process() would be better placed back in main() IMNSHO.
>  * the large statements for setting up the datadir and the xlogdir
>should be factored out of process() into their own functions, I
>think. That would make it much more readable.

OK, I will make those changes.  Thanks.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+

2012-11-30 Thread Andres Freund

On 2012-11-30 14:35:37 -0500, Robert Haas wrote:
> On Fri, Nov 30, 2012 at 2:02 PM, Andres Freund  wrote:
> > Does anybody have an argument against changing the default value?
>
> Well, the disadvantage of it is that the standby can bloat the master,
> which might be surprising to some people, too.  But I don't really
> have a lot of skin in this game.

Sure, thats a problem. But ISTM that its a problem everyone running
postgres has to know about anyway from running the master itself.

> While we're talking about changing defaults, how about changing the
> default value of the recovery.conf parameter 'standby_mode' to on?
> Not sure about anybody else, but I never want it any other way.

Hm. But only if there is a recovery.conf I guess?

Andres

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+

2012-11-30 Thread Daniel Farina

On Fri, Nov 30, 2012 at 11:35 AM, Robert Haas  wrote:
> On Fri, Nov 30, 2012 at 2:02 PM, Andres Freund  wrote:
>> Does anybody have an argument against changing the default value?
>
> Well, the disadvantage of it is that the standby can bloat the master,
> which might be surprising to some people, too.  But I don't really
> have a lot of skin in this game.

Under this precept, we used to not enable hot standby feedback and
instead allowed more or less unbounded staleness of the standby
through very long cancellation times. Although not immediate,
eventually we decided that enough people were getting confused by
sufficiently long standby delay caused by bad queries and idle in xact
backends, so now we have enabled feedback for new database replicants,
along with some fairly un-aggressive cancellation timeouts. It's all
rather messy and not very satisfying.  We have yet to know if feedback
causes or solves problems, on average.

In very early versions we tried the default cancellation settings, and
query cancellation confused everyone a *lot*.  That went away in a
hurry as a result, so I suppose it's not entirely unreasonable to say
in retrospect that the defaults can be considered kind of bad.

Longer term, I think I'd be keen to switch all our user-controlled
replication to logical except for use cases where the workload of the
standby is under our (and not the user's) control, such as for
failover.

Unfortunately, our experience with the feature and its use suggests
that the contract granted by the mechanisms seen in hot standby are
too complex for full-stack developers to keep in careful consideration
along with all the other things they want to do with their application
and/or have to remember about Postgres to get by.

--
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] make -jN check fails / unset MAKEFLAGS in pg_regress.c

2012-11-30 Thread Andres Freund

On 2012-11-30 14:41:14 -0500, Robert Haas wrote:
> On Thu, Nov 29, 2012 at 8:50 AM, Andres Freund  wrote:
> > Hi,
> >
> > Currently "make -jN check" fails during "creating temporary installation"
> > with:
> > make[1]: *** read jobs pipe: Invalid argument.  Stop.
> > make[1]: *** Waiting for unfinished jobs
> > make[2]: warning: jobserver unavailable: using -j1.  Add `+' to parent make 
> > rule.
> > in install.log.
> >
> > This is due to pg_regress invoking make while being invoked by make
> > itself. gnu make internally sets the MAKEFLAGS environment variable to
> > remember arguments. The problem in this case is that it contains
> > "--jobserver-fds=4,5" which makes the pg_regress invoked make think its
> > running as a make child process.
> >
> > Now the problem obviously can be worked around by using "make -jN &&
> > make check" instead of "make -j16 check" but I several times now have
> > spent time trying to figure out what I broke so it sees sensible to
> > "fix" this.
> >
> > Any arguments against doing so?
> >
> > The attached patch also resets the MAKELEVEL environment variable for
> > good measure. I haven't seen any indication that its needed, but it
> > feelds safer ;)
>
> Seems reasonable to me.
>
> But shouldn't the comment in the patch say "to be certain the child
> DOESN'T notice the make above us"?

Yes. obviously. Stupid, errr, fingers.

Thanks for spotting.

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] make -jN check fails / unset MAKEFLAGS in pg_regress.c

2012-11-30 Thread Andres Freund

> Yes. obviously. Stupid, errr, fingers.

This time round it really was fatfingering the wrong key after having
been climbing for 4h. Really.

Trivially updated patch attached.

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
>From 52fad5c7e645b514cdf23feddf08a3cd690363be Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Thu, 29 Nov 2012 14:49:42 +0100
Subject: [PATCH] Unset MAKEFLAGS in pg_regress.c to hide the knowledge that
 its invoked by make from submakes

Make stores some flags in the MAKEFLAGS variable to pass arguments to its own
children. If we are invoked by make that makes the make invoked by us think
it's part of the parallel make invoking us and tries to communicate with the
toplevel make. Which fails.
---
 src/test/regress/pg_regress.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c
index 5a656f2..842eba1 100644
--- a/src/test/regress/pg_regress.c
+++ b/src/test/regress/pg_regress.c
@@ -783,6 +783,18 @@ initialize_environment(void)
 		}
 
 		/*
+		 * make stores some flags in the MAKEFLAGS variable to pass arguments
+		 * to its own children. If we are invoked by make that makes the make
+		 * invoked by us think its part of the parallel make invoking us and
+		 * tries to communicate with the toplevel make. Which fails.
+		 *
+		 * Unset the variable to protect against such problems. We also reset
+		 * MAKELEVEL to be certain the child doesn't notice the make above us.
+		 */
+		unsetenv("MAKEFLAGS");
+		unsetenv("MAKELEVEL");
+
+		/*
 		 * Adjust path variables to point into the temp-install tree
 		 */
 		tmp = malloc(strlen(temp_install) + 32 + strlen(bindir));
-- 
1.7.12.289.g0ce9864.dirty


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+

2012-11-30 Thread Magnus Hagander

On Fri, Nov 30, 2012 at 9:46 PM, Heikki Linnakangas
 wrote:
> On 30.11.2012 21:02, Andres Freund wrote:
>>
>> Hi,
>>
>> The subject says it all.
>>
>> There are workloads where its detrimental, but in general having it
>> default to on improver experience tremendously because getting conflicts
>> because of vacuum is rather confusing.
>>
>> In the workloads where it might not be a good idea (very long queries on
>> the standby, many dead tuples on the primary) you need to think very
>> carefuly about the strategy of avoiding conflicts anyway, and explicit
>> configuration is required as well.
>>
>> Does anybody have an argument against changing the default value?
>
>
> -1. By default, I would expect a standby server to not have any meaningful
> impact on the performance of the master. With hot standby feedback, you can
> bloat the master very badly if you're not careful.

I'm with Heikki on the -1 on this. It's certainly unexpected to have
the slave affect the master by default - people will expect the master
to be independent.

Also, it doesn't IMHO actually *help*. The big thing that makes it
harder for people to set up replication that way is wal_level=minimal
by default, and in a smaller sense max_wal_senders (but
wal_level=minimal also has the interesting property that it's not
enough to change it to wal_level=hot_standby if you figure it out too
late - you have to turn off hot standby on the slave, start it, have
it catch up, shut it down, and reenable hot standby). And they
requires a *restart* of the master, which is a lot worse than a small
change to the config of the *slave*. So unless you're suggesting to
change the default of those two values as well, I'm not sure it really
helps that much...


> Think of someone setting up a test server, by setting it up as a standby
> from the master. Now, when someone holds a transaction open in the test
> server, you get bloat in the master. Or if you set up a standby for
> reporting purposes - a very common use case - you would not expect a long
> running ad-hoc query in the standby to bloat the master. That's precisely
> why you set up such a standby in the first place.
>
> You could of course still turn it off, but you would have to know about it
> in the first place. I think it's a reasonable assumption that a standby does
> *not* affect the master (aside from the bandwidth and disk space required to
> retain/ship the WAL). If you have to remember to explicitly set a GUC to get
> that behavior, that's a pretty big gotcha.

+1. Having your reporting query time out *shows you* the problem.
Having the master bloat for you won't show the problem until later -
when it's much bigger, and it's much more pain to recover from.


--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+

2012-11-30 Thread Magnus Hagander

On Fri, Nov 30, 2012 at 10:09 PM, Tom Lane  wrote:
> Claudio Freire  writes:
>> Without hot standby feedback, reporting queries are impossible. I've
>> experienced it. Cancellations make it impossible to finish any
>> decently complex reporting query.
>
> The original expectation was that slave-side cancels would be
> infrequent.  Maybe there's some fixing/tuning to be done there.

It depends completely on the query pattern on the master. Saying that
cancellations makes it "impossible to finish any decently complex
reporting query" is completely incorrect - it depends on the queries
on the *master*, not on the complexity of the query on the slave. I
know a lot of scenarios where query cancels pretty much never happen
at all.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+

2012-11-30 Thread Andres Freund

On 2012-11-30 22:46:06 +0200, Heikki Linnakangas wrote:
> On 30.11.2012 21:02, Andres Freund wrote:
> >Hi,
> >
> >The subject says it all.
> >
> >There are workloads where its detrimental, but in general having it
> >default to on improver experience tremendously because getting conflicts
> >because of vacuum is rather confusing.
> >
> >In the workloads where it might not be a good idea (very long queries on
> >the standby, many dead tuples on the primary) you need to think very
> >carefuly about the strategy of avoiding conflicts anyway, and explicit
> >configuration is required as well.
> >
> >Does anybody have an argument against changing the default value?
>
> -1. By default, I would expect a standby server to not have any meaningful
> impact on the performance of the master. With hot standby feedback, you can
> bloat the master very badly if you're not careful.

True. But everyone running postgres hopefully knows the problem
already. So that effect is relatively easy to explain.

The other control possibilities we have are rather hard to understand
and to setup in my experience.

> Think of someone setting up a test server, by setting it up as a standby
> from the master. Now, when someone holds a transaction open in the test
> server, you get bloat in the master. Or if you set up a standby for
> reporting purposes - a very common use case - you would not expect a long
> running ad-hoc query in the standby to bloat the master. That's precisely
> why you set up such a standby in the first place.

But you can't do any meaningful reporting without changing the current
variables around this anyway. If you have any writes on the master
barely any significant query ever completes.
The two basic choices we give people suck more imo:
* you setup a large delay: It possibly takes a very long time to catch
up if the primary dies, you don't see any up2date data in later queries
* you abort queries: You can't do any reporting queries

Both are unusable for most scenarios and getting the former just right
is hard.

Imo a default of on works in far more scenarios than the contrary.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+

2012-11-30 Thread Josh Berkus

All:

Well, the problem is that we have three configurations which only work
for one very common scenario:

- reporting slave: feedback off, very long replication_delay
- load-balancing slave: feedback on, short replication_delay
- backup/failover slave: feedback off, short replication_delay

I don't think anyone without a serious market survey can say that any of
the above scenarios is more common than the others; I run into all three
pretty darned frequently.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Hot Standby Feedback should default to on in 9.3+

2012-11-30 Thread Andres Freund

On 2012-11-30 16:09:15 -0500, Tom Lane wrote:
> Claudio Freire  writes:
> > Without hot standby feedback, reporting queries are impossible. I've
> > experienced it. Cancellations make it impossible to finish any
> > decently complex reporting query.
>
> The original expectation was that slave-side cancels would be
> infrequent.  Maybe there's some fixing/tuning to be done there.

I've mostly seen snapshot conflicts. Its hard to say anything more
precise because we don't log any additional information (its admittedly
not easy).

I think it would already help a lot if
ResolveRecoveryConflictWithSnapshot would log:
* the relfilenode (already passed)
* the removed xid
* the pid of the backend holding the oldest snapshot
* the oldest xid of that backend

Most of that should be easy to get.

But I don't think we really can expect a very low rate of conflicts if
the primary has few longrunning transactions but the standby does.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Use of fsync; was Re: [HACKERS] Pg_upgrade speed for many tables

2012-11-30 Thread Bruce Momjian

On Mon, Nov 26, 2012 at 02:43:19PM -0500, Bruce Momjian wrote:
> > >> In any event, I think the documentation should caution that the
> > >> upgrade should not be deemed to be a success until after a system-wide
> > >> sync has been done.  Even if we use the link rather than copy method,
> > >> are we sure that that is safe if the directories recording those links
> > >> have not been fsynced?
> > >
> > > OK, the above is something I have been thinking about, and obviously you
> > > have too.  If you change fsync from off to on in a cluster, and restart
> > > it, there is no guarantee that the dirty pages you read from the kernel
> > > are actually on disk, because Postgres doesn't know they are dirty.
> > > They probably will be pushed to disk by the kernel in less than one
> > > minute, but still, it doesn't seem reliable. Should this be documented
> > > in the fsync section?
> > >
> > > Again, another reason not to use fsync=off, though your example of the
> > > file copy is a good one.  As you stated, this is a problem with the file
> > > copy/link, independent of how Postgres handles the files.  We can tell
> > > people to use 'sync' as root on Unix, but what about Windows?
> > 
> > I'm pretty sure someone mentioned the way to do that on Windows in
> > this list in the last few months, but I can't seem to find it.  I
> > thought it was the initdb fsync thread.
> 
> Yep, the code is already in initdb to fsync a directory --- we just need
> a way for pg_upgrade to access it.

I have developed the attached patch that does this.  It basically adds
an --sync-only option to initdb, then turns off all durability in
pg_upgrade and has pg_upgrade run initdb --sync-only;  this give us
another nice speedup!

 -- SSD  -- magnetic ---
gitpatchgitpatch
1  11.11   11.11   11.10   11.13
 1000  20.57   19.89   20.72   19.30
 2000  28.02   25.81   28.50   27.53
 4000  42.00   43.59   46.71   46.84
 8000  89.66   74.16   89.10   73.67
16000 157.66  135.98  159.97  153.48
32000 316.24  296.90  334.74  308.59
64000 814.97  715.53  797.34  727.94

(I am very happy with these times.  Thanks to Jeff Janes for his
suggestions.)

I have also added documentation to the 'fsync' configuration variable
warning about dirty buffers and recommending flushing them to disk
before the cluster is crash-recovery safe.

I consider this patch ready for 9.3 application (meaning it is not a
prototype).

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/contrib/pg_upgrade/pg_upgrade.c b/contrib/pg_upgrade/pg_upgrade.c
new file mode 100644
index c12f15b..63df529
*** a/contrib/pg_upgrade/pg_upgrade.c
--- b/contrib/pg_upgrade/pg_upgrade.c
*** main(int argc, char **argv)
*** 150,155 
--- 150,161 
  			  new_cluster.pgdata);
  	check_ok();
  
+ 	prep_status("Sync data directory to disk");
+ 	exec_prog(UTILITY_LOG_FILE, NULL, true,
+ 			  "\"%s/initdb\" --sync-only \"%s\"", new_cluster.bindir,
+ 			  new_cluster.pgdata);
+ 	check_ok();
+ 
  	create_script_for_cluster_analyze(&analyze_script_file_name);
  	create_script_for_old_cluster_deletion(&deletion_script_file_name);
  
diff --git a/contrib/pg_upgrade/server.c b/contrib/pg_upgrade/server.c
new file mode 100644
index 49d4c8f..05d8cc0
*** a/contrib/pg_upgrade/server.c
--- b/contrib/pg_upgrade/server.c
*** start_postmaster(ClusterInfo *cluster)
*** 209,217 
  	 * a gap of 20 from the current xid counter, so autovacuum will
  	 * not touch them.
  	 *
! 	 *	synchronous_commit=off improves object creation speed, and we only
! 	 *	modify the new cluster, so only use it there.  If there is a crash,
! 	 *	the new cluster has to be recreated anyway.
  	 */
  	snprintf(cmd, sizeof(cmd),
  			 "\"%s/pg_ctl\" -w -l \"%s\" -D \"%s\" -o \"-p %d%s%s%s%s\" start",
--- 209,217 
  	 * a gap of 20 from the current xid counter, so autovacuum will
  	 * not touch them.
  	 *
! 	 * Turn off durability requirements to improve object creation speed, and
! 	 * we only modify the new cluster, so only use it there.  If there is a
! 	 * crash, the new cluster has to be recreated anyway.
  	 */
  	snprintf(cmd, sizeof(cmd),
  			 "\"%s/pg_ctl\" -w -l \"%s\" -D \"%s\" -o \"-p %d%s%s%s%s\" start",
*** start_postmaster(ClusterInfo *cluster)
*** 219,225 
  			 (cluster->controldata.cat_ver >=
  			  BINARY_UPGRADE_SERVER_FLAG_CAT_VER) ? " -b" :
  			 " -c autovacuum=off -c autovacuum_freeze_max_age=20",
! 			 (cluster == &new_cluster) ? " -c synchronous_commit=off" : "",
  			 cluster->pgopts ? cluster->pgopts : "", socket_string);
  
  	/*
--- 219,226 
  			 (cluster->controldata.cat_ver >=
  			  BINARY_UPGRADE_SERVER_FLAG_CAT_VER) ? " -b" :
  			 " -c autovacuum=o

[HACKERS] --single-transaction hack to pg_upgrade does not work

2012-11-30 Thread Tom Lane

Some of the buildfarm members are failing the pg_upgrade regression test
since commit 12ee6ec71f8754ff3573711032b9b4d5a764ba84.  I can duplicate
it here, and the symptom is:

pg_restore: creating TYPE float8range
pg_restore: creating TYPE insenum
pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 978; 1247 16584 TYPE insenum 
tgl
pg_restore: [archiver (db)] could not execute query: ERROR:  ALTER TYPE ... ADD 
cannot run inside a transaction block
Command was: 
-- For binary upgrade, must preserve pg_type oid
SELECT binary_upgrade.set_next_pg_type_oid('16584'::pg_catalog.oid);

I have not investigated why it apparently passes some places; this looks
to me like a guaranteed failure.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

98 matches

Mail list logo