Re: [HACKERS] pgbench more operators & functions

2016-10-08 Thread Fabien COELHO


Hello Tom,

I comment here on the first part of your remarks. I've answered the second 
part in another mail.



(1) The required schema is slightly different : currently the type used
for holding balances is not wide enough per the TCP-B standard, this mean
maybe having an option to do "pgbench -i --standard-tpcb" which would
generate the right schema, probably it should just change a few INTEGER to
INT8, or maybe use NUMERIC(10). I have not done such a patch yet.


The whole question of the database setup is an interesting one.
If we were to do anything at all here, I'd want to see not only the 
table schemas and initial population, but also the hard-wired "vacuum" 
logic, somehow made not so hard-wired.  I have no good ideas about that. 
The SQL commands could possibly be taken from scripts, but what of all 
the work that's gone into friendly progress reporting for table loading?


I'm unconvince by the current status, especially the default behaviors. I 
think it should do a good sensible representative job, and not be a 
minimum installation.


For instance, the default setup does not use foreign keys. It should be 
the reverse, foreign keys should be included by default and an option 
should be used to lessen the schema quality.


Also, given the heavy UPDATE nature of the pgbench test, a non 100% 
default fill factor on some tables would make sense.


The "friendly progress reporting" only applies to the initial insert: the 
vacuum, primary key and possibly foreign key alterations also take a 
significant time but are not included in the progress report. On the one 
hand that makes sense because pgbench has no clue about the progression of 
these tasks, but on the other hand it means that the "friendly" stops 
halfway in the setup. The default progress reporting is much too verbose 
on any modern hardware, the quiet mode should be the default, or even the 
only option.


Note that I'm not really planing to change any of this because it would 
probably be rejected as it is a significant behavioral change, but I find 
it annoying anyway.


--
Fabien


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench more operators & functions

2016-10-08 Thread Amit Kapila
On Sat, Oct 8, 2016 at 12:58 PM, Fabien COELHO  wrote:
>
> Hello Tom,
>
> I comment here on the first part of your remarks. I've answered the second
> part in another mail.
>
>>> (1) The required schema is slightly different : currently the type used
>>> for holding balances is not wide enough per the TCP-B standard, this mean
>>> maybe having an option to do "pgbench -i --standard-tpcb" which would
>>> generate the right schema, probably it should just change a few INTEGER
>>> to
>>> INT8, or maybe use NUMERIC(10). I have not done such a patch yet.
>>
>>
>> The whole question of the database setup is an interesting one.
>> If we were to do anything at all here, I'd want to see not only the table
>> schemas and initial population, but also the hard-wired "vacuum" logic,
>> somehow made not so hard-wired.  I have no good ideas about that. The SQL
>> commands could possibly be taken from scripts, but what of all the work
>> that's gone into friendly progress reporting for table loading?
>
>
> I'm unconvince by the current status, especially the default behaviors. I
> think it should do a good sensible representative job, and not be a minimum
> installation.
>
> For instance, the default setup does not use foreign keys. It should be the
> reverse, foreign keys should be included by default and an option should be
> used to lessen the schema quality.
>
> Also, given the heavy UPDATE nature of the pgbench test, a non 100% default
> fill factor on some tables would make sense.
>

FWIW, sometime back I have seen that with fill factor 80, at somewhat
moderate client counts (32) on 192 - Hyper Threaded m/c, the
performance is 20~30% better, but at higher client counts, it was same
as 100 fill factor.  I think if go by your theory, one could also
argue to have non-default values autovacuum threshold parameters.
pgbench already has a parameter to specify non-default fillfactor and
I think that is sufficient for anyone to do performance testing.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump getBlobs query broken for 7.3 servers

2016-10-08 Thread Michael Paquier
On Sat, Oct 8, 2016 at 2:59 AM, Stephen Frost  wrote:
> Another approach to this would be to figure out a way for the newer
> testing framework in HEAD to be run against older versions, though we'd
> need to have a field which indicates which version of PG a given test
> should be run against as there are certainly tests of newer capabilities
> than older versions supported.

pg_upgrade would benefit from something like that as well. But isn't
is that something the buildfarm client would be better in managing? I
recall that it runs older branches first, so it would be doable to
point to the compiled builds of the branches already ran and perform
tests on them. Surely we are going to need to code path on branch X
tha tis able to handle test cases depending on the version of the
backend involved, that makes maintenance more difficult in the long
run. Still I cannot think about something that should do on-the-fly
branch checkouts, users should be able to run such tests easily with
just a tarball. Perhaps an idea would be to allow past versions of
Postgres to be installed in a path of the install folder, say
PGINSTALL/bin/old/, then have the tests detect them? installcheck
would be the only thing supported of course for such cross-version
checks.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] vacuumdb -f and -j options (was Question / requests.)

2016-10-08 Thread Amit Kapila
On Fri, Oct 7, 2016 at 10:16 PM, Alvaro Herrera
 wrote:
> Robert Haas wrote:
>> On Wed, Oct 5, 2016 at 10:58 AM, Francisco Olarte
>> I don't know, but it seems like the documentation for vacuumdb
>> currently says, more or less, "Hey, if you use -j with -f, it may not
>> work!", which seems unacceptable to me.  It should be the job of the
>> person writing the feature to make it work in all cases, not the job
>> of the person using the feature to work around the problem when it
>> doesn't.
>
> The most interesting use case of vacuumdb is lazy vacuuming, I think, so
> committing that patch as it was submitted previously was a good step
> forward even if it didn't handle VACUUM FULL 100%.
>
> I agree that it's better to have both modes Just Work in parallel, which
> is the point of this subsequent patch.  So let's move forward.  I
> support Francisco's effort to make -f work with -j.  I don't have a
> strong opinion on which of the various proposals presented so far is the
> best way to implement it, but let's figure that out and get it done.
>

After reading Francisco's proposal [1], I don't think it is directly
trying to make -f and -j work together.  He is proposing to make it
work by providing some new options.  As you are wondering upthread, I
think it seems reasonable to disallow -f with parallel vacuuming if no
tables are specified.


[1] - 
https://www.postgresql.org/message-id/CA%2BbJJbx8%2BSKBU%3DXUE%2BHxZHysh9226iMfTnA69AznwRTOEGtR7Q%40mail.gmail.com

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] vacuumdb -f and -j options (was Question / requests.)

2016-10-08 Thread Michael Paquier
On Sat, Oct 8, 2016 at 9:12 PM, Amit Kapila  wrote:
> On Fri, Oct 7, 2016 at 10:16 PM, Alvaro Herrera
>  wrote:
>> Robert Haas wrote:
>>> On Wed, Oct 5, 2016 at 10:58 AM, Francisco Olarte
>>> I don't know, but it seems like the documentation for vacuumdb
>>> currently says, more or less, "Hey, if you use -j with -f, it may not
>>> work!", which seems unacceptable to me.  It should be the job of the
>>> person writing the feature to make it work in all cases, not the job
>>> of the person using the feature to work around the problem when it
>>> doesn't.
>>
>> The most interesting use case of vacuumdb is lazy vacuuming, I think, so
>> committing that patch as it was submitted previously was a good step
>> forward even if it didn't handle VACUUM FULL 100%.
>>
>> I agree that it's better to have both modes Just Work in parallel, which
>> is the point of this subsequent patch.  So let's move forward.  I
>> support Francisco's effort to make -f work with -j.  I don't have a
>> strong opinion on which of the various proposals presented so far is the
>> best way to implement it, but let's figure that out and get it done.
>>
>
> After reading Francisco's proposal [1], I don't think it is directly
> trying to make -f and -j work together.  He is proposing to make it
> work by providing some new options.  As you are wondering upthread, I
> think it seems reasonable to disallow -f with parallel vacuuming if no
> tables are specified.

Instead of restricting completely things, I'd like to think that being
able to make both of them work together is the right move at the end.
>From what I recall from the code of vacuumdb, I agree with Alvaro's
position: it would not be much a complicated challenge to vacuum all
the catalogs in one worker and spread the rest of the tables in the
rest of them. We need to be careful of the case where a list of tables
is given by the user via -t though, in the case where user is passing
both catalog and normal relations.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench more operators & functions

2016-10-08 Thread Fabien COELHO


Hello Amit.


Also, given the heavy UPDATE nature of the pgbench test, a non 100% default
fill factor on some tables would make sense.


FWIW, sometime back I have seen that with fill factor 80, at somewhat
moderate client counts (32) on 192 - Hyper Threaded m/c, the
performance is 20~30% better, but at higher client counts, it was same
as 100 fill factor.


The 20-30% figure is consistent with figures I collected 2 years ago about 
fill factor on HDD, see the beginning run of:


http://blog.coelho.net/database/2014/08/23/postgresql-fillfactor-and-update.html

Although I found that the advantages is reduced after some time because 
once a page has got an update it has some free space which can be taken 
advantage of later on, if the space was not reclaimed by vacuum.


I cannot understand why there would be no advantage with more clients, 
though...


Alas, performance testing is quite sensitive to many details:-(

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_upgrade 9.5 -> 9.6 fails when pg_largeobject is in separate tablespace

2016-10-08 Thread Andreas Joseph Krogh
Hi.
 
(if this is not the right forum, please point me to it)
 
I have an issue with pg_upgrade upgrading 9.5 to 9.6. (my system is 
Ubuntu-16.04 and packages from http://apt.postgresql.org/)
 
In short pg_upgrade fails with:
 
Linking user relation files
 No match found in new cluster for old relation with OID 16388 in database 
"andreak": "pg_toast.pg_toast_2613" which is the TOAST table for 
"pg_catalog.pg_largeobject"
 No match found in new cluster for old relation with OID 16390 in database 
"andreak": "pg_toast.pg_toast_2613_index" which is an index on 
"pg_toast.pg_toast_2613" which is the TOAST table for 
"pg_catalog.pg_largeobject"
Failed to match up old and new tables in database "andreak"
 Failure, exiting
 
I issued the following command:
PG_NEW_VERSION=9.6
 PG_OLD_VERSION=9.5
 /usr/lib/postgresql/$PG_NEW_VERSION/bin/pg_upgrade \
           --old-bindir=/usr/lib/postgresql/$PG_OLD_VERSION/bin/ \
           --new-bindir=/usr/lib/postgresql/$PG_NEW_VERSION/bin/ \
           --old-datadir=/var/lib/postgresql/$PG_OLD_VERSION/main \
           --new-datadir=/var/lib/postgresql/$PG_NEW_VERSION/main \
           -o " -c 
config_file=/etc/postgresql/$PG_OLD_VERSION/main/postgresql.conf" \
           -O " -c 
config_file=/etc/postgresql/$PG_NEW_VERSION/main/postgresql.conf" \
           --link

 
I have pg_largeobject in a separate tablespace, moved by the following command:
psql -U postgres -c "alter table pg_largeobject set tablespace andreak_lo" -d 
andreak
 
 
(I've set allow_system_table_mods=on in postgresql.conf)
 
Else I have tablespaces configured outside PGDATA:
mkdir /var/lib/postgresql/9.5/tablespaces/andreak
 mkdir /var/lib/postgresql/9.5/tablespaces_lo/andreak
 
These are symlinked:
ln -s /storage/wal/9.5/pg_xlog /var/lib/postgresql/9.5/main/pg_xlog
 ln -s /storage/fast_ssd/9.5/tablespaces /var/lib/postgresql/9.5/tablespaces
 ln -s /storage/fast_ssd/9.5/tablespaces_lo 
/var/lib/postgresql/9.5/tablespaces_lo
 
 
psql -c "create tablespace andreak OWNER andreak location 
'/var/lib/postgresql/9.5/tablespaces/andreak'" postgres;
 psql -c "create tablespace andreak_lo OWNER andreak location 
'/var/lib/postgresql/9.5/tablespaces_lo/andreak'" postgres;
 createdb --tablespace=andreak -O andreak andreak
psql -U postgres -c "alter table pg_largeobject set tablespace andreak_lo" -d 
andreak

Is this a bug or not a supported configuration by pg_upgraded?
Any hints on how to proceed?
 
Thanks.
 
-- Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andr...@visena.com 
www.visena.com 
 




[HACKERS] regular 10devel pdf build

2016-10-08 Thread Erik Rijkers

Hello,

Would it be possible to regularly build and provide a .pdf of the 
development version?


Ideally, it would show up on

 https://www.postgresql.org/docs/manuals/

(I know there is already a html devel version available.)


thanks,


Erik Rijkers




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench vs. wait events

2016-10-08 Thread Jeff Janes
On Fri, Oct 7, 2016 at 1:28 PM, Robert Haas  wrote:

> On Fri, Oct 7, 2016 at 11:51 AM, Jeff Janes  wrote:
> > What happens if you turn fsync off?  Once a xlog file is fully written,
> it
> > is immediately fsynced, even if the backend is holding WALWriteLock or
> > wal_insert (or both) at the time, and even if synchrounous_commit is off.
> > Assuming this machine has a BBU so that it doesn't have to wait for disk
> > rotation, still fsyncs are expensive because the kernel has to find all
> the
> > data and get it sent over to the BBU, while holding locks.
>
> Scale factor 300, 32 clients, fsync=off:
>
>   5  Lock| tuple
>  18  LWLockTranche   | lock_manager
>  24  LWLockNamed | WALWriteLock
>  88  LWLockTranche   | buffer_content
> 265  LWLockTranche   | wal_insert
> 373  LWLockNamed | CLogControlLock
> 496  LWLockNamed | ProcArrayLock
> 532  Lock| extend
> 540  LWLockNamed | XidGenLock
> 545  Lock| transactionid
>   27067  Client  | ClientRead
>   85364  |
>


Did the TPS go up appreciably?


>
> But I'm not sure you're right about the way the fsync=off code works.
> I think pg_fsync()/pg_fsync_writethrough()/pg_fsync_no_writethrough()
> look at enableFsync and just do nothing if it's false.
>

I think we are in agreement.  I don't know which part you think is wrong.
When I said immediately, I didn't mean unconditionally.

Anyway, based on the reduced wait events, I think this shows that if we
need to do something in the xlog area, probably what it would be is to add
a queue of fully written but un-synced xlog files, so that the syncing can
be delegated to the background wal writer process.  And of course anyone
needing to actually flush their xlog would have to start by flushing the
queue.

(Or perhaps just make the xlog files bigger, and call it a day)

Cheers,

Jeff


Re: [HACKERS] pgbench vs. wait events

2016-10-08 Thread Jeff Janes
On Fri, Oct 7, 2016 at 11:14 PM, Amit Kapila 
wrote:

>
> > Another strategy that may work is actually intentionally
> waiting/buffering
> > some few ms between flushes/fsync,
>
> We do that before attempting to write if user has set "commit_delay"
> and "commit_siblings" guc parameters.
>

If you have a fast, high resolution timer, then one thing you can do is
keep track of when the previous xlog sync finished. Then instead of having
commit_delay be an absolute amount of time to sleep, it would mean "wait
until that amount of time has passed since the previous sync finished."  So
you would set it based on the RPM of your drive, so that the time it is
sleeping to allow more work to happen from other processes is time it would
have to spend waiting on rotational delay anyway.

But I dropped this, because it would be hard to tune, hard to implement in
a cross-platform way, and because anyone with such high performance needs
is just going to buy a nonvolatile write-cache and be done with it.


>
> Now here, we can't buffer the fsync requests as current we are doing
> both writes and fsync under one lock.  However, if we can split the
> work such that writes are done under one lock and fsync under separate
> lock, then probably we can try to buffer fsync requests and after
> fsyncing the current pending requests, we can recheck if there are
> more pending requests and try to flush them.
>

What I implemented at one point was:

(Already have the lock before getting here)
Write to the extent it is ready to be written.
Update the shared structure to reflect written upto.
Drop the lock
fsync
Take the lock again
update the shared structure to reflect flushed upto.
Drop the lock again.

This way, multiple process could all be waiting on the kernel's fsync
response, rather than on each others locks.  What I was hoping would happen
is that if one process wrote everything that was ready and called fsync,
while it was waiting for the platter to come around to the writing head,
more processes could make more data ready, write that more data, and call
an fsync of their own. And the kernel would be smart enough to amalgamate
them together. But the kernel evidently was not that smart, and performance
did not improve.

Cheers,

Jeff


Re: [HACKERS] pgbench vs. wait events

2016-10-08 Thread Jeff Janes
On Fri, Oct 7, 2016 at 8:51 AM, Jeff Janes  wrote:

>
> I think we need to come up with some benchmarking queries which get more
> work done per round-trip to the database. And build them into the binary,
> because otherwise people won't use them as much as they should if they have
> to pass "-f" files around mailing lists and blog postings.   For example,
> we could enclose 5 statements of the TPC-B-like into a single function
> which takes aid, bid, tid, and delta as arguments.  And presumably we could
> drop the other two statements (BEGIN and COMMIT) as well, and rely on
> autocommit to get that job done.  So we could go from 7 statements to 1.
>


Here is an implementation of that.  I've included the calling code as a
patch to pgbench, because if I make it a separate -f file then it is a pain
to get the correct scale and settings of naccounts, etc., into it.

The create script could be integrated into pgbench -i if this is something
we might want to commit.

This gives me an almost 3 fold increase in performance on a system with
fsync turned off:


pgbench -b tpcb-func  -T30 -c8 -j8
tps = 24193.197877 (excluding connections establishing)

pgbench -b tpcb-like  -T30 -c8 -j8
tps = 8434.746300 (excluding connections establishing)


Cheers,

Jeff
create or replace function pgbench_transaction(arg_aid int, arg_bid int, 
arg_tid int, arg_delta int) returns int as $$
DECLARE
  abal int;
BEGIN
UPDATE pgbench_accounts SET abalance = abalance + arg_delta WHERE aid = arg_aid;
SELECT abalance into abal FROM pgbench_accounts WHERE aid = arg_aid;
UPDATE pgbench_tellers SET tbalance = tbalance + arg_delta WHERE tid = arg_tid;
UPDATE pgbench_branches SET bbalance = bbalance + arg_delta WHERE bid = arg_bid;
INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (arg_tid, 
arg_bid, arg_aid, arg_delta, CURRENT_TIMESTAMP);
RETURN abal;
END;
$$
language plpgsql;


pgbench_function.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Switch to unnamed POSIX semaphores as our preferred sema code?

2016-10-08 Thread Tom Lane
Robert Haas  writes:
> On Thu, Oct 6, 2016 at 9:46 AM, Tom Lane  wrote:
>> Can anyone think of a test case that would stress semaphore operations
>> more heavily, without being unrealistic?

> I think it's going to be pretty hard to come up with a non-artificial
> test case that has exhibits meaningful lwlock contention on an 8-core
> system.  If you go back to 9.1, before we had fast-path locking, you
> can do it, because the relation locks and vxid locks do cause
> noticeable contention on the lock manager locks in that version.
> ...
> Alternatively, get a bigger box.  :-)

Well, I did both of the above.  I tried 9.1 on "hydra", that 60-processor
POWER7 box, and cranked the parallelism up to ridiculous levels:

pgbench -S -j 250 -c 250 -M prepared -T 60 bench

Median of 3 runs with sysv semaphores:

number of transactions actually processed: 1554570
tps = 25875.432836 (including connections establishing)
tps = 25894.938187 (excluding connections establishing)

Ditto, for unnamed POSIX semaphores:

number of transactions actually processed: 1726696
tps = 28742.486104 (including connections establishing)
tps = 28765.963071 (excluding connections establishing)

That's about a 10% win for POSIX semaphores.  Now, at saner loads,
I couldn't see much of any difference between the two semaphore APIs.
So I'm still of the opinion that there's not likely to be any meaningful
performance difference in practice, at least not on reasonably recent
Linux machines.  But this does indicate that if there is any difference,
it will probably favor switching.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Showing parallel status in \df+

2016-10-08 Thread Jim Nasby

On 10/3/16 3:18 PM, Pavel Stehule wrote:

I am feeling consensus on removing source of PL from \dt+. There is
partial consensus on saving this field (renamed) for C and internal
language. I am not sure about consensus about \sf enhancing.


FWIW, I'm completely in favor of ditching PL source code. I'm neutral on 
C and internal.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical tape pause/resume

2016-10-08 Thread Peter Geoghegan
Apologies for the delayed response to this.

On Tue, Oct 4, 2016 at 3:47 AM, Heikki Linnakangas  wrote:
> One of the patches in Peter Geoghegan's Parallel tuplesort patch set [1] is
> to put a cap on the number of tapes to use for the sort.

> That amounts to about 8% of the available memory. That's quite wasteful.
> Peter's approach of putting an arbitrary cap on the max number of tapes
> works, but I find it a bit hackish. And you still waste that 8% with smaller
> work_mem settings.

I don't think it's hackish -- there was never a theoretical
justification for using as many tapes as possible from Knuth, or
anyone else. I think that Simon's 2006 work on allowing for a number
of tapes much greater than Knuth's "sweet spot" of 7 was useful only
because it sometimes enabled final on-the-fly merging where we are not
merging a huge number of runs (i.e. when merging only somewhat more
than 7 runs). Your recent work on tape preloading has probably greatly
diminished the value of not doing all merging on-the-fly in a single,
final merge, though. Knuth's sweet spot of 7 had little or nothing to
do with the economics of buying many tape drives.

You shouldn't really waste 8% of the budget with low work_mem settings
with my cap patch applied, because the new cap never limits the number
of tapes. IIRC, the cap of 500 tapes doesn't start to matter until you
have about 1GB of work_mem. So, if there is still any waste at the low
end, that can only be solved by tweaking the main calculation within
tuplesort_merge_order(). (Also, to be clear to others following along:
that memory is never actually allocated, so it's only "wasted" from
the perspective of one sort operation alone).

The cost of multiple passes is paid in sequential I/O of tape temp
files, which is now clearly a relatively low cost. OTOH, the cost of a
larger merge heap is paid in more CPU cache misses, which is a cost we
can feel quite severely. While it's really hard to come up with a
generic break-even point, I do think that there is value in capping
the number of tapes somewhere in the hundreds. It's not like a cap on
the number of tapes along the lines I've proposed was not thought
about from day one, by both Tom and Simon. Noah's relatively recent
MaxAllocSize work has given the issue new importance, though (the same
might have been said during the 9.6 replacement selection vs.
quicksort discussions, actually).

> When we finish writing an initial run to a tape, we keep the tape buffers
> around. That's the reason we need to reserve that memory. But there's no
> fundamental reason we have to do that. As I suggested in [2], we could flush
> the buffers to disk, and only keep in memory the location of the last block
> we wrote. If we need to write another run to the tape, we can reload the
> last incomplete block from the disk, and continue writing.

Okay. But, you haven't actually addressed the problem with non-trivial
amounts of memory being logically allocated ahead of time for no good
reason -- you've only address the constant factor (the overhead
per-tape). Can't we also fix the general problem, by applying a cap?
Better to have a cap that is approximately right (in the hundreds or
so) than one that is exactly wrong (infinity -- no cap).

> Reloading the last block, requires an extra I/O. That's OK. It's quite rare
> to have a multi-pass sort these days, so it's a good bet that you don't need
> to do it. And if you have a very small work_mem, so that you need to do a
> multi-pass sort, having some more memory available for building the initial
> runs is probably worth it, and there's also a good chance that the block is
> still in the OS cache.

That analysis does seem sound to me.

> In addition to saving a little bit of memory, I'd like to do this
> refactoring because it simplifies the code. It's code that has stayed
> largely unchanged for the past 15 years, so I'm not too eager to touch it,
> but looking at the changes coming with Peter's parallel tuplesort patch set,
> I think this makes sense.

I can definitely see value in refactoring, to make that code less
complicated -- I just don't think it's justified by the amount of
memory that is wasted on tapes.

That said, I'm not especially worried about the complexity within the
logtape.c changes of the parallel CREATE INDEX patch alone. I'm much
more interested in having a logtape.c that could be more easily made
to support binary searching, etc, to find *partition* boundaries,
which my CREATE INDEX patch doesn't use or care about. This is needed
for tuplesort.c partition-based sorting. When parallel sort isn't just
used by CREATE INDEX, partitioning becomes important. And, with
partitioning, dynamic sampling is essential (I think that this will
end up living in logtape.c).

To recap on what I went into in the paritioning-to-parallel-tuplesort
thread [1], I think that partitioning will come in a future release,
and will be of more value to parallel queries, where much more can be
pushed down wi

Re: [HACKERS] vacuumdb -f and -j options (was Question / requests.)

2016-10-08 Thread Amit Kapila
On Sat, Oct 8, 2016 at 5:52 PM, Michael Paquier
 wrote:
> On Sat, Oct 8, 2016 at 9:12 PM, Amit Kapila  wrote:
>> On Fri, Oct 7, 2016 at 10:16 PM, Alvaro Herrera
>>  wrote:
>>> Robert Haas wrote:
 On Wed, Oct 5, 2016 at 10:58 AM, Francisco Olarte
 I don't know, but it seems like the documentation for vacuumdb
 currently says, more or less, "Hey, if you use -j with -f, it may not
 work!", which seems unacceptable to me.  It should be the job of the
 person writing the feature to make it work in all cases, not the job
 of the person using the feature to work around the problem when it
 doesn't.
>>>
>>> The most interesting use case of vacuumdb is lazy vacuuming, I think, so
>>> committing that patch as it was submitted previously was a good step
>>> forward even if it didn't handle VACUUM FULL 100%.
>>>
>>> I agree that it's better to have both modes Just Work in parallel, which
>>> is the point of this subsequent patch.  So let's move forward.  I
>>> support Francisco's effort to make -f work with -j.  I don't have a
>>> strong opinion on which of the various proposals presented so far is the
>>> best way to implement it, but let's figure that out and get it done.
>>>
>>
>> After reading Francisco's proposal [1], I don't think it is directly
>> trying to make -f and -j work together.  He is proposing to make it
>> work by providing some new options.  As you are wondering upthread, I
>> think it seems reasonable to disallow -f with parallel vacuuming if no
>> tables are specified.
>
> Instead of restricting completely things, I'd like to think that being
> able to make both of them work together is the right move at the end.
>

Sure, if somebody can come up with a patch which can safely avoid the
deadlock when both -f and -j options are used, then we should go that
way. Otherwise we can block those options to be used together rather
than just have a note in docs.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] vacuumdb -f and -j options (was Question / requests.)

2016-10-08 Thread Pavel Stehule
2016-10-09 7:54 GMT+02:00 Amit Kapila :

> On Sat, Oct 8, 2016 at 5:52 PM, Michael Paquier
>  wrote:
> > On Sat, Oct 8, 2016 at 9:12 PM, Amit Kapila 
> wrote:
> >> On Fri, Oct 7, 2016 at 10:16 PM, Alvaro Herrera
> >>  wrote:
> >>> Robert Haas wrote:
>  On Wed, Oct 5, 2016 at 10:58 AM, Francisco Olarte
>  I don't know, but it seems like the documentation for vacuumdb
>  currently says, more or less, "Hey, if you use -j with -f, it may not
>  work!", which seems unacceptable to me.  It should be the job of the
>  person writing the feature to make it work in all cases, not the job
>  of the person using the feature to work around the problem when it
>  doesn't.
> >>>
> >>> The most interesting use case of vacuumdb is lazy vacuuming, I think,
> so
> >>> committing that patch as it was submitted previously was a good step
> >>> forward even if it didn't handle VACUUM FULL 100%.
> >>>
> >>> I agree that it's better to have both modes Just Work in parallel,
> which
> >>> is the point of this subsequent patch.  So let's move forward.  I
> >>> support Francisco's effort to make -f work with -j.  I don't have a
> >>> strong opinion on which of the various proposals presented so far is
> the
> >>> best way to implement it, but let's figure that out and get it done.
> >>>
> >>
> >> After reading Francisco's proposal [1], I don't think it is directly
> >> trying to make -f and -j work together.  He is proposing to make it
> >> work by providing some new options.  As you are wondering upthread, I
> >> think it seems reasonable to disallow -f with parallel vacuuming if no
> >> tables are specified.
> >
> > Instead of restricting completely things, I'd like to think that being
> > able to make both of them work together is the right move at the end.
> >
>
> Sure, if somebody can come up with a patch which can safely avoid the
> deadlock when both -f and -j options are used, then we should go that
> way. Otherwise we can block those options to be used together rather
> than just have a note in docs.
>

+1

Pavel


>
> --
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: [HACKERS] Showing parallel status in \df+

2016-10-08 Thread Pavel Stehule
Hi

2016-10-08 23:46 GMT+02:00 Jim Nasby :

> On 10/3/16 3:18 PM, Pavel Stehule wrote:
>
>> I am feeling consensus on removing source of PL from \dt+. There is
>> partial consensus on saving this field (renamed) for C and internal
>> language. I am not sure about consensus about \sf enhancing.
>>
>
> FWIW, I'm completely in favor of ditching PL source code. I'm neutral on C
> and internal.
>

here is a patch

Regards

Pavel


> --
> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
> Experts in Analytics, Data Architecture and PostgreSQL
> Data in Trouble? Get it in Treble! http://BlueTreble.com
> 855-TREBLE2 (855-873-2532)   mobile: 512-569-9461
>
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 6275a68..f03f547 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -455,10 +455,12 @@ describeFunctions(const char *functypes, const char *pattern, bool verbose, bool
 		printACLColumn(&buf, "p.proacl");
 		appendPQExpBuffer(&buf,
 		  ",\n l.lanname as \"%s\""
-		  ",\n p.prosrc as \"%s\""
+		  ",\n CASE\n"
+		  "  WHEN l.lanname IN ('c','internal') THEN p.prosrc\n"
+		  "END as \"%s\""
 ",\n pg_catalog.obj_description(p.oid, 'pg_proc') as \"%s\"",
 		  gettext_noop("Language"),
-		  gettext_noop("Source code"),
+		  gettext_noop("Reference name"),
 		  gettext_noop("Description"));
 	}
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers