Re: [GENERAL] maintenance_work_mem and CREATE INDEX time

2013-07-23 Thread Amit Langote
On Tue, Jul 23, 2013 at 1:11 PM, Amit Langote  wrote:
> Hello,
>
> While understanding the effect of maintenance_work_mem on time taken
> by CREATE INDEX, I observed that for the values of
> maintenance_work_mem less than the value for which an internal sort is
> performed, the time taken by CREATE INDEX increases as
> maintenance_work_increases.
>
> My guess is that for all those values an external sort is chosen at
> some point and larger the value of maintenance_work_mem, later the
> switch to external sort would be made causing CREATE INDEX to take
> longer. That is a smaller value of maintenance_work_mem would be
> preferred for when external sort is performed anyway. Does that make
> sense?
>

Upon further investigation, it is found that the delay to switch to
external sort caused by a larger value of maintenance_work_mem is
small compared to the total time of CREATE INDEX. So, plotting for a
number of maintenance_work_mem values shows that its effect is
negligible. Are there any other parts of external sort logic that
might make it slower with increasing values of maintenance_work_mem.
It seems merge order, number of tapes seem are related with
state->allowedMem.

Does that mean, external sort is affected by the value of
maintenance_work_mem in a way roughly similar to above?

--
Amit Langote


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Different transaction log for database/schema

2013-07-23 Thread Luca Ferrari
On Mon, Jul 22, 2013 at 10:09 AM, Ondrej Chaloupka  wrote:
> Hello,
>
> I would like kindly ask for an advice whether and how the transaction log
> behavior could be configured.
>
> I would like to have possibility to differentiate transaction logs for two
> databases or schema. I would need such configuration for my testing
> environment where I need to run simultaneously different tests using
> transactions. And I need to separate those tests for being sure that they do
> not interfere with each other. I'm using XA transactions in Java.


It is not clear to me to which logs you are referring: the commit logs
or the WALs? Anyway, transaction are numbered in an unique way around
the cluster, and therefore it does not matter on which database you
are going to run the transaction, it will not interfere with other
databases within the same cluster.
I don't get what advantage will give you having logs different for
each database/schema, considering that it is not clear what kind of
maintenance you are going to do on such logs. Can you please better
explain what is your aim and use case?

Luca


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Viewing another role's search path?

2013-07-23 Thread Ian Lawrence Barwick
Is there some simple way of viewing the search path (or other
role-specific setting) for a role different to the current role? Apart from
querying 'pg_db_role_setting' directly?

Just wondering if I'm missing something obvious.

Regards

Ian Barwick


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Different transaction log for database/schema

2013-07-23 Thread Ondrej Chaloupka
Hi,

I'm talking about transaction log for XA/2PC transactions. My tests crash 
application in some phase of the transaction. After restart I would like 
check how the recovery proceeded and I would like verify that all the 
transaction records (I mean info about prepare tx) are gone on DB site. 

Nevertheless what I took from discussion so far is that for my use case I 
need more PostgreSQL installations or to synchronize test runs for not being
run in parallel.

Thank you for all reactions
Ondra



-- Původní zpráva --
Od: Luca Ferrari 
Datum: 23. 7. 2013
Předmět: Re: [GENERAL] Different transaction log for database/schema

"On Mon, Jul 22, 2013 at 10:09 AM, Ondrej Chaloupka  
wrote:
> Hello,
>
> I would like kindly ask for an advice whether and how the transaction log
> behavior could be configured.
>
> I would like to have possibility to differentiate transaction logs for two
> databases or schema. I would need such configuration for my testing
> environment where I need to run simultaneously different tests using
> transactions. And I need to separate those tests for being sure that they 
do
> not interfere with each other. I'm using XA transactions in Java.


It is not clear to me to which logs you are referring: the commit logs
or the WALs? Anyway, transaction are numbered in an unique way around
the cluster, and therefore it does not matter on which database you
are going to run the transaction, it will not interfere with other
databases within the same cluster.
I don't get what advantage will give you having logs different for
each database/schema, considering that it is not clear what kind of
maintenance you are going to do on such logs. Can you please better
explain what is your aim and use case?

Luca


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general";

Re: [GENERAL] Viewing another role's search path?

2013-07-23 Thread Michael Paquier
On Tue, Jul 23, 2013 at 9:15 PM, Ian Lawrence Barwick wrote:

> Is there some simple way of viewing the search path (or other
> role-specific setting) for a role different to the current role? Apart from
> querying 'pg_db_role_setting' directly?
>
This one perhaps?
select rolname, rolconfig from pg_roles;
rolconfig contains all the user-specific settings.
-- 
Michael


Re: [GENERAL] Viewing another role's search path?

2013-07-23 Thread Ian Lawrence Barwick
2013/7/23 Michael Paquier :
>
> On Tue, Jul 23, 2013 at 9:15 PM, Ian Lawrence Barwick 
> wrote:
>>
>> Is there some simple way of viewing the search path (or other
>> role-specific setting) for a role different to the current role? Apart
>> from
>> querying 'pg_db_role_setting' directly?
>
> This one perhaps?
> select rolname, rolconfig from pg_roles;
> rolconfig contains all the user-specific settings.

Thanks, but what I'm looking for is a more elegant (built-in?) way of extracting
the contents of the configuration array which is useable for people who don't
want to poke around in the system catalogues. I've created a view
which does this,
but I want to confirm if there's anything along the lines of "SHOW
search_path FOR ROLE foo"
which I've missed and would save the trouble of maintaining the view.

Regards

Ian Barwick


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Viewing another role's search path?

2013-07-23 Thread Adrian Klaver

On 07/23/2013 05:15 AM, Ian Lawrence Barwick wrote:

Is there some simple way of viewing the search path (or other
role-specific setting) for a role different to the current role? Apart from
querying 'pg_db_role_setting' directly?

Just wondering if I'm missing something obvious.


http://www.postgresql.org/docs/9.2/interactive/app-psql.html

\drds [ role-pattern [ database-pattern ] ]
Lists defined configuration settings. These settings can be 
role-specific, database-specific, or both. role-pattern and 
database-pattern are used to select specific roles and databases to 
list, respectively. If omitted, or if * is specified, all settings are 
listed, including those not role-specific or database-specific, 
respectively.


The ALTER ROLE and ALTER DATABASE commands are used to define per-role 
and per-database configuration settings.




Regards

Ian Barwick





--
Adrian Klaver
adrian.kla...@gmail.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [HACKERS] maintenance_work_mem and CREATE INDEX time

2013-07-23 Thread Jeff Janes
On Mon, Jul 22, 2013 at 9:11 PM, Amit Langote  wrote:
> Hello,
>
> While understanding the effect of maintenance_work_mem on time taken
> by CREATE INDEX, I observed that for the values of
> maintenance_work_mem less than the value for which an internal sort is
> performed, the time taken by CREATE INDEX increases as
> maintenance_work_increases.
>
> My guess is that for all those values an external sort is chosen at
> some point and larger the value of maintenance_work_mem, later the
> switch to external sort would be made causing CREATE INDEX to take
> longer. That is a smaller value of maintenance_work_mem would be
> preferred for when external sort is performed anyway. Does that make
> sense?

The heap structure used in external sorts is cache-unfriendly.  The
bigger the heap used, the more this unfriendliness becomes apparent.
And the bigger maintenance_work_mem, the bigger the heap used.

The bigger heap also means you have fewer "runs" to merge in the
external sort.  However, as long as the number of runs still fits in
the same number of merge passes, this is generally not a meaningful
difference.

Ideally the planner (or something) would figure out how much memory
would be needed to complete an external sort in just one external
pass, and then lower the effective maintenance_work_mem to that
amount.  But that would be hard to do with complete precision, and the
consequences of getting it wrong are asymmetric.

(More thoroughly, it would figure out the number of passes needed for
the given maintenance_work_mem, and then lower the effective
maintenance_work_mem to the smallest value that gives the same number
of passes. But for nearly all practical situations, I think the number
of passes for an index build is going to be 0 or 1.)

Cheers,

Jeff


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Fwd: [GENERAL] odd locking behaviour

2013-07-23 Thread pg noob
Thank you Jeff and others for the responses.

One concern that I have is that even cases where there is no deadlock it is
still acquiring stronger locks than necessary.  I only discovered it
because of the deadlock issue but I presume that there are many cases where
it is acquiring a lock on the foreign table and really doesn't need to.
That would seem to lead to higher lock contention in general which although
it may not cause a deadlock in every case, does affect the overall
performance.

In my opinion this should be a case where it ought not to acquire any locks
on the foreign table at all.
It's not as though the columns relevant to the foreign key reference have
changed.

Anyway, at least it's good to understand it more.  I'll have to find some
way to work around this in my application.

Thank you!




On Mon, Jul 22, 2013 at 12:48 PM, Jeff Janes  wrote:

> On Sun, Jul 21, 2013 at 9:15 PM, Pavel Stehule 
> wrote:
> > hello
> >
> > It can be artefact of RI implementation.
> >
> > see
> http://michael.otacoo.com/postgresql-2/postgres-9-3-feature-highlight-for-key-share-and-for-no-key-update/
> >
> > Try to 9.3, please, where RI uses more gently locks
>
> It still behaves this way in 9.4dev.
>
> >>
> >> On Mon, Jul 8, 2013 at 9:54 AM, pg noob  wrote:
> >>>
> >>>
> >>> Thank you for the responses.  Is it a bug?
>
> I don't think so.  While PostgreSQL of course strives for maximum
> concurrency, it makes no guarantee that it uses the weakest
> theoretically possible locking in all possible cases.  But it is kind
> of unfortunate that updating the same row twice causes a lock
> escalation when it is not obvious it should do so, because as you
> found that makes avoiding deadlocks quite difficult.
>
> I'm rather surprised it doesn't block at the first update of the 2nd
> session, rather than waiting for the 2nd update of that session.
>
> Anyway, when the 2nd session re-updates the same row in the same
> transaction, it uses a 'multixact' to record this.   Doing that
> apparently defeats some locking optimization that takes place under
> simpler cases.
>
> Sorry, that probably isn't the definitive answer you were hoping for.
>
> Cheers,
>
> Jeff
>


[GENERAL] process deadlocking on its own transactionid?

2013-07-23 Thread Kevin Goess
We're seeing a problem with some of our processes hanging on locks.  The
select below makes it look like it's *waiting* for a ShareLock on
transactionid, but it *has* an ExclusiveLock on the same value in
virtualxid.

That makes it look like the process has deadlocked on its own
transactionid.  Or are we reading the results of this query wrong, and this
is expected behavior, and our problem lies elsewhere? (Yes, the process is
doing a "select for update" on this context_objects table according to
pg_stat_activity)

production=> select locktype, database, relname, relation, virtualxid,
virtualtransaction, pid, mode, granted
from pg_locks left outer join pg_class on pg_class.oid = pg_locks.relation
where pid = 2288;
   locktype| database |   relname| relation |  virtualxid
 | virtualtransaction | pid  |  mode   | granted
---+--+--+--+--++--+-+-
 relation  |41194 | context_objects_pkey | 95318843 |
 | 123/45694692   | 2288 | AccessShareLock | t
 relation  |41194 | context_objects  |41553 |
 | 123/45694692   | 2288 | RowShareLock| t
 virtualxid|  |  |  | 123/45694692
| 123/45694692   | 2288 | ExclusiveLock   | t
 transactionid |  |  |  |
 | 123/45694692   | 2288 | ShareLock   | f
 tuple |41194 | context_objects  |41553 |
 | 123/45694692   | 2288 | ExclusiveLock   | t
(5 rows)
```

-- 
Kevin M. Goess
Software Engineer
Berkeley Electronic Press
kgo...@bepress.com

510-665-1200 x179
www.bepress.com

bepress: sustainable scholarly publishing


Re: [GENERAL] odd locking behaviour

2013-07-23 Thread Alvaro Herrera
Moshe Jacobson escribió:
> I wish one of the PG developers would respond to this...

Have you seen the thread in pgsql-bugs?
http://www.postgresql.org/message-id/e1uwamw-vh...@wrigleys.postgresql.org

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [HACKERS] maintenance_work_mem and CREATE INDEX time

2013-07-23 Thread Jeff Janes
On Tue, Jul 23, 2013 at 1:23 AM, Amit Langote  wrote:
> On Tue, Jul 23, 2013 at 1:11 PM, Amit Langote  wrote:
>> Hello,
>>
>> While understanding the effect of maintenance_work_mem on time taken
>> by CREATE INDEX, I observed that for the values of
>> maintenance_work_mem less than the value for which an internal sort is
>> performed, the time taken by CREATE INDEX increases as
>> maintenance_work_increases.
>>
>> My guess is that for all those values an external sort is chosen at
>> some point and larger the value of maintenance_work_mem, later the
>> switch to external sort would be made causing CREATE INDEX to take
>> longer. That is a smaller value of maintenance_work_mem would be
>> preferred for when external sort is performed anyway. Does that make
>> sense?
>>
>
> Upon further investigation, it is found that the delay to switch to
> external sort caused by a larger value of maintenance_work_mem is
> small compared to the total time of CREATE INDEX.

If you are using trace_sort to report that, it reports the switch as
happening as soon as it runs out of memory.

At point, all we have been doing is reading tuples into memory.  The
time it takes to do that will depend on maintenance_work_mem, because
that affects how many tuples fit in memory.  But all the rest of the
tuples need to be read sooner or later anyway, so pushing more of them
to later doesn't improve things overall it just shifts timing around.

After it reports the switch, it still needs to heapify the existing
in-memory tuples before the tapesort proper can begin.  This is where
the true lost opportunities start to arise, as the large heap starts
driving cache misses which would not happen at all under different
settings.

Once the existing tuples are heapified, it then continues to use the
heap to pop tuples from it to write out to "tape", and to push newly
read tuples onto it.  This also suffers lost opportunities.

Once all the tuples are written out and it starts merging, then the
large maintenance_work_mem is no longer a penalty as the new heap is
limited by the number of tapes, which is almost always much smaller.
In fact this stage will actually be faster, but not by enough to make
up for the earlier slow down.

So it is not surprising that the time before the switch is reported is
a small part of the overall time difference.


Cheers,

Jeff


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] process deadlocking on its own transactionid?

2013-07-23 Thread Jeff Janes
On Tue, Jul 23, 2013 at 12:54 PM, Kevin Goess  wrote:
> We're seeing a problem with some of our processes hanging on locks.  The
> select below makes it look like it's *waiting* for a ShareLock on
> transactionid, but it *has* an ExclusiveLock on the same value in
> virtualxid.

It has an ExclusiveLock on itself, but that is independent of the
ShareLock it is waiting for.

The transaction it is waiting for is in the transactionid column,
which is not in your select list.  The virtualxid column seems pretty
useless to me, I don't really know why it is there.  Also, since you
are filtering for only the blocked pid, you will not see the blocking
pid in your results, which is probably what you really want to see.

> That makes it look like the process has deadlocked on its own transactionid.
> Or are we reading the results of this query wrong, and this is expected
> behavior, and our problem lies elsewhere?

You are reading the results wrong, which is very easy to do.

For this type of lock, you need to join the table to itself on the
transactionid column.

http://wiki.postgresql.org/wiki/Lock_Monitoring

Cheers,

Jeff


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] process deadlocking on its own transactionid?

2013-07-23 Thread Alvaro Herrera
Jeff Janes escribió:

> The transaction it is waiting for is in the transactionid column,
> which is not in your select list.  The virtualxid column seems pretty
> useless to me, I don't really know why it is there.

If you do CREATE INDEX CONCURRENTLY and it has to wait for other
processes to finish their current transactions (in order for them to
release their snapshots, which is what it needs to ensure the index can
be enabled), it will use the virtualxid.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Why are stored procedures looked on so negatively?

2013-07-23 Thread Some Developer
I've done quite a bit of reading on stored procedures recently and the 
consensus seems to be that you shouldn't use them unless you really must.


I don't understand this argument. If you implement all of your logic in 
the application then you need to make a network request to the database 
server, return the required data from the database to the app server, do 
the processing and then return the results. A stored procedure is going 
to be a lot faster than that even if you just take away network latency 
/ transfer time.


I'm in the middle of building a database and was going to make extensive 
use of stored procedures and trigger functions because it makes more 
sense for the actions to happen at the database layer rather than in the 
app layer.


Should I use them or not?


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why are stored procedures looked on so negatively?

2013-07-23 Thread hidayat365
I presume you're refering to trigger. Since trigger often do something 
automagically :) and it sometime make developer hard to debug when something 
wrong since they they do not aware that there are triggers exist in database.

Stored procedure is OK.

CIIMW
Sent from my BlackBerry®
powered by Sinyal Kuat INDOSAT

-Original Message-
From: Some Developer 
Sender: pgsql-general-owner@postgresql.orgDate: Wed, 24 Jul 2013 01:29:14 
To: 
Subject: [GENERAL] Why are stored procedures looked on so negatively?

I've done quite a bit of reading on stored procedures recently and the 
consensus seems to be that you shouldn't use them unless you really must.

I don't understand this argument. If you implement all of your logic in 
the application then you need to make a network request to the database 
server, return the required data from the database to the app server, do 
the processing and then return the results. A stored procedure is going 
to be a lot faster than that even if you just take away network latency 
/ transfer time.

I'm in the middle of building a database and was going to make extensive 
use of stored procedures and trigger functions because it makes more 
sense for the actions to happen at the database layer rather than in the 
app layer.

Should I use them or not?


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why are stored procedures looked on so negatively?

2013-07-23 Thread Adrian Klaver

On 07/23/2013 05:29 PM, Some Developer wrote:

I've done quite a bit of reading on stored procedures recently and the
consensus seems to be that you shouldn't use them unless you really must.

I don't understand this argument. If you implement all of your logic in
the application then you need to make a network request to the database
server, return the required data from the database to the app server, do
the processing and then return the results. A stored procedure is going
to be a lot faster than that even if you just take away network latency
/ transfer time.

I'm in the middle of building a database and was going to make extensive
use of stored procedures and trigger functions because it makes more
sense for the actions to happen at the database layer rather than in the
app layer.

Should I use them or not?


Personally I figure the arguments for and against are closely correlated 
with where on the development chain you are, and are tied in with job 
security. If you are an app developer than it is in your interest to 
have code in the app, if you are a database developer in the database. 
Me, I am tend to go with your argument about keeping procedures, where 
appropriate, in the database for the reasons you state. In other words 
an API in the database.








--
Adrian Klaver
adrian.kla...@gmail.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why are stored procedures looked on so negatively?

2013-07-23 Thread Kevin Goess
On Tue, Jul 23, 2013 at 5:40 PM, Adrian Klaver wrote:

> On 07/23/2013 05:29 PM, Some Developer wrote:
>
>> I'm in the middle of building a database and was going to make extensive
>> use of stored procedures and trigger functions because it makes more
>> sense for the actions to happen at the database layer rather than in the
>> app layer.
>>
>> Should I use them or not?
>>
>
> Personally I figure the arguments for and against are closely correlated
> with where on the development chain you are, and are tied in with job
> security. If you are an app developer than it is in your interest to have
> code in the app, if you are a database developer in the database.


What he says is very true. But make sure to think about things that may
already be set up to manage the application code: versioning, testing, unit
testing, packaging, release process, and documentation--how much of that is
in place for your stored procedures and triggers?  If a developer makes a
change to application code, it gets checked in to source control, unit
tested, code reviewed, goes through some QA, and is staged for the next
roll to production--will that all happen for your stored procedures?  And
consider, there is already logic in the application, now some of the logic
will be in the application and some of it will be in the database--does it
make sense to have it in two places?

I think those are the kind of concerns that make people shy about putting
too much logic in the database.  None of them are insurmountable, but you
should at least think about them.


Re: [GENERAL] Why are stored procedures looked on so negatively?

2013-07-23 Thread John Meyer
Taking an absolutist position either way is pretty blind.   What is the 
purpose of the procedure?  Is it enforcing business rules?  Are these 
rules that must be enforced against already existing data or are they 
more akin to validation of a credit card.  How many people are accessing 
your database at one time?  And most importantly, what are you best at?


Adrian Klaver wrote:

On 07/23/2013 05:29 PM, Some Developer wrote:

I've done quite a bit of reading on stored procedures recently and the
consensus seems to be that you shouldn't use them unless you really 
must.


I don't understand this argument. If you implement all of your logic in
the application then you need to make a network request to the database
server, return the required data from the database to the app server, do
the processing and then return the results. A stored procedure is going
to be a lot faster than that even if you just take away network latency
/ transfer time.

I'm in the middle of building a database and was going to make extensive
use of stored procedures and trigger functions because it makes more
sense for the actions to happen at the database layer rather than in the
app layer.

Should I use them or not?


Personally I figure the arguments for and against are closely 
correlated with where on the development chain you are, and are tied 
in with job security. If you are an app developer than it is in your 
interest to have code in the app, if you are a database developer in 
the database. Me, I am tend to go with your argument about keeping 
procedures, where appropriate, in the database for the reasons you 
state. In other words an API in the database.












--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why are stored procedures looked on so negatively?

2013-07-23 Thread Some Developer

On 24/07/13 01:55, John Meyer wrote:

Taking an absolutist position either way is pretty blind.   What is the
purpose of the procedure?  Is it enforcing business rules?  Are these
rules that must be enforced against already existing data or are they
more akin to validation of a credit card.  How many people are accessing
your database at one time?  And most importantly, what are you best at?


Basically what happens is an object is created in the application and 
saved to the database. When the insert has completed I need to start a 
process immediately based on the information in the object on another 
server (Amazon Simple Message Queue to be precise).


So basically I'll have a trigger function that fires on INSERTs and does 
this work. That way the action will only be performed on INSERTs that 
have successfully completed and I can be sure that the trigger will 
always fire.


On top of that there are a few common data structures that need to be 
written to the database that would be perfect for stored procedures 
since there is a little logic involved in saving them which shouldn't 
really be exposed to the application developers.




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why are stored procedures looked on so negatively?

2013-07-23 Thread Andrew Sullivan
On Tue, Jul 23, 2013 at 06:55:56PM -0600, John Meyer wrote:
> are accessing your database at one time?  And most importantly, what
> are you best at?

That is one of the most important questions, for sure, but there's a
close second that I'd suggest: what are the scaling properties?

For practical purposes, if you're going to do complicated data
validation and business logic in the application, you have any
significant degree of contention, and you need to write some data, the
use pattern is going to look something like this (A is application, D
is database):

A: get some data 
D: here you go, optimistic lock value L
A: do some work
A: given this value, get some more data
D: here you go, optimistic lock value L2
A: INS/UPD/DEL data, optimistic lock value L, RETURNING data
D: ok, here you go, optimistic lock value L3
A: do some work
A: INS/UPD/DEL data, optimistic lock value L3
D: ok

And that's if none of the optimistic locks fails.  That's a lot of
round trips.  If you have 20 transactions a minute, this is just fine.
If you have 2000 transactions per second, it totally sucks: you're
buried in round trips.

In my experience, if you want your application to scale to large
numbers of users, you need to avoid application<->database round
trips.

Best,

A

-- 
Andrew Sullivan
a...@crankycanuck.ca


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [HACKERS] maintenance_work_mem and CREATE INDEX time

2013-07-23 Thread Amit Langote
On Wed, Jul 24, 2013 at 6:02 AM, Jeff Janes  wrote:
> On Tue, Jul 23, 2013 at 1:23 AM, Amit Langote  wrote:
>> On Tue, Jul 23, 2013 at 1:11 PM, Amit Langote  
>> wrote:
>>> Hello,
>>>
>>> While understanding the effect of maintenance_work_mem on time taken
>>> by CREATE INDEX, I observed that for the values of
>>> maintenance_work_mem less than the value for which an internal sort is
>>> performed, the time taken by CREATE INDEX increases as
>>> maintenance_work_increases.
>>>
>>> My guess is that for all those values an external sort is chosen at
>>> some point and larger the value of maintenance_work_mem, later the
>>> switch to external sort would be made causing CREATE INDEX to take
>>> longer. That is a smaller value of maintenance_work_mem would be
>>> preferred for when external sort is performed anyway. Does that make
>>> sense?
>>>
>>
>> Upon further investigation, it is found that the delay to switch to
>> external sort caused by a larger value of maintenance_work_mem is
>> small compared to the total time of CREATE INDEX.
>
> If you are using trace_sort to report that, it reports the switch as
> happening as soon as it runs out of memory.
>
> At point, all we have been doing is reading tuples into memory.  The
> time it takes to do that will depend on maintenance_work_mem, because
> that affects how many tuples fit in memory.  But all the rest of the
> tuples need to be read sooner or later anyway, so pushing more of them
> to later doesn't improve things overall it just shifts timing around.
>
> After it reports the switch, it still needs to heapify the existing
> in-memory tuples before the tapesort proper can begin.  This is where
> the true lost opportunities start to arise, as the large heap starts
> driving cache misses which would not happen at all under different
> settings.
>
> Once the existing tuples are heapified, it then continues to use the
> heap to pop tuples from it to write out to "tape", and to push newly
> read tuples onto it.  This also suffers lost opportunities.
>
> Once all the tuples are written out and it starts merging, then the
> large maintenance_work_mem is no longer a penalty as the new heap is
> limited by the number of tapes, which is almost always much smaller.
> In fact this stage will actually be faster, but not by enough to make
> up for the earlier slow down.
>
> So it is not surprising that the time before the switch is reported is
> a small part of the overall time difference.
>

So, is it the actual sorting (before merging) that suffers with larger
maintenance_work_mem? I am sorry but I can not grasp the complexity of
external sort code at this point, so all I can say is that during an
external sort a smaller value of maintenance_work_mem is beneficial
(based on my observations in tests). But how that follows from what is
going on in the implementation of external sort is still something I
am working on understanding.


-- 
Amit Langote


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [HACKERS] maintenance_work_mem and CREATE INDEX time

2013-07-23 Thread Amit Langote
On Wed, Jul 24, 2013 at 11:30 AM, Amit Langote  wrote:
> On Wed, Jul 24, 2013 at 6:02 AM, Jeff Janes  wrote:

>> If you are using trace_sort to report that, it reports the switch as
>> happening as soon as it runs out of memory.
>>
>> At point, all we have been doing is reading tuples into memory.  The
>> time it takes to do that will depend on maintenance_work_mem, because
>> that affects how many tuples fit in memory.  But all the rest of the
>> tuples need to be read sooner or later anyway, so pushing more of them
>> to later doesn't improve things overall it just shifts timing around.
>>
>> After it reports the switch, it still needs to heapify the existing
>> in-memory tuples before the tapesort proper can begin.  This is where
>> the true lost opportunities start to arise, as the large heap starts
>> driving cache misses which would not happen at all under different
>> settings.
>>
>> Once the existing tuples are heapified, it then continues to use the
>> heap to pop tuples from it to write out to "tape", and to push newly
>> read tuples onto it.  This also suffers lost opportunities.
>>
>> Once all the tuples are written out and it starts merging, then the
>> large maintenance_work_mem is no longer a penalty as the new heap is
>> limited by the number of tapes, which is almost always much smaller.
>> In fact this stage will actually be faster, but not by enough to make
>> up for the earlier slow down.
>>
>> So it is not surprising that the time before the switch is reported is
>> a small part of the overall time difference.
>>
>
> So, is it the actual sorting (before merging) that suffers with larger
> maintenance_work_mem? I am sorry but I can not grasp the complexity of
> external sort code at this point, so all I can say is that during an
> external sort a smaller value of maintenance_work_mem is beneficial
> (based on my observations in tests). But how that follows from what is
> going on in the implementation of external sort is still something I
> am working on understanding.
>

Or does the increased create index time follow from something else
altogether (not any part of external sort) may be still another
question. Since we have to relate  that to maintenance_work_mem, the
first thing I could think of was to look at sorting part of it.



-- 
Amit Langote


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why are stored procedures looked on so negatively?

2013-07-23 Thread Some Developer

On 24/07/13 02:56, Andrew Sullivan wrote:

On Tue, Jul 23, 2013 at 06:55:56PM -0600, John Meyer wrote:

are accessing your database at one time?  And most importantly, what
are you best at?


That is one of the most important questions, for sure, but there's a
close second that I'd suggest: what are the scaling properties?

For practical purposes, if you're going to do complicated data
validation and business logic in the application, you have any
significant degree of contention, and you need to write some data, the
use pattern is going to look something like this (A is application, D
is database):

 A: get some data
 D: here you go, optimistic lock value L
 A: do some work
 A: given this value, get some more data
 D: here you go, optimistic lock value L2
 A: INS/UPD/DEL data, optimistic lock value L, RETURNING data
 D: ok, here you go, optimistic lock value L3
 A: do some work
 A: INS/UPD/DEL data, optimistic lock value L3
 D: ok

And that's if none of the optimistic locks fails.  That's a lot of
round trips.  If you have 20 transactions a minute, this is just fine.
If you have 2000 transactions per second, it totally sucks: you're
buried in round trips.

In my experience, if you want your application to scale to large
numbers of users, you need to avoid application<->database round
trips.

Best,

A



Thanks for the response. Obviously since I am still in the development 
stage I have no idea of the number of transactions I will need to handle 
but the business has the potential to be quite popular so I'd rather be 
safe than sorry and be able to handle large amounts of traffic from day one.


I think ultimately it'll be simpler this way because the system I am 
developing is a quasi distributed system with lots of independent parts 
that need to be able to communicate and to share data with each other.



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why are stored procedures looked on so negatively?

2013-07-23 Thread Albe Laurenz
Some Developer wrote:
> On 24/07/13 01:55, John Meyer wrote:
> > Taking an absolutist position either way is pretty blind.   What is the
> > purpose of the procedure?  Is it enforcing business rules?  Are these
> > rules that must be enforced against already existing data or are they
> > more akin to validation of a credit card.  How many people are accessing
> > your database at one time?  And most importantly, what are you best at?
> 
> Basically what happens is an object is created in the application and
> saved to the database. When the insert has completed I need to start a
> process immediately based on the information in the object on another
> server (Amazon Simple Message Queue to be precise).
> 
> So basically I'll have a trigger function that fires on INSERTs and does
> this work. That way the action will only be performed on INSERTs that
> have successfully completed and I can be sure that the trigger will
> always fire.

If you want to write a (trigger) function that starts a process on
a remote machine, there are a few points to think about:

- Should the INSERT fail if the remote process cannot be started?
  If yes, then a trigger is a good idea.
- If you code it as a trigger, be aware that the transaction
  is not complete until the remote process has been started.
  That might be a noticable delay and might affect concurrency
  negatively.

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] [HACKERS] maintenance_work_mem and CREATE INDEX time

2013-07-23 Thread Amit Langote
On Wed, Jul 24, 2013 at 3:20 AM, Jeff Janes  wrote:
> On Mon, Jul 22, 2013 at 9:11 PM, Amit Langote  wrote:
>> Hello,
>>
>> While understanding the effect of maintenance_work_mem on time taken
>> by CREATE INDEX, I observed that for the values of
>> maintenance_work_mem less than the value for which an internal sort is
>> performed, the time taken by CREATE INDEX increases as
>> maintenance_work_increases.
>>
>> My guess is that for all those values an external sort is chosen at
>> some point and larger the value of maintenance_work_mem, later the
>> switch to external sort would be made causing CREATE INDEX to take
>> longer. That is a smaller value of maintenance_work_mem would be
>> preferred for when external sort is performed anyway. Does that make
>> sense?
>
> The heap structure used in external sorts is cache-unfriendly.  The
> bigger the heap used, the more this unfriendliness becomes apparent.
> And the bigger maintenance_work_mem, the bigger the heap used.
>
> The bigger heap also means you have fewer "runs" to merge in the
> external sort.  However, as long as the number of runs still fits in
> the same number of merge passes, this is generally not a meaningful
> difference.

Does fewer runs mean more time (in whichever phase of external sort)?


-- 
Amit Langote


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general