Robert,

On 09/17/2010 05:52 PM, Robert Haas wrote:
Technically, you could start an autonomous transaction from within an
autonomous transaction, so I don't think there's a hard maximum of one
per normal backend.  However, I agree that the expected case is to not
have very many.

Thanks for pointing that out. I somehow knew that was wrong..

I guess it depends on what your goals are.

Agreed.

If you're optimizing for
ability to respond quickly to a sudden load, keeping idle backends
will probably win even when the number of them you're keeping around
is fairly high.  If you're optimizing for minimal overall resource
consumption, though, you'll not be as happy about that.

What resources are we talking about here? Are idle backends really that resource hungry? My feeling so far has been that idle processes are relatively cheap (i.e. some 100 idle processes shouldn't hurt on a modern server).

What I'm
struggling to understand is this: if there aren't any preforked
workers around when the load hits, how much does it slow things down?

As the startup code is pretty much the same as for the current avlauncher, the coordinator can only request one bgworker at a time.

This means the signal needs to reach the postmaster, which then forks a bgworker process. That new process starts up, connects to the requested database and then sends an imessage to the coordinator to register. Only after having received that registration, the coordinator can request another bgworker (note that this is a one-overall limitation, not per database).

I haven't measured the actual time it takes, but given the use case of a connection pool, I so far thought it's obvious that this process takes too long.

(It's exactly what apache pre-fork does, no? Is anybody concerned about the idle processes there? Or do they consume much less resources?)

I would have thought that a few seconds to ramp up to speed after an
extended idle period (5 minutes, say) would be acceptable for most of
the applications you mention.

A few seconds? That might be sufficient for autovacuum, but most queries are completed in less that one second. So for parallel querying, autonomous transactions and Postgres-R, I certainly don't think that a few seconds are reasonable. Especially considering the cost of idle backends.

Is the ramp-up time longer than that,
or is even that much delay unacceptable for Postgres-R, or is there
some other aspect to the problem I'm failing to grasp?  I can tell you
have some experience tuning this so I'd like to try to understand
where you're coming from.

I didn't ever compare to a max_spare_background_workers = 0 configuration, so I don't have any hard numbers, sorry.

I think this is an interesting example, and worth some further
thought.  I guess I don't really understand how Postgres-R uses these
bgworkers.

The given example doesn't only apply to Postgres-R. But with fewer bgworkers in total, you are more likely to want to use them all for one database, yes.

Are you replicating one transaction at a time, or how does
the data get sliced up?

Yes, one transaction at a time. One transaction per backend (bgworker). On a cluster with n nodes that has only performs writing transactions, avg. at a rate of m concurrent transactions/node, you ideally end up having m normal backends and (n-1) * m bgworkers that concurrently apply the remote transactions.

I remember you mentioning
sync/async/eager/other replication strategies previously - do you have
a pointer to some good reading on that topic?

Postgres-R mainly is eager multi-master replication. www.postgres-r.org has some links, most up-to-date my concept paper:
http://www.postgres-r.org/downloads/concept.pdf

That seems like it would be useful, too.

Okay, will try to come up with something, soon(ish).

Thank you for your feedback and constructive criticism.

Regards

Markus Wanner

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to