from:"Andrew Sullivan"

[HACKERS] Time zone database

2011-03-03 Thread Andrew Sullivan

Hackers,

Today on the ietf-applications list, I saw this:

http://www.ietf.org/mail-archive/web/apps-discuss/current/msg02301.html

If there are particular issues with respect to the time zone database
stuff that you all have struggled with and want highlighted, feel free
to send them to me and I'll try to figure out whether they're relevant
to this Internet Draft and the new plans.  I will be at the IETF
meeting in Prague, though I won't plan to attend the session in
question unless someone tells me I ought.

If none of this is relevant to Postgres, sorry for the noise.  I just
saw it in passing and remember some of the annoyances that happened in
the past.

Also, if you want me to see what you have to say, send your mail
directly to me or cc: me.  I can't really keep up with the volume on
this list, and I'm likely to miss it if it's only here.

Best,

A

-- 
Andrew Sullivan
a...@crankycanuck.ca

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Time zone database

2011-03-03 Thread Andrew Sullivan

On Thu, Mar 03, 2011 at 09:27:58PM +0200, Heikki Linnakangas wrote:
> Yes, we use David Olson's tz database. According to this proposal, David  
> Olson is retiring, and they propose that IETF takes over maintainership  
> of the tz database.

Yeah, I guess I ought to have summarized.  That is indeed the plan.

> The idea seems to be that the work to keep the database and tz code  
> current would continue like before, just in the hands of different  
> people, so I don't think this affects us in any way.

There is the possibility that the IETF will be somehow less quick to
cope with changes.  (The IETF is not a speedy way to get anything
done.)  I think that's the biggest reservation I've heard expressed.

Anyway, as long as nobody's worried, I can stand mute :)

Thanks for the reply.

A

-- 
Andrew Sullivan
a...@crankycanuck.ca

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Managing the community information stream

2007-05-08 Thread Andrew Sullivan

On Mon, May 07, 2007 at 07:36:55AM -0500, Jim Nasby wrote:

> Instead, if all feature requests are tracked then users can vote on  
> what's most important to them.

I am sympathetic to the issues you and Andrew are describing (I
understand Bruce's stream analogy, but I think Andrew is right that
from the user's point of view, it's not usable).  But I am not
convinced that users voting on desired features will get us the
users' desired features.  The features we get are mostly the features
that have attracted developers.  The method by which that attraction
happens is interesting, but I don't think it's democratic.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
Information security isn't a technological problem.  It's an economics
problem.
--Bruce Schneier

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Not ready for 8.3

2007-05-17 Thread Andrew Sullivan

On Tue, May 15, 2007 at 04:52:16PM -0400, Alvaro Herrera wrote:

> This is what happens with the Linux kernel.  They have hundreds of
> developers getting their hands dirty during a previous period.  Then
> 2.6.20 is released; the 2.6.21 "merge window" opens, and all sort of
> patches are flooded in.  

I hasten to point out that the Linux kernel has also had several
"stable" releases with huge bugs -- things like massive filesystem
corruption, bizarre failure cases, and nasty compatibility problems
with modules across versions.  I am not entirely sure that the Linux
model is the one to ape.  PostgreSQL has a history with remarkably
few of those blunders, and I'd hate to give that up.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
If they don't do anything, we don't need their acronym.
--Josh Hamilton, on the US FEMA

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Signing off of patches (was Re: Not ready for 8.3)

2007-05-18 Thread Andrew Sullivan

On Fri, May 18, 2007 at 12:05:55PM -0400, Alvaro Herrera wrote:

> there are no obvious, glaring mistakes could go a long way.  (I have
> this weird idea that I should not apply a patch unless someone else says
> "hey, looks OK to me".  Somehow, the mere lack of objections does not
> increase my confidence.)

I have nothing to contribute on the suggestion, since I can neither
offer review nor patches.  But I can offer an analogy that will maybe
strengthen your point, and might offer a hint of how to make the
developer community bigger.

In the IETF working groups I follow, most of the chairs have decided
to impose some baseline level of group review for protocol documents. 
In dnsop, for instance, we have a rule that if at least five people
do not review an Internet Draft and agree to its publication, it just
won't get advanced as a working group document.  The idea is that, if
we can't get that small number of reviews, then either the working
group either isn't interested in the feature or topic, or the draft
is a bad idea as it stands.  

As a result, if you want to have the suasion to get people to review
your own submissions, you also have to do the work of reviewing
others'.  But it also means that if you're new to an area, you can
become better in that area by doing document review.  Probably, your
own reviews won't uncover big flaws that those more experienced with
the protocol will find; but you'll be able to make some small
contributions that will allow you help in getting the documents
finished.  Also, while you're at it, you'll be forced to read all the
referenced documents, which help you learn about the protocol and
therefore make you more valuable to the WG.

Perhaps, then, new contributors to Postgres could also take on the
task of reviewing some of the patches, not as a matter of being the
_only_ reviewer -- the new code still needs review by those more
experienced with the rest of the code -- but as a first-pass review
that will help in a "more eyeballs" sort of way.  This would also
have the happy paedogogical effect that those newer reviewers would
learn more of the code in each cycle.  I think this is similar to a
previous suggestion someone made about "mentored review", but it
doesn't require formal mentoring for it to get started.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
Users never remark, "Wow, this software may be buggy and hard 
to use, but at least there is a lot of code underneath."
--Damien Katz

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Controlling Load Distributed Checkpoints

2007-06-08 Thread Andrew Sullivan

On Fri, Jun 08, 2007 at 09:50:49AM +0100, Heikki Linnakangas wrote:

> dynamics change. But we must also keep in mind that average DBA doesn't 
> change any settings, and might not even be able or allowed to. That 
> means the defaults should work reasonably well without tweaking the OS 
> settings.

Do you mean "change the OS settings" or something else?  (I'm not
sure it's true in any case, because shared memory kernel settings
have to be fiddled with in many instances, but I thought I'd ask for
clarification.)

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
Users never remark, "Wow, this software may be buggy and hard 
to use, but at least there is a lot of code underneath."
--Damien Katz

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] COPYable logs status

2007-06-08 Thread Andrew Sullivan

On Fri, Jun 08, 2007 at 10:29:03AM -0400, Tom Lane wrote:
> 
> The only solution that I can see is to invent some sort of simple
> protocol for the syslogger pipe. 

Perhaps having a look at the current IETF syslog discussion will be
helpful in that case?  (I know it's not directly relevant, but maybe
others have thought about some of these things.  I haven't read the
draft, note.)

http://tools.ietf.org/html/draft-ietf-syslog-protocol-20

There's also the discussion of reliability in RFC 3195:

ftp://ftp.rfc-editor.org/in-notes/rfc3195.txt

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
The whole tendency of modern prose is away from concreteness.
--George Orwell

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Controlling Load Distributed Checkpoints

2007-06-08 Thread Andrew Sullivan

On Fri, Jun 08, 2007 at 10:33:50AM -0400, Greg Smith wrote:
> they'd take care of that as part of routine server setup.  What wouldn't 
> be reasonable is to expect them to tune obscure parts of the kernel just 
> for your application.

Well, I suppose it'd depend on what kind of hosting environment
you're in (if I'm paying for dedicated hosting, you better believe
I'm going to insist they tune the kernel the way I want), but you're
right that in shared hosting for $25/mo, it's not going to happen.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
"The year's penultimate month" is not in truth a good way of saying
November.
--H.W. Fowler

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Avoiding legal email signatures

2007-06-10 Thread Andrew Sullivan

On Sat, Jun 09, 2007 at 06:14:00PM -0400, Bruce Momjian wrote:
> I know we have talked about how to avoid legal email signatures on this
> list.  One idea would be for a small percentage of our users to ignore
> emails with a legal signature.  I know I am less likely to reply to such
> an email.

The problem with that is that you ding people inside large
corporations that are _trying_ to adopt PostgreSQL in the face of
bad corporate policies (like "we standardise on product O" or "all
outbound email gets garbage L appended").  Moreover, people who are
in such environments are often prevented from visiting gmail,
hotmail, or the other likely suspects in order to send their messages
in circumvention of corporate policy.  And remember, such people may
not actually be able to prevent the signature going on by just
ignoring policy -- often, it's added at the gateway on the way out
of the server.

I know they're irritating and stupid, but in the context of a mailing
list they also have zero effect, because the mailing list address is
explicitly public.  I also know that they use extra space in the list
archive, but if we attempted to purge the list archives of every
worthless bit of nonsense in there, surely this wouldn't be the
number one thing on the list (the semi-annual eruption of knee-jerk
"threads are better" discussions probably take more room, for
example).

What we _could_ do, I suppose, is start mail-writing campaigns to
legal departments in companies that insist on such disclaimers,
pointing out the folly of their ways and asking that the policy be
changed to distinguish between list-posting and non-list-posting
accounts.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
The plural of anecdote is not data.
--Roger Brinner

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Avoiding legal email signatures

2007-06-10 Thread Andrew Sullivan

On Sun, Jun 10, 2007 at 12:50:11PM -0400, Greg Smith wrote:

> This is all true, but the reality here is that people in such a situation 
> are usually flat-out violating their corporate policy by posting to the 
> list at all from inside this kind of company.  

We don't know that in advance, and we can't know it, either.  If
someone wants to do something wrong, that's not our responsibility or
fault; and by posting to a mailing list, the purported cover offered
by the disclaimer is lost (on this we seem to have ample agreement).

> I've watched more than one attempt to sneak open-source source into a 
> large enterprise get completely blown away because unapproved mailing list 
> involvement to resolve issues became associated with making corporate 
> information public.

It might not be "sneaking"; it might in fact be two camps within a
company disagreeing about this.  Imagine, for instance, the case
where the operations department of a company (to which the current
DBAs report) are opposed to any changes, because their Oracle DBAs
feel threatened and their boss thinks his career is enhanced by a
large budget under management.  At the same time, the new product
development department is under pressure to lower costs and deliver
new services without adding more licenses.  Since operations controls
the mail servers and the firewalls (and are adding the disclaimer),
the developers will get no help from the operations people in making
things work better.  But if the developers really do deliver a new
service that costs substantially less than, say, what it would have
with Oracle, Postgres gradually finds a place in the company.  And
these developers have a mandate to change things.  That's how change
happens in large companies, and we have to remember that we won't be
talking to the people who don't want the change to happen.

> Ever watched someone get fired for responding to "can you post your config 
> file?" in an environment where that's a clear violation of corporate 
> policy?  I have.

You bet.  But what we seem to be asking in this thread is that people
find some way to violate what is clearly a corporate policy, or we
won't help them.  We don't _know_ whether some other policy is being
violated, and it's not our responsibility to know it either.  Since
we claim that we have such great community support, though, we do
have a responsibility at least to try to support people.

Perhaps we make a policy that corporate-style ("disclaimered") mail
is encouraged to seek support via corporate-style channels (e.g. is
pointed at the commercial support companies).  I'm uncomfortable with
such a policy, but it'd be better than "ignore these nasty corporate
victims", which is what the proposal so far sounds like to me.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
I remember when computers were frustrating because they *did* exactly what 
you told them to.  That actually seems sort of quaint now.
--J.D. Baldwin

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: What does Page Layout version mean? (Was: Re: [HACKERS] Reducing NUMERIC size for 8.3)

2007-06-20 Thread Andrew Sullivan

On Wed, Jun 20, 2007 at 12:34:21PM -0400, Robert Treat wrote:
> FWIW pg_migrator is a pretty good swing at an in-place upgrade tool for 
> 8.1->8.2.   Unfortunately until the PGDG decides that in-place upgrade is a 
> constraint their willing to place on development, I see them a good 
> chicken/egg away from making it a continually usefull tool. 

Or maybe cart/horse.  It seems to me that the rule more likely needs
to be that the migrator follow the development of the database than
that the database engine be strongly constrained by the needs of an
upgrade tool.  I agree that some commitment is needed, though.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
The whole tendency of modern prose is away from concreteness.
--George Orwell

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] GUC time unit spelling a bit inconsistent

2007-06-21 Thread Andrew Sullivan

On Thu, Jun 21, 2007 at 03:24:51PM +0200, Michael Paesold wrote:
> There are valid reasons against 5m as mega-bytes, because here m does 
> not refer to a unit, it refers to a quantifier (if that is a reasonable 
> English word) of a unit. So it should really be 5mb.
> 
> log_rotation_age = 5m
> log_rotation_size = 5mb

Except, of course, that "5mb" would be understood by those of us who
work in metric and use both bits and bytes as 5 millibits.  Which
would be an absurd value, but since Postgres had support for time
travel once, who knows what other wonders the developers have come up
with ;-)  (I will note, though, that this B vs b problem really gets
up my nose, especially when I hear people who are ostensibly
designing networks talking about "gigabyte ethernet" cards.  I would
_like_ such a card, I confess, but to my knowledge the standard
hasn't gotten that far yet.)

Nevertheless, I think that Tom's original suggestion was at least a
HINT, which seems perfectly reasonable to me.  

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
This work was visionary and imaginative, and goes to show that visionary
and imaginative work need not end up well. 
--Dennis Ritchie

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] GUC time unit spelling a bit inconsistent

2007-06-21 Thread Andrew Sullivan

On Thu, Jun 21, 2007 at 11:55:56AM -0400, Tom Lane wrote:
> where the HINT gets appended if there's something after the integer but
> it doesn't look like any of the allowed units.  Objections?

Sounds like a good idea to me.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
The very definition of "news" is "something that hardly ever happens."  
--Bruce Schneier

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Bugtraq: Having Fun With PostgreSQL

2007-06-25 Thread Andrew Sullivan

On Sat, Jun 23, 2007 at 06:14:23PM +0200, Magnus Hagander wrote:
> The benefit would be that PostgreSQL would be "secure by default". Which
> we are *not* today.

To achieve the "secure by default" feature that you want (and I like
the scare-quotes -- I agree with those that think this adds no real
security, but I think you're right to worry about the perception
angle in this case), why not have a ./configure option that sets the
default trust level for the build?  The option could default to
something "secure", but experienced users' build scripts would only
have to be altered to include --default-authentication="trust" or
something like that.  Using this approach, packagers can also
continue to do what they want.

A
-- 
Andrew Sullivan  | [EMAIL PROTECTED]
However important originality may be in some fields, restraint and 
adherence to procedure emerge as the more significant virtues in a 
great many others.   --Alain de Botton

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Bugtraq: Having Fun With PostgreSQL

2007-06-25 Thread Andrew Sullivan

On Mon, Jun 25, 2007 at 01:31:52PM -0400, Tom Lane wrote:
> Why is that better than the initdb-time option we already have?
> Locking down options earlier rather than later is usually not a win.

Like I said, I don't actually think it _is_ better.  But it would
solve the problem that some people think it's a bad thing that you
run superuser-type commands without reading the manual, and then get
a badly-secured system.  (The idea here, incidentally, is not to
replace the initdb-time option, but to set the default of the initdb
command.)

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
When my information changes, I alter my conclusions.  What do you do sir?
--attr. John Maynard Keynes

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Bugtraq: Having Fun With PostgreSQL

2007-06-26 Thread Andrew Sullivan

On Tue, Jun 26, 2007 at 03:08:59PM -0400, Tom Lane wrote:
> It still wouldn't make us "secure by default".  Not unless you propose
> to actually change the default.

Perhaps I wasn't clear enough, but the suggestion I made (for
something I actually don't care about, I hasten to add again) was
that the configure switch _would_ be set to something different by
default.  So it would require that those who need trust to work would
have to enable it on purpose; but (I suggested) at a not-too-invasive
point.  It'd just be another configure option, and you usually have
to reconsider those at major version upgrade anyway.  Nevertheless,

> In any case, what is "secure by default"?  

. . .I agree that the checkbox is a dumb one.  I think the entire
topic isn't worth the amount of electrons already spilled over it.  I
don't see anyone clamouring for this and I see at least one person
strongly opposed.  I think it should be regarded as Dead, Jim.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
I remember when computers were frustrating because they *did* exactly what 
you told them to.  That actually seems sort of quaint now.
--J.D. Baldwin

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] 2PC-induced lockup

2007-07-11 Thread Andrew Sullivan

On Wed, Jul 11, 2007 at 10:43:23AM -0400, Chris Browne wrote:
> The right resolution to this is not, a priori, evident yet.

_A posteriori_, though, it seems to me the right resolution is "don't
do that" ;-)

A
-- 
Andrew Sullivan  | [EMAIL PROTECTED]
Users never remark, "Wow, this software may be buggy and hard 
to use, but at least there is a lot of code underneath."
--Damien Katz

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] 2PC-induced lockup

2007-07-11 Thread Andrew Sullivan

On Wed, Jul 11, 2007 at 04:44:12PM +0100, Heikki Linnakangas wrote:
> FWIW, deleting the files from pg_twophase is safe when the system is 
> shut down.

Is it safe for the PREPAREd TRANSACTIONs?  I assume not.  That is, in
Peter's presumably experimental case, it might be ok to delete the
files, but on a production system, you'd violate the semantics of 2PC
by doing this?

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
Users never remark, "Wow, this software may be buggy and hard 
to use, but at least there is a lot of code underneath."
--Damien Katz

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] 2PC-induced lockup

2007-07-11 Thread Andrew Sullivan

On Wed, Jul 11, 2007 at 06:15:12PM +0100, Heikki Linnakangas wrote:
> It's effectively the same as manually issuing a ROLLBACK PREPARED. It 
> will brake the atomicity of the global transaction, if some branches of 
> that global transaction in other resource managers have already been 
> committed.

But how do you know which file to delete?  Is it keyed to the
transaction identifier or something?

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
However important originality may be in some fields, restraint and 
adherence to procedure emerge as the more significant virtues in a 
great many others.   --Alain de Botton

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] 2PC-induced lockup

2007-07-11 Thread Andrew Sullivan

On Wed, Jul 11, 2007 at 09:26:34PM +0100, Heikki Linnakangas wrote:
> 
> The xid is encoded in the filename. If you can't start up the database 
> and look at pg_locks, you can't do much other than guess.

So then in this sort of case, it isn't _really_ safe to delete those
files, because the commitment you made before crash when you accepted
a PREPARE TRANSACTION is going to be gone, which violates the 2PC
rules.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
When my information changes, I alter my conclusions.  What do you do sir?
--attr. John Maynard Keynes

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] 2PC-induced lockup

2007-07-12 Thread Andrew Sullivan

On Wed, Jul 11, 2007 at 10:47:25PM +0100, Simon Riggs wrote:
> expertise to isolate this as the error. I would prefer to explicitly
> avoid this kind of error, so that we can return to the idea that
> removing pg_twophase is never a requirement.

This was pretty much my point.  It's one thing to say, "If you are
completely hosed, you will lose some data."  But 2PC is making some
pretty strong promises, and I sort of hate it that it's not real hard
to break things in such a way that those promises have to be broken.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
When my information changes, I alter my conclusions.  What do you do sir?
--attr. John Maynard Keynes

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] 2PC-induced lockup

2007-07-12 Thread Andrew Sullivan

On Wed, Jul 11, 2007 at 06:09:55PM -0400, Tom Lane wrote:
> This is really pretty silly to be getting worked up about.  The command
> in question wouldn't have been allowed at all except to a superuser,
> and there are plenty of ways to catastrophically destroy your database
> when you are superuser; most of which we will never consider blocking

I think the problem is it is nowise obvious that LOCK [sometable];
PREPARE TRANSACTION is, as superuser, dangerous in the way that DROP
CASCADE is.  Or rm -rf, for that matter.  It seems to me that at the
very least, some pretty extensive documentation of the perils of
running 2PC as a superuser ought to be available, and if it's there,
it didn't leap out at me.  

But the other problem I see here is that the solution hits more than
just the problematic state.  If we have bad pages on disk, for
instance, we zero pages; we don't drop the table.  Similarly, it
seems that all that's necessary here is an external tool to grovel
through the prepared transaction files and somehow figure out what
locks, if any, they entail.  Wouldn't that be enough to then allow
you to shoot the relevant file, thereby losing only that prepared
transaction instead of all of them?  (I'm not suggesting that no data
must ever be lost in this case; just that we should lose the minimum
necessary to make the system work.)

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
A certain description of men are for getting out of debt, yet are
against all taxes for raising money to pay it off.
--Alexander Hamilton

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] Straightforward changes for increased SMP scalability

2007-07-16 Thread Andrew Sullivan

On Mon, Jul 16, 2007 at 01:23:46PM +0100, Simon Riggs wrote:
> Both of these changes are simple enough to consider for 8.3

I'm in favour of scalability, of course, but are they really simple
enough to put in for 8.3?  I was under the impression that there was
a push on to get the thing shipped, and adding incremental changes
near the end of the cycle strikes me as a possible source of
significant additional surprises (and therefore delays).  I am no
code expert, though; I just wanted to be sure there's consensus on
the simplicity of the changes.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
This work was visionary and imaginative, and goes to show that visionary
and imaginative work need not end up well. 
--Dennis Ritchie

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] bit string functions

2007-07-16 Thread Andrew Sullivan

On Mon, Jul 16, 2007 at 09:40:18AM -0700, TJ O'Donnell wrote:
> I would like to make these a part of postgresql for others to use.
> Is it more appropriate for these to be in contrib code
> or part of the postgresql proper?
> How can I contribute these?

I would say just set up a project on pgfoundry. 

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
The very definition of "news" is "something that hardly ever happens."  
--Bruce Schneier

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] crypting prosrc in pg_proc

2007-08-09 Thread Andrew Sullivan

On Thu, Aug 09, 2007 at 04:42:19PM +0200, Hans-Juergen Schoenig wrote:

> the problem here is that vendors of appliances don't want people to  
> spider their codes. this is a fact - it is not the idea of open  
> source to do so but bloody reality. in addition to that people are  
> not willing to code everything in C just to hide.

Well, then, they're out of luck.

> so, there has to be a concept to achieve this for stored procedures  
> somehow.

Just because you want something does not mean that there must be a
way to achieve it.

> i am afraid the source level encryption is the easiest thing and most  
> understandable thing to do.

It's easy and understandable, but it won't do anything for you,
because the code has to be decrypted automatically in order to run. 
If it can be decrypted by the system, then it can obviously be
decrypted by someone who has superuser access to the system too.  I
have seen systems that attempt this sort of "protection" anyway. 
(Indeed, one might argue that all of the hopeless and failed copy
protection systems CDs and DVDs have come with are exactly this sort
of trick.)  The fundamental fact is that, if you want users to be able
to use code that is interpreted, then you're never going to be able
to guarantee that the users can't also see that code.

Other equally useless, in my experience, techniques include code
obfuscation.  You could probably bind things up in enough levels of
nested calls to various functions that mapping it out would be tricky
and tedious.  Of course, this also means that maintaining the code is
a fairly significant pain.

I'll also note that I am not actually convinced these contortions do
anything to protect the system from people who want to figure out how
it works. 

None of that, of course, does anything to relieve the pressure on you
to deliver the requested feature.  But perhaps you can reason with
them.  And they can always write it in C and deliver only compiled
object code; although whether anyone would be foolish enough to run
such object code without inspecting its safety is an excercise in
speculation I don't care to pursue.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
The plural of anecdote is not data.
--Roger Brinner

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] crypting prosrc in pg_proc

2007-08-09 Thread Andrew Sullivan

On Thu, Aug 09, 2007 at 05:16:43PM +0200, Hans-Juergen Schoenig wrote:
> we are shipping appliances - users can only use SQL; no compilers and  
> no SSH logins allowed ...

If you have shipped a physical thing to someone else, then they can
do what they want with it irrespective of what "rules" you think you
have imposed.  If you believe otherwise, then you are deluding
yourself about the security of your approach.  It was this sort of
magical thinking that caused SIM cards, WEP keys, DVD copy
protection, and other such "guaranteed safe" technologies to be
revealed as encumbered with needless weight that may foil casual
users, but that are trivially broken by anyone actually interested in
doing the breaking.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
In the future this spectacle of the middle classes shocking the avant-
garde will probably become the textbook definition of Postmodernism. 
--Brad Holland

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] SQL feature requests

2007-08-24 Thread Andrew Sullivan

On Thu, Aug 23, 2007 at 02:06:16PM -0400, Chuck McDevitt wrote:
> In general, we wouldn't want to support any de facto standard that:
> 
>   1.  Is supported only by one vendor
>   2.  Causes any standard SQL statement to fail, or return a different
> answer from the standard.
> 
> The proposed change doesn't fail either of these.

>From what I can see upthread, it fails 1 and possibly 2.  Given that
we don't seem to know _why_ it is forbidden, there could well be a
case under 2 is a problem, and we haven't thought of it.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
Everything that happens in the world happens at some place.
--Jane Jacobs 

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] Obfuscated definitions of database objects

2007-08-24 Thread Andrew Sullivan

On Fri, Aug 24, 2007 at 04:59:13PM +0200, Charles N. Charotti wrote:

> I want to know if there is any plan in future versions
> of PG to obfuscate the source code of some objects of
> the database, specially the functions of the backend ?

There have been requests for this, but AFAIK nobody has committed to
doing it.  

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
In the future this spectacle of the middle classes shocking the avant-
garde will probably become the textbook definition of Postmodernism. 
--Brad Holland

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] Password requirement in windows installer

2007-08-31 Thread Andrew Sullivan

On Fri, Aug 31, 2007 at 12:30:02PM -0500, Decibel! wrote:
> 
> Is it easy to spoof where an incoming connection request is coming from?
> Is there something else that makes ident on 127.0.0.1/32 insecure?

It shouldn't be easy.  Ident uses TCP, which is rather harder to
spoof.  If someone can originate spoofed TCP packets from 127.0.0.1,
you gots bigger problems than them being able to lie about the
identity of a user.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
However important originality may be in some fields, restraint and 
adherence to procedure emerge as the more significant virtues in a 
great many others.   --Alain de Botton

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] Password requirement in windows installer

2007-08-31 Thread Andrew Sullivan

On Fri, Aug 31, 2007 at 07:07:40PM +0100, Gregory Stark wrote:
> >
> > It shouldn't be easy.  Ident uses TCP, which is rather harder to
> > spoof.  
> 
> Say what? It's actually quite easy to spoof TCP. There are even command-line
> tools to do it available in most Unix distributions.

Sorry, I should have been more precise.  It's hard to spoof TCP
easily and usefully.  It's trivial to spoof a TCP packet and send it,
but because TCP depends on the handshake, there's a bunch of
additional overhead that make the whole thing fragile: intermediate
firewalls and such like tend to detect these things and prevent their
continued use.  (This is all in comparison with UDP, which is
completely trivial to spoof.  In the absence of BCP38, it's also
almost impossible to detect, which is why the DNS is so vulnerable
these days.)

> > If someone can originate spoofed TCP packets from 127.0.0.1, you
> > gots bigger problems than them being able to lie about the
> > identity of a user.
> 
> Well yes, there are other insecure services which look at the
> originating ip address. 

I was thinking that, if your operating system is accepting packets
from 127.0.0.1 on an external interface, you're already in a world of
hurt.  And if someone is able to fiddle with the packets on your own
box, then they have root anyway; they can do anything they want.  If
you haven't set up your systems so that this kind of attack is
impossible from localhost, well, I don't think that anything at the
application's level of security is going to do you any good at all.

Indeed, I would argue that, for industrial-class data centre use, if
you can't use ident between machines, your network security is in
very bad shape.  (That isn't to say I think it's a good idea; but
rather, that I hope the network is well enough run that, even if you
did run it, it would not represent a real risk.)

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
This work was visionary and imaginative, and goes to show that visionary
and imaginative work need not end up well. 
--Dennis Ritchie

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Password requirement in windows installer

2007-08-31 Thread Andrew Sullivan

On Fri, Aug 31, 2007 at 02:12:03PM -0500, Decibel! wrote:
> ISTM that if someone breaches your network to the point where they can
> spoof identd, you're pretty much hosed anyway; so what's the point of
> hard-coding passwords in a config file somewhere then?

True.  I personally prefer cryptographic authentication for this
reason (but it's not perfect either -- if someone has root, they have
root.  You're hosed).  None of my employers ever seem willing to pay
the additional overhead, however.

A
-- 
Andrew Sullivan  | [EMAIL PROTECTED]
The plural of anecdote is not data.
--Roger Brinner

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Password requirement in windows installer

2007-08-31 Thread Andrew Sullivan

On Fri, Aug 31, 2007 at 08:20:20PM +0100, Gregory Stark wrote:
> Except note that ident is, like X, precisely the kind of protocol where the
> handshake matters least. Since you all the relevant data comes early in the
> message you can fire the SYN and the ACK (with the predicted sequence number)
> with the first data packet off right away.

Well, I certainly wasn't arguing the virtues of ident, in case that's
the impression you've formed.

> They may tend to but would you trust a system that depended entirely on a
> firewall for security? 

Of course not.  But the point is that, in real data centre use, you
have pretty serious problems if you have been compromised to the
point of people successfully spoofing TCP packets: _someone_ is on
your network.  I suppose there are people hanging database servers
directly off the internet, without a stateful firewall in between,
but I don't think those people can really be helped anyway.

> Uhm, no DNS is vulnerable because it isn't authenticated. 

Well, sorta.  But the spoofing-based DDoS attacks that are currently
so trivial against DNS wouldn't be possible if it used TCP all the
time, or if everyone implemented BCP38. 

> Well then lots of people are in a world of hurt. OSes only started adding
> filters like this about 8-9 years ago and there are plenty of people running
> Linux distributions older than this and other operating systems that are
> slower to take up new ideas. In any case there are common misconfigurations
> that defeat these kinds of filters too.

Sure.  As I said above, I don't think anyone was suggesting that this
should be a general strategy.  It's just one way to do things, if you
know what you're doing.

> happy to run ident on 127.0.0.1. But I would be a lot happier
> running on Unix domain sockets where it doesn't depend on an
> external daemon and ip filters -- just regular kernel credentials.

Right, but you can't use UNIX domain sockets with, say, JDBC.

> Just as an example, say you're running vmware or something like it
> and you're bridging it on to your network. Will your ip filters
> will still kick in for bridged packets? Are you 100.0% sure?

I dunno, but I do know that I'd test it before I started doing it :)

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
In the future this spectacle of the middle classes shocking the avant-
garde will probably become the textbook definition of Postmodernism. 
--Brad Holland

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] [FEATURE REQUEST] Streaming Onlinebackup (Maybe OFFTOPIC)

2007-09-06 Thread Andrew Sullivan

On Thu, Sep 06, 2007 at 04:08:10PM +0200, apoc9009 wrote:
> >archive_timeout setting. It will produce a lot of log files with very
> >little content in them, but they will compress well.
> >  
> Yes, it is possible but not recommended . My Backup Servers Filesystem 
> will explode :D

. . .

> Correct, but  this is not good enought and i think there are a lot of 
> Peoples having the same Problem.
> It was wishfull, having a Online Streaming Backupsubsystem, thadt can 
> produce portable Backupdatabases,
> to prevent users for Millions of Archive Files on the Backup FTP-Server.

It seems that what you want is near-real-time online backups with _no
cost_, which is not a feature that I think anyone will ever work on.  

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
Users never remark, "Wow, this software may be buggy and hard 
to use, but at least there is a lot of code underneath."
--Damien Katz

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] PANIC caused by open_sync on Linux

2007-10-26 Thread Andrew Sullivan

On Fri, Oct 26, 2007 at 08:34:49AM -0400, Tom Lane wrote:
> we only check for the case once per checkpoint and we don't create a
> segment unless there's very little space left.

Sort of a filthy hack, but what about always having an _extra_
segment around?  The bgwriter could do that, no?

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] PANIC caused by open_sync on Linux

2007-10-29 Thread Andrew Sullivan

On Fri, Oct 26, 2007 at 10:39:12PM -0400, Greg Smith wrote:
> There's a couple of potential to-do list ideas that build on the changes 
> in this area in 8.3:

I think that's the right way to go.  It's too bad that this may still
happen in 8.3, but we're way past the point that this is a bug fix,
IMO.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
The plural of anecdote is not data.
--Roger Brinner

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Simplifying Text Search

2007-11-12 Thread Andrew Sullivan

On Mon, Nov 12, 2007 at 08:09:48PM +, Simon Riggs wrote:
> 
> @@ would still exist, so no problems. These additions are for new users,
> not old ones.

Given that this is all sugar on top of tsearch anyway, why not put it in
pgfoundry as the tsearch_sugar project?  Then packagers could include a
standard set of such sugar if they wanted.

A

-- 
Andrew Sullivan
Old sigs will return after re-constitution of blue smoke

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] How to keep a table in memory?

2007-11-13 Thread Andrew Sullivan

On Mon, Nov 12, 2007 at 06:55:09PM -0800, Joshua D. Drake wrote:
> Cost is always an issue, even if implicit. If the person is so hung up 
> on the idea of pushing things into ram there is a pretty good 
> possibility they have priced out the 50 and 100 spindle devices needed 
> to get the same type of performance.

I'm not sure I agree with that.  The OP was claiming that this approach was
what worked for him with MS SQL Server, which makes me think that this is
the usual human habit of generalizing widely from a particular.  That is,
"X was a solution that worked once with another product, so I want to know
how to do X with your product."  We get these questions all the time, partly
because one has to re-learn all sorts of things when moving to PostgreSQL. 
For instance, most of the traditional real database systems don't
collaborate with the OS in memory and cache management.

A

-- 
Andrew Sullivan
Old sigs will return after re-constitution of blue smoke

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] How to keep a table in memory?

2007-11-13 Thread Andrew Sullivan

On Mon, Nov 12, 2007 at 10:54:34PM -0500, Tom Lane wrote:

> class.  But if that's your problem, "pin these tables in memory" is
> still an awfully crude solution to the problem.  I'd be inclined to
> think instead about a scheme that lets references made by
> higher-priority queries bump buffers' use-counts by more than 1,
> or some other way of making the priority considerations visible to an
> automatic cache management algorithm.

While this is true, nobody seems to have those other ways available today. 
If there was a quick and easy way to pin certain tables in memory, I think
that administrators might be well-advised to use that mechanism until such
time as the weighted-priority cacheing or whatever shows up.  (Of course,
AFAICT, there's no easy way to do the pinning, either, so this all seems a
little academic.)

I have to agree with what Tom says, however, about people thinking they're
smarter than the system.  Much of the time, this sort of thumb on the scale
optimisation just moves the cost to some other place, and the admin's
analysis isn't comprehensive enough to turn that up until it's all turned on
in production.

A

-- 
Andrew Sullivan
Old sigs will return after re-constitution of blue smoke

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] How to keep a table in memory?

2007-11-13 Thread Andrew Sullivan

On Tue, Nov 13, 2007 at 02:36:14PM -0500, Greg Smith wrote:
> Sure, but in this case the reasoning seems sound enough.  

Yes.  But. . .

> I see this as similar to the old optimizer hint argument, where there 
> certainly exist some edge cases where people know something the optimizer 
> doesn't which changes the optimal behavior.

. . .the abuse of such hints in applications I have seen is so rampant as to
make me doubt the utility of adding them anyway.  It's true that by adding
hints, you give a facility to a good, competent designer who has a really
peculiar case that no general purpose system is likely to solve well.  In
practice, however, it also seems to mean that every slack-jawed fool with
access to the manual thinks that he or she is going to "fix" the "broken"
query plan by forcing index scans where they're useless (has a week yet gone
by where someone doesn't post to -performance with that problem?).  So I'm
divided on whether actually providing the facility is a good idea, even
though I can think of a handful of cases where I doubt even the smartest
planner will get it right.  (By analogy, pinning in memory, and I'm
similarly divided.)

A

-- 
Andrew Sullivan
Old sigs will return after re-constitution of blue smoke

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Quality and Performance

2007-11-27 Thread Andrew Sullivan

On Tue, Nov 27, 2007 at 05:32:49PM +, Simon Riggs wrote:
> What I would really like to persuade everybody is that performance needs
> specific attention. 

[. . .]

> Your thoughts are welcome,

Well, one thing that might help is something of the specifics you mention.

I remember mentioning to Jan not long after he started at Afilias that we
occasionally saw strange behaviour that looked like "lock up".  He was
slightly incredulous, and I didn't have time to build a repeatable test
case.  So it was in the context of testing Slony that he discovered the dual
pains of buffer shuffling and checkpoint storms; this is part of what led
him to work on those problems in 8.0.

The key was to state, at the outset, "Here is the problem I want to fix."
By stating precisely and specifically what is to be fixed, the issue moves
from "performance needs" to a feature that can be implemented.

Perhaps now is the time to list some specific performance areas you want to
fix up?

A

-- 
Andrew Sullivan
Old sigs will return after re-constitution of blue smoke

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] WORM and Read Only Tables (v0.1)

2007-12-11 Thread Andrew Sullivan

On Tue, Dec 11, 2007 at 11:12:46AM +, Simon Riggs wrote:
> 
> Read-Only Tables
> 

In the past when this topic came up, there was some discussion of doing this
at a level somewhere below the table horizon.  There are a number of nasty
limitations for partitions currently (not the least of which is that real
uniqueness guarantees are impractical), so allowing users to specify some
segment of the table to be "read only" without imposing it on the whole
table would be awful nice.  I seem to recall Jan had an idea on how to do
it, but I could be wrong.

Also, doing this at the tuple, rather than table-wide, level might lead to
additional capabilities in this area:

> Attach
> --
> 
> Writing tables on one system and then moving that data to other systems
> is fairly common. If we supported read-only tables then you might
> consider how you would publish new versions to people.

Some time ago I was speculating on pie-in-the-sky features I might like in
Postgres, and it was something like this attach.  But the idea was somehow
related to the read-only tuples.

In my specific case, I have piles and piles of mostly useless data.
Sometimes, however, some of that data is possibly useful in retrospect.  So
the suggestion was to have tables that could be mostly offline -- archived
somewhere -- but for which we had enough metadata online to say, "You have
some data that might match in catalog C.  Go mount it, and I'll check."  I
think this is subtly different from the attach case you're outlining?

A

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] WORM and Read Only Tables (v0.1)

2007-12-12 Thread Andrew Sullivan

On Wed, Dec 12, 2007 at 12:14:43PM +0100, Zeugswetter Andreas ADI SD wrote:
> Uniqueness is currently perfectly practical, when the unique index
> contains
> the column[s] that is/are used in a non overlapping partitioning scheme.

Well, yes, assuming you have no bugs.  Part of the reason I want the
database to handle this for me is because, where I've come from, the only
thing I can be sure of is that there will be bugs.  There'll even be bugs
before there is running code.  One bug I can easily imagine is that the
non-overlapping partitioning scheme has a bug in it, such that it turns out
there _is_ an overlap some time.

All of that said, I agree with you, particularly about the alternative ways
things can suck instead :-/

A

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] WORM and Read Only Tables (v0.1)

2007-12-12 Thread Andrew Sullivan

On Wed, Dec 12, 2007 at 12:58:11PM +0100, Zeugswetter Andreas ADI SD wrote:
> Wouldn't one very substantial requirement of such storage be to
> have it independent of db version, or even db product? Keeping
> old hardware and software around can be quite expensive.

This was one of the explicit requirements I had when I wrote my pie in the
sky outline.  Hrm.  I wonder if I can get permission to post it.  Let me
find out.

The requirement was, anyway, that we be able to read old versions of
"archived" rows.  IIRC there was an implementation choice, whether we would
_never_ allow such rows to be SET READ WRITE or whether they'd be
immediately upgraded to the present format on SET READ WRITE.

A

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] WORM and Read Only Tables (v0.1)

2007-12-12 Thread Andrew Sullivan

On Wed, Dec 12, 2007 at 07:07:57PM +, Simon Riggs wrote:
> 
> Enforcing uniqueness with a global index has a number of disadvantages.

This is why I was trying to talk about "constraints" rather than global
indexes.  Just because we happen to implement them that way today does not
mean that such constraints need be implemented that way in every case.

I think especially for the sort of "detached rows" scenario I was dreaming
about, a global index is never going to be good.

A

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] Negative LIMIT and OFFSET?

2007-12-13 Thread Andrew Sullivan

On Fri, Dec 14, 2007 at 01:47:23AM +, Gregory Stark wrote:
> Huh, I was all set to post an example of a useful application of it but then
> apparently I'm wrong and it doesn't work:

I dimly remember some discussion of this issue once before, maybe a year
ago.  My memory isn't what it was, and I can't find it by trolling archives,
but I recall Tom saying that it was dumb, yes, but don't do that, because
there's some reason not to change it.  I know, helpful search terms R me.

A

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Negative LIMIT and OFFSET?

2007-12-13 Thread Andrew Sullivan

On Thu, Dec 13, 2007 at 10:06:35PM -0500, Tom Lane wrote:
> of a negative limit, it's meeting its spec.  If you want to throw an
> error for negative limit, shouldn't you logically also throw an error

Should it be a WARNING?

A


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Negative LIMIT and OFFSET?

2007-12-13 Thread Andrew Sullivan

On Thu, Dec 13, 2007 at 10:01:43PM -0500, Jonah H. Harris wrote:
> Man, maybe my mad Google skillz are not as mad as I thought :(

Hey, I worked in a library some years ago, when Google was just a googlet,
and I couldn't find it either.  It's a dim memory, note.  Which could mean
"artifact".  I'm old!  I'm probably delusional too.

A


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Negative LIMIT and OFFSET?

2007-12-14 Thread Andrew Sullivan

On Fri, Dec 14, 2007 at 09:02:04AM +, Gregory Stark wrote:
> 
> Oh, and incidentally the problem with WARNING is that this is DML which could
> potentially be executing hundreds or thousands of times per minute. A WARNING
> is effectively an ERROR.

Good point.  Also, the sort of case where you're likely to be automatically
generating these negative values is also the sort of case where you have
various nice programmatic interfaces, many of which store up all the
warnings.  The warnings then have to be freed explicitly, which of course
means that by adding a warning, clients would suddenly start to chew through
piles of memory.

A

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Negative LIMIT and OFFSET?

2007-12-14 Thread Andrew Sullivan

On Thu, Dec 13, 2007 at 11:31:17PM -0500, Merlin Moncure wrote:
> 
> for historical record, this comment (subject not directly related to
> the OP) was probably this:
> http://www.mail-archive.com/pgsql-hackers@postgresql.org/msg62562.html

Bingo.  Thanks!

A


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Negative LIMIT and OFFSET?

2007-12-16 Thread Andrew Sullivan

On Fri, Dec 14, 2007 at 06:42:24PM -0500, Tom Lane wrote:
> 
> How do people feel about applying this to 8.3, rather than holding it?

To me, this is a feature change, and therefore should be held.

A


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Negative LIMIT and OFFSET?

2007-12-17 Thread Andrew Sullivan

On Sun, Dec 16, 2007 at 12:31:11PM -0500, Tom Lane wrote:
> 
> Well, I wouldn't advocate making it in a minor release, but it's not
> clear how that translates into saying it can't go into 8.3.

Just because we're well past feature freeze, in beta.  I realise this seems
like a corner case, but the whole point of having betas where functionality
is more or less frozen is to reduce the liklihood that someone's testing on
(say) beta 2 is not all completely invalidated on beta 4.

A

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] Testing mail list

2007-12-19 Thread Andrew Sullivan

On Wed, Dec 19, 2007 at 11:15:37AM -0500, Tom Lane wrote:
> hoping to draw responses from careless people?  I've heard of web
> comment-spammers who try to get other people to decode captchas
> for them this way.

Yes.  This is the latest spammer trick.  They get people all over the globe
to decode the captchas.  It's way easier than programming to decode the
captchas (which itself isn't that hard -- there are plenty of toolkits out
there that will decode such things for you).

A

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] function body actors (was: [PERFORM] viewing source code)

2007-12-21 Thread Andrew Sullivan

On Fri, Dec 21, 2007 at 12:09:28AM -0500, Merlin Moncure wrote:
> Maybe a key management solution isn't required.  If, instead of
> strictly wrapping a language with an encryption layer, we provide
> hooks (actors) that have the ability to operate on the function body
> when it arrives and leaves pg_proc, we may sidestep the key problem
> (leaving it to the user) and open up the doors to new functionality at
> the same time.

I like this idea much better, because the same basic mechanism can be used
for more than one thing, and it doesn't build in a system that is
fundamentally weak.  Of course, you _can_ build a weak system this way, but
there's an important difference between building a fundamentally weak system
and making weak systems possible.

A

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] function body actors (was: [PERFORM] viewing source code)

2007-12-21 Thread Andrew Sullivan

On Fri, Dec 21, 2007 at 12:40:05AM -0500, Tom Lane wrote:

> whether there is a useful policy for it to implement.  Andrew Sullivan
> argued upthread that we cannot get anywhere with both keys and encrypted
> function bodies stored in the same database (I hope that's an adequate
> summary of his point).  

It is.  I'm not a security expert, but I've been spending some time
listening to some of them lately.  The fundamental problem with a system
that stores the keys online in the same repository is not just its potential
for compromise, but its brittle failure mode: once the key is recovered,
you're hosed.  And there's no outside check of key validity, which means
attackers have a nicely-contained target to hit.

> I'm not convinced that he's right, but that has to be the first issue we
> think about.  The whole thing is a dead end if there's no way to do
> meaningful encryption --- punting an insoluble problem to the user doesn't
> make it better.

Well, one thing you could do with the proposal is build a PKCS#11 actor,
that could talk to an HSM.  Not everyone needs HSMs, of course, but they do
make online key storage much less risky (because correctly designed ones
make key recovery practically impossible).  So the mechanism can be made
effectively secure even for very strong cryptographic uses.

Weaker cases might use a two-level key approach, with a "data-signing key"
online all the time to do the basic encryption and validation, but a
key-signing key that is always offline or otherwise unavailable from within
the system.  The key signing key only authenticates (and doesn't encrypt)
the data signing key.  You could use a different actor for this, to provide
an interface to one-way functions or something.  This gives you a way to
revoke a data-signing key.  You couldn't protect already compromised data
this way, but at least you could prevent new disclosures.  

Yes, I'm being hand-wavy now, but I can at least see how these different
approaches are possible under the suggestion, so it seems like a possibly
fruitful avenue to explore.  The more I think about it, actually, the more I
like it.

A

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] function body actors (was: [PERFORM] viewing source code)

2007-12-21 Thread Andrew Sullivan

On Fri, Dec 21, 2007 at 01:57:44PM -0500, Tom Lane wrote:
> "Merlin Moncure" <[EMAIL PROTECTED]> writes:
> > ISTM the main issue is how exactly the authenticated user interacts
> > with the actor to give it the information it needs to get the real
> > key.  This is significant because we don't want to be boxed into an
> > actor implementation that doesn't allow that interaction.
> 
> We don't?  What purpose would such a setup serve?  I would think
> that for the applications we have in mind, the *last* thing you
> want is for the end user to hold the key.  The whole point of this
> is to keep him from seeing the function source code, remember?

Hmm; this may be exactly part of the problem, though.  It seems there are
two possible cases in play:

1.  Protect the content in the database (in this case, function bodies)
from _all_ users on a given server.  This is a case where you want to
protect (say) your function body from your users, because you have a
closed-source application.  

2.  Protect the content of a field from _some_ users on a given system,
based on the permissions they hold.  This is roughly analagous to others not
being able to look in the table I created, because I haven't GRANTed them
permission.

(2) is really a case for column-level access controls, I guess.  But if
we're trying to solve this problem too, then user passwords or something
make sense.

A

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] function body actors (was: [PERFORM] viewing source code)

2007-12-21 Thread Andrew Sullivan

On Fri, Dec 21, 2007 at 04:19:51PM -0500, Tom Lane wrote:
> > 2.  Protect the content of a field from _some_ users on a given system,
> 
> I would argue that (2) is reasonably well served today by setting up
> separate databases for separate users. 

I thought actually this was one of the use-cases we were hearing.  Different
people using the same database (because the same data), with rules about the
different staff being able to see this or that function body.  I can easily
imagine such a case, for instance, in a large organization with different
departments and different responsibilities.  It seems a shame that the only
answer we have there is, "Give them different databases."  

I actually think organizations that think keeping function bodies secret
like this to be a good idea are organizations that will eventually make
really stupid mistakes.  But that doesn't mean they're not under the legal
requirement to do this.  For instance, my current employer has
(externally-mandated) organizational conflict of interest rules that require
all disclosure to be done exclusively as "need to know".  Under the right
(!) legal guidance, such a requirement could easily lead to rules about
function-body disclosure.  From my point of view, such a use case is way
more compelling than function-body encryption (although I understand that
one too).

A

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Spoofing as the postmaster

2007-12-27 Thread Andrew Sullivan

On Mon, Dec 24, 2007 at 12:04:16AM +0100, Tomasz Ostrowski wrote:
> 
> Not at all, as it won't run as root, it'll just start as root and
> then give up all root privileges. The only thing it would have after
> being root is just an open socket.

If you think that is complete protection against privilege escalation, I
encourage you to read some more bugtraq archives.

The answer to MITM attacks is not superuser-reserved ports anyway.  The
privileged port idea was a bad one in retrospect.  The answer is strong
authentication. 

A

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Spoofing as the postmaster

2007-12-27 Thread Andrew Sullivan

On Sun, Dec 23, 2007 at 09:52:14PM +0100, Magnus Hagander wrote:
> My point is that all these other server products have the exact same
> issue. And that they deal with it the exact same we do - pretty much
> leave it up to the guy who configure the server to realize that's just
> how things work.

The problem with that approach is that, in the computer security world,
taking that approach is increasingly regarded as negligent.  And pointing
out that others are similarly negligent is not a response.

Note that I am explicitly not subscribing to or disagreeing with that view.

A

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Spoofing as the postmaster

2007-12-27 Thread Andrew Sullivan

On Sun, Dec 23, 2007 at 01:45:14AM -0500, Tom Lane wrote:
> 
> The primary reason things work like that is that there are boatloads of
> machines that are marginally misconfigured.  For instance, userland
> thinks there is IPv6 support when the kernel thinks not (or vice versa).

Not only "marginally misconfigured", but "broken as shipped", in the case of
some OSes.  And in those cases, you can't even fix it.

A


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Spoofing as the postmaster

2007-12-28 Thread Andrew Sullivan

On Sat, Dec 29, 2007 at 02:09:23AM +1100, Naz Gassiep wrote:
> In the web world, it is the client's responsibility to ensure that they 
> check the SSL cert and don't do their banking at 
> www.bankofamerica.hax0r.ru and there is nothing that the real banking 
> site can do to stop them using their malware infested PC to connect to 
> the phishing site. 

The above security model is exactly how we got into the mess we're in:
relying entirely on the good sense of a wide community of users is how
compromises happen.  Strong authentication authenticates both ways.

For instance, the web world you describe is not the only one.  Banks who
take security seriously have multiple levels of authentication, have trained
their users how to do this, and regularly provide scan tools to clients in
an attempt (IMO possibly doomed) to reduce the chances of input-device
sniffing. 

A

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Spoofing as the postmaster

2007-12-28 Thread Andrew Sullivan

On Fri, Dec 28, 2007 at 07:48:22AM -0800, Trevor Talbot wrote:
> I don't follow. What are banks doing on the web now to force clients
> to authenticate them, and how is it any different from the model of
> training users to check the SSL certificate?

Some banks (mostly Swiss and German, from what I've seen) are requiring
two-token authentication, and that second "token" is really the way that the
client authenticates the server: when you "install" your banking
application, you're really installing the keys you need to authenticate the
server and for the server to authenticate you.

> There's a fundamental problem that you can't make someone else do
> authentication if they don't want to, and that's exactly the situation
> clients are in. 

Right, but you can train users to expect authentication of the server.  One
way to do that is to require them to use an intrusive enough system that
they end up learning what to look for in a phish attack.  That said, I tend
to agree with you: if we had dnssec everywhere today, it's totally unclear
to me what client applications would do in the event they got a "bogus"
resolution.

A

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Slow count(*)

2008-01-02 Thread Andrew Sullivan

On Wed, Jan 02, 2008 at 09:29:24AM -0600, Abraham, Danny wrote:
> We are looking for a patch that will help us  count using the indexes.

Is this for 

SELECT count(*) FROM table;

or 

SELECT count(1) FROM table WHERE. . .

The latter _will_ use an index, if the index is correct, the statistics are
right, and the index selectivity is worth the cost of reading the index. 
The former will not use an index at all, because the answer depends on
visibility, and you can't know that without reading the table.  If you're
counting how many rows are in the table (for, for instance, display
purposes), you probably need to do something else.

> Our product is about 20 times slower on Postgres compared to MS SQL
> Server.
> 
> Any ideas?

Not without the queries, the EXPLAIN ANALYZE plans, and some information
about the database.

A

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Index performance

2008-01-02 Thread Andrew Sullivan

On Wed, Jan 02, 2008 at 05:53:35PM +0200, Brian Modra wrote:
> This table is added to in real time, at least 10 rows per second.

[. . .]

> If I do a select which uses the pkey index, where equal to the ID
> column, and greater than one of the values, which should return about
> 1500 rows, it sometimes takes 1/2 minute to return, and other times
> takes only seconds.
> 
> Is it the number of rows being added in real time, that is maybe
> causing the index to be locked?

No, it's probably a bad plan.  A minimum 10 rows/second is probably just
making the statistics for the table look bad.  You likely want to SET
STATISTICS wider on the 1st (~150 distinct values) column, and then run
ANALYSE on the table very frequently.  Are you updating or deleting at all? 
If so, that will also affect things: you need to perform very frequent
VACUUM on that table in that case.

Aside from that generic advice, it's impossible to say more without EXPLAIN
ANALYSE output for the slow and fast examples.

A

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] Dynamic Partitioning using Segment Visibility Maps

2008-01-04 Thread Andrew Sullivan

On Fri, Jan 04, 2008 at 01:29:55PM +0100, Markus Schiltknecht wrote:
> 
> Agreed. Just a minor note: I find "marked read-only" too strong, as it 
> implies an impossibility to write. I propose speaking about mostly-read 
> segments, or optimized for reading or similar.

I do want some segments to be _marked_ read-only: I want attempted writes to
them to _fail_.

A


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Index performance

2008-01-04 Thread Andrew Sullivan

On Thu, Jan 03, 2008 at 07:11:07AM +0200, Brian Modra wrote:
> Thanks, I think you have me on the right track. I'm testing a vacuum
> analyse now to see how long it takes, and then I'll set it up to
> automatically run every night (so that it has a chance to complete
> before about 6am.)

Note that "VACUUM ANALYSE" and "ANALYSE" are not identical: the former also
performs vacuum.  On a table that is not updating that often but that is
expanding rapidly, you may not need that extra I/O.  Analyse on its own can
perform just the statistical sampling.  If you're not creating dead tuples
with UPDATE, DELETE, or ROLLBACK, that might be enough most of the time.

A

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] SSL over Unix-domain sockets

2008-01-04 Thread Andrew Sullivan

On Fri, Jan 04, 2008 at 02:37:03PM -0500, Bruce Momjian wrote:
> The problem with adding SSL to local sockets is this slippery slope
> where we only do part of the job, but it isn't clear where to draw the
> line.

I don't think "part of the job" for a patch is a slippery slope.  It's what
you do with patches for issues discovered too late to make full release:
pick the least invasive thing you can do, release that patch, and implement
the full feature later.

A

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Dynamic Partitioning using Segment Visibility Maps

2008-01-04 Thread Andrew Sullivan

On Fri, Jan 04, 2008 at 10:26:54PM +0100, Markus Schiltknecht wrote:
> 
> I'm still puzzled about how a DBA is expected to figure out which 
> segments to mark. 

I think that part might be hand-wavy still.  But once this facility is
there, what's to prevent the current active segment (and the rest) from also
getting this mark, which would mean "the table is read only"?  

A

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Dynamic Partitioning using Segment Visibility Maps

2008-01-07 Thread Andrew Sullivan

On Sat, Jan 05, 2008 at 08:02:41PM +0100, Markus Schiltknecht wrote:
> Well, management of relations is easy enough, known to the DBA and most 
> importantly: it already exists. Having to set up something which is 
> *not* tied to a relation complicates things just because it's an 
> additional concept.

But we're already dealing with some complicated concepts.

There isn't anything that will prevent current-style partitioning strategies
from continuing to work in the face of Simon's proposal.  But let me see if
I can outline the sort of cases where I see real value in what he's
outlined.

There is a tendency in data systems to gather all manner of data that, in
retrospect, _might_ turn out to be be valuable; but which, at the time, is
not really valuable at all.  Moreover, the value later on might be
relatively low: if you can learn something much later from that data, and do
so easily, then it will be worth doing.  But if the work involved passes
some threshold (say 1/2 a day), it's suddenly not worth it any more.  It's
simple economics: below a certain cost, the data is valuable.  Above a
certain cost, you simply shouldn't keep the data in the first place, because
the cost of using it is higher than any value you'll likely be able to
extract.

Simon's proposal changes the calculations you have to do.  If keeping some
data online longer does not impose administrative or operational overhead
(you have it marked read only, so there's no I/O for vacuum; you don't need
to do anything to get the data marked read only; &c.), then all it costs is
a little more disk, which is relatively cheap these days.  More importantly,
if the longer-term effect of this strategy is to make it possible to move
such data offline _without imposing a big cost_ when moving it back online,
then the value is potentially very high.

Without even trying, I can think of a dozen examples in the past 5 years
where I could have used that sort of functionality.  Because the cost of
data retrieval was high enough, we had to decide that the question wasn't
worth answering.  Some of those answers might have been quite valuable
indeed to the Internet community, to be frank; but because I had to pay the
cost without getting much direct benefit, it just wasn't worth the effort. 

The thing about Simon's proposal that is beguiling is that it is aimed at
a very common use pattern.  The potential for automatic management under
such a use pattern makes it seem to me to be worth exploring in some detail.

> Agreed. I'd say that's why the DBA needs to be able to define the split 
> point between partitions: only he knows the meaning of the data.

I think this is only partly true.  A casual glance at the -general list will
reveal all manner of false assumptions on the parts of administrators about
how their data is structured.  My experience is that, given that the
computer has way more information about the data than I do, it is more
likely to make the right choice.  To the extent it doesn't do so, that's a
problem in the planning (or whatever) algorithms, and it ought to be fixed
there.

A

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] VACUUM FULL out of memory

2008-01-07 Thread Andrew Sullivan

On Mon, Jan 07, 2008 at 10:40:23AM +0100, Michael Akinde wrote:
> As suggested, I tested a VACUUM FULL ANALYZE with 128MB shared_buffers 
> and 512 MB reserved for maintenance_work_mem (on a 32 bit machine with 4 
> GB RAM). That ought to leave more than enough space for other processes 
> in the system. Again, the system fails on the VACUUM with the following 
> error (identical to the error we had when maintenance_work_mem was very 
> low.
> 
> INFO:  vacuuming "pg_catalog.pg_largeobject"
> ERROR:  out of memory
> DETAIL:  Failed on request of size 536870912

Something is using up the memory on the machine, or (I'll bet this is more
likely) your user (postgres?  Whatever's running the postmaster) has a
ulimit on its ability to allocate memory on the machine.  

> It strikes me as somewhat worrying that VACUUM FULL ANALYZE has so much 
> trouble with a large table. Granted - 730 million rows is a good deal - 

No, it's not really that big.  I've never seen a problem like this.  If it
were the 8.3 beta, I'd be worried; but I'm inclined to suggest you look at
the OS settings first given your set up.

Note that you should almost never use VACUUM FULL unless you've really
messed things up.  I understand from the thread that you're just testing
things out right now.  But VACUUM FULL is not something you should _ever_
need in production, if you've set things up correctly.

A

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Dynamic Partitioning using Segment Visibility Maps

2008-01-07 Thread Andrew Sullivan

On Mon, Jan 07, 2008 at 07:16:35PM +0100, Markus Schiltknecht wrote:
> 
> Does anything speak against letting the DBA handle partitions as relations?

Yes: it doesn't solve the problem I have, which is that I don't want to have
to manage a whole bunch of tables.  I want one table, and I want to be able
to say, "That section is closed". 

> Sure, there's value in Simon's proposal. But it has pretty strict 
> requirements. IMO, it's pretty hard to say, if it would have helped at 
> all for your cases. Any of them still available to check?

No, but one of your worries doesn't bother me:

> Remember the requirements: no single tuple in the segment may be 
> significantly out of the average bounds. Otherwise, the min/max gets 
> pretty useless and the segment can never be excluded.

The segment can never be excluded in a search on that key, yes.  But
consider the likely cases we're looking at: 

WHERE some_date >= '1999-01-01' AND some_date < '2001-01-01';
WHERE sequence_field BETWEEN 3000 AND 30;

&c.  These are the two obvious cases: you're searching for data in a given
date range or for primary (sometimes artificial) identifiers in a range,
and the source data increases (almost) monotonically.  You have to do this
now anyway, because there's _some_ basis on which you're partitioning your
data; but today, you do this with a lot of fooling around with views and
nasty triggers that push data into the "right" table, assuming someone
doesn't screw it up.  

> need to maintain CLUSTERed ordering, aren't there better ways? For 
> example, you could use binary searching on the relation directly, much 
> like with indices, instead of sequentially scanning on the CLUSTERed 
> relation. That would even give us some sort of "indices with visibility".

I think this is a nice idea too :)

> Well, Postgres doesn't automatically create indices, for a counter example.

Yes, and it has no data-use analyser tools that automatically suggest
indexes, either.  That's the sort of thing people coming from other (err,
"Other" ;-) products complain about, in fact.

> definitely has more information available, than the computer. A DBA 
> (hopefully) knows future plans and emergency strategies for the storage 
> system, for example. 

Perhaps my jaundice comes from too much time spent in operational trenches,
but while good DBAs have some ideas about that, large numbers of them are
harried and overwhelmed just by the piles of work they already have. 
Nevertheless, while what you say is true, I'm not sure what it has to do
with the present case.  I don't think the current proposal is to address
partitioning across table spaces.  It's to do with the way certain segments
of a table are interpreted by the system.  It's undoubtedly true that this
strategy is of questionable utility for many kinds of use of PostgreSQL. 
But it seems to offer very significant advantages for one use-pattern that
is very common.

That said, I am not trying to argue it should be adopted without poking at
its weaknesses.  I just think it unfair to ask the proposal to address
problems it's not really aimed at.

A

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] VACUUM FULL out of memory

2008-01-08 Thread Andrew Sullivan

On Tue, Jan 08, 2008 at 09:50:07AM +0100, Michael Akinde wrote:
> stack size  (kbytes, -s) 8192

Perhaps this is the issue?  (I don't know.)  Also, this _is_ for the
postgres user, right?  That's the relevant one: the one that's actually
running the back end process.  

Also, are you sure there's nothing else in the way?  I don't remember what
OS you're using.  On AIX, for instance, there's some _other_ dopey setting
that allows you to control user resource consumption as well, and it means
that ulimit's answers are not the full story.  (I learned this through
painful experience, and confess it's one of the many reasons I think AIX
should be prounounced as one word, rather than three letters.)

> Andrew Sullivan wrote:
> > Something is using up the memory on the machine, or (I'll bet this is 
> more
> > likely) your user (postgres? Whatever's running the postmaster) has a
> > ulimit on its ability to allocate memory on the machine.
> 
> If one looks at the system resources while the VACUUM FULL is going up, 
> its pretty obvious that its a postgres process going on a memory 
> allocation rampage that eats up all the resources.

Of course VACUUM FULL is eating up as much memory as it can: it's moving a
lot of data around.  But is it in fact exhausting memory on the machine? 
There are only two possibilities: either there's something else that is
preventing that allocation, or else you've run into a case so unusual that
nobody else has ever seen it.  The data you're talking about isn't that big:
I've run similar-sized databases on my laptop without pain.  

> Or in this case: if VACUUM FULL is never required (except in very 
> special circumstances), it might be a good idea not to have VACUUM 
> recommend running it (cf. the VACUUM I ran before New Year on a similar 
> size table).

The suggestion you see there, though, is in fact one of the cases where you
might in fact want to run it.  That is,

> WARNING: relation "pg_catalog.pg_largeobject" contains more than 
> "max_fsm_pages" pages with useful free space HINT: Consider using VACUUM 
> FULL on this relation or increasing the configuration parameter 
> "max_fsm_pages".

what it is saying is that a regular vacuum can no longer recover all the
dead pages in the table, and if you want that space back and marked usable
on your disk, you have to run VACUUM FULL (or, in fact, CLUSTER, or else
dump and reload the table.  But one of these).  Note that I said that, if
you have things configured _correctly_, you shouldn't have to run VACUUM
FULL except in unusual circumstances.  That doesn't mean "never".  The
problem here is an historical one: you have a "hangover" from previous
missed maintenance or sub-optimal vacuum scheduling.  In those cases, you
may want to perform VACUUM FULL, provided you understand the potential side
effects (like possibly slower inserts initially, and some possible index
bloat).

A

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] VACUUM FULL out of memory

2008-01-08 Thread Andrew Sullivan

On Tue, Jan 08, 2008 at 05:27:16PM +0100, Michael Akinde wrote:
> >  
> Those are the ulimits of the db_admin account (i.e., the user that set 
> up and runs the DB processes). Is Postgres limited by other settings?

Are you sure?

On one system I used many years ago, /bin/sh wasn't what I thought it was,
and so the ulimit that I got when logged in was not what the postmaster was
starting under.  Took me many days to figure out what was up.

A

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Dynamic Partitioning using Segment Visibility Maps

2008-01-08 Thread Andrew Sullivan

On Tue, Jan 08, 2008 at 01:08:52AM +0100, Markus Schiltknecht wrote:
> 
> Uh, which key are you talking about? AFAIU Simon's proposal, he suggests 
> maintaining min/max values for all columns of the table.

Right, but I think that's just because that approach is automatable.  Only
some use cases are going to be approproate to this.

> Yeah, and if only *one* tuple in the 1G segment has:
> 
>   some_date <= '1998-12-31' OR some_date >= '2001-01-01'
> 
> Segment Exclusion can't exclude that segment. That's all I'm saying.

Correct.

> Huh? I'm certainly not the one asking for it. Quite the opposite, I'm 
> warning from over-estimating the use of SE.

Right; I think one should be clear that there are many -- maybe most --
uses of PostgreSQL where the proposal will be of no use.  I just think we
need to be clear that for the areas where it _can_ be useful, it could be
very useful indeed.

A

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Dynamic Partitioning using Segment Visibility Maps

2008-01-08 Thread Andrew Sullivan

On Tue, Jan 08, 2008 at 02:12:28AM +, Gregory Stark wrote:

> > Yes: it doesn't solve the problem I have, which is that I don't want to
> > have to manage a whole bunch of tables.  I want one table, and I want to
> > be able to say, "That section is closed".
> 
> That's not your problem, that's the solution you're looking for. You're
> assuming a particular solution in your problem statement.

Probably in that one, yes.  I'm still waiting for permission to post my
original problem statement; I suspect it's not going to be forthcoming by
next Monday, so it's not going to happen.

But I did outline something like what I'm talking about elsewhere in this
thread.  For my case, I'm thinking of the sort of data that builds up over
time, and most of which happens probably not to be useful at any moment, but
all of which _might_ be useful over the long haul.  So what I wanted,
originally, was to be able to set arbitrary ranges of tuples to be
read-only, and to be able to set them offline if I wanted.  Pseudo-DDL:

ALTER TABLE foo
SET read_only='t'
WHERE created_on < '2007-01-01';

ALTER TABLE foo
SET tuple_offline='t'
WHERE created_on < '2006-01-01';

Now, the second segment is marked "offline".  If I query the table for
things in that range, I get an ERROR telling me there might be data in the
range, but it's not mounted at the moment.  If I try to update records in
the read-only (first) range, I get an error telling me the tuple is marked
read only.  The idea then is that these older tuples can be put off into
long-term storage (wave hands here about the management of that stuff), and
this keeps my online system compact but yet allows me, for just the cost
of mounting a backup tape and reading the segments back, to go back and
query those old ranges.

The case I was particularly aiming at originally was for a case of data that
cannot cost more than fractions of pennies to store, but that might
represent a hugely expensive liability if the answer is not always right. 
Driving down that storage cost was mostly what I was aiming at, but people
gradually convinced me that slightly more generic implementations might be
useful.  Simon's proposal came along, and it seems to me to be something
like the generic implementation that others already convinced me was needed.

> I think Simon's proposal loses the very feature that makes partitioning
> useful. The DBA doesn't have a "thing" to describe, he has to define what
> parts of the table he's describing for every operation. And if you define a
> whole new object to name these "things" I think you'll end up with something
> that looks a lot like tables.

I don't see how that's the case at all.  In fact, I have the feeling it's
the opposite, so perhaps I've misunderstood something.

A

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] VACUUM FULL out of memory

2008-01-08 Thread Andrew Sullivan

On Tue, Jan 08, 2008 at 05:53:28PM +, Sam Mason wrote:
> What about a stored procedure in a language that allows you to do
> system(3) calls?

PL/bash?  (I think there is something like this).  But surely the ulimit
before start is much easier!

A

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Load spikes on 8.1.11

2008-07-17 Thread Andrew Sullivan

On Fri, Jul 18, 2008 at 10:05:33AM +0530, Gurjeet Singh wrote:

> I just ran DROP SCHEMA _ CASCADE; and it spiked again, on a
> very low loaded box!!

Ah, well, if slony is involved, then you have possible locking
problems in the database _also_ to contend with, along with the
spinlock problems.  This will for sure cause spikes.

You need to tell us more about what you're doing.  And I bet some of
it belongs on the slony lists.

A
-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Load spikes on 8.1.11

2008-07-17 Thread Andrew Sullivan

On Fri, Jul 18, 2008 at 10:41:36AM +0530, Gurjeet Singh wrote:
> 
> Just started INIT cluster Slonik command and that spiked too.. for more than
> 10 minutes now!!

Are you attempting to do Slony changes (such as install Slony) on an
active database?  I strongly encourage you to read the Slony manual.
Slony, frankly, sucks for this use case.  The manual says as much,
although in more orotund phrases than that.

A

-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Load spikes on 8.1.11

2008-07-21 Thread Andrew Sullivan

On Sat, Jul 19, 2008 at 07:09:46AM +0530, Gurjeet Singh wrote:

> Will try this option, at least in the next schema upgrade or when setting up
> Slony.

As I've already suggested, however, if you try to set up slony on a
loaded database, you're going to see all manner of problems.  Slony
takes some heavy-duty locks when it does its setup work.  It's
designed that you should have an application outage for this sort of
work.  Please see previous discussion on the Slony mailing list.

A
-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Do we really want to migrate plproxy and citext into PG core distribution?

2008-07-21 Thread Andrew Sullivan

On Mon, Jul 21, 2008 at 01:17:39PM -0700, David E. Wheeler wrote:
>  pgFoundry ain't the CPAN, alas.

Maybe that's the problem that really needs solving?

One of the big Postgres features is its extensibility.  I agree that
the extensions can sometimes be hard to find, but surely the answer to
that is not an infinitely large source tarball?

A

-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Load spikes on 8.1.11

2008-07-21 Thread Andrew Sullivan

On Tue, Jul 22, 2008 at 02:41:55AM +0530, Gurjeet Singh wrote:

> I am aware of the heavy locking involved with Slony, which should mean that
> it blocks the application connections; that's be completely acceptable,
> given all the warnings in the Slony docs. But what I am concerned about and
> trying to hunt down is why  backend processes are all consuming up all
> of CPU (!!!) so much so that I am unable to fire up any new process!

Ah, well, then, yes, the spinlock improvements probably will help
you.  But you should disabuse yourself of the idea that 
processes have no cost.  You still have to talk to all those
connections when doing schema changes.

A

-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] plproxy v2

2008-07-22 Thread Andrew Sullivan

On Mon, Jul 21, 2008 at 09:32:57PM -0400, Tom Lane wrote:
> "Marko Kreen" <[EMAIL PROTECTED]> writes:

> > 2.  If cluster connection strings do not have 'user=' key,
> > ' user=' || current_username() is appended to it.
> 
> Cool, I missed that.  At minimum the documentation has to explain this
> point and emphasize the security implications.  Is it a good idea
> to allow user= in the cluster strings at all?

I wondered about this myself.  Is there anything at all preventing me
from doing 'user=' for some other user?  If not. . .

> > Also, plroxy does
> > _nothing_ with passwords.  That means the password for remote
> > connection must be in postgres user's .pgpass,
> 
> That seems *exactly* backwards, because putting the password in postgres
> user's .pgpass is as good as disabling password auth altogether.

. . .this means that any user on system1 for which there is at least
one user on system2 with plproxy access automatically also has that
access on system2.  (Plus what Tom noted).

> We regularly get beat up about any aspect of our security apparatus
> that isn't "secure by default".  This definitely isn't, and from
> a PR point of view (if nothing else) that doesn't seem a good idea.

I'm less worried about the PR, and more worried about the truck-sized
hole this opens in any authentication controls.  It seems to me that
it's a fairly serious problem.

A

-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Should creating a new base type require superuser status?

2008-07-31 Thread Andrew Sullivan

On Wed, Jul 30, 2008 at 06:07:53PM -0400, Alvaro Herrera wrote:

> I do agree that creating base types should require a superuser though.
> It too seems dangerous just on principle, even if today there's no
> actual hole (that we already know of).

I agree.

-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: plan invalidation vs stored procedures

2008-08-19 Thread Andrew Sullivan

On Tue, Aug 19, 2008 at 02:47:13PM -0400, Tom Lane wrote:

> Whether (and how far) to backpatch has always been a best-judgment call
> in the past, and we've gotten along fine with that.  I think having a
> formal policy is just likely to lead to even more complaints:

I completely agree with this.  If you formalise the back-patch policy,
then it will be necessary to invent classifications for bug severity
to determine whether to back patch.  This will inevitably lead to some
sort of false objectivity measure, where bugs get a "severity number"
that actually just means "we have already decided to back-patch".

A

-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: plan invalidation vs stored procedures

2008-08-19 Thread Andrew Sullivan

On Tue, Aug 19, 2008 at 12:42:29PM -0700, Joshua Drake wrote:
> Generally speaking we adhere to the following guideline for patches.
>* Security fixes are applied to all applicable branches.
>* Bugfixes are applied to all applicable branches
>   * Note: A patch that addresses a known limitation is generally
> not backpatched
>* New features are always applied to -HEAD only.
> 
> This is not a policy as much as a legend for developers to consider
> before they submit their patch.

But it's meaningless.  "Bugfixes are applied to all applicable
branches," is either false or trivially true.  It's trivially true if
you interpret "applicable branches" to mean "the ones that get the
patch".  It's false if you mean "bugfix" to mean "every patch that
fixes a bug".  I can think of bugs that we have lived with in older
releases because fixing them was too risky or because the bug was so
tiny or unusual as to make the risk greater than the reward.

A formal policy that's any more detailed than what's in the FAQ today
is a solution in search of a problem.

A

-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: plan invalidation vs stored procedures

2008-08-20 Thread Andrew Sullivan

On Wed, Aug 20, 2008 at 03:12:43PM +0300, Asko Oja wrote:

> - If there is nothing that can be done in 8.3 at least warning should be
> added into the documentation.  It will be just one more don't in our long
> list don'ts for our developers.

I am in favour of that change in the 8.3 branch.

> 
> ERROR:  cache lookup failed for function.
> - Could the plan be marked as invalid so it would fail only once so the next
> call to the function would get replanned and work again. At least it would
> be better than losing parts of application for indeterminate time.

That seems to me to be a behaviour change, not a bug fix.  I agree
that the current behaviour is pretty annoying.  That is not the same
thing as "a bug" except in the loosest sense.  The system works as
specified, and therefore it's not a bug.  If the specification is
wrong, you need a new specification; that's a "bug fix" that is
usually pronounced "major release".

> - Could some less dangerous looking mechanism be added to 8.3 that wouldn't
> make users not used to PostgreSQL limitations gasp for air when they see the
> workarounds :)

I think it a very bad idea even to suggest that we start undertaking
things like adding mechanisms to minor releases, even with smileys at
the end of the sentence.  I appreciate (possibly more than many
hackers) the limitations that are imposed on users by some of the
decisions historically taken by developers in some of the previous
major releases.  But I very strongly agree with Dimitri: the
super-conservative approach to maintenance releases that this project
takes is a really big benefit to users, and is ultra important in
"mission critical" environments.  Otherwise, it becomes practically
impossible to get minor releases into production.  If you have to
worry about the possibility of major changes between minor versions,
you will have to treat every release as a major release.

I don't think we have sufficient commercial integration support yet
that we can follow the lead of the Linux kernel, where the system
vendor has the effective obligation to make sure your kernel actually
works.  

In addition, if someone wants to develop back-patches for 8.3 that
give it new functionality otherwise planned for 8.4, I see nothing
wrong with them doing so.  That's the advantage offered by having the
source.  But the idea that the new functionality should be patched
back by the project because one is impatient is not on.

A

-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: plan invalidation vs stored procedures

2008-08-20 Thread Andrew Sullivan

On Wed, Aug 20, 2008 at 05:03:19PM +0300, Asko Oja wrote:
> 
> Lets get on with 8.4

Oh, I shoulda mentioned that, too -- I completely support doing this
work for 8.4.  (I can think of more than one case where this feature
alone would be worth the upgrade.)

A

-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Cleanup of GUC units code

2008-09-03 Thread Andrew Sullivan

On Wed, Sep 03, 2008 at 06:37:29PM +0300, Hannu Krosing wrote:
> On Wed, 2008-09-03 at 08:20 -0700, Joshua D. Drake wrote:
> > There is no arguing that MB != Mb; 
> 
> The whole point of this discussion is, that mostly people expect 
> MB == Mb = mb == mB, especially if they see weird constructs like kB
> used (k for Kilo, or actually Kibi).

Note that in the networked computer world, MB and Mb are
importantly different.  The latter is relevant for the speed of your
network interface, for instance.  People often get this wrong when
speaking carelessly, but a mistake of this sort is a serious one,
given the orders of magnitude difference.

A

-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Cleanup of GUC units code

2008-09-03 Thread Andrew Sullivan

On Wed, Sep 03, 2008 at 01:48:18PM -0400, Alvaro Herrera wrote:

> I think the energy wasted in this discussion would be better spent in
> working a the check-the-config-file feature.  That would equally solve
> this problem, as well as many others.

This seems like a good idea to me.

A

-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Cleanup of GUC units code

2008-09-04 Thread Andrew Sullivan

On Thu, Sep 04, 2008 at 01:26:44AM +0300, Hannu Krosing wrote:

> So Andrews opinion was that Mb (meaning Mbit) is different from MB (for
> megabyte) and that if someone thinks that we define shared buffers in
> megabits can get confused and order wrong kind of network card ?

I know it's fun to point and laugh instead of giving an argument, but
the above is not what I said.  What I said is that there is a
technical difference between at least some of these units, and one
that is relevant in some contexts where we have good reason to believe
Postgres is used.  So it seems to me that there is at least a _prima
facie_ reason in favour of making case-based decisions.  Your argument
against that appears to be, "Well, people can be sloppy."

Alvaro's suggestion seems to me to be a better one.  It is customary,
in servers with large complicated configuration systems, for the
server to come with a tool that validates the configuration file
before you try to load it.  Postfix does this; apache does it; so does
BIND.  Heck, even NSD (which is way less configurable than BIND) does
this.  Offering such a tool provides considerable more benefit than
the questionable one of allowing people to type whatever they want
into the configuration file and suppose that the server will by magic
know what they meant.

> I can understand Alvaros stance more readily - if we have irrational
> constraints on what can go into conf file, and people wont listen to
> reason

Extending your current reasoning, it's irrational that all the names
of the parameters have to be spelled correctly.  Why can't we just
accept log_statement_duration_min?  It's _obvious_ that it's the same
thing as log_min_duration_statement!  It's silly to expect that
harried administrators have to spell these options correctly.  Why
can't we parse all the file, separating each label by "_".  Then if
any arrangements of those labels matches a "real" configuration
parameter, select that one as the thing to match and proceed from
there?

A

-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Cleanup of GUC units code

2008-09-04 Thread Andrew Sullivan

On Thu, Sep 04, 2008 at 07:01:18AM -0700, Steve Atkins wrote:
> Settings in postgresql.conf are currently case-insensitive. Except
> for the units.

And, of course, filenames when you are using a case-sensitive
filesystem.  Because these are things that are defined by some
convention other than the ones the PGDG made up.  Since units fall
into that category, it seems to me that we're stuck with using
external conventions.

> one right now. If the answer to that is something along the lines
> of we don't support megaabits for shared_buffers, and never will because
> nobody in their right mind would ever intend to use megabits
> to set their shared buffer size... that's a useful datapoint when
> it comes to designing for usability.

And you are going to establish this worldwide convention on what
someone in right mind would do how, exactly?  For instance, I think
nobody in right mind would use "KB" to mean "kilobytes".  I suppose
you could get a random sample of all current Postgres users to decide
what makes sense, but then you'd have the problem of knowing whether
you had a random sample, since the population isn't obviously
identifiable.  Or, we could just stick with the convention that we
already have, and write a tool that captures this an other issues.
Maybe even one that could later form the basis for an automatic tuning
advisor, as well.

The problem with appeals to common sense always turns out to be that
different people's common sense leads them to different conclusions.
(We had a devastating government in Ontario some years ago that claimed
to be doing things that were just common sense; the Province is still
cleaning up the mess.)  

A

-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Do we really need a 7.4.22 release now?

2008-09-18 Thread Andrew Sullivan

On Thu, Sep 18, 2008 at 03:25:10PM -, Greg Sabino Mullane wrote:
> Frankly, the whole pg_dump mess is what keeps many people on older versions,
> somtimes including 7.4.

This isn't my experience.  The reasons people stay on older releases
are manifold.

A
-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches

2008-09-26 Thread Andrew Sullivan

On Thu, Sep 25, 2008 at 08:57:46PM -0400, Tom Lane wrote:
> Another point is that the proposed behavior leaks quite a lot of
> information, since it will fail operations on the basis of tuples that
> supposedly aren't visible to the invoking user.  While I admit that it's
> hard to see an alternative if we're to preserve FK integrity, I have to
> worry that this definition isn't going to satisfy the tin-foil-hat
> brigade that are supposed to be the main users of SEPostgres.  If the
> goal is "you don't know the row is there", this doesn't seem to meet it.

The above point, and other similar ones in every discussion of the
proposed functionality, makes me think once again either that the
requirements for this feature aren't understood by everyone, or else
that they're not actually explicit enough.  I have a feeling it's the
latter.  Certainly, I've not yet read a complete security analysis of
a data system security plan that outlines why the proposed model is
correct.

What I think is really happening with this development is that the
SE-Linux understanding of "security enhancement" has been taken as the
correct analysis for how one secures an information system.  That deep
assumption appears to me to be informing much of the development of
SE-PostgreSQL.  In particular, that deep assumption includes an
assumption that consistency of access control trumps all.  The
Postgres developers who are questioning the SE approach are (I think)
coming at this from the point of view of data systems developers,
where consistency of the data set trumps all.

I suspect that the tension between these approaches will not be
reconciled without a fairly complete outline of possible security
models for data systems, their relationship to what the OS security
people have decided is the right thing to do, and the trade-offs
necessary to make different priorities work.  Some of the trade offs
may include things like "violate traditional understanding of data set
consistency" and "possible disclosure of existence of datum".  I think
this will be a lot of work, and I'm not volunteering to do it.  I
nevertheless think that without it, the SE-PostgreSQL features will
continue to be a very awkward fit.

A

-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches

2008-09-26 Thread Andrew Sullivan

On Thu, Sep 25, 2008 at 10:32:24PM -0400, Tom Lane wrote:
> I can't escape the lurking suspicion that some bright folk inside the
> NSA have spent years thinking about this and have come up with some
> reasonably self-consistent definition of row hiding in a SQL database.
> But have they published it where we can find it?

I have a couple contacts in the security world who might be able to
help with references.  I'm asking them now.

A

-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches

2008-09-26 Thread Andrew Sullivan

Dear colleagues,

I said earlier I'd ask around about some of the literature on security
controls vs. databse accessibility and side channels.  I did, and I
heard back.

One person told me that this conference often has things on this
topic:

http://www.ieee-security.org/TC/SP-Index.html

>From my brief glimpse of the TOCs from the proceedings, as well as
some spelunking in the ACM guide, it seems to me that some people have
already worked out what ought to happen in many of these cases, and
all we need to do is write down what we think ought to happen for the
various use cases.  I note in particular that an awful lot of work
seems to be coming out of the health care sector in this area.  That
strikes me as at least as good a guide as national security concerns,
and anything that one might want to do probably ought to be able to
cope with at least those two caricatures of use cases.

I also found a 2007 doctoral thesis by Azhar Rauf, Colorado Technical
University, _A tradeoff analysis between data accessibility and
inference control for row, column, and cell level security in
relational databases_.  The title and abstract make me think it might
be worth looking at.

Hope this is helpful,

A

-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches

2008-09-26 Thread Andrew Sullivan

On Fri, Sep 26, 2008 at 06:15:46PM -0400, Bruce Momjian wrote:

> I am confused how knowing that a sequence number used for a primary key
> exists or doesn't exist is leaking _meaningful_ information. 

This sort of side-channel intelligence is _exactly_ how certain kinds
of security exploits work: I'm not supposed to know that _x_ exists;
but by knowing key-of-_x_, I learn that _x_ exists.  From existence, I
can infer something, and from that inference I construct an attack
that was supposed to be forestalled by the access controls.

I am by no means a security expert, but I know enough about the area
to know that it is very hard to get right, and that seemingly
insignificant flaws in design turn out to be major vulnerabilities.
To speak about something I do know about, when DNS was designed,
nobody could have imagined that the widespread availability of
recursion would turn out to be a flaw.  Today, it turns out that open
recursion can be used in an attack that magnifies the attacker's
outbound traffic by many orders of magnitude.  This sort of surprise
side effect is why I am so anxious that something advertised as a
security system fit really well with the proposed use cases.

A

-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches

2008-10-09 Thread Andrew Sullivan

Hi,

On Sat, Sep 27, 2008 at 12:18:45PM +0900, KaiGai Kohei wrote:
> 
> As I repeated several times, SE-PostgreSQL applies the seuciry policy
> of SELinux to achieve consistency in access controls. 

I get this.  The problem as I see it is that the SELinux security
policy is designed around a very different set of assumptions about
concurrency and consistency than any database system has to provide.
My admittedly hurried glance at some abstracts in the literature
suggests to me that others have looked at the paradoxes that come out
of this kind of security policy consistency when you apply them to
database systems.  I think that clearly stating which of the
trade-offs are the ones to be accepted is all that's needed.

> In my vision, Apache assigns its contents handler an individual
> security context based on HTTP authentication, source IP address
> and so on just before web application invoked.
> Because web applications works with individual least privilege set,
> its accesses on filesystem are restricted by the security policy.
> In a similar way, its accesses on databases are also restricted
> via SE-PostgreSQL by same policy, by same privilege set.

I want to focus on this description, because you appear to be limiting
the problem scope tremendously here.  We've moved from "general
security policy for database system" to "security policy for database
system as part of a web-application stack".  I suggest that the range
of practically available behaviours of the DBMS working as part of a
web-application stack is a subset of the practically available
behaviours of the DBMS overall.  This could be the source of some of
the difficulty.  For instance, it seems that some scenarios people are
worried about are really the sort of scenario relevant to online users
of Postgres, rather than to people seeing results filtered through a
web application.  If we just don't care about the online, interactive
users, then maybe some of those concerns go away.  (I'll be honest,
though, that I have a hard time getting excited about a security
system that doesn't really work as advertised outside of a narrow
context.)

All that said, I think there are definite practical uses for the work
you're undertaking, and I want to emphasise that I think the general
goal is probably a good one.  I am suggesting that some additional
work clarifying the specific goals of the work is all that's really
needed. 

A

-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches

2008-10-09 Thread Andrew Sullivan

On Wed, Oct 08, 2008 at 11:36:19AM +0900, KaiGai Kohei wrote:
> Yes, unfortunatelly.
> No one replied to my proposed design:
>   http://marc.info/?l=pgsql-hackers&m=12470930544&w=2

FWIW, I didn't know what to say about that proposal because I still
don't know what problems any of this is trying to solve.  Perhaps I'm
just obtuse (and people should feel free to ignore me, since I'm not
volunteering any code anyway); but this entire discussion seems to me
to lack clear statements of what the goals of the effort are.  I know,
"consistent access control"; that still doesn't tell me enough, I
think.

I haven't given it a great deal of thought, but it seems to me that
there are at least a few different goals people have in mind.  These
are the ones I've thought of, but I don't pretend this is an
exhaustive list.  It certainly isn't based on a serious review of the
literature, which seems to be plentiful:

1.  Single-policy access control for all objects available in a
system.  I'm guessing that this is the point of SE-PostgreSQL.  It
also appears to me that the SE-Linux approach may not have completely
defined how these things work in a database context, but that there
has been academic work in this area.  I think a clear description of
what the costs and benefits of this approach are would go a long way
to helping evaluation.  For instance, what sort of side channels does
this expose?  Are there database design techniques that help?  &c.

2.  Granular row-level access control based on database-level ACLs.  I
have formed the impression that there is some support of this sort
needed anyway to implement (1), but maybe not.  I guess this is the
proposal you mention.  What is the goal here?  Simply
administrator-defined controls on data access inside the database?  Is
there a model we're following, or are we making one up?  If the
latter, are we sure we're solving all the use cases?  What are they?
What are the problems here: for instance, this has exactly the same
sorts of foreign-key issues that swamped the discussion of the
SE-PostgreSQL patches, and I don't see that the new proposal addresses
any of that.

3.  Column-level access controls.  This is ruled out of the current
discussion in the proposal mentioned above.  Surely you need
column-level access controls to make (1) work, though?  If not, then
I'm even more confused than I thought.

4.  Metadata-level access controls.  None of the proposals so far seem
to provide a complete set of access controls for the system details --
schemas, databases, &c.  Such controls are often requested, so I
wonder about that.

Please understand that I'm not trying to be obstructive, but at the
moment I don't understand what the proposals aim to do.  I suggest
that, without some clear statements of what things are trying to do,
and what the intended limitations are, it will always be impossible
for anyone to review the implementation of such a big feature and say
whether it does what it intends to do.

A

-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches

2008-10-10 Thread Andrew Sullivan

On Fri, Oct 10, 2008 at 01:09:48PM +0900, KaiGai Kohei wrote:

>> 4.  Metadata-level access controls.  None of the proposals so far seem
>> to provide a complete set of access controls for the system details --
>> schemas, databases, &c.  Such controls are often requested, so I
>> wonder about that.
>
> We are already have GRANT/REVOKE on databases, schemaes and so on
> as a core facility. This optional facility does not need to provide
> it again.

I think I wasn't clear enough.  One of the requests we hear all the
time -- indeed, somone just posted an RFQ looking for coders for it --
is a request to prevent users who haven't any permission on a database
to learn anything about it at all.  In a shared hosting environment,
for instance, the idea is that two customers can have databases in the
same back end, and not be able to learn anything about one another
_including that they are there_.  I am pretty sure I first heard
someone wishing for something like that when was using PostgreSQL
6.something, so it's a long-standing irritant.

Anyway, I'm not trying to suggest, "You should do this."  I'm just
trying to point out that what are the obvious areas of access control
from one point of view are not even interesting from another.  This is
why I think a fairly complete analysis is needed (and why I think it
hasn't been done yet).

A

-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches

2008-10-10 Thread Andrew Sullivan

On Fri, Oct 10, 2008 at 01:44:49PM +0900, KaiGai Kohei wrote:
> Andrew Sullivan wrote:
>> I want to focus on this description, because you appear to be limiting
>> the problem scope tremendously here.  We've moved from "general
>> security policy for database system" to "security policy for database
>> system as part of a web-application stack".
>
> The "general security policy for database system" is an incorrect term.
> SELinux does not cover database system only. It covers operating sytem
> and application managing objects (like database object, X window, ...).
> Thus, it should be talked as "general security policy for operating
> system, database system and so on".

Ok, then let's use the broader case, which is "general security policy
for entire computing system including a RDBM subsystem" (call this
"GSPECS+DB", say).  This shows up even more the issue that considering
primarily the application stack does not actually cover all the cases.

I'm not suggesting, even a little bit, that securing an application
stack as you propose is a waste of time.  It could be, actually, that
this more modest goal is the more appropriate one, and that
SE-PostgreSQL would be a killer feature in this space (because it
would, if it worked, solve a lot of problems that other systems have,
as you have pointed out).  But it is not GSPECS+DB, because of all the
corner case problems whose behaviour still needs working out.  Since I
don't have to do any of the work to maintain the system in future in
the face of the proposed new code, I can be indifferent as to whether
the achievement of the goal is worth the cost.  But plainly, others
who need to look after the code will want to know what the exact goal
is before committing themselves to future maintenance.

A

-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSL cleanups/hostname verification

2008-10-21 Thread Andrew Sullivan

On Tue, Oct 21, 2008 at 08:47:35AM -0400, Tom Lane wrote:

> Um, IIRC what it's checking there is the server's key signature, which
> has nada to do with certificates.

That depends on whether you used an X.509 certificate to authenticate
the original signature.  Just about nobody does, but AIUI, there's a
way to do so.  Anyway, in the strict sense you're right, but the
comparison is wrong anyway.  SSH doesn't pretend to be authenticating
over SSL.  It's authenticating using the SSH protocol, which has its
own RFCs describing it.

If I understand the description of the current behaviour, I have to
agree with those who say the current behaviour is almost worse than
nothing.  In the presence of DNS forgery (and I'll bet a pretty good
lunch most people aren't using DNSSEC), it's not hard to send a client
to the wrong server.  If the ssl-using client will blithely proceed if
it can't authenticate the server, it's pretty hard to see in what
sense this is a conforming use of anything I know as SSL.  SSL is
supposed to provide both encryption and authentication (the
self-signed certificate nonsense is actually breakage that everyone in
the protocol community wails about whenever given the opportunity,
because of the results in user behaviour.  It was a compromise that
people made back in the period when Verisign had a lock on the market
and would charge you an arm and a leg for a cert). 

A

[Actually, to be pedantic, it might be better to call the
authentication method TLS, so as not to conflate it with the
Netscape-defined SSL.  But this is maybe straying into a different
topic.]

-- 
Andrew Sullivan
[EMAIL PROTECTED]
+1 503 667 4564 x104
http://www.commandprompt.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 3 4 >

1 - 100 of 394 matches

Mail list logo