Thanks for adding this Bruce!

Is anyone going to be working on this immediately? If so, I'd be glad to work with someone. Unfortunately, I don't have the time to devote to taking something this big on, but I think it would be a really great thing to have. Just let me know [EMAIL PROTECTED] OR [EMAIL PROTECTED] Thanks!

Bruce Momjian wrote:

Added to TODO:

* Add the features of packages
       o  Make private objects accessable only to objects in the same schema
       o  Allow current_schema.objname to access current schema objects
       o  Add session variables
       o  Allow nested schemas


---------------------------------------------------------------------------

Bruce Momjian wrote:
OK, so it seems we need:

        o  make private objects accessable only to objects
           in the same schema
o Allow current_schema.objname to access current schema objects
        o  session variables
        o  nested schemas?

---------------------------------------------------------------------------

Dave Held wrote:
-----Original Message-----
From: Bruce Momjian [mailto:[EMAIL PROTECTED]
Sent: Tuesday, May 10, 2005 8:43 AM
To: Thomas Hallgren
Cc: Tom Lane; [EMAIL PROTECTED]; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Oracle Style packages on postgres

[...]
I suppose.  I think we should focus on the use cases for Oracle
packages, rather than the specific functionality it provides. What things do people need PostgreSQL to do that it already
doesn't do?
Is that really the best way to go about things?  Already RDBMSes
are patchwork quilts of functionality.  Is merely adding another
patch the most elegant way to evolve the database?  The problem is
that Oracle et al are trying to be ORDBMSes and aren't exactly sure
what the best way to go is. Instead of trying to formulate a rational plan for what an ORDBMS should even look like, they simply
look at what would work with their existing infrastructure and tack
on features.  Then Postgres plays the copycat game.  Instead of
trying to play catch-up with Oracle, why not beat them at their own
game?

What packages provide is encapsulation.  Hiding the data from the
user and forcing him/her to use the public interface (methods).
That is an important and admirable OO feature.  Some people think
that using the DB's security model can achieve the same thing.  It
can't, exactly, but there's an important lesson to be learned from
the suggestion.  The problem is that OOP is a *programming* paradigm,
and a database is not a *programming language*.  In a programming
language, there really is no such thing as "security". There is only "visibility" and "accessibility". Private methods in an OOP
language do not provide *security*; they only limit *accessibility*.
Like so many other differences between the relational model and the
OOP model, there is an impedance mismatch here.  However, there is
also opportunity.

In an OOPL, you can say: "Users can call this method from here, but
not from there."  What you *can't* say is: "User X can call this
method, but User Y cannot."  As you can see, these are orthogonal
concepts.  You could call the first "accessibility by location" and
the second "accessibility by authentication".  An ORDBMS should
support both.  "Private" does not respect your identity, only your
calling location.  An ACL does not respect your calling scope, only
your identity.  A system that has both is clearly more flexible than
one that only has one or the other.

Now what you need to keep in mind is that each visibility model serves a different purpose. The purpose of a security model is to limit *who* can see/touch certain data because the data has intrinsic value. The purpose of an accessibility model is to limit *where* and *how* data can be seen/touched in order to preserve *program invariants*. So if you have an object (or tuple!) that records the start and stop time of some process, it is probably a logical invariant that the stop time is greater than or equal to the start time. For this reason, in a PL, you would encapsulate these fields (attributes) and only provide controlled access to update them that checks and preserves the invariant, *no matter who you are*. You don't want a superuser violating this invariant any more than Sue User.

Now you might object that constraints allow you to preserve invariants as well, and indeed they do. But constraints do not
respect calling scope.  Suppose there is a process that needs to
update the timestamps in a way that temporarily breaks the invariant
but restores it afterwards.  The only way to effect this in a
constraint environment is to drop the constraint, perform the
operation, and restore it.  However, dropping a constraint is not an
ideal solution because there may be other unprivileged processes operating on the relation that still need the constraint to be enforced. There is no way to say: "There is a priviledged class of methods that is allowed to violate this constraint because they are trusted to restore it upon completion." Note that this is different
from saying "There is a priviledged class of users that is allowed
to violate this constraint."  If you try to do something like give
read-only access to everybody and only write access to one user and
define that user to be the owner of the methods that update the data,
you have to follow the convention that that user only operates through the defined interface, and doesn't hack the data directly.
That's because user-level accessibility is not the same as scope-
level accessibility.  Whereas, if you define something like a
package, and say: "Package X is allowed full and complete access
to relation Y", and stick the interface methods in X, you still have
all the user-level security you want while preserving the invariants
in the most elegant way.

So you can think of a package as a scope in a programming language.
It's like a user, but it is not a user.  A user has privileges that
cut across scopes.  Now, whether packages should be different from
schemas is a whole different ballgame.  The purpose of a schema in
Postgres is not entirely clear to me.  There's lots of different ways
to use schemas, and there is no obvious best way to use them. In order to implement the accessibility features of packages, schemas
would have to be changed considerably.  Probably a lot of users would
be unhappy if schemas were changed in that way.  My guess is that
this would not be a good idea.

I think we can get some guidance from PLs.  C++ is what you call a
"multi-paradigm language".  You can do everything from assembly to
metaprogramming in C++.  As such, it is very loose and open in some
respects.  C++ has two kinds of scopes: it has classes and namespaces.
Members of a class are encapsulated and support data hiding.  Members
of a namespace are only loosely grouped and do not support data hiding
explicitly.  Namespaces exist primarily to avoid name collisions.

Java, on the other hand, decided that for OOP purity, everything must
be a class.  That would be like making schemas into packages and
imposing accessibility rules on them.  At the end of the day, I think
many PL design experts agree that making everything a class is not
necessarily the best way to go.

So schemas can be like C++ namespaces - they provide a means to loosely group related objects and help avoid name collisions. So the package could be like a class - they provide OOP-like encapsulation via accessibility rules. However, that doesn't mean
that nested schemas wouldn't also be a good thing.  In C++, nested
namespaces are extremely useful when one layer of scoping does not
sufficiently partition the namespace to avoid frequent name collisions. I think the same is true of Postgres. I certainly would like to be able to use nested schema names in several contexts. Instead, I have to make a choice between making different schemas, or making different name prefixes. I wouldn't even mind if nested
schemas were only allowed to contain schemas except at the leaves of
the tree.  Another feature that is very useful is the "using clause".
Combined with nested namespaces, this is a very powerful way to give
programmers/dbas control over names.  You can give everything the
most natural name, and just put it in the appropriate namespace,
and use the namespace that is relevant to the given task at hand.

So consider this example:

Tables:
 etl.import.record
 etl.export.record

As you can imagine, I don't really want to make an 'import' and
'export' schema at the top level.  There's several tables in
each schema, but that should illustrate the point.  Then, when
constructing queries, it would be nice to be able to do this:

USING etl.import
      ;
SELECT *
 FROM record
 JOIN header ON ...
 JOIN file ON ...
      ;

The effect of a USING clause would be to import the schema names
into the public namespace for the duration of the transaction.  If
that leads to ambiguous names, then the parser/planner should emit an error.

__
David B. Held
Software Engineer/Array Services Group
200 14th Ave. East,  Sartell, MN 56377
320.534.3637 320.253.7800 800.752.8129

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

--
 Bruce Momjian                        |  http://candle.pha.pa.us
 pgman@candle.pha.pa.us               |  (610) 359-1001
 +  If your life is a hard drive,     |  13 Roberts Road
 +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])




---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Reply via email to