NOTE - after writing all this, I did think of a possible solution, but I'mUmmmm... yeah, that would be a great way to do that *if* the "inheritance" emulation in sql worked that way ...
not sure if PG can handle it. If I made a table called "object" with one
column, the object_id, and then had EVERY table inherit from this table.
Then, I could have my constraints set up against this master table.
Unfortunately, it does not :-(
That's exactly the reason I called it 'half-baked' in one of the earlier messages... When you inherit table B from table A, B will have all the
columns A has, but setting up an FK on table C against A won't work, because the FK will check for the key to be present in A *itself*, not in A or any of its inherited children, as one would expect...
What you *could* do, though, if you really wanted is to kinda emulate that inheritance on your own, with something like:
create table A ( id serial primary key, type text not null ); create unique index a_idx on A (id,type);
create table B1
(
id int primary key,
type text not null default 'b';
stuff text
foreign key (id,type) references A(id,type) on delete cascade on update cascade deferrable initially deferred
);
create table B2
(
id int primary key,
type text not null default 'b1',
stuff text,
foreign key (id,type) references A(id,type) on delete cascade on update cascade deferrable initially deferred
);
create table C
(
id int not null references A on delete cascade on update cascade initially deferred,
note text
);
... now you can insert notes for either B1 or B2 into C
This kinda works, but just seems like too much trouble to go through - it would be nice if the 'inheritance' support could do something like that for you automatically, but, since it doesn't, I'd rather stick with the old good 'plain sql' solution - get rid of A, merge B1 and B2 together (just B), and make C reference B.
Then you need to normalize your schema first - extract those "very similar" parts, and put them into the single table, and make your notes table reference that one, then create specialized table(s) on the side, that will contain those columns that are different between the objects, and make them reference your 'master' table too.
If you can combine your queries with a union, your table layouts must be
very similar if not identical.
Why not put everything into the same table then, and just have an FK
between that table and the notes?
No, there are _parts_ that are very similar. I don't know where this whole "table-combining" kick came from, but I've usually found that it ends in a disaster.
Well... That's the 'GUI delete'...
What do you mean by soft-delete?
Leaving orphaned notes behind? Bad idea...
I have a boolean flag that says "active". I don't ever actually purge
data. There are times when it is useful to come back in and look at
what's been "deleted". From the user-interrface standpoint it has been
deleted, but we can still go back in and retrieve records after they are
gone.
Sooner or later you will want to do the 'real' delete - either to remove an object that just was created by mistake, and should not be there at all, or simply to clean up your database, and remove the stuff that has been sitting there for years, being 'inactive'
I am sure, they will... *after* they screw up the database, and begin wonderring what's wrong with it :-)
How can you be sure? What if you get hit by a bus, and another admin,
who is not as knowledgeable as you are goes to the db, and runs an
insert with plain sql?
They should read the documentation.
Not really... If your database schema is thoughtfully designed, one has to be *really* smart to be able to screw something up.If they want to run an insert with SQL, they should at least be smart about it :) If someone's too stupid to read documentation, they are going to screw up the database no matter what.
No, it is not going to save *you*, but it *is* going to save your *data*Also, even if it was indeed only possible because of a broken code, you
are not saying that your code is bug-free, are you?
If it's not bug-free, having a good database schema isn't going to save me.
No, they won't, unless you actually use them :-)
Even, if it was, postgres is not, and your filesystem and OS are not
either. If the database crashes in the middle of your insert, you'll end
up having inconsistent data.
Transactions will handle that one.
Easy - merge your tables together the way I described above. Your ease of programming will not be affected - your obj_html() function will still work, and won't even require any changes. *Moreover* - if you ever need to add more object types (or modify the existing ones), you will be able to do that with much less effort, then you would need to invest now, with your current schema (in most of the cases, you'll be able to do those kinds of enhancements without even touching your schema *at all*).
If I understand your statement ('what would the harm be') correctly, and
you just don't care about your data consistency, then, I guess, you are
rigfht - you don't need any constraints... but, in that case, I don't
You are missing the point. There are many things that must be balanced:
* Ease of programming / speed of development
* Correctness of code
* Consistency of data
I can get much further with my system on all three points than I can with yours. With mine, I have a single, tested function that I can use anywhere. This hits off the first two. While this prevents me from having a database-checked #3, it still gives me consistent data because we don't do hard-deletes and we have tested and verified #2. If you can show me how to get te ease-of-programming and correctness with your approach, I'd be happy to use it.
No, it does not show that they are incomplete, it shows that you are not using them the right way :-)
I do care about data consistency and concurrent access. And my data is consistent. I use many of the RDBMS features of postgres such as views / subselects, transactions, triggers, etc. The fact that I have instances where they fall short and I have to do it in the application just shows that our tools are incomplete, not that I don't care about data
Not *one* table. I never advocated that. It is perfectly normal to split your data into different tables *vertically* (i.e. things that do not have any intersection between their data, should go into different tables), but it very rarely (if at all) makes any sense to split it *horizontally* (so that identical columns sit in different tables, just because your application interprets them differently) - the 'early' indication of the problems caused by the latter approach is the temptation to create 'multitable references' - this can always easily be avoided by eliminating those 'horizontal divisions', and doing so will necessarily (and immediately) benefit *all three* of your "balancing points", that you mentioned earlier.consistency. In fact, I have trouble thinking how anyone could have sensical data with everything joined together into one uber-table, which you seem to be advocating.
We ahve a standard creation process. If someone is an idiot, that can'tNot 'helped' entirely, but the amount of damage *can* and *should* be limited.
be helped.
It is an unfortunate reality of life - not all people are smart (even, some of the ones with access to the database are not).
It is definitely a bad programming practice to assume otherwise.
What if someone pulls out several drives from a running RAID?The first oen and the last one are hardware related. There are ways to deal with those too (a good start will be putting a lock on the server room for example), but they are beyond the current topic.
What if someone inserts bad records (even if they are inconsistent)? What
if someone steals the server?
As for inserting bad records - that's *exactly* what properly designed constraints are supposed to prevent.
If you wind up with problems, you have to fix them. The fact thatSure. There should not be copies. Code duplication is a software-world analog of poorly designed schema problem in the database world.
problems are possible does not make them likely. There is much more
likelihood of a programmer screwing up writing their 14th copy of
note_html and the associated table structure than there is that someone
pulls a number out of their butt.
Both are bad, and should be avoided. I never suggested that you eliminate the latter at the expense of the former.
Your note_html() looks great to me, and I agree, that it should be kept that way... it's the database side that look s problematic.
Your programmers must be really smart :-)Where would they get the number if not from the sequence? "Oooh, I know, I'll insert 10 as the primary key, jsut for kicks! I'll even hardcode that into the app!" Not a very likely scenario.
Are you saying that you have never seen a person writing a piece of sql like:
insert into mytable (id, data) select max(id) + 1 from mytable, 'mydata'
???
If so, you must be really lucky :-)
Exactly. But 'copy from ' does *not* - so, after you have loaded, your sequnce next_val () will return 1.
What if you load your database from a backup and forget to reinit the
sequence?
If it's being loaded from backup, the schema reinits the sequence on
schema-load.
And that's what you should do - similar (common) parts go to the same table, to which your notes are linked, the differences go to other tables, and get linked to your master table the same way.
If that was the case, you would not be able to combine them with a
union, as you said you do...
No, I said that there were certain instances that this happens. Not that it is the norm. In the cases where I do combine them with a union, I am only combining the similar parts, not the whole shebang.
You can then create a bunch of views to make it *look* exactly the same way as it is now to the programmer, so that your application code will not be affected at all, but your schema will be nicely normalized, and you'll be able to set up constraints on it properly.
Maybe :-)
Splitting the notes table would be pointless. Why do it?Because that would make it possible to use the constraints.
Maybe the constraint system should be extended to allow it to check
across multiple tables.
That was my last suggestion in the original message - you can always write a trigger to do that...
It would be the worst of the three other possibilities we've considered, but still better, then what you have now.
The OID type/column is ideal for this.No, it isn't. Not all tables have oids. The ones that do, do not guarantee, that they will be unique.
Even if you do that by hand (create unique index on table(oid)), there is still no way to guarantee their uniqueness across entire database.
Sadly, itI don't know what you are talking about :-)
is only 32 bits, and they are not storing an OID/table lookup like I think
they should. That would enable really powerful database applications that
are way too much work to make today.
'too much work' ... come on.
If you think of a solution, that is too much work, it does not mean, that it is the *only* possibility - just think again :-)
What do you mean by "record merges"?With that kind of thing, you could even do record merges with automatic database support.
Any meaning of that phrase I can imagine can be easily done with the currently supported database features... so, you must mean something different by that, I assume...
Sure... and all of your payments, sponsors, and whatever else you have are 'objects', arent' they?
Also, if one adopts your earlier point, it can also be argued, that it
is equally 'nuts' to have notes about Payments stored together with
notes about Sponsors.
Those notes have just as much to do with each other as the objects they
annotate. :-)
But they are all "notes" on "objects".
I did not say that your notes have nothing to do with each other - I only said that they have only as much to do with each other as your different kinds of objects do. If putting all the objects (or at least, some parts of them) into the same table doesn't make sense to you, I don't see why putting the notes together should.
And the other way around - if treating notes in a generic way does make sense to you, I don't see any reason why doing the same for the 'objects' should not.
I did - just put the damn objects into the same table :-)
If you insist that Payments must be separate from Sponsors, the same
exact argument should be applied to their respective notes
Why? All of the note's attributes function exactly the same, while none of Payments and Sponsors function the same. Again, you have yet to show how you can make a mechanism that is as easily extensible as mine is.
It is not just "as easily" extensible as yours, but *much easier* extensible rather
Your "fault" is that they *are* available, but you just refuse to use them :-)The fact that database constraints aren't available to support it is a fault of the database, not mine.
I don't know what you are talking about... What table names? What copy if schema????Hmmm... changing table names, having to go in and recreate an exact copy of the schema every time that is added.
You lost me completely.
I *never* suggested anything remotely like that.
Yep. It does. No argument here.
You want that same one line of code, working against the properly
designed and normalized sql schema, that lets you rely on the database
top ensure your data consistency and access efficiency.
Yes, but to set it up requires a full database change.
Frankly, I am not trying to convience you that you should do that database change now (although, if I were you, I would *certainly* do it right away, before I get into any more trouble - changing - what 5 - 10 tables? - doesn't look like *that* much of a deal to me, compared to the benefit you'd by from it).
The discussion was about whether or not your schema was *initially* created in a good way.
Whether or not to change it now is another question - as I said, I certainly would, but this is your decision entirely, I am not going to try to talk you into it :-)
Not one table - see above.Not at all. If you give up your idea about splitting your payments from
your sponsors for example, you won't need to modify your schema *at all*
if you need to add another object type, or another kind of note, or
whatever - all you'd need to do would be to implement the new
application logic in your application, where it belongs, and be done
with it. No need to even touch your schema at all.
I don't see why you want to make my entire database into one giant table. Why bother with tables at all in that case?
And I don't understand your second question - "why bother with tables"... what's the alternative?
That doesn't matter (it indeed won't work 'out of the box' - see the beginning of this email - but for totally different reason)
First of all, inheritance wouldn't work in case of the notes, as it is a one-to-many relationship.
No. They would have a base class of "Object" (or whatever), and the 'notes' would be linked to the Object.
No, it's not. At least with PostgreSQL, you can only have ONE line of
inheritance (at least as far as I am aware - please correct me if I'm
wrong). In that case, all of these would have to have a base class of
"notes".
This would in fact, be a *beatiful* solution... it's a shame really that it doesn't work.
I am wonderring if what postgres does with those inherited FK constraints is specified by the standard, or if it is just an implementation feature, that can be improved...
No, you would be perfectly able to add "features" to base class as much as you want.Of course, if I wanted to add other similar features, I would be unable to. For example, if I wanted to add a "changelog" table, and be able to attach last_modified, creator, etc to certain tables, I could not do so without adding them to all of them.
That's the beauty of the inheritance.
The "one-to-one" features can just be added as columns to the base table (you can actually do that with even the current half-baked inheritance implementation, you can even do that without any inheritance at all - if all of your common features sit in the same table, like I suggest, you can just keep adding columns to that table as you like). The one-to-many ones can go to additional tables, and get linked to the object id the same way your notes do.
... sure. Except if you wanted to add a new feature to *all* of the objects...
The approach I have let's me pick and choose any additional feature I want to add to any record.
Then you would either have to use your "notes hack", and create a separate table with broken links, or you would have to go and modify each and every of your "object" tables separately.
That's another reason for you to extract the common features into a "supertable" - a change like this would then become a breeze.
No, they are not :-)
Sure, if you implement in such way. But not if you give it some thoughtAgain, this would require modifying and testing that trigger every time I want to add a new thing to take notes on.
in advance, and come up with an implementation that would be generic
enough not to care about your application-specific differences between
sponsors and payments :-)
Actually, they are database-specific.
The database doesn't have any notion of payments, sponsors, notes etc. All this stuff is application specific.
Database specific are tables, columns and constraints. Everything else is application logic, that belongs to the application.
Dima
---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings