> However any time you have a primary key with meaning is probably > a mistake in database design. You are best trying to avoid this > at all costs unless you are constrained by a legacy system in > some way.
This is common advice, but the topic is controversial. Even in non-legacy databases, having natural PKs can have advantages: - A synthetic key often means an additional join since an internal ID number is incomprehensible to a human so you need to get the database record that contains the printable text for the ID. This affects a large percentage of queries that return displayable data, and almost all ad-hoc queries that your team leader will use to check database sanity (this means that your team leader will usually shoot down the idea of using synthetic PKs exclusively, with good reasons, and that's that). - Even with synthetic keys, in a hierarchy of parent-child relationships, it is useful to construct the PK of a child from the ID of the parent plus a unique synthetic detail id. That way, if you have a grandchild record, you can immediately access its grandparent and don't need to load the parent record. This isn't a very common scenario and mostly restricted to ad-hoc queries, but it does occasionally help. - A synthetic key is redundant if the data does contain a natural key, so we're violating a normalization property and (more importantly) spreading out the data over more disk sectors, which has a performance impact (this starts to get noticeable at six-digit record count and can become paramount at millions of records; below that, performance is usually a non-issue anyway and you need to question yourself what you actually need a database for when a text editor search will work just as well - okay, I'm exaggerating a bit :-) ). There is one restriction: You never ever change a primary key. It's technically possible, but in practice, you need to find each and every place where that key value was ever stored: not just associated tables but also any database dumps on backup media, or if that PK value ever went to somebody else's computer (say, via a webservice), any copy of the value on their disks, which is far more effort that it's worth even if you manage to control all the places where the value has gone. So, if you use natural keys, if there is *any* chance that *anybody* will *ever* want to change a PK candidate field, don't put that field in the PK and use a synthetic PK instead. The distinction between natural and synthetic keys is a bit more blurry than most people realize, too. Most "natural" keys that I see in practice have started life as synthetic key on somebody else's system. The EAN, ISSN, and ISBN, for example, come as data to most, but these are a world-wide unique values specifically designed to never change, so each of them is perfectly fine as a PK if you are dealing with data that always comes with such a number. Just be 150% sure that all the articles you'll ever deal with are guaranteed to have such a number (not true if you're doing business with entities that do not know or care about such numbers, but can be true in B2B scenarios). The symbols of chemical elements, despite being standardized, have changed in the past, and it is conceivable that we may live to see one or two changes in the future. Ordinal numbers of elements, by definition, never change, so they are an acceptable PK in a table of element properties. If you are talking to another system and that system provides you with a unique key for something, you can use that UK as a PK (or as part of a PK), too. Just talk to the guys maintaining the system and make sure that they run a strict never-change-a-PK policy. (This is essentially the same scenario as the one with EAN/ISSN/ISBN, except that the guys maintaining these PKs have already publicly committed to never changing these numbers.) Reporting straight from the trenches, Jo