On Thu, Jun 16, 2022 at 8:28 PM Bryn Llewellyn <b...@yugabyte.com> wrote:
> > *Back to NULLs...* > > Your code examples ran without error and produced the results that you > described. I do understand the fact that, on its face, the NULLs in the > two cases arise for different reasons. But this (still) seems to me to be a > distinction without a difference. It rather reminds me of my earlier > discussion with you (all) about the distinction (in the world of JSON using > "jsonb") between the presence of an object key "k" with the value "JSON > null" and the absence of key "k". > Hadn't thought of that but indeed I suspect SQL, whether conscious or not, has influenced the design of dealing with JSON in an SQL database to this extent. > The semantic proposition behind the "outer join", as it seems to me, is > inextricably bound up with the notion that, in the resulting rows, one > table might not have a partner row with the other. (It doesn't matter here > which table lacks the partner or if you decide to spell your query so that > "right" is the appropriate choice or "left" is—as long as you spell the > whole thing correctly to express your intention.) And the "outer join" > semantics bring the notion that you simply have no information about the > facts that, were it present, the missing row might have given you. Whoever > it was on the Committee back in the day, decided in concert to represent > this "no information" outcome, in the relation that results for an "outer > join", as an "emergent" SQL NULL. > > I've talked endlessly about NULL, over the years and face-to-face, with > colleagues whose reasoning ability and lucidity I hugely respect. They are > unwavering in how they explain NULL. It says simply: "I have absolutely no > information about the value that I sought to interrogate." And these > experts argue that there are no flavors of the bare fact of having no > information. They argue, too, that to say "the value of this variable (or > row-column intersection) is NULL" is an oxymoron because the absence of > information is not a value—in the purist sense. > At a high-level I would agree. But those NULLs are introduce by two different processes and sometimes that fact helps to explain reality. It's like: In what city does John reside in, and in what country is that location. If I don't know the first I won't know the second, even though I do know what country every city in my database is located within (country is not null, it's just null in reference to this question about John). IOW, it is quite possible for a design to have had different token for the two cases, which means they differ in some regard. That it doesn't work that way is because for nearly all cases the difference is immaterial and so separate tokens would be more annoying than helpful. The model is intentionally papering over reality (i.e., is wrong is some sense) in the interest of being more useful. > (A table without a primary key violoates the rule of proper practice. This > is why I found David's terse example too unrealistic to illustrate the > issue at hand here.) > vala is the PK, valb is the FK. Table B's PK wasn't relevant. > *Can anybody show me an implementation of a realistic use case that > follows proper practice — like "every table must a primary key", "a foreign > key must refer to a primary key", and "joins may be made only "on" columns > one of which has a PK constraint and the other of which has a FK > constraint" — where using a not nullable data type brings a problem that > wouldn't occur if the column were defined with a nullable data type and an > explicit "not null" constraint?* > Nothing obvious comes to mind. But frankly, proper practice includes trying to write idiomatic code for the language you are using so others familiar with the language can learn your code more easily. You are violating this to an extreme degree. I do not think it to be a good trade-off. SQL writers are practical people and the idioms largely avoid any downsides that the arise from SQL not being some paragon of language design. > > *-- "\d genres" shows "gk" with a "not null" constraint, whether I write > it* > *-- or not. And convention seems to say "don't clutter you code by writing > it".* > *create table genres(* > * gk int primary key,* > * gv text not null* > * );* > "Primary Key" is defined to be the application of both UNIQUE and NOT NULL constraints. It's not unlike saying "serial" to mean "integer with an associated sequence and default". But let's not go there, please? David J.