On 15.03.21 03:47, Thomas Munro wrote:
On Thu, Mar 11, 2021 at 2:06 AM David Steele <da...@pgmasters.net> wrote:
On 12/1/20 3:38 AM, Jürgen Purtz wrote:
OK. Patch attached.
+ Queries which access multiple tables (including repeats) at once are called
I'd write "Queries that" here (that's is a transatlantic difference in
usage; I try to proofread these things in American mode for
consistency with the rest of the language in this project, which I
probably don't entirely succeed at but this one I've learned...).
Maybe instead of "(including repeats)" it could say "(or multiple
instances of the same table)"?
+ For example, to return all the weather records together with the
location of the
+ associated city, the database compares the <structfield>city</structfield>
column of each row of the <structname>weather</structname> table with the
<structfield>name</structfield> column of all rows in the
<structname>cities</structname>
table, and select the pairs of rows where these values match.
Here "select" should agree with "the database" and take an -s, no?
+ This syntax pre-dates the <literal>JOIN</literal> and <literal>ON</literal>
+ keywords. The tables are simply listed in the <literal>FROM</literal>,
+ comma-separated, and the comparison expression added to the
+ <literal>WHERE</literal> clause.
Could we mention SQL92 somewhere? Like maybe "This syntax pre-dates
the JOIN and ON keywords, which were introduced by SQL-92". (That's a
"non-restrictive which", I think the clue is the comma?)
+1. All proposed changes integrated.
--
Kind regards, Jürgen Purtz
diff --git a/doc/src/sgml/query.sgml b/doc/src/sgml/query.sgml
index e793398bb2..44aa8ef32b 100644
--- a/doc/src/sgml/query.sgml
+++ b/doc/src/sgml/query.sgml
@@ -438,16 +438,15 @@ SELECT DISTINCT city
<para>
Thus far, our queries have only accessed one table at a time.
- Queries can access multiple tables at once, or access the same
- table in such a way that multiple rows of the table are being
- processed at the same time. A query that accesses multiple rows
- of the same or different tables at one time is called a
- <firstterm>join</firstterm> query. As an example, say you wish to
- list all the weather records together with the location of the
- associated city. To do that, we need to compare the <structfield>city</structfield>
+ Queries that access multiple tables (or multiple instances of the same table) at once are called
+ <firstterm>join</firstterm> queries. They internally combine
+ each row from one table with each row of a second table. An expression is
+ specified to limit which pairs of rows are returned.
+ For example, to return all the weather records together with the location of the
+ associated city, the database compares the <structfield>city</structfield>
column of each row of the <structname>weather</structname> table with the
<structfield>name</structfield> column of all rows in the <structname>cities</structname>
- table, and select the pairs of rows where these values match.
+ table, and selects the pairs of rows where these values match.
<note>
<para>
This is only a conceptual model. The join is usually performed
@@ -459,10 +458,16 @@ SELECT DISTINCT city
<programlisting>
SELECT *
- FROM weather, cities
- WHERE city = name;
+ FROM weather
+ JOIN cities ON city = name;
</programlisting>
+ After the keyword <command>ON</command> follows the
+ expression comparing their rows. In this example the
+ column <varname>city</varname> of table <varname>weather</varname>
+ must be equal to the column <varname>name</varname>
+ of table <varname>cities</varname>.
+
<screen>
city | temp_lo | temp_hi | prcp | date | name | location
---------------+---------+---------+------+------------+---------------+-----------
@@ -497,23 +502,14 @@ SELECT *
<literal>*</literal>:
<programlisting>
SELECT city, temp_lo, temp_hi, prcp, date, location
- FROM weather, cities
- WHERE city = name;
+ FROM weather
+ JOIN cities ON city = name;
</programlisting>
</para>
</listitem>
</itemizedlist>
</para>
- <formalpara>
- <title>Exercise:</title>
-
- <para>
- Attempt to determine the semantics of this query when the
- <literal>WHERE</literal> clause is omitted.
- </para>
- </formalpara>
-
<para>
Since the columns all had different names, the parser
automatically found which table they belong to. If there
@@ -524,8 +520,8 @@ SELECT city, temp_lo, temp_hi, prcp, date, location
<programlisting>
SELECT weather.city, weather.temp_lo, weather.temp_hi,
weather.prcp, weather.date, cities.location
- FROM weather, cities
- WHERE cities.name = weather.city;
+ FROM weather
+ JOIN cities ON cities.name = weather.city;
</programlisting>
It is widely considered good style to qualify all column names
@@ -535,15 +531,29 @@ SELECT weather.city, weather.temp_lo, weather.temp_hi,
<para>
Join queries of the kind seen thus far can also be written in this
- alternative form:
+ form:
<programlisting>
SELECT *
- FROM weather INNER JOIN cities ON (weather.city = cities.name);
+ FROM weather, cities
+ WHERE city = name;
</programlisting>
- This syntax is not as commonly used as the one above, but we show
- it here to help you understand the following topics.
+ This syntax pre-dates the <literal>JOIN</literal> and <literal>ON</literal>
+ keywords, which were introduced by SQL-92. The tables are simply listed
+ in the <literal>FROM</literal>, comma-separated, and the comparison
+ expression added to the <literal>WHERE</literal> clause.
+ </para>
+
+ <para>
+ As join expressions serve a specific
+ purpose in a multi-table query it is preferable to make them stand-out
+ by using join clauses to introduce additional tables into the query.
+ The results from the older implicit syntax and the newer explicit
+ JOIN/ON syntax are identical. But for a reader of the statement
+ its meaning is now easier to understand: the join condition is
+ introduced by its own key word whereas previously the condition was
+ merged into the WHERE clause together with other conditions.
</para>
<indexterm><primary>join</primary><secondary>outer</secondary></indexterm>
@@ -556,12 +566,13 @@ SELECT *
found we want some <quote>empty values</quote> to be substituted
for the <structname>cities</structname> table's columns. This kind
of query is called an <firstterm>outer join</firstterm>. (The
- joins we have seen so far are inner joins.) The command looks
- like this:
+ joins we have seen so far are <firstterm>inner joins</firstterm>.)
+ The command looks like this:
<programlisting>
SELECT *
- FROM weather LEFT OUTER JOIN cities ON (weather.city = cities.name);
+ FROM weather
+ LEFT OUTER JOIN cities ON city = name;
</programlisting>
<screen>
@@ -591,10 +602,9 @@ SELECT *
</para>
</formalpara>
+ <indexterm><primary>join</primary><secondary>self</secondary></indexterm>
+ <indexterm><primary>alias</primary><secondary>for table name in query</secondary></indexterm>
<para>
- <indexterm><primary>join</primary><secondary>self</secondary></indexterm>
- <indexterm><primary>alias</primary><secondary>for table name in query</secondary></indexterm>
-
We can also join a table against itself. This is called a
<firstterm>self join</firstterm>. As an example, suppose we wish
to find all the weather records that are in the temperature range
@@ -608,10 +618,9 @@ SELECT *
<programlisting>
SELECT w1.city, w1.temp_lo AS low, w1.temp_hi AS high,
- w2.city, w2.temp_lo AS low, w2.temp_hi AS high
- FROM weather w1, weather w2
- WHERE w1.temp_lo < w2.temp_lo
- AND w1.temp_hi > w2.temp_hi;
+ w2.city, w2.temp_lo AS low, w2.temp_hi AS high
+ FROM weather w1
+ JOIN weather w2 ON w1.temp_lo < w2.temp_lo AND w1.temp_hi > w2.temp_hi;
</programlisting>
<screen>
@@ -628,8 +637,8 @@ SELECT w1.city, w1.temp_lo AS low, w1.temp_hi AS high,
queries to save some typing, e.g.:
<programlisting>
SELECT *
- FROM weather w, cities c
- WHERE w.city = c.name;
+ FROM weather w
+ JOIN cities c ON w.city = c.name;
</programlisting>
You will encounter this style of abbreviating quite frequently.
</para>