On May 15, 2007, at 12:02 PM, Bill Moseley wrote:

On Tue, May 15, 2007 at 01:42:18PM -0400, Chris Browne wrote:
[EMAIL PROTECTED] (PFC) writes:
    SELECT o.id
    FROM order o
    JOIN customer c on o.customer = c.id

Does that bring into memory all columns from both order and customer?
Maybe that's not a good example due to indexes.

        No, it just pulls the columns you ask from the table, nothing
less,  nothing more.

That's not quite 100% accurate.

In order to construct the join, the entire pages of the relevant
tuples in tables "order" and "customer" will need to be drawn into
memory.

Thus, if there are a whole bunch of columns on each table, the data in
those extra columns (e.g. - all columns aside from "id", the one that
was asked for in the result set) will indeed be drawn into memory.

Is that specific to Postgresql?  From an outside perspective it just
seems odd that potentially a large amount of data would be pulled off
disk into memory that is never used.  Perhaps there's an overriding
reason for this.

The columns are next to each other on the disk. You need to read
the entire block off disk into system cache, so you'll be reading all
the columns of all the rows in that block into memory.

That's just the way that most (all?) modern filesystems work, and so the
way that most filesystem based databases are going to work. I've seen
some databases that don't store all the main columns of a table together
on disk, but they're fairly rare.

Pushing data into lookaside tables either manually or automatically
via toast changes the tradeoffs.

Cheers,
  Steve




---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
      choose an index scan if your joining column's datatypes do not
      match

Reply via email to