Re: [GENERAL] Performance issues of one vs. two split tables.

2007-05-16 Thread Brian Wipf
On 16-May-07, at 4:05 PM, PFC wrote: This makes queries hard to optimize. Consider the table (user_id, item_id) meaning user selected this item as favourite. If you want to know which users did select both items 1 and 2, you have to do a self-join, something like : SELECT... FROM favourite

Re: [GENERAL] Performance issues of one vs. two split tables.

2007-05-16 Thread PFC
We use bitfields on our large user table. It is becoming unworkable to scan for matches, since overall most people have very few selections made. We are moving it to a model like your favorite_colors table which just links the option and the user. We find that doing joins on large tables

Re: [GENERAL] Performance issues of one vs. two split tables.

2007-05-16 Thread Vivek Khera
On May 15, 2007, at 10:35 AM, Bill Moseley wrote: For some value of "large", is there a time when one might consider using a single column in the user or user_prefs table to represent their color choices instead of a link table? We use bitfields on our large user table. It is becoming unwork

Re: [GENERAL] Performance issues of one vs. two split tables.

2007-05-16 Thread Vivek Khera
On May 14, 2007, at 4:37 PM, Bill Moseley wrote: Say that there's also about 10 columns of settings or preferences for each user. Are there any cases or reasons to have a separate "user_preferences" table vs. just placing all the columns together in one table? when you have, say 65 million u

Re: [GENERAL] Performance issues of one vs. two split tables.

2007-05-15 Thread Bill Moseley
On Tue, May 15, 2007 at 09:20:53PM +0200, PFC wrote: > >From an outside perspective it just > >seems odd that potentially a large amount of data would be pulled off > >disk into memory that is never used. Perhaps there's an overriding > >reason for this. > > Yeah, where would you put this d

Re: [GENERAL] Performance issues of one vs. two split tables.

2007-05-15 Thread PFC
Thus, if there are a whole bunch of columns on each table, the data in those extra columns (e.g. - all columns aside from "id", the one that was asked for in the result set) will indeed be drawn into memory. Yeah, I wanted to mean that ;) All the columns are loaded (except the TOASTed

Re: [GENERAL] Performance issues of one vs. two split tables.

2007-05-15 Thread Steve Atkins
On May 15, 2007, at 12:02 PM, Bill Moseley wrote: On Tue, May 15, 2007 at 01:42:18PM -0400, Chris Browne wrote: [EMAIL PROTECTED] (PFC) writes: SELECT o.id FROM order o JOIN customer c on o.customer = c.id Does that bring into memory all columns from both order and customer? M

Re: [GENERAL] Performance issues of one vs. two split tables.

2007-05-15 Thread John D. Burger
Thus, if there are a whole bunch of columns on each table, the data in those extra columns (e.g. - all columns aside from "id", the one that was asked for in the result set) will indeed be drawn into memory. Is that specific to Postgresql? From an outside perspective it just seems odd that po

Re: [GENERAL] Performance issues of one vs. two split tables.

2007-05-15 Thread Bill Moseley
On Tue, May 15, 2007 at 01:42:18PM -0400, Chris Browne wrote: > [EMAIL PROTECTED] (PFC) writes: > >> SELECT o.id > >> FROM order o > >> JOIN customer c on o.customer = c.id > >> > >> Does that bring into memory all columns from both order and customer? > >> Maybe that's not a good examp

Re: [GENERAL] Performance issues of one vs. two split tables.

2007-05-15 Thread SCassidy
One other possible reason for splitting the table up in two chunks is to grant different rights on the 2 sets of columns. Susan Cassidy Bill Moseley <[EMAIL PROTECTED]> Sent by: [EMAIL PROTECTED] 05/15/2007 09:44 AM To Postgres General cc Subject Re: [GENERAL] Performance issues

Re: [GENERAL] Performance issues of one vs. two split tables.

2007-05-15 Thread Chris Browne
[EMAIL PROTECTED] (PFC) writes: >> SELECT o.id >> FROM order o >> JOIN customer c on o.customer = c.id >> >> Does that bring into memory all columns from both order and customer? >> Maybe that's not a good example due to indexes. > > No, it just pulls the columns you ask from the

Re: [GENERAL] Performance issues of one vs. two split tables.

2007-05-15 Thread PFC
SELECT o.id FROM order o JOIN customer c on o.customer = c.id Does that bring into memory all columns from both order and customer? Maybe that's not a good example due to indexes. No, it just pulls the columns you ask from the table, nothing less, nothing more. Splitting tab

Re: [GENERAL] Performance issues of one vs. two split tables.

2007-05-15 Thread Bill Moseley
On Tue, May 15, 2007 at 07:51:44AM +0200, Dawid Kuroczko wrote: > On 5/15/07, Bill Moseley <[EMAIL PROTECTED]> wrote: > >On Tue, May 15, 2007 at 06:33:26AM +0200, Dawid Kuroczko wrote: > >> Well, views are not going to help with memory consumption here. > >> It is the table contents that gets cache

Re: [GENERAL] Performance issues of one vs. two split tables.

2007-05-15 Thread Ben
I'm sure there's a point where you'd be saving a "substantial" amount of disk space using a non-normalized scheme, but, like you say, you'd be missing out on other things. In general, disks are cheap while the man hours used to try to fix data corruption is not. On May 15, 2007, at 7:35 AM,

Re: [GENERAL] Performance issues of one vs. two split tables.

2007-05-15 Thread Bill Moseley
Can anyone provide input on this question? I'm curious how to look at this from a disk and memory usage perspective. Would using a bit column type help much? I'm not thrilled by the loss of referential integrity. On Mon, May 14, 2007 at 01:37:18PM -0700, Bill Moseley wrote: > > Say I have a ta

Re: [GENERAL] Performance issues of one vs. two split tables.

2007-05-14 Thread PFC
Say, if you first SELECT fname, lname FROM user_table; and then you issue SELECT * FROM user_table; -- the second select will be returned from buffer cache -- since all rows are already in the cache. ...Unless your table contains some large TEXT columns that have been stored out of line (T

Re: [GENERAL] Performance issues of one vs. two split tables.

2007-05-14 Thread Dawid Kuroczko
On 5/15/07, Bill Moseley <[EMAIL PROTECTED]> wrote: On Tue, May 15, 2007 at 06:33:26AM +0200, Dawid Kuroczko wrote: > Well, views are not going to help with memory consumption here. > It is the table contents that gets cached in buffer cache, not the > views contents. So if you have a view which

Re: [GENERAL] Performance issues of one vs. two split tables.

2007-05-14 Thread Bill Moseley
On Tue, May 15, 2007 at 06:33:26AM +0200, Dawid Kuroczko wrote: > Well, views are not going to help with memory consumption here. > It is the table contents that gets cached in buffer cache, not the > views contents. So if you have a view which returns only one > column from 15-column table, you w

Re: [GENERAL] Performance issues of one vs. two split tables.

2007-05-14 Thread Rich Shepard
On Mon, 14 May 2007, PFC wrote: I did something like that on MySQL some time ago. In the Users table there was stuff that other users need to see (like his login name, etc), and stuff that only this user needs to see (like his preferences). So, when displaying posts in the forum, for

Re: [GENERAL] Performance issues of one vs. two split tables.

2007-05-14 Thread Bill Moseley
On Mon, May 14, 2007 at 10:52:13PM +0200, PFC wrote: > > >Say I have a "user" table that has first, last, email, password, and > >last_accessed columns. This user table will be accessed often. (It's > >not really "user", but that's not important in this discussion) > > > >Say that there's also a

Re: [GENERAL] Performance issues of one vs. two split tables.

2007-05-14 Thread PFC
Say I have a "user" table that has first, last, email, password, and last_accessed columns. This user table will be accessed often. (It's not really "user", but that's not important in this discussion) Say that there's also about 10 columns of settings or preferences for each user. Are there