I have a table which has some "raw" data in it. By "raw", I mean it is minimally processed from a log file. Every week, I update this table by processing the weekly log using awk to create a "psql script" file which looks similar to:
COPY rawdata FROM STDIN; .... lines created by awk script \. The table schema is: Table "jobrun.rawdata" Column | Type | ----------+---------------+ lpar | character(4) | yyddd | character(5) | timedata | character(11) | jobid | character(8) | msgid | character(7) | jobname | character(8) | Now, this data is not really very useful in its raw form. So I "process" it via a view: View "jobrun.rundata" Column | Type | ---------+--------------------------+ lpar | character(4) | msgid | character(7) | jobname | character(8) | jobid | character(8) | msgtime | timestamp with time zone | View definition: SELECT rawdata.lpar, rawdata.msgid, rawdata.jobname, rawdata.jobid, to_timestamp((rawdata.yyddd::text || ' '::text) || rawdata.timedata::text, 'YYDDD HH24:MI:SS.MS'::text) AS msgtime FROM rawdata; My question is this: If I do a number of SELECTs on the "rundata" table. So, would it be worth while to make this a table in itself? The plus of a view is that I don't need to worry about updates. And I still have the "raw" data around. In reality, this is just the first VIEW. I create three other views. Two views are to "subset" the data based on the contents of the "msgid" value (there are only 2 possibilities at present: 'START' and 'END'). The final view, which is my actual information is a FULL OUTER JOIN of the START and END subset, based on lpar,jobname, and jobid: View "jobrun.runinfo" Column | Type | ----------+--------------------------+ lpar | character(4) | jobname | character(8) | jobid | character(8) | runstart | timestamp with time zone | runend | timestamp with time zone | View definition: SELECT COALESCE(a.lpar, b.lpar) AS lpar, COALESCE(a.jobname, b.jobname) AS jobname, COALESCE(a.jobid, b.jobid) AS jobid, a.msgtime AS runstart, b.msgtime AS runend FROM runstart a FULL JOIN runend b ON a.lpar = b.lpar AND a.jobname = b.jobname AND a.jobid = b.jobid; So the overhead may be quite high, because to SELECT from RUNINFO, PostgreSQL must realize all four views. I appreciate your thoughts on if this is OK, given that performance is currently acceptable. Mainly because this work is basically only done one a week, on Sundays. And I don't do it myself, it is done via a scheduler (not cron, but similar) which runs some scripts. -- There is nothing more pleasant than traveling and meeting new people! Genghis Khan Maranatha! <>< John McKown