Hey, I guess you know all about PL/R, the R language extension for postgres . It is very convenient, though be carefull as sometime it crashed my server.
Cheers, Rémi-C 2014-07-16 3:42 GMT+02:00 John McKown <john.archie.mck...@gmail.com>: > On Tue, Jul 15, 2014 at 8:46 AM, David G Johnston < > david.g.johns...@gmail.com> wrote: > >> John McKown wrote >> > I have a table which has some "raw" data in it. By "raw", I mean it is >> > minimally processed from a log file. Every week, I update this table by >> > processing the weekly log using awk to create a "psql script" file which >> > looks similar to: >> > >> <snip> >> > So the overhead may be quite high, because to SELECT from RUNINFO, >> > PostgreSQL must realize all four views. >> > >> > I appreciate your thoughts on if this is OK, given that performance is >> > currently acceptable. Mainly because this work is basically only done >> one >> > a >> > week, on Sundays. And I don't do it myself, it is done via a scheduler >> > (not >> > cron, but similar) which runs some scripts. >> >> I would likely make "jobrun.runinfo" into a table while leaving >> "jobrun.rawdata" as-is. I would have a function that populates "runinfo" >> from "rawdata" that I would call after performing the copy to "rawdata". >> There would be no views - unless you desire a view interface over >> "runinfo" >> for API or permission reasons. >> >> In 9.4 you can (probably) make "runinfo" an explicit MATERIALIZED VIEW and >> perform REFRESH command to accomplish the same thing - though I am not >> particularly familiar with the mechanics of that feature. >> >> David J. >> >> > Being the indecisive nut that I am, I am going to do both <grin/>. I will > keep the current view. But when I update the rawdata, what I will then do > is: > > drop table runinfo_table; > create table runinfo_table as select distinct * from runinfo; > > I am fairly confident that there cannot be any duplicates in runinfo. But, > being paranoid as well, I will do the DISTINCT just to be sure. I may > change the VIEW to do that in the future, and remove it from the > preceeding. Since the process which updates the rawdata table is automated > and runs on a Sunday, the time needed to recreate runinfo_table is not > relevant to me. So I get what I want, unless I update rawdata off schedule. > I cannot imagine why I would do that since the logs from which I create it > are generally only available after 17:00 local time on Sunday. Getting the > iogs-to-date information for the time since the last dump is basically a > PITA and my current use is not critical. Actually, it is more a > "skunkworks" project of my own to produce a set of nice graphs, using R, > which _might_ turn out to be interesting to management, but the production > of which _will_ help me learn PostgreSQL and R better (hopefully). > > Many thanks. > > -- > There is nothing more pleasant than traveling and meeting new people! > Genghis Khan > > Maranatha! <>< > John McKown >