[GENERAL] Multithreaded query onto 4 postgresql instances
Here you are my probably uncommon situation. I have installed 4 different instances of postgresql-9.0.2 on the same machine, on ports 5433, 5434, 5435, 5436. On these instances I have splitted a huge database, dividing it per date (from 1995 to 1998 on 5433, from 1999 to 2002 on 5434 and so on...). Then I have developed a C function using libpq which creates 4 threads, each one which query a 1/4 of the db. After that I merge the results in one single response. My function works fine, but I need to include it inside a postgresql instance in order to launch it as a normal SQL query (SELECT myfunc(...);). Why I have to do something tricky like this is long too explain... I have read the documentation here http://www.postgresql.org/docs/9.0/static/xfunc-sql.html#XFUNC-SQL-FUNCTIONS-RETURNING-SET, but I'm pretty confuse and I don't know if this is the right way to accomplish my goal. Have you got any ideas or suggestions? Thanks a lot! -- Alessandro Candini MEEO S.r.l. Via Saragat 9 I-44122 Ferrara, Italy Tel: +39 0532 1861501 Fax: +39 0532 1861637 http://www.meeo.it "ATTENZIONE:le informazioni contenute in questo messaggio sono da considerarsi confidenziali ed il loro utilizzo è riservato unicamente al destinatario sopra indicato. Chi dovesse ricevere questo messaggio per errore è tenuto ad informare il mittente ed a rimuoverlo definitivamente da ogni supporto elettronico o cartaceo." "WARNING:This message contains confidential and/or proprietary information which may be subject to privilege or immunity and which is intended for use of its addressee only. Should you receive this message in error, you are kindly requested to inform the sender and to definitively remove it from any paper or electronic format." -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] partitioning and dynamic query creation
Hello list, I'm currently thinking about a possibility to create a dynamic insert statement inside a trigger function to put the data in the correct partition. What am I talking about ? I want to put data dependant on a timestamp column ("datetime") in seperate partitions. Therefore I created the partitions, checks and a trigger function with the following code: "" ... date_part=''; date_part = to_char(NEW.datetime,'') || to_char(NEW.datetime,'MM'); tablename = 'table_' || date_part; RAISE NOTICE 'target table: %', tablename; EXECUTE 'insert into ' || tablename::regclass || ' values (NEW.*);'; --IF ( DATE (NEW.datetime) >= DATE '2010-11-01' AND -- DATE (NEW.datetime) < DATE '2010-12-01' ) THEN --INSERT INTO tab_tour201011 VALUES (NEW.*); --ELSIF ( DATE (NEW.datetime) >= DATE '2010-12-01' AND --DATE (NEW.datetime) < DATE '2011-01-01' ) THEN --INSERT INTO tab_tour201012 VALUES (NEW.*); --ELSIF ( DATE (NEW.datetime) >= DATE '2011-01-01' AND --DATE (NEW.datetime) < DATE '2011-02-01' ) THEN --INSERT INTO tab_tour201101 VALUES (NEW.*); --ELSIF ( DATE (NEW.datetime) >= DATE '2011-02-01' AND --DATE (NEW.datetime) < DATE '2011-03-01' ) THEN --INSERT INTO tab_tour201102 VALUES (NEW.*); ... "" The above code throws the following error while trying to insert data: "" NOTICE: target table: table_201102 ERROR: missing FROM-clause entry for table "new" LINE 1: insert into table_201102 values (NEW.*); ^ QUERY: insert into table_201102 values (NEW.*); CONTEXT: PL/pgSQL function "insert_trigger" line 10 at EXECUTE statement "" O.K., most probably this is caused by the fact that the statement "string" includes the characters NEW, but not the values...or what?!?! The commented lines are working as expected and I think this is the common way of handling partitions. Now my question: is it possible at all to create the insert statement on the fly, to avoid modifying the trigger function each time a new partition has been added ? any help appreciated::GERD:: -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] GUC configuration
Hi, I am an MSc student in the department of Informatics and Telecommunications of the University of Athens and as part of my thesis I am examining some new cost models for DB optimizers. I have successfully changed the optimizer of PostgreSQL in order to implement these models, but I have stumbled upon a very little detail: until now I use some hardcoded values in my code which I would like to make accessible through GUC. After much googling the only relative pages I have found is about configuring existing PostgreSQL variables. Can anybody please provide some help? How can I add some new configuration variables using GUC? Thanks in advance, Thanos Papapetrou Θάνος Παπαπέτρου You insist that there is something that a machine can't do. If you will tell me precisely what it is that a machine cannot do, then I can always make a machine which will do just that. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] 9.0.X FOR UPDATE|SHARE on Sub-Query Causes "cannot extract system attribute from virtual tuple" if Sub-Query Returns Records (BUG)
Tom, >From your commit notes: "This wasn't a problem before 9.0 because we didn't support FOR UPDATE below the top query level..." FWIW I had been using a sub-query FOR UPDATE in one of my key queries (one that was called multiple times per second) and relied upon the FOR UPDATE to avoid having the same record "dispatched" multiple times. It worked just fine in 8.2.X and 8.4.X - supported or not. David J -Original Message- From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Tom Lane Sent: Wednesday, February 09, 2011 11:37 PM To: David Johnston Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] 9.0.X FOR UPDATE|SHARE on Sub-Query Causes "cannot extract system attribute from virtual tuple" if Sub-Query Returns Records (BUG) "David Johnston" writes: > More simply if you run any query of the form: > SELECT subquerycolumn > FROM ( > SELECT subquerycolumn FROM table WHERE [condition] FOR UPDATE -- > WHERE is optional but obviously useful; FOR SHARE also causes this > behavior > ) intermediate > The error "cannot extract system attribute from virtual tuple" is > thrown IIF the sub-query returns one or more records. Fixed, thanks for the report! http://git.postgresql.org/gitweb?p=postgresql.git;a=commitdiff;h=d5478c3391f 8f1a243abbc3d9253aac3d6d3538e regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Backup/Restore Needed for Upgrade from 9.0beta4?
On Tuesday, February 08, 2011 10:41:15 am Lee Hughes wrote: > From section 15.4 of the manual: > > "If you are upgrading from PostgreSQL "9.0.x", the new version can use > your current data files so you should skip the backup and restore steps" > > Is 9.0beta4 considered a 9.0.x version, or do I need to backup/restore > when upgrading from that version? > > Thanks much- > > Lee This announcement for the Postgres 9.0rc1 indicated a dump and restore for migration from the betas. I take this to mean yes to dump/restore for what you are doing. http://www.postgresql.org/about/news.1230 -- Adrian Klaver adrian.kla...@gmail.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Trigger problem, record "new" is not assigned yet
Hello. I'm probably doing some very basic error here, but I get ERROR: record "new" is not assigned yet The tuple structure of a not-yet-assigned record is indeterminate. when I try this small example create table foo(x int); create or replace function trigger_foo() returns trigger language plpgsql as $$ declare v_i integer; begin select count(1) into v_i from foo; if new.x >18 then raise exception 'error'; else return null; end if; end; $$; CREATE TRIGGER trigger_foo AFTER INSERT ON foo for each row EXECUTE PROCEDURE trigger_foo(); insert into foo (x) values (15); And the table foo only have 14 lines right now. Can someone help me spot the error? :-) -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Trigger problem, record "new" is not assigned yet
On Thu, Feb 10, 2011 at 9:29 AM, A B wrote: > Can someone help me spot the error? :-) > use a BEFORE INSERT trigger? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Trigger problem, record "new" is not assigned yet
Thanks for the suggestion, but CREATE TRIGGER trigger_foo BEFORE INSERT ON foo for each row EXECUTE PROCEDURE trigger_foo(); gives me the same error. 2011/2/10 Vick Khera : > On Thu, Feb 10, 2011 at 9:29 AM, A B wrote: >> Can someone help me spot the error? :-) >> > > use a BEFORE INSERT trigger? > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Trigger problem, record "new" is not assigned yet
On Thursday, February 10, 2011 6:29:58 am A B wrote: > Hello. > > I'm probably doing some very basic error here, but I get > > ERROR: record "new" is not assigned yet > The tuple structure of a not-yet-assigned record is indeterminate. > > when I try this small example > > create table foo(x int); > > create or replace function trigger_foo() returns trigger language plpgsql > as $$ declare v_i integer; > begin > select count(1) into v_i from foo; > if new.x >18 then > raise exception 'error'; > else > return null; > end if; > end; $$; > > CREATE TRIGGER trigger_foo AFTER INSERT ON foo for each row EXECUTE > PROCEDURE trigger_foo(); > > insert into foo (x) values (15); > > > And the table foo only have 14 lines right now. > > Can someone help me spot the error? :-) The above works here, Postgres 9.0.3. Was that the complete error message? Is there more than one foo across the schemas? -- Adrian Klaver adrian.kla...@gmail.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Trigger problem, record "new" is not assigned yet
On Thu, Feb 10, 2011 at 9:38 AM, A B wrote: > CREATE TRIGGER trigger_foo BEFORE INSERT ON foo for each row EXECUTE > PROCEDURE trigger_foo(); > > gives me the same error. > Maybe "NEW" needs to be all caps? Also, with the BEFORE trigger, you'll need to RETURN NEW. This trigger of mine works just fine as a before trigger, FWIW: CREATE OR REPLACE FUNCTION sessions_update_lastuse() RETURNS TRIGGER AS $$ BEGIN NEW.lastuse = CURRENT_TIMESTAMP; RETURN NEW; END; $$ LANGUAGE plpgsql; -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Trigger problem, record "new" is not assigned yet
I'm very embarresed now. There were another trigger that caused a problem. Now it works. Thank you all for helping! :-) 2011/2/10 Adrian Klaver : > On Thursday, February 10, 2011 6:29:58 am A B wrote: >> Hello. >> >> I'm probably doing some very basic error here, but I get >> >> ERROR: record "new" is not assigned yet >> The tuple structure of a not-yet-assigned record is indeterminate. >> >> when I try this small example >> >> create table foo(x int); >> >> create or replace function trigger_foo() returns trigger language plpgsql >> as $$ declare v_i integer; >> begin >> select count(1) into v_i from foo; >> if new.x >18 then >> raise exception 'error'; >> else >> return null; >> end if; >> end; $$; >> >> CREATE TRIGGER trigger_foo AFTER INSERT ON foo for each row EXECUTE >> PROCEDURE trigger_foo(); >> >> insert into foo (x) values (15); >> >> >> And the table foo only have 14 lines right now. >> >> Can someone help me spot the error? :-) > > The above works here, Postgres 9.0.3. Was that the complete error message? Is > there more than one foo across the schemas? > > > -- > Adrian Klaver > adrian.kla...@gmail.com > -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] fulltext search and hunspell
Thanks for this tip, the german compound directory from http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/ works fine. I think the problem was the rudimentary support of hunspell dictionaries. Thanks for your help and your great software! Am 08.02.2011 11:34, schrieb Oleg Bartunov: > Jens, > > have you tried german compound dictionary from > http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/ > > Oleg > On Tue, 8 Feb 2011, Jens Sauer wrote: > >> Hey, >> >> thanks for your answer. >> >> First I checked the links in the tsearch_data directory >> de_de.affix, and de_de.dict are symlinks to the corresponding files in >> /var/cache/postgresql/dicts/ >> Then I recreated them by using pg_updatedicts. >> >> This is an extract of the de_de.affix file: >> >> # this is the affix file of the de_DE Hunspell dictionary >> # derived from the igerman98 dictionary >> # >> # Version: 20091006 (build 20100127) >> # >> # Copyright (C) 1998-2009 Bjoern Jacke >> # >> # License: GPLv2, GPLv3 or OASIS distribution license agreement >> # There should be a copy of both of this licenses included >> # with every distribution of this dictionary. Modified >> # versions using the GPL may only include the GPL >> >> SET ISO8859-1 >> TRY esijanrtolcdugmphbyfvkwqxz??ESIJANRTOLCDUGMPHBYFVKWQXZ-. >> >> PFX U Y 1 >> PFX U 0 un . >> >> PFX V Y 1 >> PFX V 0 ver . >> >> SFX F Y 35 >> [...] >> >> I cannot find "compoundwords controlled z" there, so I manually added >> it. >> >> [...] >> # versions using the GPL may only include the GPL >> >> compoundwords controlled z >> >> SET ISO8859-1 >> TRY esijanrtolcdugmphbyfvkwqxz??ESIJANRTOLCDUGMPHBYFVKWQXZ-. >> [...] >> >> Then I restarted PostgreSQL. >> >> Now I get an error: >> SELECT * FROM ts_debug('Schokoladenfabrik'); >> FEHLER: falsches Affixdateiformat f?r Flag >> CONTEXT: Zeile 18 in Konfigurationsdatei >> ?/usr/share/postgresql/8.4/tsearch_data/de_de.affix?: ?PFX U Y 1 >> ? >> SQL-Funktion ?ts_debug? Anweisung 1 >> SQL-Funktion ?ts_debug? Anweisung 1 >> >> Which means: >> ERROR: wrong Affixfileformat for flag >> CONTEXT: Line 18 in Configuration ... >> >> If I add >> COMPOUNDFLAG Z >> ONLYINCOMPOUND L >> >> instead of "compoundwords controlled z" >> >> I didn't get an error: >> >> SELECT * FROM ts_debug('Schokoladenfabrik'); >> alias | description | token | >> dictionaries | dictionary | lexemes >> ---+-+---+---+-+--- >> >> asciiword | Word, all ASCII | Schokoladenfabrik | >> {german_hunspell,german_stem} | german_stem | {schokoladenfabr} >> (1 row) >> >> But it seems that the hunspell dictionary is not working for compound >> words. >> >> Maybe pg_updatedicts has a bug and generates affix files in the wrong >> format? >> >> Jens >> >> 2011/2/7 Oleg Bartunov : >>> Jens, >>> >>> could you check affix file for >>> compoundwords controlled z >>> >>> also, can you provide link to dictionary files, so we can check if they >>> supported, since we have only rudiment support of hunspell. >>> btw,it'd be nice to have output from ts_debug() to make sure >>> dictionaries >>> actually used. >>> >>> Oleg >> > > Regards, > Oleg > _ > Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), > Sternberg Astronomical Institute, Moscow University, Russia > Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/ > phone: +007(495)939-16-83, +007(495)939-23-83 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] 9.0.X FOR UPDATE|SHARE on Sub-Query Causes "cannot extract system attribute from virtual tuple" if Sub-Query Returns Records (BUG)
"David Johnston" writes: >> From your commit notes: > "This wasn't a problem before 9.0 because we didn't support FOR UPDATE > below the top query level..." > FWIW I had been using a sub-query FOR UPDATE in one of my key queries (one > that was called multiple times per second) and relied upon the FOR UPDATE to > avoid having the same record "dispatched" multiple times. It worked just > fine in 8.2.X and 8.4.X - supported or not. Yeah, what that actually meant was that we didn't support FOR UPDATE below the top level of the query *as executed*. The optimizer used to flatten subqueries containing FOR UPDATE if it could (and fail if it couldn't). 9.0 changes that behavior because it led to FOR UPDATE locking getting applied in unexpected/unpredictable ways in more complex queries, eg joins. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] COPY statement REAL vs VARCHAR precision issue
Hello all, I'm using a COPY statement to load data into a PostGIS. The issue I am facing is that if I define fields with the REAL type, the COPY will only preserve 4 decimals which is not sufficient for my application. Here is the commands I'm running and a sample of the results : CREATE TABLE sites ( id VARCHAR(9) PRIMARY KEY, name VARCHAR(128), type VARCHAR(1), agency VARCHAR(128), status INTEGER, napsDesignated BOOLEAN, address VARCHAR(128), city VARCHAR(128), network VARCHAR(128), timeZone REAL, lat REAL, lon REAL, elevation REAL ); COPY sites ( id, name, type, agency, status, napsDesignated, address, city, network, timeZone, lat, lon, elevation ) FROM '/data/sites.csv' WITH DELIMITER '|' CSV HEADER; -- Then I use some PostGIS functions to convert the lat/lon into a geometry -- The issue is independent of the GIS processing. SELECT lat, lon FROM sites LIMIT 4; --lat | lon -- -+-- -- 47.5681 | -52.7022 --47.56 | -52.7114 -- 49.3208 | -57.3972 -- 48.9495 | -57.9454 -- (4 rows) -- If I define the fields as being VARCHAR instead of REAL, I get all -- the decimals : CREATE TABLE sites ( id VARCHAR(9) PRIMARY KEY, name VARCHAR(128), type VARCHAR(1), agency VARCHAR(128), status INTEGER, napsDesignated BOOLEAN, address VARCHAR(128), city VARCHAR(128), network VARCHAR(128), timeZone REAL, lat VARCHAR(32), lon VARCHAR(32), elevation REAL ); COPY sites ( id, name, type, agency, status, napsDesignated, address, city, network, timeZone, lat, lon, elevation ) FROM '/data/sites.csv' WITH DELIMITER '|' CSV HEADER; SELECT lat, lon FROM sites LIMIT 4; lat|lon --+--- 47.56806 | -52.70222 47.56| -52.71139 49.32083 | -57.39722 48.949479| -57.945387 (4 rows) -- When I convert the lat/lon to GEOMETRY, the function takes a string as input. Therefore, the precision depends on how the REAL are converted to strings. : UPDATE sites SET position = GeomFromEWKT('SRID=4326;POINT(' || lon || ' ' || lat ||' ' || elevation || ')'); -- Are the values stored with all the possible precision of a REAL or are they stored as displayed? -- Is it because of the behaviour of the COPY statement? Thank you for your help! Samuel -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] GUC configuration
On %D, %SN wrote: %Q %C -- Robert... > -Original Message- > From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general- > ow...@postgresql.org] On Behalf Of T ?apap?t??? > Sent: Thursday, February 10, 2011 2:18 PM > To: pgsql-general@postgresql.org > Subject: [GENERAL] GUC configuration > > Hi, > > I am an MSc student in the department of Informatics and > Telecommunications of the University of Athens and as part of my > thesis I am examining some new cost models for DB optimizers. I have > successfully changed the optimizer of PostgreSQL in order to implement > these models, but I have stumbled upon a very little detail: until now > I use some hardcoded values in my code which I would like to make > accessible through GUC. After much googling the only relative pages I > have found is about configuring existing PostgreSQL variables. Can > anybody please provide some help? How can I add some new configuration > variables using GUC? > > Thanks in advance, Are you looking for custom variable classes? http://www.postgresql.org/docs/8.4/interactive/runtime-config-custom.htm l -- Robert... -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Cross Pollination is a wonderful thing
-General How did we get the duckbill platypus? Cross Pollination of course and today is the last day to get your content in for PostgreSQL Conference East. A conference that is cross pollinating with MongoDB, one of those "NoSQL" technologies. Now, we have experience, lots of experience on our side but they are the beautiful and energetic technology. Let's show them what our old school ways can do! Today, CFP for East closes. Don't be left alone after beer-thirty. https://www.postgresqlconference.org/talk_types JD -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579 Consulting, Training, Support, Custom Development, Engineering http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] finding bogus UTF-8
I know that I have at least one instance of a varchar that is not valid UTF-8, imported from a source with errors (AMA CPT files, actually) before PG's checking was as stringent as it is today. Can anybody suggest a query to find such values? -- Scott Ribe scott_r...@elevated-dev.com http://www.elevated-dev.com/ (303) 722-0567 voice -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] COPY statement REAL vs VARCHAR precision issue
On 10/02/2011 17:13, Samuel Gilbert wrote: Hello all, I'm using a COPY statement to load data into a PostGIS. The issue I am facing is that if I define fields with the REAL type, the COPY will only preserve 4 decimals which is not sufficient for my application. If you use NUMERIC, you can define the precision you need, and be sure of accurate calculations: http://www.postgresql.org/docs/9.0/static/datatype-numeric.html#DATATYPE-NUMERIC-DECIMAL Ray. -- Raymond O'Donnell :: Galway :: Ireland r...@iol.ie -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Schema version control
Hi all, My company is having trouble managing how we upgrade schema changes across many versions of our software. I imagine this is a common problem and there're probably some neat solutions that we don't know about. For the last 10 years we have been writing bash shell scripts essentially numbered in order db0001, db0002, db0003 The number represents the schema version which is recorded in the database and updated by the shell scripts. We have a template that provides all the functionality we need, we just copy the script and fill in the blanks. The schema upgrade scripts are committed to svn along with the software changes, and we have a process when installing the software at a site that runs the scripts on the DB in order before starting up the new version of the software. This has worked really well so far. But we've effectively only had one version of the software in development at any time. We're now in the habit of branching the software to form releases to promote stability when making large changes. The idea is that only really important changes are merged in to the releases. This introduces a bit of a problem when some change needs to be merged from one release to another. The typical problem is that we might have two versions of the software 10.0 at schema version 10057 and 11.0 at 11023 and we need to merge an important bug fix from schema 11023 in to 10.0. The issue is that 11023 might depend upon changes introduced in the schema versions before it. Or 11023 might introduce changes that cause later scripts to break (11000 - 11023) when upgrading from 10.0 to 11.0. One potential solution is to require that schema changes are never merged in to a release, but of course sometimes business requires we do =( I'm really interested to hear how you guys manage schema upgrades in the face of branches and upgrading from many different versions of the database. I've been reading http://pyrseas.wordpress.com/2011/02/07/version-control-part-2-sql-databases/ but I have a feeling that this blog post won't address branches. Cheers! --Royce -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Schema version control
On 2/10/2011 3:38 PM, Royce Ausburn wrote: Hi all, My company is having trouble managing how we upgrade schema changes across many versions of our software. I imagine this is a common problem and there're probably some neat solutions that we don't know about. For the last 10 years we have been writing bash shell scripts essentially numbered in order db0001, db0002, db0003 The number represents the schema version which is recorded in the database and updated by the shell scripts. We have a template that provides all the functionality we need, we just copy the script and fill in the blanks. The schema upgrade scripts are committed to svn along with the software changes, and we have a process when installing the software at a site that runs the scripts on the DB in order before starting up the new version of the software. This has worked really well so far. But we've effectively only had one version of the software in development at any time. We're now in the habit of branching the software to form releases to promote stability when making large changes. The idea is that only really important changes are merged in to the releases. This introduces a bit of a problem when some change needs to be merged from one release to another. The typical problem is that we might have two versions of the software 10.0 at schema version 10057 and 11.0 at 11023 and we need to merge an important bug fix from schema 11023 in to 10.0. The issue is that 11023 might depend upon changes introduced in the schema versions before it. Or 11023 might introduce changes that cause later scripts to break (11000 - 11023) when upgrading from 10.0 to 11.0. One potential solution is to require that schema changes are never merged in to a release, but of course sometimes business requires we do =( I'm really interested to hear how you guys manage schema upgrades in the face of branches and upgrading from many different versions of the database. I've been reading http://pyrseas.wordpress.com/2011/02/07/version-control-part-2-sql-databases/ but I have a feeling that this blog post won't address branches. Cheers! --Royce So, 10.0 at 10057. 11.0 at 11023. then 10.1 needs some fixes so db is bumped to 10058. Then, later, you can upgrade 10057 to 11023, but you cant get 10058 to 11023. Humm... maybe you need smarter upgrade scripts? Would having logic in the script help? Something like: if not fieldExists('xyz) then alter table ... add xyz ... Or, maybe your schema numbering system is to broad? Maybe each table could have a version number? Or some kinda flags like: create table dbver(key text); then an update would be named: "add xyz to bob". then the update code: q = select key from dbver where key = 'add xyz to bob'; if q.eof then alter table bob add xyz -Andy -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] finding bogus UTF-8
I'm working on a project to convert a large database form SQL_ASCII to UTF-8. I am using this procedure: 1) pg_dump the SQL_ASCII database to an SQL text file. 2) Run through a small (efficient) C program that logs each line that contains ANY "unclean" ASCII text. 3) Parse that log with a small perl program (hashes are easier in perl than C) to produce a report, and emit some SQL. 4) Construct SQL update statements to "repair" the original data. 5) Repeat at step #1 until the database is clean. 6) pg_dump (SQL_ASCII) -> pg_restore -EUTF8 new database. 7) Profit! If you are interested, I can email to you the C and Perl source. It runs like this: # time pg_restore /db-dumps/some_ascii_pgdump.bin | ./ascii-tester | ./bad-ascii-report.pl > unclean-ascii.rpt real11m11.804s user18m2.579s sys 2m25.803s # grep "^--" unclean-ascii.rpt -- some_table 4051021 -- other_table 16 ^^^ Numbers are count of rows that need cleaning. Entire "rpt" file contains SQL comments "--" and SQL select statements of the form: select * from table where primary_key in (1, 2, 3, 4, ); The perl script contains a hash that maps table names to primary key column IDs (to pick up when parsing the raw SQL restore "COPY" script). I will need to purge my secret schema stuff from it before sharing it with anyone. My solution is probably not perfect, and probably not optimal, but it is working great so far. I'm almost done cleaning up my database and hope to attempt a real UTF8 restore in the near future. On Thu, Feb 10, 2011 at 1:02 PM, Scott Ribe wrote: > I know that I have at least one instance of a varchar that is not valid > UTF-8, imported from a source with errors (AMA CPT files, actually) before > PG's checking was as stringent as it is today. Can anybody suggest a query to > find such values? > > > -- > Scott Ribe > scott_r...@elevated-dev.com > http://www.elevated-dev.com/ > (303) 722-0567 voice > > > > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] finding bogus UTF-8
On Thu, Feb 10, 2011 at 1:02 PM, Scott Ribe wrote: > I know that I have at least one instance of a varchar that is not valid > UTF-8, imported from a source with errors (AMA CPT files, actually) before > PG's checking was as stringent as it is today. Can anybody suggest a query to > find such values? If you know which table and column the data is in, you can also do something like this: (I typed this up without checking the syntax of it. The basic idea is to cast the column as bytea, encode with the 'escape' method, then grep for back-slashes). select * from bad_table where regexp_match (encode (bad_column::bytea, 'escape'), '')); -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Schema version control
On 2/10/2011 4:14 PM, Andy Colson wrote: On 2/10/2011 3:38 PM, Royce Ausburn wrote: Hi all, My company is having trouble managing how we upgrade schema changes across many versions of our software. I imagine this is a common problem and there're probably some neat solutions that we don't know about. For the last 10 years we have been writing bash shell scripts essentially numbered in order db0001, db0002, db0003 The number represents the schema version which is recorded in the database and updated by the shell scripts. We have a template that provides all the functionality we need, we just copy the script and fill in the blanks. The schema upgrade scripts are committed to svn along with the software changes, and we have a process when installing the software at a site that runs the scripts on the DB in order before starting up the new version of the software. This has worked really well so far. But we've effectively only had one version of the software in development at any time. We're now in the habit of branching the software to form releases to promote stability when making large changes. The idea is that only really important changes are merged in to the releases. This introduces a bit of a problem when some change needs to be merged from one release to another. The typical problem is that we might have two versions of the software 10.0 at schema version 10057 and 11.0 at 11023 and we need to merge an important bug fix from schema 11023 in to 10.0. The issue is that 11023 might depend upon changes introduced in the schema versions before it. Or 11023 might introduce changes that cause later scripts to break (11000 - 11023) when upgrading from 10.0 to 11.0. One potential solution is to require that schema changes are never merged in to a release, but of course sometimes business requires we do =( I'm really interested to hear how you guys manage schema upgrades in the face of branches and upgrading from many different versions of the database. I've been reading http://pyrseas.wordpress.com/2011/02/07/version-control-part-2-sql-databases/ but I have a feeling that this blog post won't address branches. Cheers! --Royce So, 10.0 at 10057. 11.0 at 11023. then 10.1 needs some fixes so db is bumped to 10058. Then, later, you can upgrade 10057 to 11023, but you cant get 10058 to 11023. Humm... maybe you need smarter upgrade scripts? Would having logic in the script help? Something like: if not fieldExists('xyz) then alter table ... add xyz ... Or, maybe your schema numbering system is to broad? Maybe each table could have a version number? Or some kinda flags like: create table dbver(key text); then an update would be named: "add xyz to bob". then the update code: q = select key from dbver where key = 'add xyz to bob'; if q.eof then alter table bob add xyz -Andy D'oh! a bug in my update script: q = select key from dbver where key = 'add xyz to bob'; if q.eof then alter table bob add xyz insert into dbver('add xyz to bob'); How embarrassing :-) -Andy -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Schema version control
We have this kickass solution we built at work called dbsteward that just takes care of all of this for us, automatically. You just give it the new version and the old version and it generates update statements to feed into PG. The reason I'm bringing this up is that we're working to release dbsteward as open source for PGCon. So, if you can wait a bit, you can jump on that train. In response to Andy Colson : > On 2/10/2011 3:38 PM, Royce Ausburn wrote: > > Hi all, > > > > My company is having trouble managing how we upgrade schema changes across > > many versions of our software. I imagine this is a common problem and > > there're probably some neat solutions that we don't know about. > > > > For the last 10 years we have been writing bash shell scripts essentially > > numbered in order db0001, db0002, db0003 The number represents the > > schema version which is recorded in the database and updated by the shell > > scripts. We have a template that provides all the functionality we need, > > we just copy the script and fill in the blanks. The schema upgrade scripts > > are committed to svn along with the software changes, and we have a process > > when installing the software at a site that runs the scripts on the DB in > > order before starting up the new version of the software. > > > > This has worked really well so far. But we've effectively only had one > > version of the software in development at any time. We're now in the habit > > of branching the software to form releases to promote stability when making > > large changes. The idea is that only really important changes are merged > > in to the releases. This introduces a bit of a problem when some change > > needs to be merged from one release to another. The typical problem is > > that we might have two versions of the software 10.0 at schema version > > 10057 and 11.0 at 11023 and we need to merge an important bug fix from > > schema 11023 in to 10.0. The issue is that 11023 might depend upon changes > > introduced in the schema versions before it. Or 11023 might introduce > > changes that cause later scripts to break (11000 - 11023) when upgrading > > from 10.0 to 11.0. > > > > One potential solution is to require that schema changes are never merged > > in to a release, but of course sometimes business requires we do =( > > > > I'm really interested to hear how you guys manage schema upgrades in the > > face of branches and upgrading from many different versions of the database. > > > > I've been reading > > http://pyrseas.wordpress.com/2011/02/07/version-control-part-2-sql-databases/ > > but I have a feeling that this blog post won't address branches. > > > > Cheers! > > > > --Royce > > > > > > So, 10.0 at 10057. > 11.0 at 11023. > > then 10.1 needs some fixes so db is bumped to 10058. > > Then, later, you can upgrade 10057 to 11023, but you cant get 10058 to > 11023. > > Humm... maybe you need smarter upgrade scripts? Would having logic in > the script help? Something like: > > if not fieldExists('xyz) then alter table ... add xyz ... > > > > Or, maybe your schema numbering system is to broad? Maybe each table > could have a version number? > > > Or some kinda flags like: > create table dbver(key text); > > then an update would be named: "add xyz to bob". > > then the update code: > > q = select key from dbver where key = 'add xyz to bob'; > if q.eof then > alter table bob add xyz > > > -Andy > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general -- Bill Moran http://www.potentialtech.com http://people.collaborativefusion.com/~wmoran/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Schema version control
On 2/10/2011 4:18 PM, Bill Moran wrote: We have this kickass solution we built at work called dbsteward that just takes care of all of this for us, automatically. You just give it the new version and the old version and it generates update statements to feed into PG. The reason I'm bringing this up is that we're working to release dbsteward as open source for PGCon. So, if you can wait a bit, you can jump on that train. In response to Andy Colson: But... that assumes all updates are DDL only? What if I have a lookup table, and want to add some new entries into it? -Andy -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Schema version control
Top-posting is frowned upon by some (not me), but since Bill started it... I for one will be waiting to see your dbsteward. How does it compare functionally or stylistically with Ruby's migration tools (which I found to be pretty cool and frustrating all in one go). On 02/10/2011 03:18 PM, Bill Moran wrote: > > We have this kickass solution we built at work called dbsteward that > just takes care of all of this for us, automatically. You just give > it the new version and the old version and it generates update statements > to feed into PG. > > The reason I'm bringing this up is that we're working to release > dbsteward as open source for PGCon. So, if you can wait a bit, you > can jump on that train. > > In response to Andy Colson : > >> On 2/10/2011 3:38 PM, Royce Ausburn wrote: >>> Hi all, >>> >>> My company is having trouble managing how we upgrade schema changes across >>> many versions of our software. I imagine this is a common problem and >>> there're probably some neat solutions that we don't know about. >>> >>> For the last 10 years we have been writing bash shell scripts essentially >>> numbered in order db0001, db0002, db0003 The number represents the >>> schema version which is recorded in the database and updated by the shell >>> scripts. We have a template that provides all the functionality we need, >>> we just copy the script and fill in the blanks. The schema upgrade scripts >>> are committed to svn along with the software changes, and we have a process >>> when installing the software at a site that runs the scripts on the DB in >>> order before starting up the new version of the software. >>> >>> This has worked really well so far. But we've effectively only had one >>> version of the software in development at any time. We're now in the habit >>> of branching the software to form releases to promote stability when making >>> large changes. The idea is that only really important changes are merged >>> in to the releases. This introduces a bit of a problem when some change >>> needs to be merged from one release to another. The typical problem is >>> that we might have two versions of the software 10.0 at schema version >>> 10057 and 11.0 at 11023 and we need to merge an important bug fix from >>> schema 11023 in to 10.0. The issue is that 11023 might depend upon changes >>> introduced in the schema versions before it. Or 11023 might introduce >>> changes that cause later scripts to break (11000 - 11023) when upgrading >>> from 10.0 to 11.0. >>> >>> One potential solution is to require that schema changes are never merged >>> in to a release, but of course sometimes business requires we do =( >>> >>> I'm really interested to hear how you guys manage schema upgrades in the >>> face of branches and upgrading from many different versions of the database. >>> >>> I've been reading >>> http://pyrseas.wordpress.com/2011/02/07/version-control-part-2-sql-databases/ >>> but I have a feeling that this blog post won't address branches. >>> >>> Cheers! >>> >>> --Royce >>> >>> >> >> So, 10.0 at 10057. >> 11.0 at 11023. >> >> then 10.1 needs some fixes so db is bumped to 10058. >> >> Then, later, you can upgrade 10057 to 11023, but you cant get 10058 to >> 11023. >> >> Humm... maybe you need smarter upgrade scripts? Would having logic in >> the script help? Something like: >> >> if not fieldExists('xyz) then alter table ... add xyz ... >> >> >> >> Or, maybe your schema numbering system is to broad? Maybe each table >> could have a version number? >> >> >> Or some kinda flags like: >> create table dbver(key text); >> >> then an update would be named: "add xyz to bob". >> >> then the update code: >> >> q = select key from dbver where key = 'add xyz to bob'; >> if q.eof then >> alter table bob add xyz >> >> >> -Andy >> >> -- >> Sent via pgsql-general mailing list (pgsql-general@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-general > > -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] finding bogus UTF-8
> > If you are interested, I can email to you the C and Perl source. > > It runs like this: > > # time pg_restore /db-dumps/some_ascii_pgdump.bin | ./ascii-tester | > ./bad-ascii-report.pl > unclean-ascii.rpt http://www.ecoligames.com/~djenkins/pgsql/ Disclaimer: I offer NO warranty. Use at your own risk. Code does minimal error checking (its a hack / tool for manual use, not reliable production use). C code compiles cleanly with gcc. Perl code uses no libraries (just a STDIN -> STDOUT processor). This code should run damn near anywhere. The code will stay on my web server until I forget about it and re-org stuff in a few weeks, so grab it while you can. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Schema version control
In response to Andy Colson : > On 2/10/2011 4:18 PM, Bill Moran wrote: > > > > We have this kickass solution we built at work called dbsteward that > > just takes care of all of this for us, automatically. You just give > > it the new version and the old version and it generates update statements > > to feed into PG. > > > > The reason I'm bringing this up is that we're working to release > > dbsteward as open source for PGCon. So, if you can wait a bit, you > > can jump on that train. > > > > In response to Andy Colson: > > > > But... that assumes all updates are DDL only? What if I have a lookup > table, and want to add some new entries into it? It has provisions for maintaining static data as well. We have a bunch of lookup tables (too many, in my opinion) and it does an excellent job of maintaining them. They just need to have a primary key, but that's not usually a problem with lookup tables. -- Bill Moran http://www.potentialtech.com http://people.collaborativefusion.com/~wmoran/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Schema version control
Royce Ausburn wrote on 10.02.2011 22:38: I'm really interested to hear how you guys manage schema upgrades in the face of branches and upgrading from many different versions of the database. We are quite happy with Liquibase. You can simply run it against a database and tell it to migrate it to "Version x.y" As it keeps track of all changes applied it automatically knows what to do. I can handle static data as well as stored procedure and any custom SQL. Regards Thomas -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Schema version control
In response to Rob Sargent : > Top-posting is frowned upon by some (not me), but since Bill started it... Oops ... the weird thing is that I'm usually really anal about not top- posting ... > I for one will be waiting to see your dbsteward. How does it compare > functionally or stylistically with Ruby's migration tools (which I found > to be pretty cool and frustrating all in one go). I'm not familiar with Ruby's migration tools, so I can't say much. The overview: You store your schema and data as XML (this is easy to migrate to, because it includes a tool that makes the XML from a live database) Keep your XML schema files in some RCS. When it's time for a new deployment, you run the dbsteward tool against the schema XML and it turns it into DDL and DML. When it's time for an upgrade, you run the dbsteward tool against two schema XML files, and it calculates what has changed and generates the appropriate DDL and DML to upgrade. So ... you know, however that compares with the Ruby stuff is how it does. -- Bill Moran http://www.potentialtech.com http://people.collaborativefusion.com/~wmoran/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Schema version control
Bill Moran wrote on 10.02.2011 23:59: The overview: You store your schema and data as XML (this is easy to migrate to, because it includes a tool that makes the XML from a live database) Keep your XML schema files in some RCS. When it's time for a new deployment, you run the dbsteward tool against the schema XML and it turns it into DDL and DML. When it's time for an upgrade, you run the dbsteward tool against two schema XML files, and it calculates what has changed and generates the appropriate DDL and DML to upgrade. This very much sounds like Liquibase. Do you happen to know any differences? Regards Thomas -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Schema version control
On 02/10/2011 03:59 PM, Bill Moran wrote: > In response to Rob Sargent : > >> Top-posting is frowned upon by some (not me), but since Bill started it... > > Oops ... the weird thing is that I'm usually really anal about not top- > posting ... > >> I for one will be waiting to see your dbsteward. How does it compare >> functionally or stylistically with Ruby's migration tools (which I found >> to be pretty cool and frustrating all in one go). > > I'm not familiar with Ruby's migration tools, so I can't say much. > > The overview: > You store your schema and data as XML (this is easy to migrate to, because > it includes a tool that makes the XML from a live database) > Keep your XML schema files in some RCS. > When it's time for a new deployment, you run the dbsteward tool against > the schema XML and it turns it into DDL and DML. > When it's time for an upgrade, you run the dbsteward tool against two > schema XML files, and it calculates what has changed and generates the > appropriate DDL and DML to upgrade. > > So ... you know, however that compares with the Ruby stuff is how it > does. > Now at the bottom :) It's been a couple years since I played with Ruby ActiveRecord but it's (of course) radically than what you describe. The ddl is in the ruby code and naturally the code is in RCS. So a revision is a new instance of ActiveRecord (iirc) which does the change(s) (create table ttt, alter table vvv etc). Maybe skip a rev. Rollback to a rev is definitely there because one writes the undo for each new revision. This include manipulating the data of course, so there are limitations. I personally am leary of the 'make the prod match the dev db' approach. Who knows what extras lurk in the depths. I think one should be able to make the dev db from scratch and write the necessary scripts to change to (and from if possible) each revision. Apply to prod when tested. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Schema version control
On Thu, 10 Feb 2011 17:59:30 -0500, Bill Moran wrote: The overview: You store your schema and data as XML (this is easy to migrate to, because it includes a tool that makes the XML from a live database) We're doing a similar thing here except we're a Lisp shop so our schema is defined as a set of "defentities" and we can migrate from one version to another using a corresponding set of "defmaps". Keep your XML schema files in some RCS. When it's time for a new deployment, you run the dbsteward tool against the schema XML and it turns it into DDL and DML. When it's time for an upgrade, you run the dbsteward tool against two schema XML files, and it calculates what has changed and generates the appropriate DDL and DML to upgrade. This sounds pretty cool. Ours doesn't do that yet but that's next on my TODO list. -- Andy Chambers -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Schema version control
In response to Thomas Kellerer : > Bill Moran wrote on 10.02.2011 23:59: > > The overview: > > You store your schema and data as XML (this is easy to migrate to, because > > it includes a tool that makes the XML from a live database) > > Keep your XML schema files in some RCS. > > When it's time for a new deployment, you run the dbsteward tool against > > the schema XML and it turns it into DDL and DML. > > When it's time for an upgrade, you run the dbsteward tool against two > > schema XML files, and it calculates what has changed and generates the > > appropriate DDL and DML to upgrade. > > This very much sounds like Liquibase. Do you happen to know any differences? Hrm ... before we started writing dbsteward, we looked around to see if something already existed and didn't find Liquibase. This is the first I've heard of it. I'm thinking it was some time in 2008, and according to their changelog, Liquibase was around at that time. I wonder how we missed it ... Anyway ... based on nothing more than a quick scan of their quickstart page, here are the differences I see: * Liquibase is dependent on you creating "changesets". I'm sure this works, but we took a different approach with dbsteward. dbsteward expects you to maintain XML files that represent the entire database, then dbsteward does the work of figuring out what changed. Our opinion was that svn already does the work of tracking changes, why reinvent the wheel. * Looks like liquibase requires you to talk to the database to push the changes? dbsteward outputs a DDL/DML file that you can push in whatever way is best. This is important to us because we use Slony, and DDL changes have to be submitted through EXECUTE SCRIPT() * dbsteward has built-in Slony support (i.e. it will make slony configs as well as slony upgrade scripts in addition to DDL/DML) * Does liquibase support UDFs? dbsteward does. * liquibase has a lot more supported platforms at this time. dbsteward only supports PostgreSQL and MSSQL (because that's all that we needed) but I expect that other support will come quickly once we release it. * Does liquibase support things like multi-column indexes and multi- column primary keys? dbsteward does. I don't think I should go on and on, as I could ask a lot of questions about what liquibase does, and I simply don't have the time right now to research it, or ask all those question ;) Anyway ... sorry for the teaser on this, but we're trying to get through all the hoops the company is requiring us to do to release it, and we think we're on track to be ready by PGCon, so there'll be a website up as soon as we can get it. -- Bill Moran http://www.potentialtech.com http://people.collaborativefusion.com/~wmoran/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Schema version control
In response to Rob Sargent : > > On 02/10/2011 03:59 PM, Bill Moran wrote: > > In response to Rob Sargent : > >> I for one will be waiting to see your dbsteward. How does it compare > >> functionally or stylistically with Ruby's migration tools (which I found > >> to be pretty cool and frustrating all in one go). > > > > I'm not familiar with Ruby's migration tools, so I can't say much. > > > > The overview: > > You store your schema and data as XML (this is easy to migrate to, because > > it includes a tool that makes the XML from a live database) > > Keep your XML schema files in some RCS. > > When it's time for a new deployment, you run the dbsteward tool against > > the schema XML and it turns it into DDL and DML. > > When it's time for an upgrade, you run the dbsteward tool against two > > schema XML files, and it calculates what has changed and generates the > > appropriate DDL and DML to upgrade. > > > > So ... you know, however that compares with the Ruby stuff is how it > > does. > > > Now at the bottom :) > > It's been a couple years since I played with Ruby ActiveRecord but it's > (of course) radically than what you describe. The ddl is in the ruby > code and naturally the code is in RCS. So a revision is a new instance > of ActiveRecord (iirc) which does the change(s) (create table ttt, alter > table vvv etc). Maybe skip a rev. Rollback to a rev is definitely > there because one writes the undo for each new revision. This include > manipulating the data of course, so there are limitations. dbsteward can do downgrades ... you just feed it the old schema and the new schema in reverse of how you'd do an upgrade ;) Oh, also, it allows us to do installation-specific overrides. We use this ONLY for DML for lookup lists where some clients have slightly different names for things than others. In theory, it could do DDL overrides as well, but we decided on a policy of not utilizing that because we wanted the schemas to be consistent on all our installs. > I personally am leary of the 'make the prod match the dev db' approach. > Who knows what extras lurk in the depths. I think one should be able to > make the dev db from scratch and write the necessary scripts to change > to (and from if possible) each revision. Apply to prod when tested. dbsteward allows us to do all this. A developer can make a change, rebuild a test database from their change to make sure it works, then test the upgrade process as well, all before even checking the code in. -- Bill Moran http://www.potentialtech.com http://people.collaborativefusion.com/~wmoran/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Schema version control
Bill Moran wrote on 11.02.2011 00:37: Anyway ... based on nothing more than a quick scan of their quickstart page, here are the differences I see: * Liquibase is dependent on you creating "changesets". I'm sure this works, but we took a different approach with dbsteward. dbsteward expects you to maintain XML files that represent the entire database, then dbsteward does the work of figuring out what changed. Our opinion was that svn already does the work of tracking changes, why reinvent the wheel. That sounds like a very nice feature. * Looks like liquibase requires you to talk to the database to push the changes? dbsteward outputs a DDL/DML file that you can push in whatever way is best. This is important to us because we use Slony, and DDL changes have to be submitted through EXECUTE SCRIPT() No, Liquibase can also emit the SQL that it would execute. * dbsteward has built-in Slony support (i.e. it will make slony configs as well as slony upgrade scripts in addition to DDL/DML) * liquibase has a lot more supported platforms at this time. dbsteward only supports PostgreSQL and MSSQL (because that's all that we needed) but I expect that other support will come quickly once we release it. * Does liquibase support things like multi-column indexes and multi- column primary keys? dbsteward does. Yes without problems (including of course the necessary foreing keys) Anyway ... sorry for the teaser on this, but we're trying to get through all the hoops the company is requiring us to do to release it, and we think we're on track to be ready by PGCon, so there'll be a website up as soon as we can get it. Thanks for the feedback, I would really like to see it. The approach that you do not record the changes but simply let the software find them seems like a very nifty feature. I wonder how you detect renaming a table or a column? On which programming language is dbstewart based? Regards Thomas -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Schema version control
On 02/10/2011 02:38 PM, Royce Ausburn wrote: > Hi all, > > My company is having trouble managing how we upgrade schema changes across > many versions of our software. I imagine this is a common problem and > there're probably some neat solutions that we don't know about. > > For the last 10 years we have been writing bash shell scripts essentially > numbered in order db0001, db0002, db0003 The number represents the schema > version which is recorded in the database and updated by the shell scripts. > We have a template that provides all the functionality we need, we just copy > the script and fill in the blanks. The schema upgrade scripts are committed > to svn along with the software changes, and we have a process when installing > the software at a site that runs the scripts on the DB in order before > starting up the new version of the software. > > This has worked really well so far. But we've effectively only had one > version of the software in development at any time. We're now in the habit > of branching the software to form releases to promote stability when making > large changes. The idea is that only really important changes are merged in > to the releases. This introduces a bit of a problem when some change needs > to be merged from one release to another. The typical problem is that we > might have two versions of the software 10.0 at schema version 10057 and 11.0 > at 11023 and we need to merge an important bug fix from schema 11023 in to > 10.0. The issue is that 11023 might depend upon changes introduced in the > schema versions before it. Or 11023 might introduce changes that cause later > scripts to break (11000 - 11023) when upgrading from 10.0 to 11.0. > > One potential solution is to require that schema changes are never merged in > to a release, but of course sometimes business requires we do =( > > I'm really interested to hear how you guys manage schema upgrades in the face > of branches and upgrading from many different versions of the database. > > I've been reading > http://pyrseas.wordpress.com/2011/02/07/version-control-part-2-sql-databases/ > but I have a feeling that this blog post won't address branches. > > Cheers! > > --Royce > > Don't the bash scripts get checked in to .../perforce/cvs/svn/git/...? Aren't they part of the resources of the project(s)? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Schema version control
On 02/10/2011 04:44 PM, Bill Moran wrote: > In response to Rob Sargent : >> >> On 02/10/2011 03:59 PM, Bill Moran wrote: >>> In response to Rob Sargent : I for one will be waiting to see your dbsteward. How does it compare functionally or stylistically with Ruby's migration tools (which I found to be pretty cool and frustrating all in one go). >>> >>> I'm not familiar with Ruby's migration tools, so I can't say much. >>> >>> The overview: >>> You store your schema and data as XML (this is easy to migrate to, because >>> it includes a tool that makes the XML from a live database) >>> Keep your XML schema files in some RCS. >>> When it's time for a new deployment, you run the dbsteward tool against >>> the schema XML and it turns it into DDL and DML. >>> When it's time for an upgrade, you run the dbsteward tool against two >>> schema XML files, and it calculates what has changed and generates the >>> appropriate DDL and DML to upgrade. >>> >>> So ... you know, however that compares with the Ruby stuff is how it >>> does. >>> >> Now at the bottom :) >> >> It's been a couple years since I played with Ruby ActiveRecord but it's >> (of course) radically than what you describe. The ddl is in the ruby >> code and naturally the code is in RCS. So a revision is a new instance >> of ActiveRecord (iirc) which does the change(s) (create table ttt, alter >> table vvv etc). Maybe skip a rev. Rollback to a rev is definitely >> there because one writes the undo for each new revision. This include >> manipulating the data of course, so there are limitations. > > dbsteward can do downgrades ... you just feed it the old schema and > the new schema in reverse of how you'd do an upgrade ;) > > Oh, also, it allows us to do installation-specific overrides. We use > this ONLY for DML for lookup lists where some clients have slightly > different names for things than others. In theory, it could do DDL > overrides as well, but we decided on a policy of not utilizing that > because we wanted the schemas to be consistent on all our installs. > >> I personally am leary of the 'make the prod match the dev db' approach. >> Who knows what extras lurk in the depths. I think one should be able to >> make the dev db from scratch and write the necessary scripts to change >> to (and from if possible) each revision. Apply to prod when tested. > > dbsteward allows us to do all this. A developer can make a change, > rebuild a test database from their change to make sure it works, then > test the upgrade process as well, all before even checking the code in. > Good work. Will look forward to it. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] finding bogus UTF-8
On Thu, Feb 10, 2011 at 2:02 PM, Scott Ribe wrote: > I know that I have at least one instance of a varchar that is not valid > UTF-8, imported from a source with errors (AMA CPT files, actually) before > PG's checking was as stringent as it is today. Can anybody suggest a query > to find such values? > I hit this problem too, if I remember correctly when trying to upgrade a database from 8.3 to 8.4. I ended up aborting the upgrade, since the upgrade documentation made no mention of this and I didn't have time to dig into it at the time. A tool to find all instances of this would be very helpful. -- Glenn Maynard
[GENERAL] plsql question
I am trying to write a plsql routine that will delete a range of characters based on their octal or hexadecimal values. Something like the 'tr' shell command will do: cat file| tr -d ['\177'-'\377'] Can't seem to figure this one out. Pointers would be appreciated. -- Until later, Geoffrey "I predict future happiness for America if they can prevent the government from wasting the labors of the people under the pretense of taking care of them." - Thomas Jefferson -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Schema version control
On 11/02/2011, at 9:59 AM, Thomas Kellerer wrote: > Royce Ausburn wrote on 10.02.2011 22:38: >> I'm really interested to hear how you guys manage schema upgrades in >> the face of branches and upgrading from many different versions of >> the database. > > We are quite happy with Liquibase. You can simply run it against a database > and tell it to migrate it to "Version x.y" > > As it keeps track of all changes applied it automatically knows what to do. > > I can handle static data as well as stored procedure and any custom SQL. Thanks! Liquibase looks pretty neat. We'll have to check it out. --Royce -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Schema version control
> > So, 10.0 at 10057. > 11.0 at 11023. > > then 10.1 needs some fixes so db is bumped to 10058. > > Then, later, you can upgrade 10057 to 11023, but you cant get 10058 to 11023. > > Humm... maybe you need smarter upgrade scripts? Would having logic in the > script help? Something like: > > if not fieldExists('xyz) then alter table ... add xyz ... > > > > Or, maybe your schema numbering system is to broad? Maybe each table could > have a version number? > > > Or some kinda flags like: > create table dbver(key text); > > then an update would be named: "add xyz to bob". > > then the update code: > > q = select key from dbver where key = 'add xyz to bob'; > if q.eof then > alter table bob add xyz > This is effectively the approach we've been working with so far, but it isn't great. The issue is that you need to be really aware of what changes might or might not have been in the previous databases... This can be hard. There's also the problem of updating data, etc. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Schema version control
> > Don't the bash scripts get checked in to .../perforce/cvs/svn/git/...? > Aren't they part of the resources of the project(s)? Yep - they absolutely are. The issue is that there're multiple branches *potentially* having new scripts committed. Fortunately it's rare as the release branches seldom require schema changes, but it's an important enough problem to need a better solution. --Royce -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Schema version control
On Thu, Feb 10, 2011 at 6:44 PM, Bill Moran wrote: > dbsteward can do downgrades ... you just feed it the old schema and > the new schema in reverse of how you'd do an upgrade ;) > > Oh, also, it allows us to do installation-specific overrides. We use > this ONLY for DML for lookup lists where some clients have slightly > different names for things than others. In theory, it could do DDL > overrides as well, but we decided on a policy of not utilizing that > because we wanted the schemas to be consistent on all our installs. > What about upgrades that can't be derived directly from an inspection of the schema? Some examples: - Adding a NOT NULL constraint (without adding a DEFAULT). You often want to precede this with filling in any existing NULL values, so the new constraint doesn't fail. - Updating triggers, functions and their effects. For example, when I have an FTS index with a trigger to update an index column, and I change the underlying trigger, I often do something like "UPDATE table SET column = column", to cause all of the update triggers to fire and recalculate the index columns. - Creating a new column based on an old one, and removing the old one; eg. add a column "n", run "UPDATE ... SET n = i*j * 2", and then drop the old columns "i" and "j". - Updating data from an external source, such as ORM model code; for example, if you have a table representing external files, an update may want to calculate and update the SHA-1 of each file. - For efficiency, dropping a specific index while making a large update, and then recreating the index. In my experience, while generating schema updates automatically is handy, it tends to make nontrivial database updates more complicated. These sorts of things happen often and are an integral part of a database update, so I'm just curious how/if you deal with them. I've used Ruby's migrations, and for my Django databases I use my own migration system which is based in principle off of it: create scripts to migrate the database from version X to X+1 and X-1, and upgrade or downgrade by running the appropriate scripts in sequence. It's not ideal, since it can't generate a database at a specific version directly; it always has to run through the entire sequence of migrations to the version you want, and the migrations accumulate. However, it can handle whatever arbitrary steps are needed to update a database, and I don't need to test updates from every version to every other version. -- Glenn Maynard
Re: [GENERAL] Schema version control
In response to Glenn Maynard : > On Thu, Feb 10, 2011 at 6:44 PM, Bill Moran wrote: > > > dbsteward can do downgrades ... you just feed it the old schema and > > the new schema in reverse of how you'd do an upgrade ;) > > > > Oh, also, it allows us to do installation-specific overrides. We use > > this ONLY for DML for lookup lists where some clients have slightly > > different names for things than others. In theory, it could do DDL > > overrides as well, but we decided on a policy of not utilizing that > > because we wanted the schemas to be consistent on all our installs. > > > > What about upgrades that can't be derived directly from an inspection of the > schema? Some examples: > > - Adding a NOT NULL constraint (without adding a DEFAULT). You often want > to precede this with filling in any existing NULL values, so the new > constraint doesn't fail. > - Updating triggers, functions and their effects. For example, when I have > an FTS index with a trigger to update an index column, and I change the > underlying trigger, I often do something like "UPDATE table SET column = > column", to cause all of the update triggers to fire and recalculate the > index columns. > - Creating a new column based on an old one, and removing the old one; eg. > add a column "n", run "UPDATE ... SET n = i*j * 2", and then drop the old > columns "i" and "j". > - Updating data from an external source, such as ORM model code; for > example, if you have a table representing external files, an update may want > to calculate and update the SHA-1 of each file. > - For efficiency, dropping a specific index while making a large update, and > then recreating the index. > > In my experience, while generating schema updates automatically is handy, it > tends to make nontrivial database updates more complicated. These sorts of > things happen often and are an integral part of a database update, so I'm > just curious how/if you deal with them. > > I've used Ruby's migrations, and for my Django databases I use my own > migration system which is based in principle off of it: create scripts to > migrate the database from version X to X+1 and X-1, and upgrade or downgrade > by running the appropriate scripts in sequence. > > It's not ideal, since it can't generate a database at a specific version > directly; it always has to run through the entire sequence of migrations to > the version you want, and the migrations accumulate. However, it can handle > whatever arbitrary steps are needed to update a database, and I don't need > to test updates from every version to every other version. You're correct (based on our experience over the past few years). The big caveat is that 99.9% of the database changes don't fall into those "nontrivial" categories, and dbsteward makes those 99.9% of the changes easy to do, reliable to reproduce, and easy to track. We've added some stuff to handle the other .1% as well, like and where you can put an arbitrary SQL strings to be run before or after the remainder of the automatic stuff is done. We probably haven't seen every circumstance that needs a special handling, but we've already struggled through a bunch. All this is part of the reason we're pushing to get this stuff open- sourced. We feel like we've got something that's pretty far along, and we feel that community involvement will help enhance things. -- Bill Moran http://www.potentialtech.com http://people.collaborativefusion.com/~wmoran/ -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Schema version control
On Fri, Feb 11, 2011 at 12:16 AM, Bill Moran wrote: > The big caveat is that 99.9% of the database changes don't fall into those > "nontrivial" categories, and dbsteward makes those 99.9% of the changes > easy to do, reliable to reproduce, and easy to track. > My experience is maybe more like 95% than 99.9%, for what it's worth; they're the exception, but not rare. We've added some stuff to handle the other .1% as well, like > > and where you can put an arbitrary SQL strings to be run > before or after the remainder of the automatic stuff is done. We probably > haven't seen every circumstance that needs a special handling, but we've > already struggled through a bunch. > Here's a fairly common example, in the abstract: version 1 has two columns, i and j; version 2 has one column, k, where k = i + j; and version 3 has one column, x, where x = k * 2 Not only is updating from 1 to 2 tricky ("k = i + j" lies between the adding of "k" but before the removal of i and j; it's neither a "before" nor an "after"), but updating directly from 1 to 3 without first migrating to 2 is extremely hard. I suspect you'd need to snapshot the schema at each version where these are needed to update incrementally, rather than always trying to convert directly to the current version--maybe you already do that. Anyhow, just some thoughts based on my own experience with database updates--good luck. -- Glenn Maynard
Re: [GENERAL] Schema version control
On 10 Feb 2011, at 23:59, Bill Moran wrote: > The overview: > You store your schema and data as XML (this is easy to migrate to, because > it includes a tool that makes the XML from a live database) > Keep your XML schema files in some RCS. That reminds me of something I've been wondering about - How well do modern RCSs deal with structured data formats (like XML)? It would appear that most of them still use diff, which is line-based with limited context tracking, to determine change-sets. Is that combination guaranteed to result in valid XML if you merge revisions that are far enough apart? Or are there RCSs around that know about XML format (and possibly other structured formats) and handle it differently? I've heavily used RCSs (mostly Subversion) with, for example HTML, and merge conflicts haven't been unusual. It doesn't help Subversion's diff is a bit simplistic about white-space, I'm sure some of those conflicts were quite unnecessary. Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll see there is no forest. !DSPAM:737,4d54eac711731788013809! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general