date:20110210

[GENERAL] Multithreaded query onto 4 postgresql instances

2011-02-10 Thread Alessandro Candini


Here you are my probably uncommon situation.

I have installed 4 different instances of postgresql-9.0.2 on the same 
machine, on ports 5433, 5434, 5435, 5436.
On these instances I have splitted a huge database, dividing it per date 
(from 1995 to 1998 on 5433, from 1999 to 2002 on 5434 and so on...).
Then I have developed a C function using libpq which creates 4 threads, 
each one which query a 1/4 of the db. After that I merge the results in 
one single response.


My function works fine, but I need to include it inside a postgresql 
instance in order to launch it as a normal SQL query (SELECT myfunc(...);).

Why I have to do something tricky like this is long too explain...

I have read the documentation here 
http://www.postgresql.org/docs/9.0/static/xfunc-sql.html#XFUNC-SQL-FUNCTIONS-RETURNING-SET, 
but I'm pretty confuse and I don't know if this is the right way to 
accomplish my goal.


Have you got any ideas or suggestions?

Thanks a lot!

--
Alessandro Candini

MEEO S.r.l.
Via Saragat 9
I-44122 Ferrara, Italy
Tel: +39 0532 1861501
Fax: +39 0532 1861637
http://www.meeo.it


"ATTENZIONE:le informazioni contenute in questo messaggio sono
da considerarsi confidenziali ed il loro utilizzo è riservato unicamente
al destinatario sopra indicato. Chi dovesse ricevere questo messaggio
per errore è tenuto ad informare il mittente ed a rimuoverlo
definitivamente da ogni supporto elettronico o cartaceo."

"WARNING:This message contains confidential and/or proprietary
information which may be subject to privilege or immunity and which
is intended for use of its addressee only. Should you receive this
message in error, you are kindly requested to inform the sender and
to definitively remove it from any paper or electronic format."


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] partitioning and dynamic query creation

2011-02-10 Thread Gerd Koenig

Hello list,

I'm currently thinking about a possibility to create a dynamic insert 
statement inside a trigger function to put the data in the correct partition.
What am I talking about ?
I want to put data dependant on a timestamp column ("datetime") in seperate 
partitions. Therefore I created the partitions, checks and a trigger function 
with the following code:
""
...
date_part='';
date_part = to_char(NEW.datetime,'') || to_char(NEW.datetime,'MM');
tablename = 'table_' || date_part;
RAISE NOTICE 'target table: %', tablename;
EXECUTE 'insert into ' || tablename::regclass || ' values (NEW.*);';
--IF ( DATE (NEW.datetime) >= DATE '2010-11-01' AND
-- DATE (NEW.datetime) < DATE '2010-12-01' ) THEN
--INSERT INTO tab_tour201011 VALUES (NEW.*);
--ELSIF ( DATE (NEW.datetime) >= DATE '2010-12-01' AND
--DATE (NEW.datetime) < DATE '2011-01-01' ) THEN
--INSERT INTO tab_tour201012 VALUES (NEW.*);
--ELSIF ( DATE (NEW.datetime) >= DATE '2011-01-01' AND
--DATE (NEW.datetime) < DATE '2011-02-01' ) THEN
--INSERT INTO tab_tour201101 VALUES (NEW.*);
--ELSIF ( DATE (NEW.datetime) >= DATE '2011-02-01' AND
--DATE (NEW.datetime) < DATE '2011-03-01' ) THEN
--INSERT INTO tab_tour201102 VALUES (NEW.*);
   ...
""

The above code throws the following error while trying to insert data:
""
NOTICE:  target table: table_201102
ERROR:  missing FROM-clause entry for table "new"
LINE 1: insert into table_201102 values (NEW.*);
   ^
QUERY:  insert into table_201102 values (NEW.*);
CONTEXT:  PL/pgSQL function "insert_trigger" line 10 at EXECUTE statement
""

O.K., most probably this is caused by the fact that the statement "string" 
includes the characters NEW, but not the values...or what?!?!
The commented lines are working as expected and I think this is the common way 
of handling partitions.

Now my question:
is it possible at all to create the insert statement on the fly, to avoid 
modifying the trigger function each time a new partition has been added ?

any help appreciated::GERD::

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] GUC configuration

2011-02-10 Thread Θάνος Παπαπέτρου


Hi,

I am an MSc student in the department of Informatics and  
Telecommunications of the University of Athens and as part of my  
thesis I am examining some new cost models for DB optimizers. I have  
successfully changed the optimizer of PostgreSQL in order to implement  
these models, but I have stumbled upon a very little detail: until now  
I use some hardcoded values in my code which I would like to make  
accessible through GUC. After much googling the only relative pages I  
have found is about configuring existing PostgreSQL variables. Can  
anybody please provide some help? How can I add some new configuration  
variables using GUC?


Thanks in advance,

Thanos Papapetrou

Θάνος Παπαπέτρου

You insist that there is something that a machine can't do. If you  
will tell me precisely what it is that a machine cannot do, then I can  
always make a machine which will do just that.



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] 9.0.X FOR UPDATE|SHARE on Sub-Query Causes "cannot extract system attribute from virtual tuple" if Sub-Query Returns Records (BUG)

2011-02-10 Thread David Johnston

Tom,

>From your commit notes:

 "This wasn't a problem before 9.0 because we didn't support FOR UPDATE
below the top query level..."

FWIW I had been using a sub-query FOR UPDATE in one of my key queries (one
that was called multiple times per second) and relied upon the FOR UPDATE to
avoid having the same record "dispatched" multiple times.  It worked just
fine in 8.2.X and 8.4.X - supported or not.

David J

-Original Message-
From: pgsql-general-ow...@postgresql.org
[mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Tom Lane
Sent: Wednesday, February 09, 2011 11:37 PM
To: David Johnston
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] 9.0.X FOR UPDATE|SHARE on Sub-Query Causes "cannot
extract system attribute from virtual tuple" if Sub-Query Returns Records
(BUG) 

"David Johnston"  writes:
> More simply if you run any query of the form:
> SELECT subquerycolumn
> FROM (
>  SELECT subquerycolumn FROM table WHERE [condition] FOR UPDATE -- 
> WHERE is optional but obviously useful; FOR SHARE also causes this 
> behavior
> ) intermediate
> The error "cannot extract system attribute from virtual tuple" is 
> thrown IIF the sub-query returns one or more records.

Fixed, thanks for the report!

http://git.postgresql.org/gitweb?p=postgresql.git;a=commitdiff;h=d5478c3391f
8f1a243abbc3d9253aac3d6d3538e

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make
changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Backup/Restore Needed for Upgrade from 9.0beta4?

2011-02-10 Thread Adrian Klaver

On Tuesday, February 08, 2011 10:41:15 am Lee Hughes wrote:
>  From section 15.4 of the manual:
> 
> "If you are upgrading from PostgreSQL "9.0.x", the new version can use
> your current data files so you should skip the backup and restore steps"
> 
> Is 9.0beta4 considered a 9.0.x version, or do I need to backup/restore
> when upgrading from that version?
> 
> Thanks much-
> 
> Lee

This announcement for the Postgres 9.0rc1 indicated a dump and restore for 
migration from the betas. I take this to mean yes to dump/restore for what you 
are doing.
http://www.postgresql.org/about/news.1230

-- 
Adrian Klaver
adrian.kla...@gmail.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Trigger problem, record "new" is not assigned yet

2011-02-10 Thread A B

Hello.

I'm probably doing some very basic error here, but I get

ERROR:  record "new" is not assigned yet
  The tuple structure of a not-yet-assigned record is indeterminate.

when I try this small example

create table foo(x int);

create or replace function trigger_foo() returns trigger language plpgsql as $$
declare v_i integer;
begin
select count(1) into v_i from foo;
if new.x >18 then
raise exception 'error';
else
return null;
end if;
end; $$;

CREATE TRIGGER trigger_foo AFTER INSERT ON foo for each row EXECUTE
PROCEDURE trigger_foo();

insert into foo (x) values (15);


And the table foo only have 14 lines right now.

Can someone help me spot the error? :-)

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Trigger problem, record "new" is not assigned yet

2011-02-10 Thread Vick Khera

On Thu, Feb 10, 2011 at 9:29 AM, A B  wrote:
> Can someone help me spot the error? :-)
>

use a BEFORE INSERT trigger?

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Trigger problem, record "new" is not assigned yet

2011-02-10 Thread A B

Thanks for the suggestion, but

CREATE TRIGGER trigger_foo BEFORE INSERT ON foo for each row EXECUTE
PROCEDURE trigger_foo();

gives me the same error.


2011/2/10 Vick Khera :
> On Thu, Feb 10, 2011 at 9:29 AM, A B  wrote:
>> Can someone help me spot the error? :-)
>>
>
> use a BEFORE INSERT trigger?
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Trigger problem, record "new" is not assigned yet

2011-02-10 Thread Adrian Klaver

On Thursday, February 10, 2011 6:29:58 am A B wrote:
> Hello.
> 
> I'm probably doing some very basic error here, but I get
> 
> ERROR:  record "new" is not assigned yet
>   The tuple structure of a not-yet-assigned record is indeterminate.
> 
> when I try this small example
> 
> create table foo(x int);
> 
> create or replace function trigger_foo() returns trigger language plpgsql
> as $$ declare v_i integer;
> begin
> select count(1) into v_i from foo;
> if new.x >18 then
> raise exception 'error';
> else
> return null;
> end if;
> end; $$;
> 
> CREATE TRIGGER trigger_foo AFTER INSERT ON foo for each row EXECUTE
> PROCEDURE trigger_foo();
> 
> insert into foo (x) values (15);
> 
> 
> And the table foo only have 14 lines right now.
> 
> Can someone help me spot the error? :-)

The above works here, Postgres 9.0.3. Was that the complete error message? Is 
there more than one foo across the schemas?


-- 
Adrian Klaver
adrian.kla...@gmail.com

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Trigger problem, record "new" is not assigned yet

2011-02-10 Thread Vick Khera

On Thu, Feb 10, 2011 at 9:38 AM, A B  wrote:
> CREATE TRIGGER trigger_foo BEFORE INSERT ON foo for each row EXECUTE
> PROCEDURE trigger_foo();
>
> gives me the same error.
>

Maybe "NEW" needs to be all caps?  Also, with the BEFORE trigger,
you'll need to RETURN NEW.

This trigger of mine works just fine as a before trigger, FWIW:


CREATE OR REPLACE FUNCTION sessions_update_lastuse() RETURNS TRIGGER
  AS $$
BEGIN
  NEW.lastuse = CURRENT_TIMESTAMP;
  RETURN NEW;
END;
$$ LANGUAGE plpgsql;

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Trigger problem, record "new" is not assigned yet

2011-02-10 Thread A B

I'm very embarresed now.

There were another trigger that caused a problem. Now it works.

Thank you all for helping! :-)

2011/2/10 Adrian Klaver :
> On Thursday, February 10, 2011 6:29:58 am A B wrote:
>> Hello.
>>
>> I'm probably doing some very basic error here, but I get
>>
>> ERROR:  record "new" is not assigned yet
>>   The tuple structure of a not-yet-assigned record is indeterminate.
>>
>> when I try this small example
>>
>> create table foo(x int);
>>
>> create or replace function trigger_foo() returns trigger language plpgsql
>> as $$ declare v_i integer;
>> begin
>>     select count(1) into v_i from foo;
>>     if new.x >18 then
>>         raise exception 'error';
>>     else
>>         return null;
>>     end if;
>> end; $$;
>>
>> CREATE TRIGGER trigger_foo AFTER INSERT ON foo for each row EXECUTE
>> PROCEDURE trigger_foo();
>>
>> insert into foo (x) values (15);
>>
>>
>> And the table foo only have 14 lines right now.
>>
>> Can someone help me spot the error? :-)
>
> The above works here, Postgres 9.0.3. Was that the complete error message? Is
> there more than one foo across the schemas?
>
>
> --
> Adrian Klaver
> adrian.kla...@gmail.com
>

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] fulltext search and hunspell

2011-02-10 Thread Jens Sauer

Thanks for this tip,
the german compound directory from
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/ works fine.
I think the problem was the rudimentary support of hunspell dictionaries.

Thanks for your help and your great software!

Am 08.02.2011 11:34, schrieb Oleg Bartunov:
> Jens,
>
> have you tried german compound dictionary from
> http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/
>
> Oleg
> On Tue, 8 Feb 2011, Jens Sauer wrote:
>
>> Hey,
>>
>> thanks for your answer.
>>
>> First I checked the links in the tsearch_data directory
>> de_de.affix, and de_de.dict are symlinks to the corresponding files in
>> /var/cache/postgresql/dicts/
>> Then I recreated them by using pg_updatedicts.
>>
>> This is an extract of the de_de.affix file:
>>
>> # this is the affix file of the de_DE Hunspell dictionary
>> # derived from the igerman98 dictionary
>> #
>> # Version: 20091006 (build 20100127)
>> #
>> # Copyright (C) 1998-2009 Bjoern Jacke 
>> #
>> # License: GPLv2, GPLv3 or OASIS distribution license agreement
>> # There should be a copy of both of this licenses included
>> # with every distribution of this dictionary. Modified
>> # versions using the GPL may only include the GPL
>>
>> SET ISO8859-1
>> TRY esijanrtolcdugmphbyfvkwqxz??ESIJANRTOLCDUGMPHBYFVKWQXZ-.
>>
>> PFX U Y 1
>> PFX U   0 un   .
>>
>> PFX V Y 1
>> PFX V   0 ver  .
>>
>> SFX F Y 35
>> [...]
>>
>> I cannot find "compoundwords controlled z" there, so I manually added
>> it.
>>
>> [...]
>> # versions using the GPL may only include the GPL
>>
>> compoundwords  controlled z
>>
>> SET ISO8859-1
>> TRY esijanrtolcdugmphbyfvkwqxz??ESIJANRTOLCDUGMPHBYFVKWQXZ-.
>> [...]
>>
>> Then I restarted PostgreSQL.
>>
>> Now I get an error:
>> SELECT * FROM ts_debug('Schokoladenfabrik');
>> FEHLER:  falsches Affixdateiformat f?r Flag
>> CONTEXT:  Zeile 18 in Konfigurationsdatei
>> ?/usr/share/postgresql/8.4/tsearch_data/de_de.affix?: ?PFX U Y 1
>> ?
>> SQL-Funktion ?ts_debug? Anweisung 1
>> SQL-Funktion ?ts_debug? Anweisung 1
>>
>> Which means:
>> ERROR: wrong Affixfileformat for flag
>> CONTEXT: Line 18 in Configuration ...
>>
>> If I add
>> COMPOUNDFLAG Z
>> ONLYINCOMPOUND L
>>
>> instead of "compoundwords  controlled z"
>>
>> I didn't get an error:
>>
>> SELECT * FROM ts_debug('Schokoladenfabrik');
>>   alias   |   description   |   token   |
>> dictionaries  | dictionary  |  lexemes
>> ---+-+---+---+-+---
>>
>> asciiword | Word, all ASCII | Schokoladenfabrik |
>> {german_hunspell,german_stem} | german_stem | {schokoladenfabr}
>> (1 row)
>>
>> But it seems that the hunspell dictionary is not working for compound
>> words.
>>
>> Maybe pg_updatedicts has a bug and generates affix files in the wrong
>> format?
>>
>> Jens
>>
>> 2011/2/7 Oleg Bartunov :
>>> Jens,
>>>
>>> could you check affix file for
>>> compoundwords  controlled z
>>>
>>> also, can you provide link to dictionary files, so we can check if they
>>> supported, since we have only rudiment support of hunspell.
>>> btw,it'd be nice to have output from ts_debug() to make sure
>>> dictionaries
>>> actually used.
>>>
>>> Oleg
>>
>
> Regards,
> Oleg
> _
> Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
> Sternberg Astronomical Institute, Moscow University, Russia
> Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
> phone: +007(495)939-16-83, +007(495)939-23-83


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] 9.0.X FOR UPDATE|SHARE on Sub-Query Causes "cannot extract system attribute from virtual tuple" if Sub-Query Returns Records (BUG)

2011-02-10 Thread Tom Lane

"David Johnston"  writes:
>> From your commit notes:

>  "This wasn't a problem before 9.0 because we didn't support FOR UPDATE
> below the top query level..."

> FWIW I had been using a sub-query FOR UPDATE in one of my key queries (one
> that was called multiple times per second) and relied upon the FOR UPDATE to
> avoid having the same record "dispatched" multiple times.  It worked just
> fine in 8.2.X and 8.4.X - supported or not.

Yeah, what that actually meant was that we didn't support FOR UPDATE
below the top level of the query *as executed*.  The optimizer used to
flatten subqueries containing FOR UPDATE if it could (and fail if it
couldn't).  9.0 changes that behavior because it led to FOR UPDATE
locking getting applied in unexpected/unpredictable ways in more complex
queries, eg joins.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] COPY statement REAL vs VARCHAR precision issue

2011-02-10 Thread Samuel Gilbert

Hello all,

  I'm using a COPY statement to load data into a PostGIS.  The issue I am 
facing is that if I define fields with the REAL type, the COPY will only 
preserve 4 decimals which is not sufficient for my application.

Here is the commands I'm running and a sample of the results :

CREATE TABLE sites (
   id VARCHAR(9) PRIMARY KEY,
   name VARCHAR(128),
   type VARCHAR(1),
   agency VARCHAR(128),
   status INTEGER,
   napsDesignated BOOLEAN,
   address VARCHAR(128),
   city VARCHAR(128),
   network VARCHAR(128),
   timeZone REAL,
   lat REAL,
   lon REAL,
   elevation REAL
);

COPY sites (
   id,
   name,
   type,
   agency,
   status,
   napsDesignated,
   address,
   city,
   network,
   timeZone,
   lat,
   lon,
   elevation
) FROM
   '/data/sites.csv'
WITH DELIMITER '|' CSV HEADER;

-- Then I use some PostGIS functions to convert the lat/lon into a geometry
-- The issue is independent of the GIS processing.

SELECT lat, lon FROM sites LIMIT 4;
--lat   |   lon
-- -+--
--  47.5681 | -52.7022
--47.56 | -52.7114
--  49.3208 | -57.3972
--  48.9495 | -57.9454
-- (4 rows)

-- If I define the fields as being VARCHAR instead of REAL, I get all
-- the decimals :

CREATE TABLE sites (
   id VARCHAR(9) PRIMARY KEY,
   name VARCHAR(128),
   type VARCHAR(1),
   agency VARCHAR(128),
   status INTEGER,
   napsDesignated BOOLEAN,
   address VARCHAR(128),
   city VARCHAR(128),
   network VARCHAR(128),
   timeZone REAL,
   lat VARCHAR(32),
   lon VARCHAR(32),
   elevation REAL
);

COPY sites (
   id,
   name,
   type,
   agency,
   status,
   napsDesignated,
   address,
   city,
   network,
   timeZone,
   lat,
   lon,
   elevation
) FROM
   '/data/sites.csv'
WITH DELIMITER '|' CSV HEADER;

SELECT lat, lon FROM sites LIMIT 4;
   lat|lon
--+---
 47.56806 | -52.70222
 47.56| -52.71139
 49.32083 | -57.39722
 48.949479| -57.945387
(4 rows)


-- When I convert the lat/lon to GEOMETRY, the function takes a string as 
input.  Therefore, the precision depends on how the REAL are converted to 
strings. :

UPDATE sites SET
   position = GeomFromEWKT('SRID=4326;POINT(' || lon || ' ' || lat ||' ' || 
elevation || ')');

-- Are the values stored with all the possible precision of a REAL or are they 
stored as displayed?

-- Is it because of the behaviour of the COPY statement?


Thank you for your help!


Samuel

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] GUC configuration

2011-02-10 Thread Wappler, Robert



On %D, %SN wrote:
 
%Q
 
%C
-- 
Robert...

> -Original Message-
> From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general-
> ow...@postgresql.org] On Behalf Of T ?apap?t???
> Sent: Thursday, February 10, 2011 2:18 PM
> To: pgsql-general@postgresql.org
> Subject: [GENERAL] GUC configuration
> 
> Hi,
> 
> I am an MSc student in the department of Informatics and
> Telecommunications of the University of Athens and as part of my
> thesis I am examining some new cost models for DB optimizers. I have
> successfully changed the optimizer of PostgreSQL in order to implement
> these models, but I have stumbled upon a very little detail: until now
> I use some hardcoded values in my code which I would like to make
> accessible through GUC. After much googling the only relative pages I
> have found is about configuring existing PostgreSQL variables. Can
> anybody please provide some help? How can I add some new configuration
> variables using GUC?
> 
> Thanks in advance,

Are you looking for custom variable classes?

http://www.postgresql.org/docs/8.4/interactive/runtime-config-custom.htm
l

-- 
Robert...

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Cross Pollination is a wonderful thing

2011-02-10 Thread Joshua D. Drake

-General

How did we get the duckbill platypus? Cross Pollination of course and
today is the last day to get your content in for PostgreSQL Conference
East. A conference that is cross pollinating with MongoDB, one of those
"NoSQL" technologies. Now, we have experience, lots of experience on our
side but they are the beautiful and energetic technology. Let's show
them what our old school ways can do!

Today, CFP for East closes. Don't be left alone after beer-thirty.

https://www.postgresqlconference.org/talk_types

JD
-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] finding bogus UTF-8

2011-02-10 Thread Scott Ribe

I know that I have at least one instance of a varchar that is not valid UTF-8, 
imported from a source with errors (AMA CPT files, actually) before PG's 
checking was as stringent as it is today. Can anybody suggest a query to find 
such values?


-- 
Scott Ribe
scott_r...@elevated-dev.com
http://www.elevated-dev.com/
(303) 722-0567 voice





-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] COPY statement REAL vs VARCHAR precision issue

2011-02-10 Thread Raymond O'Donnell


On 10/02/2011 17:13, Samuel Gilbert wrote:

Hello all,

   I'm using a COPY statement to load data into a PostGIS.  The issue I am
facing is that if I define fields with the REAL type, the COPY will only
preserve 4 decimals which is not sufficient for my application.


If you use NUMERIC, you can define the precision you need, and be sure 
of accurate calculations:


http://www.postgresql.org/docs/9.0/static/datatype-numeric.html#DATATYPE-NUMERIC-DECIMAL

Ray.

--
Raymond O'Donnell :: Galway :: Ireland
r...@iol.ie

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Schema version control

2011-02-10 Thread Royce Ausburn

Hi all,

My company is having trouble managing how we upgrade schema changes across many 
versions of our software.  I imagine this is a common problem and there're 
probably some neat solutions that we don't know about.  

For the last 10 years we have been writing bash shell scripts essentially 
numbered in order db0001, db0002, db0003 The number represents the schema 
version which is recorded in the database and updated by the shell scripts.  We 
have a template that provides all the functionality we need, we just copy the 
script and fill in the blanks.  The schema upgrade scripts are committed to svn 
along with the software changes, and we have a process when installing the 
software at a site that runs the scripts on the DB in order before starting up 
the new version of the software.

This has worked really well so far.  But  we've effectively only had one 
version of the software in development at any time.  We're now in the habit of 
branching the software to form releases to promote stability when making large 
changes.  The idea is that only really important changes are merged in to the 
releases.  This introduces a bit of a problem when some change needs to be 
merged from one release to another.  The typical problem is that we might have 
two versions of the software 10.0 at schema version 10057 and 11.0 at 11023 and 
we need to merge an important bug fix from schema 11023 in to 10.0.  The issue 
is that 11023 might depend upon changes introduced in the schema versions 
before it.  Or 11023 might introduce changes that cause later scripts to break 
(11000 - 11023) when upgrading from 10.0 to 11.0.

One potential solution is to require that schema changes are never merged in to 
a release, but of course sometimes business requires we do =(

I'm really interested to hear how you guys manage schema upgrades in the face 
of branches and upgrading from many different versions of the database.

I've been reading 
http://pyrseas.wordpress.com/2011/02/07/version-control-part-2-sql-databases/ 
but I have a feeling that this blog post won't address branches.

Cheers!

--Royce


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Schema version control

2011-02-10 Thread Andy Colson


On 2/10/2011 3:38 PM, Royce Ausburn wrote:

Hi all,

My company is having trouble managing how we upgrade schema changes across many 
versions of our software.  I imagine this is a common problem and there're 
probably some neat solutions that we don't know about.

For the last 10 years we have been writing bash shell scripts essentially 
numbered in order db0001, db0002, db0003 The number represents the schema 
version which is recorded in the database and updated by the shell scripts.  We 
have a template that provides all the functionality we need, we just copy the 
script and fill in the blanks.  The schema upgrade scripts are committed to svn 
along with the software changes, and we have a process when installing the 
software at a site that runs the scripts on the DB in order before starting up 
the new version of the software.

This has worked really well so far.  But  we've effectively only had one 
version of the software in development at any time.  We're now in the habit of 
branching the software to form releases to promote stability when making large 
changes.  The idea is that only really important changes are merged in to the 
releases.  This introduces a bit of a problem when some change needs to be 
merged from one release to another.  The typical problem is that we might have 
two versions of the software 10.0 at schema version 10057 and 11.0 at 11023 and 
we need to merge an important bug fix from schema 11023 in to 10.0.  The issue 
is that 11023 might depend upon changes introduced in the schema versions 
before it.  Or 11023 might introduce changes that cause later scripts to break 
(11000 - 11023) when upgrading from 10.0 to 11.0.

One potential solution is to require that schema changes are never merged in to 
a release, but of course sometimes business requires we do =(

I'm really interested to hear how you guys manage schema upgrades in the face 
of branches and upgrading from many different versions of the database.

I've been reading 
http://pyrseas.wordpress.com/2011/02/07/version-control-part-2-sql-databases/ 
but I have a feeling that this blog post won't address branches.

Cheers!

--Royce




So, 10.0 at 10057.
11.0 at 11023.

then 10.1 needs some fixes so db is bumped to 10058.

Then, later, you can upgrade 10057 to 11023, but you cant get 10058 to 
11023.


Humm... maybe you need smarter upgrade scripts?  Would having logic in 
the script help?  Something like:


if not fieldExists('xyz) then alter table ...  add xyz ...



Or, maybe your schema numbering system is to broad?  Maybe each table 
could have a version number?



Or some kinda flags like:
create table dbver(key text);

then an update would be named: "add xyz to bob".

then the update code:

q = select key from dbver where key = 'add xyz to bob';
if q.eof then
alter table bob add xyz


-Andy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] finding bogus UTF-8

2011-02-10 Thread dennis jenkins

I'm working on a project to convert a large database form SQL_ASCII to
UTF-8.  I am using this procedure:

1) pg_dump the SQL_ASCII database to an SQL text file.
2) Run through a small (efficient) C program that logs each line that
contains ANY "unclean" ASCII text.
3) Parse that log with a small perl program (hashes are easier in perl
than C) to produce a report, and emit some SQL.
4) Construct SQL update statements to "repair" the original data.
5) Repeat at step #1 until the database is clean.
6) pg_dump (SQL_ASCII) -> pg_restore -EUTF8 new database.
7) Profit!

If you are interested, I can email to you the C and Perl source.

It runs like this:

# time pg_restore /db-dumps/some_ascii_pgdump.bin | ./ascii-tester |
./bad-ascii-report.pl > unclean-ascii.rpt

real11m11.804s
user18m2.579s
sys 2m25.803s

# grep "^--" unclean-ascii.rpt
-- some_table 4051021
-- other_table 16

^^^ Numbers are count of rows that need cleaning.
Entire "rpt" file contains SQL comments "--" and SQL select statements
of the form:
select * from table where primary_key in (1, 2, 3, 4, );

The perl script contains a hash that maps table names to primary key
column IDs (to pick up when parsing the raw SQL restore "COPY"
script).  I will need to purge my secret schema stuff from it before
sharing it with anyone.

My solution is probably not perfect, and probably not optimal, but it
is working great so far.  I'm almost done cleaning up my database and
hope to attempt a real UTF8 restore in the near future.

On Thu, Feb 10, 2011 at 1:02 PM, Scott Ribe  wrote:
> I know that I have at least one instance of a varchar that is not valid 
> UTF-8, imported from a source with errors (AMA CPT files, actually) before 
> PG's checking was as stringent as it is today. Can anybody suggest a query to 
> find such values?
>
>
> --
> Scott Ribe
> scott_r...@elevated-dev.com
> http://www.elevated-dev.com/
> (303) 722-0567 voice
>
>
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] finding bogus UTF-8

2011-02-10 Thread dennis jenkins

On Thu, Feb 10, 2011 at 1:02 PM, Scott Ribe  wrote:
> I know that I have at least one instance of a varchar that is not valid 
> UTF-8, imported from a source with errors (AMA CPT files, actually) before 
> PG's checking was as stringent as it is today. Can anybody suggest a query to 
> find such values?

If you know which table and column the data is in, you can also do
something like this:

(I typed this up without checking the syntax of it.  The basic idea is
to cast the column as bytea, encode with the 'escape' method, then
grep for back-slashes).

select * from bad_table where regexp_match (encode (bad_column::bytea,
'escape'), ''));

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Schema version control

2011-02-10 Thread Andy Colson


On 2/10/2011 4:14 PM, Andy Colson wrote:

On 2/10/2011 3:38 PM, Royce Ausburn wrote:

Hi all,

My company is having trouble managing how we upgrade schema changes
across many versions of our software. I imagine this is a common
problem and there're probably some neat solutions that we don't know
about.

For the last 10 years we have been writing bash shell scripts
essentially numbered in order db0001, db0002, db0003 The number
represents the schema version which is recorded in the database and
updated by the shell scripts. We have a template that provides all the
functionality we need, we just copy the script and fill in the blanks.
The schema upgrade scripts are committed to svn along with the
software changes, and we have a process when installing the software
at a site that runs the scripts on the DB in order before starting up
the new version of the software.

This has worked really well so far. But we've effectively only had one
version of the software in development at any time. We're now in the
habit of branching the software to form releases to promote stability
when making large changes. The idea is that only really important
changes are merged in to the releases. This introduces a bit of a
problem when some change needs to be merged from one release to
another. The typical problem is that we might have two versions of the
software 10.0 at schema version 10057 and 11.0 at 11023 and we need to
merge an important bug fix from schema 11023 in to 10.0. The issue is
that 11023 might depend upon changes introduced in the schema versions
before it. Or 11023 might introduce changes that cause later scripts
to break (11000 - 11023) when upgrading from 10.0 to 11.0.

One potential solution is to require that schema changes are never
merged in to a release, but of course sometimes business requires we
do =(

I'm really interested to hear how you guys manage schema upgrades in
the face of branches and upgrading from many different versions of the
database.

I've been reading
http://pyrseas.wordpress.com/2011/02/07/version-control-part-2-sql-databases/
but I have a feeling that this blog post won't address branches.

Cheers!

--Royce




So, 10.0 at 10057.
11.0 at 11023.

then 10.1 needs some fixes so db is bumped to 10058.

Then, later, you can upgrade 10057 to 11023, but you cant get 10058 to
11023.

Humm... maybe you need smarter upgrade scripts? Would having logic in
the script help? Something like:

if not fieldExists('xyz) then alter table ... add xyz ...



Or, maybe your schema numbering system is to broad? Maybe each table
could have a version number?


Or some kinda flags like:
create table dbver(key text);

then an update would be named: "add xyz to bob".

then the update code:

q = select key from dbver where key = 'add xyz to bob';
if q.eof then
alter table bob add xyz


-Andy



D'oh!  a bug in my update script:


q = select key from dbver where key = 'add xyz to bob';
if q.eof then
alter table bob add xyz
insert into dbver('add xyz to bob');

How embarrassing :-)

-Andy



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Schema version control

2011-02-10 Thread Bill Moran


We have this kickass solution we built at work called dbsteward that
just takes care of all of this for us, automatically.  You just give
it the new version and the old version and it generates update statements
to feed into PG.

The reason I'm bringing this up is that we're working to release
dbsteward as open source for PGCon.  So, if you can wait a bit, you
can jump on that train.

In response to Andy Colson :

> On 2/10/2011 3:38 PM, Royce Ausburn wrote:
> > Hi all,
> >
> > My company is having trouble managing how we upgrade schema changes across 
> > many versions of our software.  I imagine this is a common problem and 
> > there're probably some neat solutions that we don't know about.
> >
> > For the last 10 years we have been writing bash shell scripts essentially 
> > numbered in order db0001, db0002, db0003 The number represents the 
> > schema version which is recorded in the database and updated by the shell 
> > scripts.  We have a template that provides all the functionality we need, 
> > we just copy the script and fill in the blanks.  The schema upgrade scripts 
> > are committed to svn along with the software changes, and we have a process 
> > when installing the software at a site that runs the scripts on the DB in 
> > order before starting up the new version of the software.
> >
> > This has worked really well so far.  But  we've effectively only had one 
> > version of the software in development at any time.  We're now in the habit 
> > of branching the software to form releases to promote stability when making 
> > large changes.  The idea is that only really important changes are merged 
> > in to the releases.  This introduces a bit of a problem when some change 
> > needs to be merged from one release to another.  The typical problem is 
> > that we might have two versions of the software 10.0 at schema version 
> > 10057 and 11.0 at 11023 and we need to merge an important bug fix from 
> > schema 11023 in to 10.0.  The issue is that 11023 might depend upon changes 
> > introduced in the schema versions before it.  Or 11023 might introduce 
> > changes that cause later scripts to break (11000 - 11023) when upgrading 
> > from 10.0 to 11.0.
> >
> > One potential solution is to require that schema changes are never merged 
> > in to a release, but of course sometimes business requires we do =(
> >
> > I'm really interested to hear how you guys manage schema upgrades in the 
> > face of branches and upgrading from many different versions of the database.
> >
> > I've been reading 
> > http://pyrseas.wordpress.com/2011/02/07/version-control-part-2-sql-databases/
> >  but I have a feeling that this blog post won't address branches.
> >
> > Cheers!
> >
> > --Royce
> >
> >
> 
> So, 10.0 at 10057.
> 11.0 at 11023.
> 
> then 10.1 needs some fixes so db is bumped to 10058.
> 
> Then, later, you can upgrade 10057 to 11023, but you cant get 10058 to 
> 11023.
> 
> Humm... maybe you need smarter upgrade scripts?  Would having logic in 
> the script help?  Something like:
> 
> if not fieldExists('xyz) then alter table ...  add xyz ...
> 
> 
> 
> Or, maybe your schema numbering system is to broad?  Maybe each table 
> could have a version number?
> 
> 
> Or some kinda flags like:
> create table dbver(key text);
> 
> then an update would be named: "add xyz to bob".
> 
> then the update code:
> 
> q = select key from dbver where key = 'add xyz to bob';
> if q.eof then
>   alter table bob add xyz
> 
> 
> -Andy
> 
> -- 
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general


-- 
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Schema version control

2011-02-10 Thread Andy Colson


On 2/10/2011 4:18 PM, Bill Moran wrote:


We have this kickass solution we built at work called dbsteward that
just takes care of all of this for us, automatically.  You just give
it the new version and the old version and it generates update statements
to feed into PG.

The reason I'm bringing this up is that we're working to release
dbsteward as open source for PGCon.  So, if you can wait a bit, you
can jump on that train.

In response to Andy Colson:



But... that assumes all updates are DDL only?  What if I have a lookup 
table, and want to add some new entries into it?


-Andy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Schema version control

2011-02-10 Thread Rob Sargent

Top-posting is frowned upon by some (not me), but since Bill started it...

I for one will be waiting to see your dbsteward.  How does it compare
functionally or stylistically with Ruby's migration tools (which I found
to be pretty cool and frustrating all in one go).

On 02/10/2011 03:18 PM, Bill Moran wrote:
> 
> We have this kickass solution we built at work called dbsteward that
> just takes care of all of this for us, automatically.  You just give
> it the new version and the old version and it generates update statements
> to feed into PG.
> 
> The reason I'm bringing this up is that we're working to release
> dbsteward as open source for PGCon.  So, if you can wait a bit, you
> can jump on that train.
> 
> In response to Andy Colson :
> 
>> On 2/10/2011 3:38 PM, Royce Ausburn wrote:
>>> Hi all,
>>>
>>> My company is having trouble managing how we upgrade schema changes across 
>>> many versions of our software.  I imagine this is a common problem and 
>>> there're probably some neat solutions that we don't know about.
>>>
>>> For the last 10 years we have been writing bash shell scripts essentially 
>>> numbered in order db0001, db0002, db0003 The number represents the 
>>> schema version which is recorded in the database and updated by the shell 
>>> scripts.  We have a template that provides all the functionality we need, 
>>> we just copy the script and fill in the blanks.  The schema upgrade scripts 
>>> are committed to svn along with the software changes, and we have a process 
>>> when installing the software at a site that runs the scripts on the DB in 
>>> order before starting up the new version of the software.
>>>
>>> This has worked really well so far.  But  we've effectively only had one 
>>> version of the software in development at any time.  We're now in the habit 
>>> of branching the software to form releases to promote stability when making 
>>> large changes.  The idea is that only really important changes are merged 
>>> in to the releases.  This introduces a bit of a problem when some change 
>>> needs to be merged from one release to another.  The typical problem is 
>>> that we might have two versions of the software 10.0 at schema version 
>>> 10057 and 11.0 at 11023 and we need to merge an important bug fix from 
>>> schema 11023 in to 10.0.  The issue is that 11023 might depend upon changes 
>>> introduced in the schema versions before it.  Or 11023 might introduce 
>>> changes that cause later scripts to break (11000 - 11023) when upgrading 
>>> from 10.0 to 11.0.
>>>
>>> One potential solution is to require that schema changes are never merged 
>>> in to a release, but of course sometimes business requires we do =(
>>>
>>> I'm really interested to hear how you guys manage schema upgrades in the 
>>> face of branches and upgrading from many different versions of the database.
>>>
>>> I've been reading 
>>> http://pyrseas.wordpress.com/2011/02/07/version-control-part-2-sql-databases/
>>>  but I have a feeling that this blog post won't address branches.
>>>
>>> Cheers!
>>>
>>> --Royce
>>>
>>>
>>
>> So, 10.0 at 10057.
>> 11.0 at 11023.
>>
>> then 10.1 needs some fixes so db is bumped to 10058.
>>
>> Then, later, you can upgrade 10057 to 11023, but you cant get 10058 to 
>> 11023.
>>
>> Humm... maybe you need smarter upgrade scripts?  Would having logic in 
>> the script help?  Something like:
>>
>> if not fieldExists('xyz) then alter table ...  add xyz ...
>>
>>
>>
>> Or, maybe your schema numbering system is to broad?  Maybe each table 
>> could have a version number?
>>
>>
>> Or some kinda flags like:
>> create table dbver(key text);
>>
>> then an update would be named: "add xyz to bob".
>>
>> then the update code:
>>
>> q = select key from dbver where key = 'add xyz to bob';
>> if q.eof then
>>  alter table bob add xyz
>>
>>
>> -Andy
>>
>> -- 
>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general
> 
> 

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] finding bogus UTF-8

2011-02-10 Thread dennis jenkins

>
> If you are interested, I can email to you the C and Perl source.
>
> It runs like this:
>
> # time pg_restore /db-dumps/some_ascii_pgdump.bin | ./ascii-tester |
> ./bad-ascii-report.pl > unclean-ascii.rpt

http://www.ecoligames.com/~djenkins/pgsql/

Disclaimer: I offer NO warranty.  Use at your own risk.  Code does
minimal error checking (its a hack / tool for manual use, not reliable
production use).

C code compiles cleanly with gcc.  Perl code uses no libraries (just a
STDIN -> STDOUT processor).  This code should run damn near anywhere.

The code will stay on my web server until I forget about it and re-org
stuff in a few weeks, so grab it while you can.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Schema version control

2011-02-10 Thread Bill Moran

In response to Andy Colson :

> On 2/10/2011 4:18 PM, Bill Moran wrote:
> >
> > We have this kickass solution we built at work called dbsteward that
> > just takes care of all of this for us, automatically.  You just give
> > it the new version and the old version and it generates update statements
> > to feed into PG.
> >
> > The reason I'm bringing this up is that we're working to release
> > dbsteward as open source for PGCon.  So, if you can wait a bit, you
> > can jump on that train.
> >
> > In response to Andy Colson:
> >
> 
> But... that assumes all updates are DDL only?  What if I have a lookup 
> table, and want to add some new entries into it?

It has provisions for maintaining static data as well.  We have a bunch
of lookup tables (too many, in my opinion) and it does an excellent job
of maintaining them.  They just need to have a primary key, but that's
not usually a problem with lookup tables.

-- 
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Schema version control

2011-02-10 Thread Thomas Kellerer


Royce Ausburn wrote on 10.02.2011 22:38:

I'm really interested to hear how you guys manage schema upgrades in
the face of branches and upgrading from many different versions of
the database.


We are quite happy with Liquibase. You can simply run it against a database and tell it 
to migrate it to "Version x.y"

As it keeps track of all changes applied it automatically knows what to do.

I can handle static data as well as stored procedure and any custom SQL.

Regards
Thomas

 



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Schema version control

2011-02-10 Thread Bill Moran

In response to Rob Sargent :

> Top-posting is frowned upon by some (not me), but since Bill started it...

Oops ... the weird thing is that I'm usually really anal about not top-
posting ...

> I for one will be waiting to see your dbsteward.  How does it compare
> functionally or stylistically with Ruby's migration tools (which I found
> to be pretty cool and frustrating all in one go).

I'm not familiar with Ruby's migration tools, so I can't say much.

The overview:
You store your schema and data as XML (this is easy to migrate to, because
it includes a tool that makes the XML from a live database)
Keep your XML schema files in some RCS.
When it's time for a new deployment, you run the dbsteward tool against
the schema XML and it turns it into DDL and DML.
When it's time for an upgrade, you run the dbsteward tool against two
schema XML files, and it calculates what has changed and generates the
appropriate DDL and DML to upgrade.

So ... you know, however that compares with the Ruby stuff is how it
does.

-- 
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Schema version control

2011-02-10 Thread Thomas Kellerer


Bill Moran wrote on 10.02.2011 23:59:

The overview:
You store your schema and data as XML (this is easy to migrate to, because
it includes a tool that makes the XML from a live database)
Keep your XML schema files in some RCS.
When it's time for a new deployment, you run the dbsteward tool against
the schema XML and it turns it into DDL and DML.
When it's time for an upgrade, you run the dbsteward tool against two
schema XML files, and it calculates what has changed and generates the
appropriate DDL and DML to upgrade.


This very much sounds like Liquibase. Do you happen to know any differences?

Regards
Thomas


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Schema version control

2011-02-10 Thread Rob Sargent

On 02/10/2011 03:59 PM, Bill Moran wrote:
> In response to Rob Sargent :
> 
>> Top-posting is frowned upon by some (not me), but since Bill started it...
> 
> Oops ... the weird thing is that I'm usually really anal about not top-
> posting ...
> 
>> I for one will be waiting to see your dbsteward.  How does it compare
>> functionally or stylistically with Ruby's migration tools (which I found
>> to be pretty cool and frustrating all in one go).
> 
> I'm not familiar with Ruby's migration tools, so I can't say much.
> 
> The overview:
> You store your schema and data as XML (this is easy to migrate to, because
> it includes a tool that makes the XML from a live database)
> Keep your XML schema files in some RCS.
> When it's time for a new deployment, you run the dbsteward tool against
> the schema XML and it turns it into DDL and DML.
> When it's time for an upgrade, you run the dbsteward tool against two
> schema XML files, and it calculates what has changed and generates the
> appropriate DDL and DML to upgrade.
> 
> So ... you know, however that compares with the Ruby stuff is how it
> does.
> 
Now at the bottom :)

It's been a couple years since I played with Ruby ActiveRecord but it's
(of course) radically than what you describe.  The ddl is in the ruby
code and naturally the code is in RCS.  So a revision is a new instance
of ActiveRecord (iirc) which does the change(s) (create table ttt, alter
table vvv etc).  Maybe skip a rev.  Rollback to a rev is definitely
there because one writes the undo for each new revision.  This include
manipulating the data of course, so there are limitations.

I personally am leary of the 'make the prod match the dev db' approach.
Who knows what extras lurk in the depths. I think one should be able to
make the dev db from scratch and write the necessary scripts to change
to (and from if possible) each revision. Apply to prod when tested.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Schema version control

2011-02-10 Thread Andy Chambers

On Thu, 10 Feb 2011 17:59:30 -0500, Bill Moran   
wrote:



The overview:
You store your schema and data as XML (this is easy to migrate to,  
because

it includes a tool that makes the XML from a live database)


We're doing a similar thing here except we're a Lisp shop so our
schema is defined as a set of "defentities" and we can migrate from
one version to another using a corresponding set of "defmaps".


Keep your XML schema files in some RCS.
When it's time for a new deployment, you run the dbsteward tool against
the schema XML and it turns it into DDL and DML.
When it's time for an upgrade, you run the dbsteward tool against two
schema XML files, and it calculates what has changed and generates the
appropriate DDL and DML to upgrade.


This sounds pretty cool.  Ours doesn't do that yet but that's next
on my TODO list.

--
Andy Chambers

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Schema version control

2011-02-10 Thread Bill Moran

In response to Thomas Kellerer :

> Bill Moran wrote on 10.02.2011 23:59:
> > The overview:
> > You store your schema and data as XML (this is easy to migrate to, because
> > it includes a tool that makes the XML from a live database)
> > Keep your XML schema files in some RCS.
> > When it's time for a new deployment, you run the dbsteward tool against
> > the schema XML and it turns it into DDL and DML.
> > When it's time for an upgrade, you run the dbsteward tool against two
> > schema XML files, and it calculates what has changed and generates the
> > appropriate DDL and DML to upgrade.
> 
> This very much sounds like Liquibase. Do you happen to know any differences?

Hrm ... before we started writing dbsteward, we looked around to see
if something already existed and didn't find Liquibase.  This is the
first I've heard of it.  I'm thinking it was some time in 2008, and
according to their changelog, Liquibase was around at that time.  I
wonder how we missed it ...

Anyway ... based on nothing more than a quick scan of their quickstart
page, here are the differences I see:
* Liquibase is dependent on you creating "changesets".  I'm sure this
  works, but we took a different approach with dbsteward.  dbsteward
  expects you to maintain XML files that represent the entire database,
  then dbsteward does the work of figuring out what changed.  Our
  opinion was that svn already does the work of tracking changes, why
  reinvent the wheel.
* Looks like liquibase requires you to talk to the database to push
  the changes?  dbsteward outputs a DDL/DML file that you can push
  in whatever way is best.  This is important to us because we use
  Slony, and DDL changes have to be submitted through EXECUTE SCRIPT()
* dbsteward has built-in Slony support (i.e. it will make slony configs
  as well as slony upgrade scripts in addition to DDL/DML)
* Does liquibase support UDFs?  dbsteward does.
* liquibase has a lot more supported platforms at this time.  dbsteward
  only supports PostgreSQL and MSSQL (because that's all that we needed)
  but I expect that other support will come quickly once we release it.
* Does liquibase support things like multi-column indexes and multi-
  column primary keys?  dbsteward does.

I don't think I should go on and on, as I could ask a lot of questions
about what liquibase does, and I simply don't have the time right now
to research it, or ask all those question ;)

Anyway ... sorry for the teaser on this, but we're trying to get through
all the hoops the company is requiring us to do to release it, and we
think we're on track to be ready by PGCon, so there'll be a website up
as soon as we can get it.

-- 
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Schema version control

2011-02-10 Thread Bill Moran

In response to Rob Sargent :
> 
> On 02/10/2011 03:59 PM, Bill Moran wrote:
> > In response to Rob Sargent :
> >> I for one will be waiting to see your dbsteward.  How does it compare
> >> functionally or stylistically with Ruby's migration tools (which I found
> >> to be pretty cool and frustrating all in one go).
> > 
> > I'm not familiar with Ruby's migration tools, so I can't say much.
> > 
> > The overview:
> > You store your schema and data as XML (this is easy to migrate to, because
> > it includes a tool that makes the XML from a live database)
> > Keep your XML schema files in some RCS.
> > When it's time for a new deployment, you run the dbsteward tool against
> > the schema XML and it turns it into DDL and DML.
> > When it's time for an upgrade, you run the dbsteward tool against two
> > schema XML files, and it calculates what has changed and generates the
> > appropriate DDL and DML to upgrade.
> > 
> > So ... you know, however that compares with the Ruby stuff is how it
> > does.
> > 
> Now at the bottom :)
> 
> It's been a couple years since I played with Ruby ActiveRecord but it's
> (of course) radically than what you describe.  The ddl is in the ruby
> code and naturally the code is in RCS.  So a revision is a new instance
> of ActiveRecord (iirc) which does the change(s) (create table ttt, alter
> table vvv etc).  Maybe skip a rev.  Rollback to a rev is definitely
> there because one writes the undo for each new revision.  This include
> manipulating the data of course, so there are limitations.

dbsteward can do downgrades ... you just feed it the old schema and
the new schema in reverse of how you'd do an upgrade ;)

Oh, also, it allows us to do installation-specific overrides.  We use
this ONLY for DML for lookup lists where some clients have slightly
different names for things than others.  In theory, it could do DDL
overrides as well, but we decided on a policy of not utilizing that
because we wanted the schemas to be consistent on all our installs.

> I personally am leary of the 'make the prod match the dev db' approach.
> Who knows what extras lurk in the depths. I think one should be able to
> make the dev db from scratch and write the necessary scripts to change
> to (and from if possible) each revision. Apply to prod when tested.

dbsteward allows us to do all this.  A developer can make a change,
rebuild a test database from their change to make sure it works, then
test the upgrade process as well, all before even checking the code in.

-- 
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Schema version control

2011-02-10 Thread Thomas Kellerer


Bill Moran wrote on 11.02.2011 00:37:

Anyway ... based on nothing more than a quick scan of their quickstart
page, here are the differences I see:
* Liquibase is dependent on you creating "changesets".  I'm sure this
   works, but we took a different approach with dbsteward.  dbsteward
   expects you to maintain XML files that represent the entire database,
   then dbsteward does the work of figuring out what changed.  Our
   opinion was that svn already does the work of tracking changes, why
   reinvent the wheel.


That sounds like a very nice feature.


* Looks like liquibase requires you to talk to the database to push
   the changes?  dbsteward outputs a DDL/DML file that you can push
   in whatever way is best.  This is important to us because we use
   Slony, and DDL changes have to be submitted through EXECUTE SCRIPT()


No, Liquibase can also emit the SQL that it would execute.


* dbsteward has built-in Slony support (i.e. it will make slony configs
   as well as slony upgrade scripts in addition to DDL/DML)




* liquibase has a lot more supported platforms at this time.  dbsteward
   only supports PostgreSQL and MSSQL (because that's all that we needed)
   but I expect that other support will come quickly once we release it.




* Does liquibase support things like multi-column indexes and multi-
   column primary keys?  dbsteward does.


Yes without problems (including of course the necessary foreing keys)

 

Anyway ... sorry for the teaser on this, but we're trying to get through
all the hoops the company is requiring us to do to release it, and we
think we're on track to be ready by PGCon, so there'll be a website up
as soon as we can get it.


Thanks for the feedback, I would really like to see it.

The approach that you do not record the changes but simply let the software 
find them seems like a very nifty feature.
I wonder how you detect renaming a table or a column?

On which programming language is dbstewart based?


Regards
Thomas

 




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Schema version control

2011-02-10 Thread Rob Sargent



On 02/10/2011 02:38 PM, Royce Ausburn wrote:
> Hi all,
> 
> My company is having trouble managing how we upgrade schema changes across 
> many versions of our software.  I imagine this is a common problem and 
> there're probably some neat solutions that we don't know about.  
> 
> For the last 10 years we have been writing bash shell scripts essentially 
> numbered in order db0001, db0002, db0003 The number represents the schema 
> version which is recorded in the database and updated by the shell scripts.  
> We have a template that provides all the functionality we need, we just copy 
> the script and fill in the blanks.  The schema upgrade scripts are committed 
> to svn along with the software changes, and we have a process when installing 
> the software at a site that runs the scripts on the DB in order before 
> starting up the new version of the software.
> 
> This has worked really well so far.  But  we've effectively only had one 
> version of the software in development at any time.  We're now in the habit 
> of branching the software to form releases to promote stability when making 
> large changes.  The idea is that only really important changes are merged in 
> to the releases.  This introduces a bit of a problem when some change needs 
> to be merged from one release to another.  The typical problem is that we 
> might have two versions of the software 10.0 at schema version 10057 and 11.0 
> at 11023 and we need to merge an important bug fix from schema 11023 in to 
> 10.0.  The issue is that 11023 might depend upon changes introduced in the 
> schema versions before it.  Or 11023 might introduce changes that cause later 
> scripts to break (11000 - 11023) when upgrading from 10.0 to 11.0.
> 
> One potential solution is to require that schema changes are never merged in 
> to a release, but of course sometimes business requires we do =(
> 
> I'm really interested to hear how you guys manage schema upgrades in the face 
> of branches and upgrading from many different versions of the database.
> 
> I've been reading 
> http://pyrseas.wordpress.com/2011/02/07/version-control-part-2-sql-databases/ 
> but I have a feeling that this blog post won't address branches.
> 
> Cheers!
> 
> --Royce
> 
> 

Don't the bash scripts get checked in to .../perforce/cvs/svn/git/...?
Aren't they part of the resources of the project(s)?


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Schema version control

2011-02-10 Thread Rob Sargent



On 02/10/2011 04:44 PM, Bill Moran wrote:
> In response to Rob Sargent :
>>
>> On 02/10/2011 03:59 PM, Bill Moran wrote:
>>> In response to Rob Sargent :
 I for one will be waiting to see your dbsteward.  How does it compare
 functionally or stylistically with Ruby's migration tools (which I found
 to be pretty cool and frustrating all in one go).
>>>
>>> I'm not familiar with Ruby's migration tools, so I can't say much.
>>>
>>> The overview:
>>> You store your schema and data as XML (this is easy to migrate to, because
>>> it includes a tool that makes the XML from a live database)
>>> Keep your XML schema files in some RCS.
>>> When it's time for a new deployment, you run the dbsteward tool against
>>> the schema XML and it turns it into DDL and DML.
>>> When it's time for an upgrade, you run the dbsteward tool against two
>>> schema XML files, and it calculates what has changed and generates the
>>> appropriate DDL and DML to upgrade.
>>>
>>> So ... you know, however that compares with the Ruby stuff is how it
>>> does.
>>>
>> Now at the bottom :)
>>
>> It's been a couple years since I played with Ruby ActiveRecord but it's
>> (of course) radically than what you describe.  The ddl is in the ruby
>> code and naturally the code is in RCS.  So a revision is a new instance
>> of ActiveRecord (iirc) which does the change(s) (create table ttt, alter
>> table vvv etc).  Maybe skip a rev.  Rollback to a rev is definitely
>> there because one writes the undo for each new revision.  This include
>> manipulating the data of course, so there are limitations.
> 
> dbsteward can do downgrades ... you just feed it the old schema and
> the new schema in reverse of how you'd do an upgrade ;)
> 
> Oh, also, it allows us to do installation-specific overrides.  We use
> this ONLY for DML for lookup lists where some clients have slightly
> different names for things than others.  In theory, it could do DDL
> overrides as well, but we decided on a policy of not utilizing that
> because we wanted the schemas to be consistent on all our installs.
> 
>> I personally am leary of the 'make the prod match the dev db' approach.
>> Who knows what extras lurk in the depths. I think one should be able to
>> make the dev db from scratch and write the necessary scripts to change
>> to (and from if possible) each revision. Apply to prod when tested.
> 
> dbsteward allows us to do all this.  A developer can make a change,
> rebuild a test database from their change to make sure it works, then
> test the upgrade process as well, all before even checking the code in.
> 

Good work.  Will look forward to it.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] finding bogus UTF-8

2011-02-10 Thread Glenn Maynard

On Thu, Feb 10, 2011 at 2:02 PM, Scott Ribe wrote:

> I know that I have at least one instance of a varchar that is not valid
> UTF-8, imported from a source with errors (AMA CPT files, actually) before
> PG's checking was as stringent as it is today. Can anybody suggest a query
> to find such values?
>

I hit this problem too, if I remember correctly when trying to upgrade a
database from 8.3 to 8.4.  I ended up aborting the upgrade, since the
upgrade documentation made no mention of this and I didn't have time to dig
into it at the time.  A tool to find all instances of this would be very
helpful.

-- 
Glenn Maynard

[GENERAL] plsql question

2011-02-10 Thread Geoffrey Myers

I am trying to write a plsql routine that will delete a range of 
characters based on their octal or hexadecimal values.  Something like 
the 'tr' shell command will do:


cat file| tr -d ['\177'-'\377']

Can't seem to figure this one out.

Pointers would be appreciated.

--
Until later, Geoffrey

"I predict future happiness for America if they can prevent
the government from wasting the labors of the people under
the pretense of taking care of them."
- Thomas Jefferson

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Schema version control

2011-02-10 Thread Royce Ausburn


On 11/02/2011, at 9:59 AM, Thomas Kellerer wrote:

> Royce Ausburn wrote on 10.02.2011 22:38:
>> I'm really interested to hear how you guys manage schema upgrades in
>> the face of branches and upgrading from many different versions of
>> the database.
> 
> We are quite happy with Liquibase. You can simply run it against a database 
> and tell it to migrate it to "Version x.y"
> 
> As it keeps track of all changes applied it automatically knows what to do.
> 
> I can handle static data as well as stored procedure and any custom SQL.

Thanks!  Liquibase looks pretty neat.  We'll have to check it out.

--Royce
-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Schema version control

2011-02-10 Thread Royce Ausburn

> 
> So, 10.0 at 10057.
> 11.0 at 11023.
> 
> then 10.1 needs some fixes so db is bumped to 10058.
> 
> Then, later, you can upgrade 10057 to 11023, but you cant get 10058 to 11023.
> 
> Humm... maybe you need smarter upgrade scripts?  Would having logic in the 
> script help?  Something like:
> 
> if not fieldExists('xyz) then alter table ...  add xyz ...
> 
> 
> 
> Or, maybe your schema numbering system is to broad?  Maybe each table could 
> have a version number?
> 
> 
> Or some kinda flags like:
> create table dbver(key text);
> 
> then an update would be named: "add xyz to bob".
> 
> then the update code:
> 
> q = select key from dbver where key = 'add xyz to bob';
> if q.eof then
>   alter table bob add xyz
> 


This is effectively the approach we've been working with so far, but it isn't 
great.  The issue is that you need to be really aware of what changes might or 
might not have been in the previous databases... This can be hard.

There's also the problem of updating data, etc.



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Schema version control

2011-02-10 Thread Royce Ausburn


> 
> Don't the bash scripts get checked in to .../perforce/cvs/svn/git/...?
> Aren't they part of the resources of the project(s)?

Yep - they absolutely are.  The issue is that there're multiple branches 
*potentially* having new scripts committed.  Fortunately it's rare as the 
release branches seldom require schema changes, but it's an important enough 
problem to need a better solution.

--Royce
-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Schema version control

2011-02-10 Thread Glenn Maynard

On Thu, Feb 10, 2011 at 6:44 PM, Bill Moran wrote:

> dbsteward can do downgrades ... you just feed it the old schema and
> the new schema in reverse of how you'd do an upgrade ;)
>
> Oh, also, it allows us to do installation-specific overrides.  We use
> this ONLY for DML for lookup lists where some clients have slightly
> different names for things than others.  In theory, it could do DDL
> overrides as well, but we decided on a policy of not utilizing that
> because we wanted the schemas to be consistent on all our installs.
>

What about upgrades that can't be derived directly from an inspection of the
schema?  Some examples:

- Adding a NOT NULL constraint (without adding a DEFAULT).  You often want
to precede this with filling in any existing NULL values, so the new
constraint doesn't fail.
- Updating triggers, functions and their effects.  For example, when I have
an FTS index with a trigger to update an index column, and I change the
underlying trigger, I often do something like "UPDATE table SET column =
column", to cause all of the update triggers to fire and recalculate the
index columns.
- Creating a new column based on an old one, and removing the old one; eg.
add a column "n", run "UPDATE ... SET n = i*j * 2", and then drop the old
columns "i" and "j".
- Updating data from an external source, such as ORM model code; for
example, if you have a table representing external files, an update may want
to calculate and update the SHA-1 of each file.
- For efficiency, dropping a specific index while making a large update, and
then recreating the index.

In my experience, while generating schema updates automatically is handy, it
tends to make nontrivial database updates more complicated.  These sorts of
things happen often and are an integral part of a database update, so I'm
just curious how/if you deal with them.

I've used Ruby's migrations, and for my Django databases I use my own
migration system which is based in principle off of it: create scripts to
migrate the database from version X to X+1 and X-1, and upgrade or downgrade
by running the appropriate scripts in sequence.

It's not ideal, since it can't generate a database at a specific version
directly; it always has to run through the entire sequence of migrations to
the version you want, and the migrations accumulate.  However, it can handle
whatever arbitrary steps are needed to update a database, and I don't need
to test updates from every version to every other version.

-- 
Glenn Maynard

Re: [GENERAL] Schema version control

2011-02-10 Thread Bill Moran

In response to Glenn Maynard :

> On Thu, Feb 10, 2011 at 6:44 PM, Bill Moran wrote:
> 
> > dbsteward can do downgrades ... you just feed it the old schema and
> > the new schema in reverse of how you'd do an upgrade ;)
> >
> > Oh, also, it allows us to do installation-specific overrides.  We use
> > this ONLY for DML for lookup lists where some clients have slightly
> > different names for things than others.  In theory, it could do DDL
> > overrides as well, but we decided on a policy of not utilizing that
> > because we wanted the schemas to be consistent on all our installs.
> >
> 
> What about upgrades that can't be derived directly from an inspection of the
> schema?  Some examples:
> 
> - Adding a NOT NULL constraint (without adding a DEFAULT).  You often want
> to precede this with filling in any existing NULL values, so the new
> constraint doesn't fail.
> - Updating triggers, functions and their effects.  For example, when I have
> an FTS index with a trigger to update an index column, and I change the
> underlying trigger, I often do something like "UPDATE table SET column =
> column", to cause all of the update triggers to fire and recalculate the
> index columns.
> - Creating a new column based on an old one, and removing the old one; eg.
> add a column "n", run "UPDATE ... SET n = i*j * 2", and then drop the old
> columns "i" and "j".
> - Updating data from an external source, such as ORM model code; for
> example, if you have a table representing external files, an update may want
> to calculate and update the SHA-1 of each file.
> - For efficiency, dropping a specific index while making a large update, and
> then recreating the index.
> 
> In my experience, while generating schema updates automatically is handy, it
> tends to make nontrivial database updates more complicated.  These sorts of
> things happen often and are an integral part of a database update, so I'm
> just curious how/if you deal with them.
> 
> I've used Ruby's migrations, and for my Django databases I use my own
> migration system which is based in principle off of it: create scripts to
> migrate the database from version X to X+1 and X-1, and upgrade or downgrade
> by running the appropriate scripts in sequence.
> 
> It's not ideal, since it can't generate a database at a specific version
> directly; it always has to run through the entire sequence of migrations to
> the version you want, and the migrations accumulate.  However, it can handle
> whatever arbitrary steps are needed to update a database, and I don't need
> to test updates from every version to every other version.

You're correct (based on our experience over the past few years).

The big caveat is that 99.9% of the database changes don't fall into those
"nontrivial" categories, and dbsteward makes those 99.9% of the changes
easy to do, reliable to reproduce, and easy to track.

We've added some stuff to handle the other .1% as well, like 
and  where you can put an arbitrary SQL strings to be run
before or after the remainder of the automatic stuff is done.  We probably
haven't seen every circumstance that needs a special handling, but we've
already struggled through a bunch.

All this is part of the reason we're pushing to get this stuff open-
sourced.  We feel like we've got something that's pretty far along, and
we feel that community involvement will help enhance things.

-- 
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Schema version control

2011-02-10 Thread Glenn Maynard

On Fri, Feb 11, 2011 at 12:16 AM, Bill Moran wrote:

> The big caveat is that 99.9% of the database changes don't fall into those
> "nontrivial" categories, and dbsteward makes those 99.9% of the changes
> easy to do, reliable to reproduce, and easy to track.
>

My experience is maybe more like 95% than 99.9%, for what it's worth;
they're the exception, but not rare.

We've added some stuff to handle the other .1% as well, like
> 
> and  where you can put an arbitrary SQL strings to be run
> before or after the remainder of the automatic stuff is done.  We probably
> haven't seen every circumstance that needs a special handling, but we've
> already struggled through a bunch.
>

Here's a fairly common example, in the abstract:

version 1 has two columns, i and j;
version 2 has one column, k, where k = i + j; and
version 3 has one column, x, where x = k * 2

Not only is updating from 1 to 2 tricky ("k = i + j" lies between the adding
of "k" but before the removal of i and j; it's neither a "before" nor an
"after"), but updating directly from 1 to 3 without first migrating to 2 is
extremely hard.  I suspect you'd need to snapshot the schema at each version
where these are needed to update incrementally, rather than always trying to
convert directly to the current version--maybe you already do that.

Anyhow, just some thoughts based on my own experience with database
updates--good luck.

-- 
Glenn Maynard

Re: [GENERAL] Schema version control

2011-02-10 Thread Alban Hertroys

On 10 Feb 2011, at 23:59, Bill Moran wrote:

> The overview:
> You store your schema and data as XML (this is easy to migrate to, because
> it includes a tool that makes the XML from a live database)
> Keep your XML schema files in some RCS.

That reminds me of something I've been wondering about - How well do modern 
RCSs deal with structured data formats (like XML)? It would appear that most of 
them still use diff, which is line-based with limited context tracking, to 
determine change-sets.

Is that combination guaranteed to result in valid XML if you merge revisions 
that are far enough apart? Or are there RCSs around that know about XML format 
(and possibly other structured formats) and handle it differently?

I've heavily used RCSs (mostly Subversion) with, for example HTML, and merge 
conflicts haven't been unusual. It doesn't help Subversion's diff is a bit 
simplistic about white-space, I'm sure some of those conflicts were quite 
unnecessary.

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.

!DSPAM:737,4d54eac711731788013809!

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

47 matches

Mail list logo