[GENERAL] Upgrading postgresql-8.4

2013-03-11 Thread Steve Erickson
I have a database that's about 600 GB.  I did a pg_dump (pg_dump --format plain 
--create --encoding UTF8 --oids --verbose --file "/var/tmp/145_backup.sql" 
"digi") and, after 442 GB, I got an error, "pg_dump: Error message from server: 
ERROR:  missing chunk number 0 for toast value 49209130 in pg_toast_17031".  
This was using postgresql-8.4.3 on Ubuntu 10.04.  I upgraded our postgres to 
8.4.16 using 'apt-get install postgresql-8.4'.

This went well and postgres restarted just fine.  However, now when I execute a 
pg_dump I get a missing chunk 0 for pg_toast_2619 while querying pg_attribute.  
I did a reindex on pg_toast_2619, then tried to VACUUM ANALYZE pg_attribute but 
again got the missing chunk 0 error.

Did I miss a step doing the upgrade or recovery attempt, or is the data 
corrupted?  I did the postgres upgrade and pg_dump on another server and it 
went very smooth so I'm flustered at how the problem seems to have gotten worse.


Re: [GENERAL] Upgrading postgresql-8.4

2013-03-11 Thread Steve Erickson
Thanks for the reply.  I deleted all rows in pg_statistic and the VACUUM 
ANALYZE on pg_attribute still fails.  I tried to reindex pg_toast_2619 and got 
an error "could not access status of transaction 1493786085.  Could not open 
file "pg_subtrans/5909": No such file or directory.  Sure enough, there is no 
such file - only 5905.  


From: Tom Lane [t...@sss.pgh.pa.us]
Sent: Monday, March 11, 2013 12:10 PM
To: Steve Erickson
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Upgrading postgresql-8.4

Steve Erickson  writes:
> This went well and postgres restarted just fine.  However, now when I execute 
> a pg_dump I get a missing chunk 0 for pg_toast_2619 while querying 
> pg_attribute.  I did a reindex on pg_toast_2619, then tried to VACUUM ANALYZE 
> pg_attribute but again got the missing chunk 0 error.

> Did I miss a step doing the upgrade or recovery attempt, or is the
> data corrupted?

It's corrupt, but fortunately for you, 2619 is pg_statistic which is
eminently discardable data.  Just truncate pg_statistic and you should
be good.  If you aren't immediately abandoning the old database, you
might want to re-ANALYZE everything to reconstruct the stats.

We've seen one or two reports like this before, which makes me think
there might be a reproducible bug lurking somewhere around here; but
I don't suppose you have a recipe for getting a database into this
state ...

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] indexing elements of a csv ?

2013-03-12 Thread Steve Erickson
An option would be to create a column of type tsvector.  That way you could do 
text searches using partial words or words and get results including those 
containing forms of the word.

From: pgsql-general-ow...@postgresql.org [pgsql-general-ow...@postgresql.org] 
on behalf of Gauthier, Dave [dave.gauth...@intel.com]
Sent: Tuesday, March 12, 2013 8:50 AM
To: pgsql-general@postgresql.org
Subject: [GENERAL] indexing elements of a csv ?

Hi:

v9.0.1 on linux.

I have a table with a column that is a csv.  Users will select records based 
upon the existence of an element of the csv.  There is an index on that column 
but I'm thinking that it won't be of much use in this situation.  Is there a 
way to facilitate these queries?

Example:

create table foo (col0 text, col1 text);
create index foo_col1 on foo (col1);
insert into foo (col0,col1) values 
('moe','aa,bbb,c'),('larry','x,bbb,yyy'),('curly','m,,oo');

now...
select col0 from foo where 

Some attempts, which get the right answers, but which probably won't be very 
efficient...

select col0 from foo where string_to_array('bbb','') <@ string_to_array(col1);
select col0 from foo where ','||col1||','  like  '%,bbb,%';
select col0 from foo where ((col1 like 'bbb,%') or (col1 like '%,bbb,%') or 
(col1 like '%,bbb'));

Long shot, but I thought I'd ask anyway.
Thanks in Advance !


[GENERAL] Concurrent updates

2013-03-18 Thread Steve Erickson
I have a table that I want to use as a queue with all functionality (Insert,
update, delete) embodied in a stored procedure.  Inserts and deletes are no
problem.  An external program would call the stored procedure to get one or
more emails to work on, selecting on "state='N'", then updating the row so
"state='P'".  My problem is having multiple threads calling the stored
procedure simultaneously and getting the same row(s).  Selecting FOR UPDATE
won't work as, if thread #1 gets 3 rows and thread #2 starts before thread
#1 completes (Commits), thread #2 will select the same 3 rows as thread #1
except, since thread #1 will update the state (changing the state to 'P') so
that those rows no longer meet thread #2 criteria, and thread #2 will
receive zero rows.  The table looks like:

 

CREATE TABLE dss.stage_email

(

  emailid bigserial NOT NULL,

  email_path text,

  state character(1) DEFAULT 'N'::bpchar, -- N = New Email, P=Processing,
D=Deleting

  fetch_date timestamp without time zone DEFAULT now(),

  ingest_date timestamp without time zone

)

 

Steve Erickson

Senior Developer

266 East 7th Street, Floor 4

Saint Paul, MN 55101

651.925.3237 office

612.242.1343 cell

 

NOTICE:  This email, including any attachments, is covered by the Electronic
Communications Privacy Act, is confidential and may be legally privileged.
If you are not the intended recipient, you are hereby notified that any
retention, dissemination, distribution or copying of this communication is
strictly prohibited.  If you have received this communication in error,
please immediately notify our office by return email or at our telephone
number (651) 925-3200.  Then delete and destroy all copies of this email and
any attachments.

 

Thank you.

 



[GENERAL] Problem with plpython

2009-10-30 Thread Steve Erickson
I'm running PostgreSQL 8.3 with pl/python 8.3.  I am getting a different
date/time format when executing the below examples.  The results are the
same whether I use os.popen or os.system.  In plpython, I run:
  import os
  cmd = 'ls -al /var/log/messages > /var/tmp/log'
  x = os.popen(cmd)
  for aline in x.readlines():
  plpy.notice('aline = %s', aline)

and the contents of /var/tmp/log are "-rw-r- 1 syslog adm 495523 Oct 30
11:52 /var/log/messages

When, within Python, I run:
>>> cmd = 'ls -al /var/log/messages > /var/tmp/log'
>>> x = os.popen(cmd)
>>> for aline in x.readlines():
...   print aline

the contents of /var/tmp/log are "-rw-r- 1 syslog adm 23591 2009-10-30
13:03 /var/log/messages

How, using plpython, can I get the output date/time in the same format as
when executing from within python itself?