from:"Robert"

[GENERAL] mssql migration and boolean to integer problems

2007-12-14 Thread robert

Hi all, I've spent the last few days hacking a mssql INSERT script to
work with 8.1.9 - I could build the latest postgres source if need be.
My latest problem is:

ERROR:  column "includeScenario" is of type boolean but expression is
of type integer
HINT:  You will need to rewrite or cast the expression.

So mssql uses tiny int for booleans, and I have about 50 of
those ;-) . I googled alot on this, and tried 4 or 5 different ideas
with Functions and alter tables -  but I can't find anything that's
working with 8.1.9, can someone please help me?

Thanks,
Robert




---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [GENERAL] mssql migration and boolean to integer problems

2007-12-14 Thread robert

On Dec 12, 11:09 pm, robert <[EMAIL PROTECTED]> wrote:
> Hi all, I've spent the last few days hacking a mssql INSERT script to
> work with 8.1.9 - I could build the latest postgres source if need be.
> My latest problem is:
>
> ERROR:  column "includeScenario" is of type boolean but expression is
> of type integer
> HINT:  You will need to rewrite or cast the expression.
>
> So mssql uses tiny int for booleans, and I have about 50 of
> those ;-) . I googled alot on this, and tried 4 or 5 different ideas
> with Functions and alter tables -  but I can't find anything that's
> working with 8.1.9, can someone please help me?
>
> Thanks,
> Robert

Really stuck, please help. I have this table:

create table "ASSETSCENARIO" ("assetScenarioID" int8 not null,
OBJ_VERSION int8 not null, "includeScenario" bool, "scenarioName"
varchar(2000), "probability" int8, "occurenceDate" timestamp, "notes"
varchar(2000), "priceTarget" float8, "assetID" int8 not null,
"created" timestamp not null, "modified" timestamp not null,
"createdUserID" int8 not null, "modifiedUserID" int8 not null,
"deleted" bool, primary key ("assetScenarioID"));

So it has two 'bool' - "includeScenario"  and "deleted" . I have an
insert like...

INSERT INTO
"ASSETSCENARIO" 
("assetScenarioID",OBJ_VERSION,"includeScenario","scenarioName","probability","occurenceDate","notes","priceTarget","assetID","created","modified","createdUserID","modifiedUserID","deleted")VALUES(197,0,1,'2007-12-13
11:31:00.000','2007-12-13 11:31:00.000',2,2,NULL);

I've tried:

CREATE FUNCTION boolean_integer_compare(boolean,integer) RETURNS
boolean AS $$
SELECT ($2 = 1 AND $1) OR ($2 = 0 AND NOT $1);
$$ LANGUAGE SQL;

CREATE OPERATOR = (
leftarg = boolean,
rightarg = integer,
procedure = boolean_integer_compare,
commutator = =
);

And alternatively:

CREATE FUNCTION notinttobool(integer, boolean) RETURNS boolean
AS '
begin
return not inttobool($1,$2);
end;
'
LANGUAGE plpgsql;

CREATE OPERATOR = (
PROCEDURE = inttobool,
LEFTARG = boolean,
RIGHTARG = integer,
COMMUTATOR = =,
NEGATOR = <>
);

CREATE OPERATOR <> (
PROCEDURE = notinttobool,
LEFTARG = integer,
RIGHTARG = boolean,
COMMUTATOR = <>,
NEGATOR = =
);

CREATE OPERATOR = (
PROCEDURE = inttobool,
LEFTARG = integer,
RIGHTARG = boolean,
COMMUTATOR = =,
NEGATOR = <>
);

CREATE OPERATOR <> (
PROCEDURE = notinttobool,
LEFTARG = boolean,
RIGHTARG = integer,
COMMUTATOR = <>,
NEGATOR = =
);

Lastly, I tried:

ALTER TABLE table
ALTER COLUMN field1 TYPE boolean
USING CASE WHEN field1 = 0 THEN FALSE
WHEN field1 = 1 THEN TRUE
ELSE NULL
END;

Each time I get:

ERROR:  column "includeScenario" is of type boolean but expression is
of type integer
HINT:  You will need to rewrite or cast the expression.

Right now I'm trying to "cast the expression." - how do I do that in
this case?

Thanks,
Robert

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

[GENERAL] Finding bad bye in "invalid byte sequence" error

2007-12-14 Thread robert

Hi all,

I'm trying to hack my inserts script from mssql to work with postgres
8.1.9 - I can upgrade if need be. I'm getting this error:

psql -h localhost atdev < fuk2.sql
ERROR:  invalid byte sequence for encoding "UTF8": 0xe1204f
HINT:  This error can also happen if the byte sequence does not match
the encoding expected by the server, which is controlled by
"client_encoding".

I tried:

/var/lib/pgsql> recode ascii..utf8 fuk2.sql
recode: fuk2.sql failed: Invalid input in step `ANSI_X3.4-1968..UTF-8'

And also dos2unix , but nothing is working. 0xe1204f looks like a hex
address, and I'm trying hexdump to find what its complaining about,
but that's not helping either. Any ideas?

Robert

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

[GENERAL] initdb of pg 9.0.13 fails on pg_authid

2013-04-04 Thread Robert

I'm having some trouble installing `9.0.13`.

Compiling worked just fine. (The only flags used were `--prefix=/opt/pg9013
--with-perl`).

However after running bin/initdb, it fails:

The files belonging to this database system will be owned by user "mobit".
>
> This user must also own the server process.
>
>
>> The database cluster will be initialized with locale en_US.UTF-8.
>
> The default database encoding has accordingly been set to UTF8.
>
> The default text search configuration will be set to "english".
>
>
>> fixing permissions on existing directory /opt/pg9013/data ... ok
>
> creating subdirectories ... ok
>
> selecting default max_connections ... 100
>
> selecting default shared_buffers ... 24MB
>
> creating configuration files ... ok
>
> creating template1 database in /opt/pg9013/data/base/1 ... ok
>
> initializing pg_authid ... FATAL:  wrong number of index expressions
>
> STATEMENT:  REVOKE ALL on pg_authid FROM public;
>
>
>> child process exited with exit code 1
>
> initdb: removing contents of data directory "/opt/pg9013/data"
>
>
The md5sum of my tarball checks out. This was built w/ GCC-4.8.0 on an
x86_64 Linux kernel.

Any ideas why this might fail? Thanks!

[GENERAL] mssql to postgres problems with bytea help needed

2008-03-06 Thread robert

E00024C000C7761736852756C654461746571007E00077870707070707070707070707070707070707070707070707070707070707070707372000E6A6176612E6C616E672E4C6F6E673B8BE490CC8F23DF0200014A000576616C7565787200106A6176612E6C616E672E4E756D62657286AC951D0B94E08B0278717070707070707070707070707070707070707070707070707070707070707070707070707070707070707070707070707070707070707070707070707070);

If that line is not showing up on your newsreader, please see this
link:

http://braziloutsource.com/random/postgres_bytea_problem.txt

When inserting, I get: ERROR:  syntax error at or near "xACED0005


I've been googling / reading the docs and I tried several things that
have not worked, but thought I'd post here while I'm trying solutions.
Any ideas? I'm open to any cast or modifying the insert to use psql
functions, etc.

Thanks!
Robert

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

[GENERAL] installing tsearch2

2011-10-19 Thread robert


Hi there,
this is my first post to this list..
I have been a casual user of postgres for the past 2-3 years but was never 
involved in installation and such.


Now I must help the migration of a 8.3 system to 9.1.
The problem we face is that the actual system is using tsearch-2 which was 
allready posted from an 8.1 installation.


My questions now are:
- how do I install tsearch2
  I compiled the 9.1 source an executed make all / install in the contrib 
directory
  now I find there a tsearch2--1.0.sql and tsearch2--unpackaged--1.0.sql
  file.
  Do I have to execute them?
  Both ?

- in the dump of the old database there are references to gtsvector_in and 
gtsvector_out and similar of which I find nothing in gtsvector_out

  are these _in/_out objects needed anymore?
  can I overlook the errors when importing the old dump?

thanks for your time
robert

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] upgrading tsearch2: how to call function in trigger

2011-12-06 Thread robert


Hi there,

I am in the process of porting an application from V8.1 to 9.x
Now I have the following situation:

there is a function tsearch2:
CREATE FUNCTION tsearch2() RETURNS "trigger"
AS '$libdir/tsearch2', 'tsearch2'
LANGUAGE c;

a second function:
CREATE FUNCTION dropatsymbols(text) RETURNS text
AS $_$select replace($1,'-', ' ');select replace($1,':', ' ');$_$
LANGUAGE sql;

and an update trigger that is fired when a record is inserted:
CREATE TRIGGER dok_tsvectorupdate
BEFORE INSERT OR UPDATE ON dokument
FOR EACH ROW
EXECUTE PROCEDURE tsearch2('idxfti_dok', 'dropatsymbols', 'docnum', 
'titel', 'deskriptoren', 'ablage', 'ort');



Now my problem:
In above trigger tsearch2 is called with:
tsearch2('idxfti_dok', 'dropatsymbols', 'docnum', 'titel', 'deskriptoren', 
'ablage', 'ort');

which generates the error:
ERROR:  column "dropatsymbols" does not exist

which is correct as dropatsymbols is a function and not a column.

what should I do to avoid said error?

thanks for your time

robert

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] simple md5 authentication problems

2006-05-05 Thread robert

Hi all, hope this is the right list.

I have postgres 8.1 running on linux. We have tests that mostly run on
windows. I want to run these tests on linux.

On these windows boxes, pg_hba.conf has just one line:

hostall all 127.0.0.1/32  md5

They use 'postgres' as the user and password to connect to a db.

I couldn't start postgres on linux with just that line, so on linux I
have:

# "local" is for Unix domain socket connections only
local   all all   ident sameuser
# IPv4 local connections:
hostall all 127.0.0.1/32  md5
# IPv6 local connections:
hostall all ::1/128   ident sameuser

I created my db as:
postgres=# CREATE DATABASE maragato_test OWNER postgres;

I seem to have a user 'postgres' - I'm using the default.

postgres=# SELECT * FROM "pg_user";
 usename  | usesysid | usecreatedb | usesuper | usecatupd |  passwd  |
valuntil | useconfig
--+--+-+--+---+--+--+---
 postgres |   10 | t   | t| t |  |
|

However, I get this error:

/home/postgres> psql -h localhost maragato_test postgres
Password for user postgres:
psql: FATAL:  autenticação do tipo password falhou para usuário
"postgres"

Sorry - couldn't get local en_US working. That translates to:
Authentication of type password failed for user postgres. I think that
means 'ident password' . I tried to connect with java and I get the
same error.

I just need to connect to db 'maragato_test' on local host using
'postgres´ as the user and password, using md5.

Any ideas?
Robert


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [GENERAL] simple md5 authentication problems

2006-05-08 Thread robert

Thanks for the response, but changing to 127.0.0.1 didn't help.

Changing this line sets the db wide open:

hostall all 127.0.0.1/32  trust

>From there, another non-root login can access it with any user /
password.

What I really need is this command to work with a non-root
account, with only the right username and password - in this case the
pre-configured postgres account:

psql -U postgres -h 127.0.0.1

robert

"chris smith" escreveu:

> On 5 May 2006 02:22:32 -0700, robert <[EMAIL PROTECTED]> wrote:
> > Hi all, hope this is the right list.
> >
> > I have postgres 8.1 running on linux. We have tests that mostly run on
> > windows. I want to run these tests on linux.
> >
> > On these windows boxes, pg_hba.conf has just one line:
> >
> > hostall all 127.0.0.1/32  md5
> >
> > They use 'postgres' as the user and password to connect to a db.
> >
> > I couldn't start postgres on linux with just that line, so on linux I
> > have:
> >
> > # "local" is for Unix domain socket connections only
> > local   all all   ident sameuser
> > # IPv4 local connections:
> > hostall all 127.0.0.1/32  md5
> > # IPv6 local connections:
> > hostall all ::1/128   ident sameuser
> >
> > I created my db as:
> > postgres=# CREATE DATABASE maragato_test OWNER postgres;
> >
> > I seem to have a user 'postgres' - I'm using the default.
> >
> > postgres=# SELECT * FROM "pg_user";
> >  usename  | usesysid | usecreatedb | usesuper | usecatupd |  passwd  |
> > valuntil | useconfig
> > --+--+-+--+---+--+--+---
> >  postgres |   10 | t   | t| t |  |
> > |
> >
> > However, I get this error:
> >
> > /home/postgres> psql -h localhost maragato_test postgres
> > Password for user postgres:
> > psql: FATAL:  autenticação do tipo password falhou para usuário
> > "postgres"
> >
> > Sorry - couldn't get local en_US working. That translates to:
> > Authentication of type password failed for user postgres. I think that
> > means 'ident password' . I tried to connect with java and I get the
> > same error.
> >
> > I just need to connect to db 'maragato_test' on local host using
> > 'postgres´ as the user and password, using md5.
>
> Try '-h 127.0.0.1' rather than 'localhost' - it's still seeing the
> connection as coming through the socket, not through tcpip, so it's
> matching the "ident" rule.
>
> --
> Postgresql & php tutorials
> http://www.designmagick.com/
>
> ---(end of broadcast)---
> TIP 3: Have you checked our extensive FAQ?
> 
>http://www.postgresql.org/docs/faq


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [GENERAL] simple md5 authentication problems

2006-05-09 Thread robert

Bruno Wolff III escreveu:

> On Mon, May 08, 2006 at 23:10:31 +0900,
>   kmh496 <[EMAIL PROTECTED]> wrote:
> > doesn't that user have to exist since you are using ident method?  that
> > means unix username == postgres username.
> > do you have a user named maragato_test on the system?
> > did you create that user in postgres and on the system?
>
> Note that you can create custom mappings for which the unix user is not
> the same as the postgres user.
>

I don't want to use ident and the unix user name. Let me try and
simplify my question:

1) Isn't the user 'postgres' pre-configured? Running this seems to
imply so: 'select datname from pg_database;'
datname
---
 postgres

2) Is there a way to use this user 'postgres' with a non-root unix
account _not_ named postgres? I just want _any_ method - md5, ident,
whatever, that allows access to my db with user 'postgres' from an
account called myuser1, myuser2, and myuser3. Tomorrow it might be
myuser4.

3) I'm willing to try and use custom mappings if that's the easiest way
to solve my problem. 

Thanks for the help,
Robert

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

[GENERAL] Q: regcomp failed with error invalid character range

2000-06-15 Thread Robert


I need to select records with description containing chars with highest
bit set but

select * from table where descr ~ '.*ATU[\0200-\0377].*';

fails with error

ERROR:  regcomp failed with error invalid character range

Any idea how to work around it? Thanks.

- Robert

P.S. This is 7.0.2 from RPM on RH 6.1, locale is CZ, database is with
encoding 'latin2' (8).

[GENERAL] [Fwd: Q: regcomp failed with error invalid character range]

2000-06-15 Thread Robert


With [\200-\377] instead of [\0200-\0377] it works. Sorry.

- R.


I need to select records with description containing chars with highest
bit set but

select * from table where descr ~ '.*ATU[\0200-\0377].*';

fails with error

ERROR:  regcomp failed with error invalid character range

Any idea how to work around it? Thanks.

- Robert

P.S. This is 7.0.2 from RPM on RH 6.1, locale is CZ, database is with
encoding 'latin2' (8).

Re: [GENERAL] Invoices

2001-05-04 Thread Robert

Vince Vielhaber wrote:
> 
> On Fri, 4 May 2001, Roderick A. Anderson wrote:
> 
> > On Fri, 4 May 2001, Robert wrote:
> >
> > >   I write the same application for the same reasons - we're three
> > > partners and we all want to be able to create invoices and also see what
> > > others create.
> >
> > Have you looked at SQL-Ledger?  THough I disagree with some of the
> > table designs it is a pretty good accounting package.  (Given I'm not an
> > accountant.)  Possible down side is it is web-server/browser based.  Has
> > support designed in for international use.
> 
> Got a URL?

  It used to be at http://www.simtax.ca/acc/ but it's not responding
right now. I didn't like few things about it, but I don't remember what
it was anymore, I'm going to take a look at it again though. 
  Anyway, there's similar service from Oracle (and few others), but it's
not free (just free test) and when I last checked it wasn't as nice as
it could be. And it wasn't built on PostgreSQL ;-)

  - Robert

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

[GENERAL] 6.5 connection time: too slow (?)

1999-11-03 Thread Robert


Hi,

 I'm finally about to upgrade our prehistoric 6.1 to 6.5.2 and took few
hours to do some highly unscientific (Perl/DBI) benchmarking just to see
the difference... I was quite shocked when I found out that PG6.1 (on
Linux 2.0.34, loaded K6/166 server) was consistently three times faster
when connecting to the database then latest PG6.5.2 (on Linux 2.2.12,
unloaded P2/233, faster disk). The difference was 0.102-0.108 sec vs.
0.293-0.296 sec. For web usage, this is quite slow, even if actual
execute is usually fast enough (0.05-0.15 sec). Does anybody have any
idea what's going on? Specifically, is it possible RedHat 6.1 RPM I'm
using was compiled with some stupid switch that slows things down? Any
advices how to speed things up?

P.S. Apache::DBI is not of much use here, because I'm using 6+ different
databases, so it would keep six backends per http child... Thanks for
your comments.

- Robert

Re: [GENERAL] 6.5 connection time: too slow (?)

1999-01-02 Thread Robert

Bruce Momjian wrote:

> Very strange.  Same postmaster flags?  I can't imagine what it would be?
> We really have been improving performance, including startup
> performance.

The only difference I see is -i flag on the 6.5 postmaster, but removing it
doesn't change a thing, connecting is consistently 0.2934-0.2939 sec (only
other flag both server use is -S). To explain, I'm running simple test script
starting

use DBI;
use Data::Dumper;
use Time::HiRes qw/gettimeofday tv_interval/;

$start = [gettimeofday];
$db = DBI->connect("DBI:Pg:dbname=rfp") or die $DBI::errstr;
print "connect: ", tv_interval($start, [gettimeofday]), "\n";

Maybe both servers were compiled with different options, but I don't
know how to check with which ones. Any simple way how to find ou?
BTW, the exact results were

P2/233,PG6.5.2
connect  0.295
prepare  0.001
execute (select * from table)
60 rows/10 cols 0.153
15000 rows/4 cols 1.345
15000 rows/4 cols/LIMIT 30   0.044
15000 rows/4 cols/LIMIT 30 OFFSET 2000 0.050
fetch 30  0.002
finnish  0.000
disconnect  0.002

K6/166,pg6.4
connect  0.105
prepare  0.001
execute (select * from table)
60 rows/10 cols 0.162
15000 rows/4 cols 3.508 (finnish 0.091)
fetch 30  0.008
finnish  0.002
disconnect  0.001

It shows up to LIMIT/OFFSET brings 30fold speed up as expected (main reason
for upgrading, after all) and also that the same (bigger) select takes 1.345
vs. 3.508 sec
which corresponds reasonably well to the slower machine and older postgres.
Only
the connection time is a mystery.

- Robert

Re: [GENERAL] Czech Win1250 sorting q

1999-12-11 Thread Robert


netkof=> show client encoding;
ERROR:  parser: parse error at or near "encoding"
netkof=> set client encoding to 'WIN1250';
ERROR:  parser: parse error at or near "encoding"
netkof=> set client-encoding to 'WIN1250';
ERROR:  parser: parse error at or near "-"

Hmm, what did you say I should write? Well, this is PG6.5.2 installed
from RPM, should it be compiled with some special option? Thanks.

- Robert


David Sauer wrote:

> >>>> "Robert" == Robert  <[EMAIL PROTECTED]> writes:
>
> Robert> Hi, ISO-8851-2 Czech support (esp. sorting) works fine, but my
> Robert> data use Win1250. PG 6.5.1+ is supposed to support it, can some
> Robert> kind soul tell me how can I use it? Thanks.
>
> Robert> - Robert
>
> Just do:
>
> set client encoding to 'WIN1250';
>
>  Server will store its data in iso-8859-2, but will recode input and
> output for windoze clients.
>
>   The reason is, that postgres database server uses locales to sort tables
> and there is no 'cs_CZ.WIN-1250' locale - and I hope, that never will be.
>
> --
> * David Sauer, student of Czech Technical University
> * electronic mail: [EMAIL PROTECTED] (mime compatible)

[GENERAL] With what options RPM was compiled? ( Was: Czech Win1250 sorting q)

1999-12-13 Thread Robert

I've upgraded to 6.5.3 from RPM but 'set client_encoding' still says
'unknow variable'. How can I know whether RPM was compiled
with --enable-locale and  --with-mb=LATIN2? And if it wasn't,
could I humbly suggest it for next version?

If I knew it wasn't compiled with the right options, I could just
take a deep breath and try to recompile it myself. However, there's
no info about compile options used to prepare RPM and I'm not
sure I'm not doing some trivial mistake... I'd like to avoid recompiling
the whole thing if I can.

- Robert

P.S. David, thanks for your help so far. I forgot to send my last mail
to the list and sent it to your address only instead. Sorry.

David Sauer wrote:

> >>>> "Robert" == Robert  <[EMAIL PROTECTED]> writes:
>
> Robert> netkof=> show client encoding;
> Robert> ERROR: parser: parse error at or
>
> Sorry, my mistake. The right name is 'client_encoding' (with underscore).
>
> Example:
>
> david=> set client_encoding to 'latin2';
> SET VARIABLE
>
> Robert> Hmm, what did you say I should write? Well, this is PG6.5.2
> Robert> installed from RPM, should it be compiled with some special
> Robert> option? Thanks.
>
> Yes, postgres must be compiled with --enable-locale and --with-mb=LATIN2.
> And, I'am not sure, but may want upgrade to 6.5.3.
>
> --
> * David Sauer, student of Czech Technical University
> * electronic mail: [EMAIL PROTECTED] (mime compatible)
>
>

[GENERAL] Czech2ASCII with --mb=Latin2

1999-12-15 Thread Robert


Hi,

  I have a database in Latin2 encoding (Czech stuff) and Latin2/Win1250
on-the-fly recoding with 'set client_encoding' works smoothly. Now, when
I set client encoding to SQL_ASCII, accented characters are converted to
(hexa) codes. Is there any (simple) way to make this recoding convert
accented characters to just the chars themselves but without accents?
Thanks in advance.

- Robert

P.S. Moreover, the non-Czech speakers tend to search the database with
words without accents, it would be usefull to make this conversion works
in the other direction: name LIKE 'ceske%' would return also names
starting with accented version.

P.S.2 I could do this quite easily in Perl on the application level, but
don't want to start programming before I'm sure there's no standard
postgres solution.

PostgreSQL Portable Runtime (was Re: [GENERAL] Future of PostgreSQL)

1999-12-30 Thread Robert


Hi,

 one of the important factors that contributed to the popularity and success of
Apache, Perl, Tcl/Tk etc. was their platform independence. I'm big fan of Unix (and
even bigger of Postgres ;-), but BeOS, MacOS X, even Win2000 all look quite
interesting too and I don't want to tie myself to just one platform. More platforms
will bring in more users, more testers and more hackers and thus much better
Postgres (hopefully).

Bruce M. says Postgres depends so much on Unix that to port it would be about as
hard as port the whole Unix kernel. So here's the idea for the next major release:
how about some kind of  'PostgrSQL Portable Rutime' that would isolate system
dependent stuff and make PostgreSQL reasonably portable? Apache has its 'Apache
Portable Runtime', so has Netscape/Mozilla and while they're clearly very different
applications, I believe it's not impossible.

I understand this would be a LOT of work and most Postgres developers might not be
immediately attracted, but look at it this way: Postgres is currently unique among
db servers with its features, robustness, performance and nice licence, but what if
mSQL/MySQL finally add transactions and other features and/or free their licence? Or

one of the big guys, say IBM, get enlightened/desperade enough to release source?
Suddenly there would be a strong competitor to Postgres and being crossplatform
would give them a great advantage.

I'm web developer and with Apache and Perl (and mod_perl), I'm quite happy. Now that

Mozilla M12 is quite usable I can develop on almost any platform I want... but I
want Postgres and it brings me back to Unix with its beautifull UI, great multimedia

support and Age of Empires running under Wine.  *sigh*

- Robert

P.S. Cygwin is definitely one of the options, but RedHat/Cygnus's plans are not very

clear at this point and few months ago there were even some rumors about plans for
'more restrictive licence' for cygwin - and anyway, cygwin wouldn't be of any help
to Mac/BeOS/VAX/mainframe people.

PostgreSQL Portable Runtime (was Re: [GENERAL] Future of PostgreSQL)

1999-12-30 Thread Robert


Hi,

 one of the important factors that contributed to the popularity and success of
Apache, Perl, Tcl/Tk etc. was their platform independence. I'm big fan of Unix (and
even bigger of Postgres ;-), but BeOS, MacOS X, even Win2000 all look quite
interesting too and I don't want to tie myself to just one platform. More platforms
will bring in more users, more testers and more hackers and thus much better
Postgres (hopefully).

Bruce M. says Postgres depends so much on Unix that to port it would be about as
hard as port the whole Unix kernel. So here's the idea for the next major release:
how about some kind of  'PostgrSQL Portable Rutime' that would isolate system
dependent stuff and make PostgreSQL reasonably portable? Apache has its 'Apache
Portable Runtime', so has Netscape/Mozilla and while they're clearly very different
applications, I believe it's not impossible.

I understand this would be a LOT of work and most Postgres developers might not be
immediately attracted, but look at it this way: Postgres is currently unique among
db servers with its features, robustness, performance and nice licence, but what if
mSQL/MySQL finally add transactions and other features and/or free their licence? Or
one of the big guys, say IBM, get enlightened/desperade enough to release source?
Suddenly there would be a strong competitor to Postgres and being crossplatform
would give them a great advantage.

I'm web developer and with Apache and Perl (and mod_perl), I'm quite happy. Now that
Mozilla M12 is quite usable I can develop on almost any platform I want... but I
want Postgres and it brings me back to Unix with its beautifull UI, great multimedia
support and Age of Empires running under Wine.  *sigh*

- Robert

P.S. Cygwin is definitely one of the options, but RedHat/Cygnus's plans are not very
clear at this point and few months ago there were even some rumors about plans for
'more restrictive licence' for cygwin - and anyway, cygwin wouldn't be of any help
to Mac/BeOS/VAX/mainframe people.

Re: [GENERAL] Announce: PostgreSQL-6.5.3 binaries available for Windows NT

2000-01-03 Thread Robert

Kevin Lo wrote:

> Some people asked me to build PostgreSQL binaries for Windows NT.
> The binaries(PostgreSQL-6.5.3) are now available at:
>
> ftp://203.79.167.135/pub/postgres-nt-binaries.tar.gz

Hi,

I'm trying this on Win98 (cygwin b20.1): initdb creates template1 just
fine (I had to run it as sh initdb), but whenever I try to run
postgres.exe,
it complains

FATAL 1:  Database system does not exist.  PGDATA
directory '/usr/local/pgsql/data' not found

even if two minutes ago it worked and the directory is there of course.
Some funny problem with mount? Any help will be greatly apreciated.

- Robert
.

[GENERAL] Numeric type in PL/pgSQL trigger: overflow problem

2000-02-20 Thread Robert


Hi,

 following function doesn't work with numeric type, when I change type
of 'total_no_vat' to anything else incl. float it works OK. What am I
doing wrong? Thanks for your help.

- Robert

CREATE TABLE deb_invoice_line_table (
qty NUMERIC(13,2),
unit_price  NUMERIC(13,2),
total_no_vatNUMERIC(13,2),
-- ...
)

CREATE FUNCTION deb_invoice_line_complete() RETURNS OPAQUE AS 'BEGIN
NEW.total_no_vat := 5.2;
RETURN NEW;
END;' LANGUAGE 'plpgsql';

CREATE TRIGGER deb_invoice_line_trigger
BEFORE INSERT OR UPDATE
ON deb_invoice_line_table
FOR EACH ROW EXECUTE PROCEDURE deb_invoice_line_complete();

INSERT INTO deb_invoice_line_table (qty,unit_price) VALUES (3,33);

ERROR:  overflow on numeric ABS(value) >= 10^0 for field with precision
0 scale 2

[GENERAL] Rounding/create C function question

2000-04-23 Thread Robert


Hi,

 I need some custom rounding function (say 1 or two decimal points),
what would be the simplest way to make one? 

- Robert



P.S. I tried to create one in C like

#include 
#include "postgres.h"

double tst5(double x) {
return x; // ... whatever needed
}

and compiled it with (on RadHat 6.1, PG 6.5.3)

gcc -I/usr/include/pgsql -fPIC -shared 
-o libeucto.so.0.0.2 -Wl,-soname,libeucto.so.0.0.2 eucto_round.c

CREATE FUNCTION tst5(float8) RETURNS float8 
AS '/mnt/lotrando_c/projekty/eucto/bin/libeucto.so.0.0.2' LANGUAGE
'c';

This works as long as the function is int4/int4, but for float8 'backend
terminated abnormally'. Thanks for any help.

[GENERAL] Basic Question on Point In Time Recovery

2015-03-11 Thread Robert Inder

We are developing a new software system which is now used by a number
of independent clients for gathering and storing live data as part of
their day to day work.

We have a number of clients sharing a single server.  It is running
one Postgres service, and each client is a separate user with access
to their own database.  Each client's database will contain "hundreds
of thousands" of records, and will be supporting occasional queries by
a small number of users.   So the system is currently running on
"modest" hardware.

To guard against the server failing, we have a standby server being
updated by WAL files, so if the worst comes to the worst we'll only
lose "a few minutes" work.  No problems there.

But, at least while the system is under rapid development, we also
want to have a way to roll a particular client's database back to a
(recent) "known good" state, but without affecting any other client.

My understanding is that the WAL files mechanism is installation-wide
-- it will affect all clients alike.

So to allow us to restore data for an individual client, we're running
"pg_dump" once an hour on each database in turn.  In the event of a
problem with one client's system, we can restore just that one
database, without affecting any other client.

The problem is that we're finding that as the number of clients grows,
and with it the amount of data, pg_dump is becoming more intrusive.
Our perception is that when pg_dump is running for any database,
performance on all databases is reduced.  I'm guessing this is because
the dump is making heavy use of the disk.

There is obviously scope for improving performance by getting using
more, or more powerful, hardware.  That's obviously going to be
necessary at some point, but it is obviously an expense that our
client would like to defer as long as possible.

So before we go down that route, I'd like to check that we're not
doing something dopey.

Is our current "frequent pg_dump" approach a sensible way to go about
things.  Or are we missing something?  Is there some other way to
restore one database without affecting the others?

Thanks in advance.

Robert.

-- 
Robert Inder,
Interactive Information Ltd,   3, Lauriston Gardens, Edinburgh EH3 9HH
Registered in Scotland, Company no. SC 150689
   Interactions speak louder than words


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Basic Question on Point In Time Recovery

2015-03-12 Thread Robert Inder

Thanks for your comments (so far:-)

I guess I'm pleased that nobody has said that I'm doing something stupid!

I'll certainly look at Slony and Barman.

And Stephane's suggestion of doing regular basebackups and keeping the
WAL files seems neat.  If I under stand it, we'd use the/a standby
server to "replay" the entire installation up to the point when the
problem occurs, and then use pg_dump to dump just the database we
need.

I'm wondering just how the size of a day's worth of WAL files would
compare to a whole slew of hourly dumps.

The other issue would be how long the "replay" would take.  But, I
realise, that's not a major concern: the delay would only be seen by a
client that
had had a major problem.  Everyone else would see service as normal.

I think I'll be doing some experiments to find out:-)

Robert.

-- 
Robert Inder,
Interactive Information Ltd,   3, Lauriston Gardens, Edinburgh EH3 9HH
Registered in Scotland, Company no. SC 150689
   Interactions speak louder than words


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Basic Question on Point In Time Recovery

2015-03-12 Thread Robert Inder

Hi, Francisco,

On 11 March 2015 at 17:32, Francisco Olarte  wrote:

> This is, build an streaming replication slave, pg_dump from the slave. If
> needed, restore in the master.

I really like the idea of running pg_dump on the slave, but I don't understand
how I could do it.

Postgres on our live machine is regularly pushing WAL files to the standby.

Postgres on the standby machine is continually reading those files.
But that is all it will do. "pg_dump" just says "The database is starting up".

Could/should I have something configured differently?

Or Is this something that has changed with Postgres 9?
We're currently running Postgres 8.4.
Is this my specific reason to embark on an upgrade?

Robert.

-- 
Robert Inder,
Interactive Information Ltd,   3, Lauriston Gardens, Edinburgh EH3 9HH
Registered in Scotland, Company no. SC 150689
   Interactions speak louder than words

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Basic Question on Point In Time Recovery

2015-03-13 Thread Robert Inder

On 12 March 2015 at 12:31, Thomas Kellerer  wrote:

> 8.4 cannot run queries on the standby, you need to upgrade to a 
> supported/maintained version for this
> (this feature was introduced in 9.0)
>
> In 9.x you can start the slave as a "hot standby" to allow read only queries 
> which is what pg_dump needs.
>
> You should really upgrade to a current version 9.4 or 9.3

Thought you were going to say that:-(

Well, I guess we have to do it some time, and now there is a reason
for it to happen sooner rather than later...

But even if (OK, "though")  I'm doing that, Steven's suggestion of
making the dump to a ram file system, then filing it as a separate
step, looks simple enough to be worth trying as a stop-gap...

Robert.

-- 
Robert Inder,
Interactive Information Ltd,   3, Lauriston Gardens, Edinburgh EH3 9HH
Registered in Scotland, Company no. SC 150689
   Interactions speak louder than words


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] How do I calculate the sum of a field filtered by multiple windows defined by another field?

2015-03-15 Thread Robert James

How do I calculate the sum of a field filtered by multiple windows
defined by another field?

I have table event with fields event_date, num_events, site_id. I can
easily use aggregate SQL to do SELECT SUM(num_events) GROUP BY
site_id.

But I also have another table site with fields site_id, target_date.
I'd like to do a JOIN, showing the SUM of num_events within 60 days of
the target_date, 90 days, 120 days, etc. I thought this could easily
be done using a WHERE clause in the aggregate SQL. However, this is
complicated by two challenges:

1. The target_date is not fixed, but varies for each site_id

2. I'd like multiple date ranges to be outputed in the same table; so
I can't do a simple WHERE to exclude records falling outside the range
from the event table

One workaround I've thought of is to simply make several queries, one
for each date range, using a different WHERE clause for each, and then
use a view to paste them together. Is there a simpler, better, or more
elegant way to achieve my goals?


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Preserving formatting and comments in a VIEW definition

2015-03-15 Thread Robert James

When I save a VIEW, Postgres seems to convert it to a different
format, functionally equivalent but unrecognizable (whitespace,
comments, adds lots of casts, etc.)

Is there any simple way to preserve my original code?


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] How do I calculate the sum of a field filtered by multiple windows defined by another field?

2015-03-15 Thread Robert James

Version 9.2.4

On 3/15/15, David G. Johnston  wrote:
> On Sunday, March 15, 2015, Robert James  wrote:
>
>> How do I calculate the sum of a field filtered by multiple windows
>> defined by another field?
>>
>> I have table event with fields event_date, num_events, site_id. I can
>> easily use aggregate SQL to do SELECT SUM(num_events) GROUP BY
>> site_id.
>>
>> But I also have another table site with fields site_id, target_date.
>> I'd like to do a JOIN, showing the SUM of num_events within 60 days of
>> the target_date, 90 days, 120 days, etc. I thought this could easily
>> be done using a WHERE clause in the aggregate SQL. However, this is
>> complicated by two challenges:
>>
>> 1. The target_date is not fixed, but varies for each site_id
>>
>> 2. I'd like multiple date ranges to be outputed in the same table; so
>> I can't do a simple WHERE to exclude records falling outside the range
>> from the event table
>>
>> One workaround I've thought of is to simply make several queries, one
>> for each date range, using a different WHERE clause for each, and then
>> use a view to paste them together. Is there a simpler, better, or more
>> elegant way to achieve my goals?
>>
>>
> I suggest you create and post a sample query, ideally using WITH & VALUES
> for sample data, that gets you the answer using UNION.  You will then have
> something to compare against and others can know exactly what you want
> instead of trying to figure it out from your limited description.
>
> What version does the solution need to work for?
>
> David J.
>


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Setting up replication

2015-03-17 Thread Robert Fitzpatrick

I have a question about setting up replication between my 
postgresql-9.3.6 servers. If I'm using pg_basebackup on my FreeBSD 10.1 
slave server, the postgresql.conf file is in the data directory, which 
pg_basebackup insists must be empty. I can't find any info about how to 
relocate the postgresql.conf file and tell the init script its new 
location. If I setup PITR prior to replication setup and share the same 
postgresql.conf file transferred by pg_basebackup, how can both servers 
see the archive directory? It is local for the slave, nfs mount for the 
master.


Obviously there is something I'm missing or not understanding, can 
anyone help? Thanks!


--
Robert



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-05-27 Thread Robert Haas

On Wed, May 27, 2015 at 6:21 PM, Alvaro Herrera
 wrote:
> Steve Kehlet wrote:
>> I have a database that was upgraded from 9.4.1 to 9.4.2 (no pg_upgrade, we
>> just dropped new binaries in place) but it wouldn't start up. I found this
>> in the logs:
>>
>> waiting for server to start2015-05-27 13:13:00 PDT [27341]: [1-1] LOG:
>>  database system was shut down at 2015-05-27 13:12:55 PDT
>> 2015-05-27 13:13:00 PDT [27342]: [1-1] FATAL:  the database system is
>> starting up
>> .2015-05-27 13:13:00 PDT [27341]: [2-1] FATAL:  could not access status of
>> transaction 1
>
> I am debugging today a problem currently that looks very similar to
> this.  AFAICT the problem is that WAL replay of an online checkpoint in
> which multixact files are removed fails because replay tries to read a
> file that has already been removed.

Hmm, so what exactly is the sequence of events here?  It's possible
that I'm not thinking clearly just now, but it seems to me that if
we're replaying the same checkpoint we replayed previously, the offset
of the oldest multixact will be the first file that we didn't remove.
However, I can see that there could be a problem if we try to replay
an older checkpoint after having already replayed a new one - for
example, if a standby replays checkpoint A truncating the members
multixact and performs a restart point, and then replays checkpoint B
truncating the members multixact again but without performing a
restartpoint, and then is shut down, it will resume replay from
checkpoint A, and trouble will ensue.  Is that the scenario, or is
there something else?

> I think the fix to this is to verify whether the file exists on disk
> before reading it; if it doesn't, assume the truncation has already
> happened and that it's not necessary to remove it.

That might be an OK fix, but this implementation doesn't seem very
clean.  If we're going to remove the invariant that
MultiXactState->oldestOffset will always be valid after replaying a
checkpoint, then we should be explicit about that and add a flag
indicating whether or not it's currently valid.  Shoving nextOffset in
there and hoping that's good enough seems like a bad idea to me.

I think we should modify the API for find_multixact_start.  Let's have
it return a Boolean and return oldestOffset via an out parameter.  If
!InRecovery, it will always return true and set the out parameter; but
if in recovery, it is allowed to return false without setting the out
parameter.  Both values can get stored in MultiXactState, and we can
adjust the logic elsewhere to disregard oldestOffset when the
accompanying flag is false.

This still leaves open an ugly possibility: can we reach normal
running without a valid oldestOffset?  If so, until the next
checkpoint happens, autovacuum has no clue whether it needs to worry.
There's got to be a fix for that, but it escapes me at the moment.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-05-27 Thread Robert Haas

On Wed, May 27, 2015 at 10:14 PM, Alvaro Herrera
 wrote:
> Well I'm not very clear on what's the problematic case.  The scenario I
> actually saw this first reported was a pg_basebackup taken on a very
> large database, so the master could have truncated multixact and the
> standby receives a truncated directory but actually tries to apply a
> checkpoint that is much older than what the master currently has
> transmitted as pg_multixact contents.

OK, that makes sense.

>> That might be an OK fix, but this implementation doesn't seem very
>> clean.  If we're going to remove the invariant that
>> MultiXactState->oldestOffset will always be valid after replaying a
>> checkpoint, then we should be explicit about that and add a flag
>> indicating whether or not it's currently valid.  Shoving nextOffset in
>> there and hoping that's good enough seems like a bad idea to me.
>>
>> I think we should modify the API for find_multixact_start.  Let's have
>> it return a Boolean and return oldestOffset via an out parameter.  If
>> !InRecovery, it will always return true and set the out parameter; but
>> if in recovery, it is allowed to return false without setting the out
>> parameter.  Both values can get stored in MultiXactState, and we can
>> adjust the logic elsewhere to disregard oldestOffset when the
>> accompanying flag is false.
>
> Sounds good.  I think I prefer that multixact creation is rejected
> altogether if the new flag is false.  Is that what you mean when you say
> "adjust the logic"?

No.  I'm not sure quite what you mean here.  We can't reject multixact
creation during normal running, and during recovery, we won't create
any really new mulitxacts, but we must replay the creation of
multixacts.  What I meant was stuff like this:

if (!MultiXactIdPrecedes(result, MultiXactState->multiVacLimit) ||
(MultiXactState->nextOffset - MultiXactState->oldestOffset
> MULTIXACT_MEMBER_SAFE_THRESHOLD))

I meant that we'd change the second prong of the test to check
multiXactState->nextOffsetValid && MultiXactState->nextOffset -
MultiXactState->oldestOffset > MULTIXACT_MEMBER_SAFE_THRESHOLD.  And
likewise change anything else that relies on oldestOffset.  Or else we
guarantee that we can't reach those points until the oldestOffset is
valid, and then check that it is with an Assert() or elog().

>> This still leaves open an ugly possibility: can we reach normal
>> running without a valid oldestOffset?  If so, until the next
>> checkpoint happens, autovacuum has no clue whether it needs to worry.
>> There's got to be a fix for that, but it escapes me at the moment.
>
> I think the fix to that issue is to set the oldest offset on
> TrimMultiXact.  That way, once WAL replay finished we're certain that we
> have a valid oldest offset to create new multixacts with.
>
> I'm also wondering whether the call to DetermineSafeOldestOffset on
> StartupMultiXact is good.  At that point, we haven't replayed any WAL
> yet, so the oldest multi might be pointing at a file that has already
> been removed -- again considering the pg_basebackup scenario where the
> multixact files are copied much later than pg_control, so the checkpoint
> to replay is old but the pg_multixact contents have already been
> truncated in the master and are copied truncated.

Moving the call from StartupMultiXact() to TrimMultiXact() seems like
a good idea.  I'm not sure why we didn't do that before.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-05-28 Thread Robert Haas

On Wed, May 27, 2015 at 6:21 PM, Alvaro Herrera
 wrote:
> Steve Kehlet wrote:
>> I have a database that was upgraded from 9.4.1 to 9.4.2 (no pg_upgrade, we
>> just dropped new binaries in place) but it wouldn't start up. I found this
>> in the logs:
>>
>> waiting for server to start2015-05-27 13:13:00 PDT [27341]: [1-1] LOG:
>>  database system was shut down at 2015-05-27 13:12:55 PDT
>> 2015-05-27 13:13:00 PDT [27342]: [1-1] FATAL:  the database system is
>> starting up
>> .2015-05-27 13:13:00 PDT [27341]: [2-1] FATAL:  could not access status of
>> transaction 1
>
> I am debugging today a problem currently that looks very similar to
> this.  AFAICT the problem is that WAL replay of an online checkpoint in
> which multixact files are removed fails because replay tries to read a
> file that has already been removed.

Steve: Can you tell us more about how you shut down the old cluster?
Did you by any chance perform an immediate shutdown?  Do you have the
actual log messages that were written when the system was shut down
for the upgrade?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-05-28 Thread Robert Haas

On Wed, May 27, 2015 at 6:21 PM, Alvaro Herrera
 wrote:
> Steve Kehlet wrote:
>> I have a database that was upgraded from 9.4.1 to 9.4.2 (no pg_upgrade, we
>> just dropped new binaries in place) but it wouldn't start up. I found this
>> in the logs:
>>
>> waiting for server to start2015-05-27 13:13:00 PDT [27341]: [1-1] LOG:
>>  database system was shut down at 2015-05-27 13:12:55 PDT
>> 2015-05-27 13:13:00 PDT [27342]: [1-1] FATAL:  the database system is
>> starting up
>> .2015-05-27 13:13:00 PDT [27341]: [2-1] FATAL:  could not access status of
>> transaction 1
>
> I am debugging today a problem currently that looks very similar to
> this.  AFAICT the problem is that WAL replay of an online checkpoint in
> which multixact files are removed fails because replay tries to read a
> file that has already been removed.

Wait a minute, wait a minute.  There's a serious problem with this
theory, at least in Steve's scenario.  This message:

2015-05-27 13:13:00 PDT [27341]: [1-1] LOG: database system was shut
down at 2015-05-27

That message implies a *clean shutdown*.  If he had performed an
immediate shutdown or just pulled the plug, it would have said
"database system was interrupted" or some such.

There may be bugs in redo, also, but they don't explain what happened to Steve.

Steve, is there any chance we can get your pg_controldata output and a
list of all the files in pg_clog?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-05-28 Thread Robert Haas

On Thu, May 28, 2015 at 8:01 AM, Robert Haas  wrote:
> On Wed, May 27, 2015 at 6:21 PM, Alvaro Herrera
>  wrote:
>> Steve Kehlet wrote:
>>> I have a database that was upgraded from 9.4.1 to 9.4.2 (no pg_upgrade, we
>>> just dropped new binaries in place) but it wouldn't start up. I found this
>>> in the logs:
>>>
>>> waiting for server to start2015-05-27 13:13:00 PDT [27341]: [1-1] LOG:
>>>  database system was shut down at 2015-05-27 13:12:55 PDT
>>> 2015-05-27 13:13:00 PDT [27342]: [1-1] FATAL:  the database system is
>>> starting up
>>> .2015-05-27 13:13:00 PDT [27341]: [2-1] FATAL:  could not access status of
>>> transaction 1
>>
>> I am debugging today a problem currently that looks very similar to
>> this.  AFAICT the problem is that WAL replay of an online checkpoint in
>> which multixact files are removed fails because replay tries to read a
>> file that has already been removed.
>
> Wait a minute, wait a minute.  There's a serious problem with this
> theory, at least in Steve's scenario.  This message:
>
> 2015-05-27 13:13:00 PDT [27341]: [1-1] LOG: database system was shut
> down at 2015-05-27
>
> That message implies a *clean shutdown*.  If he had performed an
> immediate shutdown or just pulled the plug, it would have said
> "database system was interrupted" or some such.
>
> There may be bugs in redo, also, but they don't explain what happened to 
> Steve.
>
> Steve, is there any chance we can get your pg_controldata output and a
> list of all the files in pg_clog?

Err, make that pg_multixact/members, which I assume is at issue here.
You didn't show us the DETAIL line from this message, which would
presumably clarify:

FATAL:  could not access status of transaction 1

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-05-28 Thread Robert Haas

On Thu, May 28, 2015 at 8:03 AM, Robert Haas  wrote:
>> Steve, is there any chance we can get your pg_controldata output and a
>> list of all the files in pg_clog?
>
> Err, make that pg_multixact/members, which I assume is at issue here.
> You didn't show us the DETAIL line from this message, which would
> presumably clarify:
>
> FATAL:  could not access status of transaction 1

And I'm still wrong, probably.  The new code in 9.4.2 cares about
being able to look at an *offsets* file to find the corresponding
member offset.  So most likely it is an offsets file that is missing
here.  The question is, how are we ending up with an offsets file that
is referenced by the control file but not actually present on disk?

It seems like it would be good to compare the pg_controldata output to
what is actually present in pg_multixact/offsets (hopefully that's the
right directory, now that I'm on my third try) and try to understand
what is going on here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-05-28 Thread Robert Haas

On Thu, May 28, 2015 at 8:51 AM, Robert Haas  wrote:
> [ speculation ]

OK, I finally managed to reproduce this, after some off-list help from
Steve Kehlet (the reporter), Alvaro, and Thomas Munro.  Here's how to
do it:

1. Install any pre-9.3 version of the server and generate enough
multixacts to create at least TWO new segments.  When you shut down
the server, all segments except for the most current one will be
removed.  At this point, the only thing in
$PGDATA/pg_multixact/offsets should be a single file, and the name of
that file should not be  or 0001.

2. Use pg_upgrade to upgrade to 9.3.4.  It is possible that versions <
9.3.4 will also work here, but you must not use 9.3.5 or higher,
because 9.3.5 includes Bruce's commit 3d2e18510, which arranged to
preserve relminmxid and datminmxid values.   At this point,
pg_controldata on the new cluster should show an oldestMultiXid value
greater than 1 (copied from the old cluster), but all the datminmxid
values are 1.  Also, initdb will have left behind a bogus  file in
pg_multixact/offsets.

3. Move to 9.3.5 (or 9.3.6), not via pg_upgrade, but just by dropping
in the new binaries.  Follow the instructions in the 9.3.5 release
notes; since you created at least TWO new segments in step one, there
will be no 0001 file, and the query there will say that you should
remove the bogus  file.  So do that, leaving just the good file in
pg_multixact/offsets.  At this point, pg_multixact/offsets is OK, and
pg_controldata still says that oldestMultiXid > 1, so that is also OK.
The only problem is that we've got some bogus datminmxid values
floating around.  Our next step will be to convince vacuum to
propagate the bogus datminmxid values back into pg_controldata.

4. Consume at least one transaction ID (e.g. SELECT txid_current())
and then do this:

postgres=# set vacuum_freeze_min_age = 0;
SET
postgres=# set vacuum_freeze_table_age = 0;
SET
postgres=# vacuum;
VACUUM

Setting the GUCs forces full table scans, so that we advance
relfrozenxid.  But notice that we were careful not to just run VACUUM
FREEZE, which would have also advanced relminmxid, which, for purposes
of reproducing this bug, is not what we want to happen.  So relminmxid
is still (incorrectly) set to 1 for every database.  However, since
the vacuum did advance relfrozenxid, it will call vac_truncate_clog,
which will call SetMultiXactIdLimit, which will propagate the bogus
datminmxid = 1 setting into shared memory.

(In my testing, this step doesn't work if performed on 9.3.4; you have
to do it on 9.3.5.  I think that's because of Tom's commit 78db307bb,
but I believe in a more complex test scenario you might be able to get
this to happen on 9.3.4 also.)

I believe it's the case that an autovacuum of even a single table can
substitute for this step if it happens to advance relfrozenxid but not
relminmxid.

5. The next checkpoint, or the shutdown checkpoint in any event, will
propagate the bogus value of 1 from shared memory back into the
control file.

6. Now try to start 9.3.7.  It will see the bogus oldestMultiXid = 1
value in the control file, attempt to read the corresponding offsets
file, and die.

In the process of investigating this, we found a few other things that
seem like they may also be bugs:

- As noted upthread, replaying an older checkpoint after a newer
checkpoint has already happened may lead to similar problems.  This
may be possible when starting from an online base backup; or when
restarting a standby that did not perform a restartpoint when
replaying the last checkpoint before the shutdown.

- pg_upgrade sets datminmxid =
old_cluster.controldata.chkpnt_nxtmulti, which is correct only if
there are ZERO multixacts in use at the time of the upgrade.  It would
be best, I think, to set this to the same value it had in the old
cluster, but if we're going to use a blanket value, I think it needs
to be chkpnt_oldstMulti.

- There's a third possible problem related to boundary cases in
SlruScanDirCbRemoveMembers, but I don't understand that one well
enough to explain it.  Maybe Thomas can jump in here and explain the
concern.

Thanks,

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-05-28 Thread Robert Haas

On Thu, May 28, 2015 at 4:06 PM, Joshua D. Drake  wrote:
> FTR: Robert, you have been a Samurai on this issue. Our many thanks.

Thanks!  I really appreciate the kind words.

So, in thinking through this situation further, it seems to me that
the situation is pretty dire:

1. If you pg_upgrade to 9.3 before 9.3.5, then you may have relminmxid
or datminmxid values which are 1 instead of the correct value.
Setting the value to 1 was too far in the past if your MXID counter is
< 2B, and too far in the future if your MXID counter is > 2B.

2. If you pg_upgrade to 9.3.7 or 9.4.2, then you may have datminmxid
values which are equal to the next-mxid counter instead of the correct
value; in other words, they are two new.

3. If you pg_upgrade to 9.3.5, 9.3.6, 9.4.0, or 9.4.1, then you will
have the first problem for tables in template databases, and the
second one for the rest. (See 866f3017a.)

4. Wrong relminmxid or datminmxid values can eventually propagate into
the control file, as demonstrated in my previous post.  Therefore, we
can't count on relminmxid to be correct, we can't count on datminmxid
to be correct, and we can't count on the control file to be correct.
That's a sack of sad.

5. If the values are too far in the past, then nothing really terrible
will happen unless you upgrade to 9.3.7 or 9.4.2, at which point the
system will refuse to start.  Forcing a VACUUM FREEZE on every
database, including the unconnectable ones, should fix this and allow
you to upgrade safely - which you want to do, because 9.3.7 and 9.4.2
fix a different set of multixact data loss bugs.

6. If the values are too far in the future, the system may fail to
prevent wraparound, leading to data loss.  I am not totally clear on
whether a VACUUM FREEZE will fix this problem.  It seems like the
chances are better if you are running at least 9.3.5+ or 9.4.X,
because of 78db307bb.  But I'm not sure how complete a fix that is.

So what do we do about this?  I have a few ideas:

A. Most obviously, we should fix pg_upgrade so that it installs
chkpnt_oldstMulti instead of chkpnt_nxtmulti into datfrozenxid, so
that we stop creating new instances of this problem.  That won't get
us out of the hole we've dug for ourselves, but we can at least try to
stop digging.  (This is assuming I'm right that chkpnt_nxtmulti is the
wrong thing - anyone want to double-check me on that one?)

B. We need to change find_multixact_start() to fail softly.  This is
important because it's legitimate for it to fail in recovery, as
discussed upthread, and also because we probably want to eliminate the
fail-to-start hazard introduced in 9.4.2 and 9.3.7.
find_multixact_start() is used in three places, and they each require
separate handling:

- In SetMultiXactIdLimit, find_multixact_start() is used to set
MultiXactState->oldestOffset, which is used to determine how
aggressively to vacuum.  If find_multixact_start() fails, we don't
know how aggressively we need to vacuum to prevent members wraparound;
it's probably best to decide to vacuum as aggressively as possible.
Of course, if we're in recovery, we won't vacuum either way; the fact
that it fails softly is good enough.

- In DetermineSafeOldestOffset, find_multixact_start() is used to set
MultiXactState->offsetStopLimit.  If it fails here, we don't know when
to refuse multixact creation to prevent wraparound.  Again, in
recovery, that's fine.  If it happens in normal running, it's not
clear what to do.  Refusing multixact creation is an awfully blunt
instrument.  Maybe we can scan pg_multixact/offsets to determine a
workable stop limit: the first file greater than the current file that
exists, minus two segments, is a good stop point.  Perhaps we ought to
use this mechanism here categorically, not just when
find_multixact_start() fails.  It might be more robust than what we
have now.

- In TruncateMultiXact, find_multixact_start() is used to set the
truncation point for the members SLRU.  If it fails here, I'm guessing
the right solution is not to truncate anything - instead, rely on
intense vacuuming to eventually advance oldestMXact to a value whose
member data still exists; truncate then.

C. I think we should also change TruncateMultiXact() to truncate
offsets first, and then members.  As things stand, if we truncate
members first, we increase the risk of seeing an offset that will fail
when passed to find_multixact_start(), because TruncateMultiXact()
might get interrupted before it finishes.  That seem like an
unnecessary risk.

Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-05-28 Thread Robert Haas

at we're doing right now.
The current logic purports to leave a one-file gap in the member
space, but there's no guarantee that the gap really exists on disk the
way we think it does.  With this approach, we can be certain that
there is a gap.  And that is a darned good thing to be certain about.

>> C. I think we should also change TruncateMultiXact() to truncate
>> offsets first, and then members.  As things stand, if we truncate
>> members first, we increase the risk of seeing an offset that will fail
>> when passed to find_multixact_start(), because TruncateMultiXact()
>> might get interrupted before it finishes.  That seem like an
>> unnecessary risk.
>
> Not sure about this point.  We did it the way you propose previously,
> and found it to be a problem because sometimes we tried to read an
> offset file that was no longer there.  Do we really read member files
> anywhere?  I thought we only tried to read offset files.  If we remove
> member files, what is it that we try to read and find not to be present?

Do you have a link to the previous discussion?

I mean, the problem we're having right now is that sometimes we have
an offset, but the corresponding member isn't there.  So clearly
offsets reference members.  Do members also reference offsets?  I
didn't think so, but life is full of surprises.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-05-29 Thread Robert Haas

On Fri, May 29, 2015 at 10:17 AM, Tom Lane  wrote:
> Thomas Munro  writes:
>> On Fri, May 29, 2015 at 11:24 AM, Robert Haas  wrote:
>>> B. We need to change find_multixact_start() to fail softly.
>
>> Here is an experimental WIP patch that changes StartupMultiXact and
>> SetMultiXactIdLimit to find the oldest multixact that exists on disk
>> (by scanning the directory), and uses that if it is more recent than
>> the oldestMultiXactId from shmem,
>
> Not sure about the details of this patch, but I was planning to propose
> what I think is the same thing: the way to make find_multixact_start()
> fail softly is to have it find the oldest actually existing file if the
> one that should be there isn't.

Working on that now.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-05-29 Thread Robert Haas

On Fri, May 29, 2015 at 12:43 PM, Robert Haas  wrote:
> Working on that now.

OK, here's a patch.  Actually two patches, differing only in
whitespace, for 9.3 and for master (ha!).  I now think that the root
of the problem here is that DetermineSafeOldestOffset() and
SetMultiXactIdLimit() were largely ignorant of the possibility that
they might be called at points in time when the cluster was
inconsistent.  SetMultiXactIdLimit() bracketed certain parts of its
logic with if (!InRecovery), but those guards were ineffective because
it gets called before InRecovery is set in the first place.

It seems pretty clear that we can't effectively determine anything
about member wraparound until the cluster is consistent.  Before then,
there might be files missing from the offsets or members SLRUs which
get put back during replay.  There could even be gaps in the sequence
of files, with some things having made it to disk before the crash (or
having made it into the backup) and others not.  So all the work of
determining what the safe stop points and vacuum thresholds for
members are needs to be postponed until TrimMultiXact() time.  And
that's fine, because we don't need this information in recovery anyway
- it only affects behavior in normal running.

So this patch does the following:

1. Moves the call to DetermineSafeOldestOffset() that appears in
StartupMultiXact() to TrimMultiXact(), so that we don't try to do this
until we're consistent.  Also, instead of passing
MultiXactState->oldestMultiXactId, pass the newer of that value and
the earliest offset that exists on disk.  That way, it won't try to
read data that's not there.  Note that the second call to
DetermineSafeOldestOffset() in TruncateMultiXact() doesn't need a
similar guard, because we already bail out of that function early if
the multixacts we're going to truncate away don't exist.

2. Adds a new flag MultiXactState->didTrimMultiXact indicate whether
we've finished TrimMultiXact(), and arranges for SetMultiXactIdLimit()
to use that rather than InRecovery to test whether it's safe to do
complicated things that might require that the cluster is consistent.
This is a slight behavior change, since formerly we would have tried
to do that stuff very early in the startup process, and now it won't
happen until somebody completes a vacuum operation.  If that's a
problem, we could consider doing it in TrimMultiXact(), but I don't
think it's safe the way it was.  The new flag also prevents
oldestOffset from being set while in recovery; I think it would be
safe to do that in recovery once we've reached consistency, but I
don't believe it's necessary.

3. Arranges for TrimMultiXact() to set oldestOffset.  This is
necessary because, previously, we relied on SetMultiXactIdLimit doing
that during early startup or during recovery, and that's no longer
true.  Here too we set oldestOffset keeping in mind that our notion of
the oldest multixact may point to something that doesn't exist; if so,
we use the oldest MXID that does.

4. Modifies TruncateMultiXact() so that it doesn't re-scan the SLRU
directory on every call to find the oldest file that exists.  Instead,
it arranges to remember the value from the first scan and then updates
it thereafter to reflect its own truncation activity.  This isn't
absolutely necessary, but because this oldest-file logic is used in
multiple places (TrimMultiXact, SetMultiXactIdLimit, and
TruncateMultiXact all need it directly or indirectly) caching the
value seems like a better idea than recomputing it frequently.

I have tested that this patch fixes Steve Kehlet's problem, or at
least what I believe to be Steve Kehlet's problem based on the
reproduction scenario I described upthread.  I believe it will also
fix the problems with starting up from a base backup with Alvaro
mentioned upthread.  It won't fix the fact that pg_upgrade is putting
a wrong value into everybody's datminmxid field, which should really
be addressed too, but I've been working on this for about three days
virtually non-stop and I don't have the energy to tackle it right now.
If anyone feels the urge to step into that breech, I think what it
needs to do is: when upgrading from a 9.3-or-later instance, copy over
each database's datminmxid into the corresponding database in the new
cluster.

Aside from that, it's very possible that despite my best efforts this
has serious bugs.  Review and testing would be very much appreciated.

Thanks,

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index 699497c..8d28a5c 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -197,8 +197,9 @@ typedef struct MultiXactStateData
 	Mu

Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-05-29 Thread Robert Haas

On Fri, May 29, 2015 at 3:08 PM, Robert Haas  wrote:
> It won't fix the fact that pg_upgrade is putting
> a wrong value into everybody's datminmxid field, which should really
> be addressed too, but I've been working on this for about three days
> virtually non-stop and I don't have the energy to tackle it right now.
> If anyone feels the urge to step into that breech, I think what it
> needs to do is: when upgrading from a 9.3-or-later instance, copy over
> each database's datminmxid into the corresponding database in the new
> cluster.

Bruce was kind enough to spend some time on IM with me this afternoon,
and I think this may actually be OK.  What pg_upgrade does is:

1. First, put next-xid into the relminmxid for all tables, including
catalog tables.  This is the correct behavior for upgrades from a
pre-9.3 release, and is correct for catalog tables in general.

2. Next, restoring the schema dump will set the relminmxid values for
all non-catalog tables to the value dumped from the old cluster.  At
this point, everything is fine provided that we are coming from a
release 9.3 or newer.  But if the old cluster is pre-9.3, it will have
dumped *zero* values for all of its relminmxid values; so all of the
user tables go from the correct value they had after step 1 to an
incorrect value.

3. Finally, if the old cluster is pre-9.3, repeat step 1, undoing the
damage done in step 2.

This is a bit convoluted, but I don't know of a reason why it
shouldn't work.  Sorry for the false alarm.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-05-29 Thread Robert Haas

On Fri, May 29, 2015 at 9:46 PM, Andres Freund  wrote:
> On 2015-05-29 15:08:11 -0400, Robert Haas wrote:
>> It seems pretty clear that we can't effectively determine anything
>> about member wraparound until the cluster is consistent.
>
> I wonder if this doesn't actually hints at a bigger problem.  Currently,
> to determine where we need to truncate SlruScanDirectory() is
> used. That, afaics, could actually be a problem during recovery when
> we're not consistent.

I agree.  I actually meant to mention this in my previous email, but,
owing to exhaustion and burnout, didn't.

> I think at least for 9.5+ we should a) invent proper truncation records
> for pg_multixact b) start storing oldestValidMultiOffset in pg_control.
> The current hack of scanning the directories to get knowledge we should
> have is a pretty bad hack, and we should not continue using it forever.
> I think we might end up needing to do a) even in the backbranches.

That may be the right thing to do.  I'm concerned that changing the
behavior of master too much will make it every subsequent fix twice as
hard, because we'll have to do one fix in master and another fix in
the back-branches.  I'm also concerned that it will create even more
convoluted failure scenarios. The failure-to-start problem discussed
on this thread requires a chain of four (maybe three) different
PostgreSQL versions in order to create it, and the more things we go
change, the harder it's going to be to reason about this stuff.

The diseased and rotting elephant in the room here is that clusters
with bogus relminmxid, datminmxid, and/or oldestMultiXid values may
exist in the wild and we really have no plan to get rid of them.
78db307bb may have helped somewhat - although I'm haven't grokked what
it's about well enough to be sure - but it's certainly not a complete
solution, as this bug report itself illustrates rather well.  Unless
we figure out some clever solution that is not now apparent to me, or
impose a hard pg_upgrade compatibility break at some point, we
basically can't count on pg_control's "oldest multixact" information
to be correct ever again.  We may be running into clusters 15 years
from now that have problems that are just holdovers from what was
fixed in 9.3.5.

One thing I think we should definitely do is add one or two additional
fields to pg_controldata that get filled in by pg_upgrade.  One of
them should be "the oldest known catversion in the lineage of this
cluster" and the other should be "the most recent catverson in the
lineage of this cluster before this one".   Or maybe we should store
PG_VERSION_NUM values.  Or store both things.  I think that would make
troubleshooting this kind of problem a lot easier - just from the
pg_controldata output, you'd be able to tell whether the cluster had
been pg_upgraded, whether it had been pg_upgraded once or multiple
times, and at least some of the versions involved, without relying on
the user's memory of what they did and when.  Fortunately, Steve
Kellet had a pretty clear idea of what his history was, but not all
users know that kind of thing, and I've wanted it more than once while
troubleshooting.

Another thing I think we should do is add a field to pg_class that is
propagated by pg_upgrade and stores the most recent PG_VERSION_NUM
that is known to have performed a scan_all vacuum of the table.  This
would allow us to do things in the future like (a) force a full-table
vacuum of any table that hasn't been vacuumed since $BUGGYRELEASE or
(b) advise users to manually inspect the values and manually perform
said vacuum or (c) only believe that certain information about a table
is accurate if it's been full-scanned by a vacuum newer than
$BUGGYRELEASE.  It could also be used as part of a strategy for
reclaiming HEAP_MOVED_IN/HEAP_MOVED_OFF; e.g. you can't upgrade to
10.5, which repurposes those bits, unless you've done a scan_all
vacuum on every table with a release new enough to guarantee that
they're not used for their historical purpose.

> This problem isn't conflicting with most of the fixes you describe, so
> I'll continue with reviewing those.

Thank you.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-05-31 Thread Robert Haas

On Sat, May 30, 2015 at 8:55 PM, Andres Freund  wrote:
> Is oldestMulti, nextMulti - 1 really suitable for this? Are both
> actually guaranteed to exist in the offsets slru and be valid?  Hm. I
> guess you intend to simply truncate everything else, but just in
> offsets?

oldestMulti in theory is the right thing, I think, but in actuality we
know that some people have 1 here instead of the correct value.

>> One argument against this idea is that we may not want to keep a full
>> set of member files on standbys (due to disk space usage), but that's
>> what will happen unless we truncate during replay.
>
> I think that argument is pretty much the death-knell.=

Yes.  Truncating on the standby is really not optional.

>> > I think at least for 9.5+ we should a) invent proper truncation records
>> > for pg_multixact b) start storing oldestValidMultiOffset in pg_control.
>> > The current hack of scanning the directories to get knowledge we should
>> > have is a pretty bad hack, and we should not continue using it forever.
>> > I think we might end up needing to do a) even in the backbranches.
>>
>> Definitely agree with WAL-logging truncations; also +1 on backpatching
>> that to 9.3.  We already have experience with adding extra WAL records
>> on minor releases, and it didn't seem to have bitten too hard.
>
> I'm inclined to agree. My only problem is that I'm not sure whether we
> can find a way of doing all this without adding a pg_control field. Let
> me try to sketch this out:
>
> 1) We continue determining the oldest 
> SlruScanDirectory(SlruScanDirCbFindEarliest)
>on the master to find the oldest offsets segment to
>truncate. Alternatively, if we determine it to be safe, we could use
>oldestMulti to find that.
> 2) SlruScanDirCbRemoveMembers is changed to return the range of members
>to remove, instead of doing itself
> 3) We wal log [oldest offset segment guaranteed to not be alive,
>nextmulti) for offsets, and [oldest members segment guaranteed to not be 
> alive,
>nextmultioff), and redo truncations for the entire range during
>recovery.
>
> I'm pretty tired right now, but this sounds doable.

I'm probably biased here, but I think we should finish reviewing,
testing, and committing my patch before we embark on designing this.
So far we have no reports of trouble attributable to the lack of the
WAL-logging support discussed here, as opposed to several reports of
trouble from the status quo within days of release.

I'm having trouble reconstructing the series of events where what
you're worried about here really becomes a problem, and I think we
ought to have a very clear idea about that before back-patching
changes of this type.  It's true that if the state of the SLRU
directory is in the future, because recovery was restarted from an
earlier checkpoint, we might replay a checkpoint and remove some of
those files from the future.  But so what?  By the time we've reached
the minimum recovery point, they will have been recreated by the same
WAL records that created them in the first place.  If, in the previous
replay, we had wrapped all the way around, some of the stuff we keep
may actually already have been overwritten by future WAL records, but
they'll be overwritten again.  Now, that could mess up our
determination of which members to remove, I guess, but I'm not clear
that actually matters either: if the offsets space has wrapped around,
the members space will certainly have wrapped around as well, so we
can remove anything we like at this stage and we're still OK.  I agree
this is ugly the way it is, but where is the actual bug?

As far as your actual outline goes, I think if we do this, we need to
be very careful about step #2.  Right now, we decide what we need to
keep and then remove everything else, but that's kind of wonky because
new stuff may be getting created at the same time, so we keep
adjusting our idea of exactly what needs to be removed.  It would be
far better to invert that logic: decide what needs to be removed -
presumably, everything from the oldest member that now exists up until
some later point - and then remove precisely that stuff and nothing
else.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-06-01 Thread Robert Haas

On Mon, Jun 1, 2015 at 12:46 AM, Noah Misch  wrote:
> Incomplete review, done in a relative rush:

Thanks.

> On Fri, May 29, 2015 at 03:08:11PM -0400, Robert Haas wrote:
>> OK, here's a patch.  Actually two patches, differing only in
>> whitespace, for 9.3 and for master (ha!).  I now think that the root
>> of the problem here is that DetermineSafeOldestOffset() and
>> SetMultiXactIdLimit() were largely ignorant of the possibility that
>> they might be called at points in time when the cluster was
>> inconsistent.
>
> A cause perhaps closer to the root is commit f741300 moving truncation from
> VACUUM to checkpoints.  CLOG has given us deep experience with VACUUM-time
> truncation.  Commit f6a6c46d and this patch are about bringing CHECKPOINT-time
> truncation up to the same level.
>
> Speaking of commit f6a6c46d, it seems logical that updating allocation stop
> limits should happen proximate to truncation.  That's currently the case for
> CLOG (vac_truncate_clog() does both) and pg_multixact/members (checkpoint's
> TruncateMultiXact() call does both).  However, pg_multixact/offsets is
> truncated from TruncateMultiXact(), but vac_truncate_clog() updates its limit.
> I did not distill an errant test case, but this is fishy.

Good point.  Because we confine ourselves to using half the offset
space, it seems much harder for us to get into trouble here than it is
with members. The first scenario that occurred to me is that the SLRU
might actually wrap.  That seems tough, though: between one checkpoint
and the next, vacuum would need to advance oldest_datminmxid by 2^31
MXIDs while generating 2^31 new ones, or something like that.  That
doesn't seem real plausible. But then it occurred to me that it's
probably sufficient to advance the head of the SLRU far enough that
TruncateMultiXact things that the tail is in the future instead of in
the past.  I see no reason why that couldn't happen.  Then we'd end up
leaving some files behind that we should have removed.  I'm not sure
exactly what problem that would cause; would they just get overwritten
on the next pass through the space, or would they cause errors?  I
have not had time to check.

>> SetMultiXactIdLimit() bracketed certain parts of its
>> logic with if (!InRecovery), but those guards were ineffective because
>> it gets called before InRecovery is set in the first place.
>
> SetTransactionIdLimit() checks InRecovery for the same things, and it is
> called at nearly the same moments as SetMultiXactIdLimit().  Do you have a
> sense of whether it is subject to similar problems as a result?

Well, I think it's pretty weird that those things will get done before
beginning recovery, even on an inconsistent cluster, but not during
recovery.  That is pretty strange.  I don't see that it can actually
do any worse than emit a few log messages at the start of recovery
that won't show up again until the end of recovery, though.

>> 1. Moves the call to DetermineSafeOldestOffset() that appears in
>> StartupMultiXact() to TrimMultiXact(), so that we don't try to do this
>> until we're consistent.  Also, instead of passing
>> MultiXactState->oldestMultiXactId, pass the newer of that value and
>> the earliest offset that exists on disk.  That way, it won't try to
>> read data that's not there.
>
> Perhaps emit a LOG message when we do that, since it's our last opportunity to
> point to the potential data loss?

If the problem is just that somebody minmxid got set to 1 instead of
the real value, I think that there is no data loss, because none of
those older values are actually present there.  But we could add a LOG
message anyway.  How do you suggest that we phrase that?

>> +  * PostgreSQL 9.3.0 through 9.3.6 and PostgreSQL 9.4.0 through 9.4.1
>> +  * had bugs that could allow users who reached those release through
>
> s/release/releases/

Fixed.

>> @@ -2859,6 +2947,14 @@ TruncateMultiXact(void)
>>   SimpleLruTruncate(MultiXactOffsetCtl,
>> 
>> MultiXactIdToOffsetPage(oldestMXact));
>>
>> + /* Update oldest-on-disk value in shared memory. */
>> + earliest = range.rangeStart * MULTIXACT_OFFSETS_PER_PAGE;
>> + if (earliest < FirstMultiXactId)
>> + earliest = FirstMultiXactId;
>> + LWLockAcquire(MultiXactGenLock, LW_EXCLUSIVE);
>> + Assert(MultiXactState->oldestMultiXactOnDiskValid);
>> + MultiXactState->oldestMultiXactOnDiskValid = earliest;
>
> That last line needs s/Valid//, I presume.  Is it okay that
> oldestMultiXactOnDisk becomes too-old during TruncateMultiXact(), despite its
> Valid indicator remaining true?

Ay yai

Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-06-01 Thread Robert Haas

On Mon, Jun 1, 2015 at 4:58 AM, Andres Freund  wrote:
>> I'm probably biased here, but I think we should finish reviewing,
>> testing, and committing my patch before we embark on designing this.
>
> Probably, yes. I am wondering whether doing this immediately won't end
> up making some things simpler and more robust though.

I'm open to being convinced of that, but as of this moment I'm not
seeing any clear-cut evidence that we need to go so far.

>> So far we have no reports of trouble attributable to the lack of the
>> WAL-logging support discussed here, as opposed to several reports of
>> trouble from the status quo within days of release.
>
> The lack of WAL logging actually has caused problems in the 9.3.3 (?)
> era, where we didn't do any truncation during recovery...

Right, but now we're piggybacking on the checkpoint records, and I
don't have any evidence that this approach can't be made robust.  It's
possible that it can't be made robust, but that's not currently clear.

>> By the time we've reached the minimum recovery point, they will have
>> been recreated by the same WAL records that created them in the first
>> place.
>
> I'm not sure that's true. I think we could end up errorneously removing
> files that were included in the base backup. Anyway, let's focus on your
> patch for now.

OK, but I am interested in discussing the other thing too.  I just
can't piece together the scenario myself - there may well be one.  The
base backup will begin replay from the checkpoint caused by
pg_start_backup() and remove anything that wasn't there at the start
of the backup.  But all of that stuff should get recreated by the time
we reach the minimum recovery point (end of backup).

>> If, in the previous
>> replay, we had wrapped all the way around, some of the stuff we keep
>> may actually already have been overwritten by future WAL records, but
>> they'll be overwritten again.  Now, that could mess up our
>> determination of which members to remove, I guess, but I'm not clear
>> that actually matters either: if the offsets space has wrapped around,
>> the members space will certainly have wrapped around as well, so we
>> can remove anything we like at this stage and we're still OK.  I agree
>> this is ugly the way it is, but where is the actual bug?
>
> I'm more worried about the cases where we didn't ever actually "badly
> wrap around" (i.e. overwrite needed data); but where that's not clear on
> the standby because the base backup isn't in a consistent state.

I agree. The current patch tries to make it so that we never call
find_multixact_start() while in recovery, but it doesn't quite
succeed: the call in TruncateMultiXact still happens during recovery,
but only once we're sure that the mxact we plan to call it on actually
exists on disk.  That won't be called until we replay the first
checkpoint, but that might still be prior to consistency.

Since I forgot to attach the revised patch with fixes for the points
Noah mentioned to that email, here it is attached to this one.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
commit d33b4eb0167f465edb00bd6c0e1bcaa67dd69fe9
Author: Robert Haas 
Date:   Fri May 29 14:35:53 2015 -0400

foo

diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index 9568ff1..aca829d 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -199,8 +199,9 @@ typedef struct MultiXactStateData
 	MultiXactOffset nextOffset;
 
 	/*
-	 * Oldest multixact that is still on disk.  Anything older than this
-	 * should not be consulted.  These values are updated by vacuum.
+	 * Oldest multixact that may still be referenced from a relation.
+	 * Anything older than this should not be consulted.  These values are
+	 * updated by vacuum.
 	 */
 	MultiXactId oldestMultiXactId;
 	Oid			oldestMultiXactDB;
@@ -213,6 +214,18 @@ typedef struct MultiXactStateData
 	 */
 	MultiXactId lastCheckpointedOldest;
 
+	/*
+	 * This is the oldest file that actually exist on the disk.  This value
+	 * is initialized by scanning pg_multixact/offsets, and subsequently
+	 * updated each time we complete a truncation.  We need a flag to
+	 * indicate whether this has been initialized yet.
+	 */
+	MultiXactId oldestMultiXactOnDisk;
+	bool		oldestMultiXactOnDiskValid;
+
+	/* Has TrimMultiXact been called yet? */
+	bool		didTrimMultiXact;
+
 	/* support for anti-wraparound measures */
 	MultiXactId multiVacLimit;
 	MultiXactId multiWarnLimit;
@@ -344,6 +357,8 @@ static char *mxstatus_to_string(MultiXactStatus status);
 /* management of SLRU infrastructure */
 static int	ZeroMultiXactOffsetPage(int page

Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-06-02 Thread Robert Haas

On Tue, Jun 2, 2015 at 1:21 AM, Noah Misch  wrote:
> On Mon, Jun 01, 2015 at 02:06:05PM -0400, Robert Haas wrote:
>> On Mon, Jun 1, 2015 at 12:46 AM, Noah Misch  wrote:
>> > On Fri, May 29, 2015 at 03:08:11PM -0400, Robert Haas wrote:
>> >> SetMultiXactIdLimit() bracketed certain parts of its
>> >> logic with if (!InRecovery), but those guards were ineffective because
>> >> it gets called before InRecovery is set in the first place.
>> >
>> > SetTransactionIdLimit() checks InRecovery for the same things, and it is
>> > called at nearly the same moments as SetMultiXactIdLimit().  Do you have a
>> > sense of whether it is subject to similar problems as a result?
>>
>> Well, I think it's pretty weird that those things will get done before
>> beginning recovery, even on an inconsistent cluster, but not during
>> recovery.  That is pretty strange.  I don't see that it can actually
>> do any worse than emit a few log messages at the start of recovery
>> that won't show up again until the end of recovery, though.
>
> Granted.  Would it be better to update both functions at the same time, and
> perhaps to make that a master-only change?  Does the status quo cause more
> practical harm via SetMultiXactIdLimit() than via SetTransactionIdLimit()?

It does in the case of the call to find_multixact_start().  If that
fails, we take the server down for no good reason, as demonstrated by
the original report. I'll revert the changes to the other two things
in this function that I changed to be protected by did_trim.  There's
no special reason to think that's a necessary change.

>> >> 1. Moves the call to DetermineSafeOldestOffset() that appears in
>> >> StartupMultiXact() to TrimMultiXact(), so that we don't try to do this
>> >> until we're consistent.  Also, instead of passing
>> >> MultiXactState->oldestMultiXactId, pass the newer of that value and
>> >> the earliest offset that exists on disk.  That way, it won't try to
>> >> read data that's not there.
>> >
>> > Perhaps emit a LOG message when we do that, since it's our last 
>> > opportunity to
>> > point to the potential data loss?
>>
>> If the problem is just that somebody minmxid got set to 1 instead of
>> the real value, I think that there is no data loss, because none of
>> those older values are actually present there.  But we could add a LOG
>> message anyway.  How do you suggest that we phrase that?
>
> There's no data loss if 1 <= true_minmxid <= 2^31 at the time minmxid got set
> to 1.  Otherwise, data loss is possible.

Yes, but in that scenario, the log message you propose wouldn't be
triggered.  If true_minmxid > 2^31, then the stored minmxid will not
precede the files on disk; it will follow it (assuming the older stuff
hasn't been truncated, as is likely).  So the message would be
essentially:

LOG: you didn't lose data.  but if exactly the opposite of what this
message is telling you about had happened, then you would have.
DETAIL: Have a nice day.

> I don't hope for an actionable
> message, but we might want a reporter to grep logs for it when we diagnose
> future reports.  Perhaps this:
>
>   "missing pg_multixact/members files; begins at MultiXactId %u, expected %u"

This seems misleading.  In the known failure case, it's not that the
pg_multixact files are unexpectedly missing; it's that we incorrectly
think that they should still be there.  Maybe:

oldest MultiXactId on disk %u follows expected oldest MultiXact %u

> For good measure, perhaps emit this when lastCheckpointedOldest > earliest by
> more than one segment:
>
>   "excess pg_multixact/members files; begins at MultiXactId %u, expected %u"

So, this scenario will happen whenever the system was interrupted in
the middle of a truncation, or when the system is started from a base
backup and a truncation happened after files were copied.  I'm wary of
giving users the idea that this is an atypical event.  Perhaps a
message at DEBUG1?

>> I'm not sure what you mean about it becoming too old.  At least with
>> that fix, it should get updated to exactly the first file that we
>> didn't remove.  Isn't that right?
>
> Consider a function raw_GOMXOD() that differs from GetOldestMultiXactOnDisk()
> only in that it never reads or writes the cache.  I might expect
> oldestMultiXactOnDisk==raw_GOMXOD() if oldestMultiXactOnDiskValid, and that
> does hold most of the time.  It does not always hold between the start of the
> quoted code's SimpleLruTruncate() and its oldestMultiXactOnDisk assignment.
> That&

Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-06-02 Thread Robert Haas

On Tue, Jun 2, 2015 at 8:56 AM, Andres Freund  wrote:
> But what *definitely* looks wrong to me is that a TruncateMultiXact() in
> this scenario now (since a couple weeks ago) does a
> SimpleLruReadPage_ReadOnly() in the members slru via
> find_multixact_start(). That just won't work acceptably when we're not
> yet consistent. There very well could not be a valid members segment at
> that point?  Am I missing something?

Yes: that code isn't new.

TruncateMultiXact() called SimpleLruReadPage_ReadOnly() directly in
9.3.0 and every subsequent release until 9.3.7/9.4.2.  The only thing
that's changed is that we've moved that logic into a function called
find_multixact_start() instead of having it directly inside that
function.  We did that because we needed to use the same logic in some
other places.  The reason why 9.3.7/9.4.2 are causing problems for
people that they didn't have previously is because those new,
additional call sites were poorly chosen and didn't include adequate
protection against calling that function with an invalid input value.
What this patch is about is getting back to the situation that we were
in from 9.3.0 - 9.3.6 and 9.4.0 - 9.4.1, where TruncateMultiXact() did
the thing that you're complaining about here but no one else did.

>From my point of view, I think that you are absolutely right to
question what's going on in TruncateMultiXact().  It's kooky, and
there may well be bugs buried there.  But I don't think fixing that
should be the priority right now, because we have zero reports of
problems attributable to that logic.  I think the priority should be
on undoing the damage that we did in 9.3.7/9.4.2, when we made other
places to do the same thing.  We started getting trouble reports
attributable to those changes *almost immediately*, which means that
whether or not TruncateMultiXact() is broken, these new call sites
definitely are.  I think we really need to fix those new places ASAP.

> I think at the very least we'll have to skip this step while not yet
> consistent. That really sucks, because we'll possibly end up with
> multixacts that are completely filled by the time we've reached
> consistency.

That would be a departure from the behavior of every existing release
that includes this code based on, to my knowledge, zero trouble
reports.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-06-02 Thread Robert Haas

On Tue, Jun 2, 2015 at 11:27 AM, Andres Freund  wrote:
> On 2015-06-02 11:16:22 -0400, Robert Haas wrote:
>> I'm having trouble figuring out what to do about this.  I mean, the
>> essential principle of this patch is that if we can't count on
>> relminmxid, datminmxid, or the control file to be accurate, we can at
>> least look at what is present on the disk.  If we also cannot count on
>> that to be accurate, we are left without any reliable source of
>> information.  Consider a hypothetical cluster where all our stored
>> minmxids of whatever form are corrupted (say, all change to 1) and in
>> addition there are stray files in pg_multixact.  I don't think there's
>> really any way to get ourselves out of trouble in that scenario.
>
> If we were to truncate after vacuum, and only on the primary (via WAL
> logging), we could, afaics, just rely on all the values to be
> recomputed. I mean relminmxid will be recomputed after a vacuum, and
> thus, after some time, will datminmxid and the control file value.  We
> could just force a value of 1 to always trigger anti-wraparound vacuums
> (or wait for that to happen implicitly, to delay the impact?). That'll
> then should then fix the problem in a relatively short amount of time?

The exact circumstances under which we're willing to replace a
relminmxid with a newly-computed one that differs are not altogether
clear to me, but there's an "if" statement protecting that logic, so
there are some circumstances in which we'll leave the existing value
intact.  If we force non-stop vacuuming in that scenario, autovacuum
will just run like crazy without accomplishing anything, which
wouldn't be good.  It would similarly do so when the oldest MXID
reference in the relation is in fact 1, but that value can't be
vacuumed away yet.

Also, the database might be really big.  Even if it were true that a
full scan of every table would get us out of this state, describing
the time that it would take to do that as "relatively short" seems to
me to be considerably understating the impact of a full-cluster
VACUUM.

With regard to the more general question of WAL-logging this, are you
going to work on that?  Are you hoping Alvaro or I will work on that?
Should we draw straws?  It seems like somebody needs to do it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-06-02 Thread Robert Haas

On Tue, Jun 2, 2015 at 11:36 AM, Andres Freund  wrote:
>> That would be a departure from the behavior of every existing release
>> that includes this code based on, to my knowledge, zero trouble
>> reports.
>
> On the other hand we're now at about bug #5 attributeable to the odd way
> truncation works for multixacts. It's obviously complex and hard to get
> right. It makes it harder to cope with the wrong values left in
> datminxid etc. So I'm still wondering whether fixing this for good isn't
> the better approach.

It may well be.  But I think we should do something more surgical
first.  Perhaps we can justify the pain and risk of making changes to
the WAL format in the back-branches, but let's not do it in a rush.
If we can get this patch to a state where it undoes the damage
inflicted in 9.3.7/9.4.2, then we will be in a state where we have as
much reliability as we had in 9.3.6 plus the protections against
member-space wraparound added in 9.3.7 - which, like the patch I'm
proposing now, were directly motivated by multiple, independent bug
reports.  That seems like a good place to get to.  If nothing else, it
will buy us some time to figure out what else we want to do.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-06-02 Thread Robert Haas

On Tue, Jun 2, 2015 at 11:44 AM, Andres Freund  wrote:
> On 2015-06-02 11:37:02 -0400, Robert Haas wrote:
>> The exact circumstances under which we're willing to replace a
>> relminmxid with a newly-computed one that differs are not altogether
>> clear to me, but there's an "if" statement protecting that logic, so
>> there are some circumstances in which we'll leave the existing value
>> intact.
>
> I guess we'd have to change that then.

Yeah, but first we'd need to assess why it's like that.  Tom was the
one who installed the current logic, but I haven't been able to fully
understand it.

>> It would similarly do so when the oldest MXID reference in the
>> relation is in fact 1, but that value can't be vacuumed away yet.
>
> I'd thought of just forcing consumption of one multixact in that
> case. Not pretty, but imo acceptable.

What if multixact 1 still has living members?

>> Also, the database might be really big.  Even if it were true that a
>> full scan of every table would get us out of this state, describing
>> the time that it would take to do that as "relatively short" seems to
>> me to be considerably understating the impact of a full-cluster
>> VACUUM.
>
> Well. We're dealing with a corrupted cluster. Having a way out that's
> done automatically, even if it takes a long while, is pretty good
> already. In many cases the corruption (i.e. pg_upgrade) happened long
> ago, so the table's relminmxid will already have been recomputed.  I
> think that's acceptable.

I'm a long way from being convinced the automated recovery is
possible.  There are so many different scenarios here that it's very
difficult to reason generally about what the "right" thing to do is.
I agree it would be nice if we had it, though.

>> With regard to the more general question of WAL-logging this, are you
>> going to work on that?  Are you hoping Alvaro or I will work on that?
>> Should we draw straws?  It seems like somebody needs to do it.
>
> I'm willing to invest the time to develop an initial version, but I'll
> need help evaluating it. I don't have many testing resources available
> atm, and I'm not going to trust stuff I developed while travelling by
> just looking at the code.

I'm willing to help with that.  Hopefully I'm not the only one, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-06-02 Thread Robert Haas

On Tue, Jun 2, 2015 at 4:19 PM, Andres Freund  wrote:
> I'm not really convinced tying things closer to having done trimming is
> easier to understand than tying things to recovery having finished.
>
> E.g.
> if (did_trim)
> oldestOffset = GetOldestReferencedOffset(oldest_datminmxid);
> imo is harder to understand than if (!InRecovery).
>
> Maybe we could just name it finishedStartup and rename the functions 
> accordingly?

Basing that particular call site on InRecovery doesn't work, because
InRecovery isn't set early enough.  But I'm fine to rename it to
whatever.

> Maybe it's worthwhile to add a 'NB: At this stage the data directory is
> not yet necessarily consistent' StartupMultiXact's comments, to avoid
> reintroducing problems like this?

Sure.

>>   /*
>> +  * We can read this without a lock, because it only changes when 
>> nothing
>> +  * else is running.
>> +  */
>> + did_trim = MultiXactState->didTrimMultiXact;
>
> Err, Hot Standby? It might be ok to not lock, but the comment is
> definitely wrong. I'm inclined to simply use locking, this doesn't look
> sufficiently critical performancewise.

/me nods.  Good point.

> Hm. If GetOldestMultiXactOnDisk() gets the starting point by scanning
> the disk it'll always get one at a segment boundary, right? I'm not sure
> that's actually ok; because the value at the beginning of the segment
> can very well end up being a 0, as MaybeExtendOffsetSlru() will have
> filled the page with zeros.
>
> I think that should be harmless, the worst that can happen is that
> oldestOffset errorneously is 0, which should be correct, even though
> possibly overly conservative, in these cases.

Uh oh.  That seems like a real bad problem for this approach.  What
keeps that from being the opposite of too conservative?  There's no
"safe" value in a circular numbering space.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-06-03 Thread Robert Haas

On Wed, Jun 3, 2015 at 4:48 AM, Thomas Munro
 wrote:
> On Wed, Jun 3, 2015 at 3:42 PM, Alvaro Herrera  
> wrote:
>> Thomas Munro wrote:
>>> On Tue, Jun 2, 2015 at 9:30 AM, Alvaro Herrera  
>>> wrote:
>>> > My guess is that the file existed, and perhaps had one or more pages,
>>> > but the wanted page doesn't exist, so we tried to read but got 0 bytes
>>> > back.  read() returns 0 in this case but doesn't set errno.
>>> >
>>> > I didn't find a way to set things so that the file exists but is of
>>> > shorter contents than oldestMulti by the time the checkpoint record is
>>> > replayed.
>>>
>>> I'm just starting to learn about the recovery machinery, so forgive me
>>> if I'm missing something basic here, but I just don't get this.  As I
>>> understand it, offsets/0046 should either have been copied with that
>>> page present in it if it existed before the backup started (apparently
>>> not in this case), or extended to contain it by WAL records that come
>>> after the backup label but before the checkpoint record that
>>> references it (also apparently not in this case).
>>
>> Exactly --- that's the spot at which I am, also.  I have had this
>> spinning in my head for three days now, and tried every single variation
>> that I could think of, but like you I was unable to reproduce the issue.
>> However, our customer took a second base backup and it failed in exactly
>> the same way, module some changes to the counters (the file that
>> didn't exist was 004B rather than 0046).  I'm still at a loss at what
>> the failure mode is.  We must be missing some crucial detail ...
>
> I have finally reproduced that error!  See attached repro shell script.
>
> The conditions are:
>
> 1.  next multixact == oldest multixact (no active multixacts, pointing
> past the end)
> 2.  next multixact would be the first item on a new page (multixact % 2048 == 
> 0)
> 3.  the page must not be the first in a segment (or we'd get the
> read-zeroes case)
>
> That gives you odds of 1/2048 * 31/32 * (probability of a wraparound
> vacuum followed by no multixact creations right before your backup
> checkpoint).  That seems like reasonably low odds... if it happened
> twice in a row, maybe I'm missing something here and there is some
> other way to get this...
>
> I realise now that this is actually a symptom of a problem spotted by
> Noah recently:
>
> http://www.postgresql.org/message-id/20150601045534.gb23...@tornado.leadboat.com
>
> He noticed the problem for segment boundaries, when not in recovery.
> In recovery, segment boundaries don't raise an error (the read-zeroes
> case applies), but page boundaries do.  The fix is probably to do
> nothing if they are the same, as we do elsewhere, like in the attached
> patch.

Actually, we still need to call DetermineSafeOldestOffset in that
case.  Otherwise, if someone goes from lots of multixacts to none, the
stop point won't advance.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-06-03 Thread Robert Haas

On Tue, Jun 2, 2015 at 5:22 PM, Andres Freund  wrote:
>> > Hm. If GetOldestMultiXactOnDisk() gets the starting point by scanning
>> > the disk it'll always get one at a segment boundary, right? I'm not sure
>> > that's actually ok; because the value at the beginning of the segment
>> > can very well end up being a 0, as MaybeExtendOffsetSlru() will have
>> > filled the page with zeros.
>> >
>> > I think that should be harmless, the worst that can happen is that
>> > oldestOffset errorneously is 0, which should be correct, even though
>> > possibly overly conservative, in these cases.
>>
>> Uh oh.  That seems like a real bad problem for this approach.  What
>> keeps that from being the opposite of too conservative?  There's no
>> "safe" value in a circular numbering space.
>
> I think it *might* (I'm really jetlagged) be fine because that'll only
> happen after a upgrade from < 9.3. And in that case we initialize
> nextOffset to 0. That ought to safe us?

That's pretty much betting the farm on the bugs we know about today
being the only ones there are.  That seems imprudent.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-06-03 Thread Robert Haas

On Wed, Jun 3, 2015 at 8:24 AM, Robert Haas  wrote:
> On Tue, Jun 2, 2015 at 5:22 PM, Andres Freund  wrote:
>>> > Hm. If GetOldestMultiXactOnDisk() gets the starting point by scanning
>>> > the disk it'll always get one at a segment boundary, right? I'm not sure
>>> > that's actually ok; because the value at the beginning of the segment
>>> > can very well end up being a 0, as MaybeExtendOffsetSlru() will have
>>> > filled the page with zeros.
>>> >
>>> > I think that should be harmless, the worst that can happen is that
>>> > oldestOffset errorneously is 0, which should be correct, even though
>>> > possibly overly conservative, in these cases.
>>>
>>> Uh oh.  That seems like a real bad problem for this approach.  What
>>> keeps that from being the opposite of too conservative?  There's no
>>> "safe" value in a circular numbering space.
>>
>> I think it *might* (I'm really jetlagged) be fine because that'll only
>> happen after a upgrade from < 9.3. And in that case we initialize
>> nextOffset to 0. That ought to safe us?
>
> That's pretty much betting the farm on the bugs we know about today
> being the only ones there are.  That seems imprudent.

So here's a patch taking a different approach.  In this approach, if
the multixact whose members we want to look up doesn't exist, we don't
use a later one (that might or might not be valid).  Instead, we
attempt to cope with the unknown.  That means:

1. In TruncateMultiXact(), we don't truncate.

2. If setting the offset stop limit (the point where we refuse to
create new multixact space), we don't arm the stop point.  This means
that if you're in this situation, you run without member wraparound
protection until it's corrected.  A message gets logged once per
checkpoint telling you that you have this problem, and another message
gets logged when things get straightened out and the guards are
enabled.

3. If setting the vacuum force point, we assume that it's appropriate
to immediately force vacuum.

I've only tested this very lightly - this is just to see what you and
Noah and others think of the approach.  As compared with the previous
approach, it has the advantage of making minimal assumptions about the
sanity of what's on disk.  It has the disadvantage that, for some
people, the member-wraparound guard won't be enabled at startup -- but
note that those people can't start 9.3.7/9.4.2 *at all*, so currently
they are either running without member wraparound protection anyway
(if they haven't upgraded to those releases) or they're down entirely.
Another disadvantage is that we'll be triggering what may be quite a
bit of autovacuum activity for some people, which could be painful.
On the plus side, they'll hopefully end up with sane relminmxid and
datminmxid guards afterwards.

Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index 9568ff1..4400fc5 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -198,13 +198,24 @@ typedef struct MultiXactStateData
 	/* next-to-be-assigned offset */
 	MultiXactOffset nextOffset;

+	/* Have we completed multixact startup? */
+	bool		finishedStartup;
+
 	/*
-	 * Oldest multixact that is still on disk.  Anything older than this
-	 * should not be consulted.  These values are updated by vacuum.
+	 * Oldest multixact that is still potentially referenced by a relation.
+	 * Anything older than this should not be consulted.  These values are
+	 * updated by vacuum.
 	 */
 	MultiXactId oldestMultiXactId;
 	Oid			oldestMultiXactDB;
+
+	/*
+	 * Oldest multixact offset that is potentially referenced by a
+	 * multixact referenced by a relation.  We don't always know this value,
+	 * so there's a flag here to indicate whether or not we currently do.
+	 */
 	MultiXactOffset oldestOffset;
+	bool		oldestOffsetKnown;

 	/*
 	 * This is what the previous checkpoint stored as the truncate position.
@@ -221,6 +232,7 @@ typedef struct MultiXactStateData

 	/* support for members anti-wraparound measures */
 	MultiXactOffset offsetStopLimit;
+	bool offsetStopLimitKnown;

 	/*
 	 * Per-backend data starts here.  We have two arrays stored in the area
@@ -350,10 +362,11 @@ static bool MultiXactOffsetPrecedes(MultiXactOffset offset1,
 		MultiXactOffset offset2);
 static void ExtendMultiXactOffset(MultiXactId multi);
 static void ExtendMultiXactMember(MultiXactOffset offset, int nmembers);
-static void DetermineSafeOldestOffset(MultiXactId oldestMXact);
+static void DetermineSafeOldestOffset(MultiXactOffset oldestMXact);
 stat

Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-06-04 Thread Robert Haas

On Thu, Jun 4, 2015 at 2:42 AM, Noah Misch  wrote:
> I like that change a lot.  It's much easier to seek forgiveness for wasting <=
> 28 GiB of disk than for deleting visibility information wrongly.

I'm glad you like it.  I concur.

>> 2. If setting the offset stop limit (the point where we refuse to
>> create new multixact space), we don't arm the stop point.  This means
>> that if you're in this situation, you run without member wraparound
>> protection until it's corrected.  A message gets logged once per
>> checkpoint telling you that you have this problem, and another message
>> gets logged when things get straightened out and the guards are
>> enabled.
>>
>> 3. If setting the vacuum force point, we assume that it's appropriate
>> to immediately force vacuum.
>
> Those seem reasonable, too.

Cool.

>> I've only tested this very lightly - this is just to see what you and
>> Noah and others think of the approach.  As compared with the previous
>> approach, it has the advantage of making minimal assumptions about the
>> sanity of what's on disk.  It has the disadvantage that, for some
>> people, the member-wraparound guard won't be enabled at startup -- but
>> note that those people can't start 9.3.7/9.4.2 *at all*, so currently
>> they are either running without member wraparound protection anyway
>> (if they haven't upgraded to those releases) or they're down entirely.
>
> That disadvantage is negligible, considering.

All right.

>> Another disadvantage is that we'll be triggering what may be quite a
>> bit of autovacuum activity for some people, which could be painful.
>> On the plus side, they'll hopefully end up with sane relminmxid and
>> datminmxid guards afterwards.
>
> That sounds good so long as each table requires just one successful emergency
> autovacuum.  I'm not seeing code to ensure that the launched autovacuum will
> indeed perform a full-table scan and update relminmxid; is it there?

No.  Oops.

> For sites that can't tolerate an autovacuum storm, what alternative can we
> provide?  Is "SET vacuum_multixact_freeze_table_age = 0; VACUUM " of
> every table, done before applying the minor update, sufficient?

I don't know.  In practical terms, they probably need to ensure that
if pg_multixact/offsets/ does not exist, no relations have
relminmxid = 1 and no remaining databases have datminmxid = 1.
Exactly what it will take to get there is possibly dependent on which
minor release you are running; on current minor releases, I am hopeful
that what you propose is sufficient.

>>  static void
>> -DetermineSafeOldestOffset(MultiXactId oldestMXact)
>> +DetermineSafeOldestOffset(MultiXactOffset oldestMXact)
>
> Leftover change from an earlier iteration?  The values passed continue to be
> MultiXactId values.

Oopsie.

>>   /* move back to start of the corresponding segment */
>> - oldestOffset -= oldestOffset %
>> - (MULTIXACT_MEMBERS_PER_PAGE * SLRU_PAGES_PER_SEGMENT);
>> + offsetStopLimit = oldestOffset - (oldestOffset %
>> + (MULTIXACT_MEMBERS_PER_PAGE * SLRU_PAGES_PER_SEGMENT));
>> + /* always leave one segment before the wraparound point */
>> + offsetStopLimit -= (MULTIXACT_MEMBERS_PER_PAGE * 
>> SLRU_PAGES_PER_SEGMENT);
>> +
>> + /* if nothing has changed, we're done */
>> + if (prevOffsetStopLimitKnown && offsetStopLimit == prevOffsetStopLimit)
>> + return;
>>
>>   LWLockAcquire(MultiXactGenLock, LW_EXCLUSIVE);
>> - /* always leave one segment before the wraparound point */
>> - MultiXactState->offsetStopLimit = oldestOffset -
>> - (MULTIXACT_MEMBERS_PER_PAGE * SLRU_PAGES_PER_SEGMENT);
>> + MultiXactState->offsetStopLimit = oldestOffset;
>
> That last line needs s/oldestOffset/offsetStopLimit/, I presume.

Another oops.

Thanks for the review.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-06-04 Thread Robert Haas

On Thu, Jun 4, 2015 at 9:42 AM, Robert Haas  wrote:
> Thanks for the review.

Here's a new version.  I've fixed the things Alvaro and Noah noted,
and some compiler warnings about set but unused variables.

I also tested it, and it doesn't quite work as hoped.  If started on a
cluster where oldestMultiXid is incorrectly set to 1, it starts up and
indicates that the member wraparound guards are disabled.  But even
after everything is fixed, they don't get enabled until after the next
full restart.  I think that's because TruncateMultiXact() bails out
too early, without calling DetermineSafeOldestOffset.

My attempt at a quick fix for that problem didn't work out, so I'm
posting this version for now to facilitate further review and testing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
commit eb39cf10e4ff853ed4b9d3fab599cf42911e6f70
Author: Robert Haas 
Date:   Thu Jun 4 11:58:49 2015 -0400

bar

diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index 699497c..209d3e6 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -196,13 +196,24 @@ typedef struct MultiXactStateData
 	/* next-to-be-assigned offset */
 	MultiXactOffset nextOffset;
 
+	/* Have we completed multixact startup? */
+	bool		finishedStartup;
+
 	/*
-	 * Oldest multixact that is still on disk.  Anything older than this
-	 * should not be consulted.  These values are updated by vacuum.
+	 * Oldest multixact that is still potentially referenced by a relation.
+	 * Anything older than this should not be consulted.  These values are
+	 * updated by vacuum.
 	 */
 	MultiXactId oldestMultiXactId;
 	Oid			oldestMultiXactDB;
+
+	/*
+	 * Oldest multixact offset that is potentially referenced by a
+	 * multixact referenced by a relation.  We don't always know this value,
+	 * so there's a flag here to indicate whether or not we currently do.
+	 */
 	MultiXactOffset oldestOffset;
+	bool		oldestOffsetKnown;
 
 	/*
 	 * This is what the previous checkpoint stored as the truncate position.
@@ -219,6 +230,7 @@ typedef struct MultiXactStateData
 
 	/* support for members anti-wraparound measures */
 	MultiXactOffset offsetStopLimit;
+	bool offsetStopLimitKnown;
 
 	/*
 	 * Per-backend data starts here.  We have two arrays stored in the area
@@ -348,10 +360,11 @@ static bool MultiXactOffsetPrecedes(MultiXactOffset offset1,
 		MultiXactOffset offset2);
 static void ExtendMultiXactOffset(MultiXactId multi);
 static void ExtendMultiXactMember(MultiXactOffset offset, int nmembers);
-static void DetermineSafeOldestOffset(MultiXactId oldestMXact);
+static void DetermineSafeOldestOffset(MultiXactOffset oldestMXact);
 static bool MultiXactOffsetWouldWrap(MultiXactOffset boundary,
 		 MultiXactOffset start, uint32 distance);
-static MultiXactOffset find_multixact_start(MultiXactId multi);
+static bool SetOffsetVacuumLimit(bool finish_setup);
+static bool find_multixact_start(MultiXactId multi, MultiXactOffset *result);
 static void WriteMZeroPageXlogRec(int pageno, uint8 info);
 
 
@@ -960,7 +973,8 @@ GetNewMultiXactId(int nmembers, MultiXactOffset *offset)
 	 * against catastrophic data loss due to multixact wraparound.  The basic
 	 * rules are:
 	 *
-	 * If we're past multiVacLimit or the safe threshold for member storage space,
+	 * If we're past multiVacLimit or the safe threshold for member storage
+	 * space, or we don't know what the safe threshold for member storage is,
 	 * start trying to force autovacuum cycles.
 	 * If we're past multiWarnLimit, start issuing warnings.
 	 * If we're past multiStopLimit, refuse to create new MultiXactIds.
@@ -969,6 +983,7 @@ GetNewMultiXactId(int nmembers, MultiXactOffset *offset)
 	 *--
 	 */
 	if (!MultiXactIdPrecedes(result, MultiXactState->multiVacLimit) ||
+		!MultiXactState->oldestOffsetKnown ||
 		(MultiXactState->nextOffset - MultiXactState->oldestOffset
 			> MULTIXACT_MEMBER_SAFE_THRESHOLD))
 	{
@@ -1083,7 +1098,8 @@ GetNewMultiXactId(int nmembers, MultiXactOffset *offset)
 	 *--
 	 */
 #define OFFSET_WARN_SEGMENTS	20
-	if (MultiXactOffsetWouldWrap(MultiXactState->offsetStopLimit, nextOffset,
+	if (MultiXactState->offsetStopLimitKnown &&
+		MultiXactOffsetWouldWrap(MultiXactState->offsetStopLimit, nextOffset,
  nmembers))
 		ereport(ERROR,
 (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
@@ -1095,7 +,8 @@ GetNewMultiXactId(int nmembers, MultiXactOffset *offset)
 		   MultiXactState->offsetStopLimit - nextOffset - 1),
  errhint("Execute a database-wide VACUUM in database with OID %u with reduced vacuum_multixact_freeze_min_age and vacuum_multixact_freeze_table_age settings.",
 		 MultiXactState->oldestMultiXactDB)));
-	else if (MultiXactOffsetWouldWrap(MultiXactState->offsetStopLimit,
+	else if (M

Re: [HACKERS] Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-06-04 Thread Robert Haas

On Thu, Jun 4, 2015 at 1:27 PM, Andres Freund  wrote:
> On 2015-06-04 12:57:42 -0400, Robert Haas wrote:
>> + /*
>> +  * Do we need an emergency autovacuum?  If we're not sure, assume yes.
>> +  */
>> + return !oldestOffsetKnown ||
>> + (nextOffset - oldestOffset > MULTIXACT_MEMBER_SAFE_THRESHOLD);
>
> I think without teaching autovac about those rules, this might just lead
> to lots of autovac processes starting without knowing they should do
> something? They know about autovacuum_multixact_freeze_age, but they
> know neither about !oldestOffsetKnown nor, afaics, about pending offset
> wraparounds?

You're right, but that's why the latest patch has changes in
MultiXactMemberFreezeThreshold.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-06-04 Thread Robert Haas

On Thu, Jun 4, 2015 at 12:57 PM, Robert Haas  wrote:
> On Thu, Jun 4, 2015 at 9:42 AM, Robert Haas  wrote:
>> Thanks for the review.
>
> Here's a new version.  I've fixed the things Alvaro and Noah noted,
> and some compiler warnings about set but unused variables.
>
> I also tested it, and it doesn't quite work as hoped.  If started on a
> cluster where oldestMultiXid is incorrectly set to 1, it starts up and
> indicates that the member wraparound guards are disabled.  But even
> after everything is fixed, they don't get enabled until after the next
> full restart.  I think that's because TruncateMultiXact() bails out
> too early, without calling DetermineSafeOldestOffset.
>
> My attempt at a quick fix for that problem didn't work out, so I'm
> posting this version for now to facilitate further review and testing.

Here's a new version with some more fixes and improvements:

- SetOffsetVacuumLimit was failing to set MultiXactState->oldestOffset
when the oldest offset became known if the now-known value happened to
be zero.  Fixed.

- SetOffsetVacuumLimit now logs useful information at the DEBUG1
level, so that you can see that it's doing what it's supposed to.

- TruncateMultiXact now calls DetermineSafeOldestOffset to adjust the
offsetStopLimit even if it can't truncate anything.  This seems
useless, but it's not, because it may be that the last checkpoint
advanced lastCheckpointedOldest from a bogus value (i.e. 1) to a real
value, and now we can actually set offsetStopLimit properly.

- TruncateMultiXact no longer calls find_multixact_start when there
are no remaining multixacts.  This is actually a completely separate
bug that goes all the way back to 9.3.0 and can potentially cause
TruncateMultiXact to remove every file in pg_multixact/offsets.
Restarting the cluster becomes impossible because TrimMultiXact barfs.

- TruncateMultiXact now logs a message if the oldest multixact does
not precede the earliest one on disk and is not equal to the next
multixact and yet does not exist.  The value of the log message is
that it discovered the bug mentioned in the previous line, so I think
it's earning its keep.

With this version, I'm able to see that when you start up a
9.3.latest+this patch with a cluster that has a bogus value of 1 in
relminmxid, datminmxid, and the control file, autovacuum vacuums
everything in sight, all the values get set back to the right thing,
and the next checkpoint enables the member-wraparound guards.  This
works with both autovacuum=on and autovacuum=off; the emergency
mechanism kicks in as intended.  We'll want to warn people with big
databases who upgrade to 9.3.0 - 9.3.4 via pg_upgrade that they may
want to pre-vacuum those tables before upgrading to avoid a vacuum
storm.  But generally I'm pretty happy with this: forcing those values
to get fixed so that we can guard against member-space wraparound
seems like the right thing to do.

So, to summarize, this patch does the following:

- Fixes the failure-to-start problems introduced in 9.4.2 in
complicated pg_upgrade scenarios.
- Prevents the new calls to find_multixact_start we added in 9.4.2
from happening during recovery, where they can only create failure
scenarios.  The call in TruncateMultiXact that has been there all
along is not eliminated, but now handles failure more gracefully.
- Fixes possible incorrect removal of every single
pg_multixact/offsets file when no multixacts exist; one file should be
kept.
- Forces aggressive autovacuuming when the control file's
oldestMultiXid doesn't point to a valid MultiXact and enables member
wraparound at the next checkpoint following the correction of that
problem.

Thanks,

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
commit 87aa15fe5257060e0c971e135dd9f460fdc00bd0
Author: Robert Haas 
Date:   Thu Jun 4 11:58:49 2015 -0400

bar

diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index 9568ff1..7c457a6 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -198,13 +198,24 @@ typedef struct MultiXactStateData
 	/* next-to-be-assigned offset */
 	MultiXactOffset nextOffset;
 
+	/* Have we completed multixact startup? */
+	bool		finishedStartup;
+
 	/*
-	 * Oldest multixact that is still on disk.  Anything older than this
-	 * should not be consulted.  These values are updated by vacuum.
+	 * Oldest multixact that is still potentially referenced by a relation.
+	 * Anything older than this should not be consulted.  These values are
+	 * updated by vacuum.
 	 */
 	MultiXactId oldestMultiXactId;
 	Oid			oldestMultiXactDB;
+
+	/*
+	 * Oldest multixact offset that is potentially referenced by a
+	 * multixact referenced by a relation.  We don't always know this value,
+	 * so there's a flag here to ind

Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-06-04 Thread Robert Haas

On Thu, Jun 4, 2015 at 5:29 PM, Robert Haas  wrote:
> - Forces aggressive autovacuuming when the control file's
> oldestMultiXid doesn't point to a valid MultiXact and enables member
> wraparound at the next checkpoint following the correction of that
> problem.

Err, enables member wraparound *protection* at the next checkpoint,
not the wraparound itself.

It's worth noting that every startup will now include one of the
following two messages:

LOG:  MultiXact member wraparound protections are now enabled

Or:

LOG:  MultiXact member wraparound protections are disabled because
oldest checkpointed MultiXact %u does not exist on disk
...where %u is probably 1

If you get the second one, you'll get the first one later after vacuum
has done its thing and a checkpoint has happened.

This is, obviously, some log chatter for people who don't have a
problem and never have, but I think it's worth emitting the first
message at startup even when there's no problem, so that people don't
have to make inferences from the absence of a message.  We can tell
people very simply that (1) if they see the first message, everything
is fine; (2) if they see the second message, autovacuum is going to
clean things up and they will eventually see the first message; and
(3) if they see neither message, they haven't upgraded to a fixed
version yet.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-06-05 Thread Robert Haas

On Fri, Jun 5, 2015 at 2:20 AM, Noah Misch  wrote:
> On Thu, Jun 04, 2015 at 05:29:51PM -0400, Robert Haas wrote:
>> Here's a new version with some more fixes and improvements:
>
> I read through this version and found nothing to change.  I encourage other
> hackers to study the patch, though.  The surrounding code is challenging.

Andres tested this and discovered that my changes to
find_multixact_start() were far more creative than intended.
Committed and back-patched with a trivial fix for that stupidity and a
novel-length explanation of the changes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [HACKERS] [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-06-05 Thread Robert Haas

On Fri, Jun 5, 2015 at 12:00 PM, Andres Freund  wrote:
> On 2015-06-05 11:43:45 -0400, Tom Lane wrote:
>> Robert Haas  writes:
>> > On Fri, Jun 5, 2015 at 2:20 AM, Noah Misch  wrote:
>> >> I read through this version and found nothing to change.  I encourage 
>> >> other
>> >> hackers to study the patch, though.  The surrounding code is challenging.
>>
>> > Andres tested this and discovered that my changes to
>> > find_multixact_start() were far more creative than intended.
>> > Committed and back-patched with a trivial fix for that stupidity and a
>> > novel-length explanation of the changes.
>>
>> So where are we on this?  Are we ready to schedule a new set of
>> back-branch releases?  If not, what issues remain to be looked at?
>
> We're currently still doing bad things while the database is in an
> inconsistent state (i.e. read from SLRUs and truncate based on the
> results of that). It's quite easy to reproduce base backup startup
> failures.
>
> On the other hand, that's not new. And the fix requires, afaics, a new
> type of WAL record (issued very infrequently). I'll post a first version
> of the patch, rebased ontop of Robert's commit, tonight or tomorrow. I
> guess we can then decide what we'd like to do.

There are at least two other known issues that seem like they should
be fixed before we release:

1. The problem that we might truncate an SLRU members page away when
it's in the buffers, but not drop it from the buffers, leading to a
failure when we try to write it later.

2. Thomas's bug fix for another longstanding but that occurs when you
run his checkpoint-segment-boundary.sh script.

I think we might want to try to fix one or both of those before
cutting a new release.  I'm less sold on the idea of installing
WAL-logging in this minor release.  That probably needs to be done,
but right now we've got stuff that worked in early 9.3.X release and
is now broken, and I'm in favor of fixing that first.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [HACKERS] [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-06-05 Thread Robert Haas

On Fri, Jun 5, 2015 at 2:47 PM, Andres Freund  wrote:
> On 2015-06-05 14:33:12 -0400, Tom Lane wrote:
>> Robert Haas  writes:
>> > 1. The problem that we might truncate an SLRU members page away when
>> > it's in the buffers, but not drop it from the buffers, leading to a
>> > failure when we try to write it later.
>
> I've got a fix for this, and about three other issues I found during
> development of the new truncation codepath.
>
> I'll commit the fix tomorrow.

OK.  Then I think we should release next week, so we get the fixes we
have out before PGCon.  The current situation is not good.

>> > I think we might want to try to fix one or both of those before
>> > cutting a new release.  I'm less sold on the idea of installing
>> > WAL-logging in this minor release.  That probably needs to be done,
>> > but right now we've got stuff that worked in early 9.3.X release and
>> > is now broken, and I'm in favor of fixing that first.
>
> I've implemented this, and so far it removes more code than it
> adds. It's imo also a pretty clear win in how understandable the code
> is.  The remaining work, besides testing, is primarily going over lots
> of comment and updating them. Some of them are outdated by the patch,
> and some already were.
>
> Will post tonight, together with the other fixes, after I get back from
> climbing.
>
> My gut feeling right now is that it's a significant improvement, and
> that it'll be reasonable to include it. But I'd definitely like some
> independent testing for it, and I'm not sure if that's doable in time
> for the wrap.

I think we would be foolish to rush that part into the tree.  We
probably got here in the first place by rushing the last round of
fixes too much; let's try not to double down on that mistake.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [HACKERS] [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-06-05 Thread Robert Haas

On Fri, Jun 5, 2015 at 2:36 PM, Alvaro Herrera  wrote:
> Tom Lane wrote:
>> Robert Haas  writes:
>
>> > There are at least two other known issues that seem like they should
>> > be fixed before we release:
>>
>> > 1. The problem that we might truncate an SLRU members page away when
>> > it's in the buffers, but not drop it from the buffers, leading to a
>> > failure when we try to write it later.
>>
>> > 2. Thomas's bug fix for another longstanding but that occurs when you
>> > run his checkpoint-segment-boundary.sh script.
>>
>> > I think we might want to try to fix one or both of those before
>> > cutting a new release.  I'm less sold on the idea of installing
>> > WAL-logging in this minor release.  That probably needs to be done,
>> > but right now we've got stuff that worked in early 9.3.X release and
>> > is now broken, and I'm in favor of fixing that first.
>>
>> Okay, but if we're not committing today to a release wrap on Monday,
>> I don't see it happening till after PGCon.
>
> In that case, I think we should get a release out next week.  The
> current situation is rather badly broken and dangerous, and the above
> two bugs are nowhere as problematic.  If we can get fixes for these over
> the weekend, that would be additional bonus.

Yeah, I think I agree.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [HACKERS] [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-06-05 Thread Robert Haas

On Fri, Jun 5, 2015 at 4:40 PM, Andres Freund  wrote:
>>I think we would be foolish to rush that part into the tree.  We
>>probably got here in the first place by rushing the last round of
>>fixes too much; let's try not to double down on that mistake.
>
> My problem with that approach is that I think the code has gotten 
> significantly more complex in the least few weeks. I have very little trust 
> that the interactions between vacuum, the deferred truncations in the 
> checkpointer, the state management in shared memory and recovery are correct. 
> There's just too many non-local subtleties here.
>
> I don't know what the right thing to do here is.

That may be true, but we don't need to get to perfect to be better
than 9.4.2 and 9.4.3, where some people can't start the database.

I will grant you that, if the patch I committed today introduces some
regression that is even worse, life will suck.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [HACKERS] [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-06-08 Thread Robert Haas

On Mon, Jun 8, 2015 at 1:23 PM, Alvaro Herrera  wrote:
> Andres Freund wrote:
>> On June 8, 2015 7:06:31 PM GMT+02:00, Alvaro Herrera 
>>  wrote:
>> >I might be misreading the code, but PMSIGNAL_START_AUTOVAC_LAUNCHER
>> >only causes things to happen (i.e. a new worker to be started) when
>> >autovacuum is disabled.  If autovacuum is enabled, postmaster
>> >receives the signal and doesn't do anything about it, because the
>> >launcher is already running.  Of course, regularly scheduled autovac
>> >workers will eventually start running, but perhaps this is not good
>> >enough.
>>
>> Well that's just the same for the plain xid precedent? I'd not mind
>> improving further, but that seems like a separate thing.
>
> Sure.  I just concern that we might be putting excessive trust on
> emergency workers being launched at a high pace.  With normally
> configured systems (naptime=1min) it shouldn't be a problem, but we have
> seen systems with naptime set to one hour or so, and those might feel
> some pain; and it would get worse the more databases you have, because
> people might feel the need to space the autovac runs even more.
>
> (My personal alarm bells go off when I see autovac_naptime=15min or
> more, but apparently not everybody sees things that way.)

Uh, I'd echo that sentiment if you did s/15min/1min/

I think Andres's patch is just improving the existing mechanism so
that it's reliable, and you're proposing something notably different
which might be better, but which is really a different proposal
altogether.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] SET LOCAL synchronous_commit TO OFF

2015-06-11 Thread Robert DiFalco

I want to make sure I understand the repercussions of this before making it
a global setting.

As far as I can tell this will put data/referential integrity at risk. It
only means that there is a period of time (maybe 600 msecs) between when a
commit occurs and when that data is safe in the case of a server crash.

Is there anything else I'm missing?

Re: [HACKERS] [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

2015-06-15 Thread Robert Haas

On Fri, Jun 12, 2015 at 7:27 PM, Steve Kehlet  wrote:
> Just wanted to report that I rolled back my VM to where it was with 9.4.2
> installed and it wouldn't start. I installed 9.4.4 and now it starts up just
> fine:
>
>> 2015-06-12 16:05:58 PDT [6453]: [1-1] LOG:  database system was shut down
>> at 2015-05-27 13:12:55 PDT
>> 2015-06-12 16:05:58 PDT [6453]: [2-1] LOG:  MultiXact member wraparound
>> protections are disabled because oldest checkpointed MultiXact 1 does not
>> exist on disk
>> 2015-06-12 16:05:58 PDT [6457]: [1-1] LOG:  autovacuum launcher started
>> 2015-06-12 16:05:58 PDT [6452]: [1-1] LOG:  database system is ready to
>> accept connections
>>  done
>> server started
>
> And this is showing up in my serverlog periodically as the emergency
> autovacuums are running:
>
>> 2015-06-12 16:13:44 PDT [6454]: [1-1] LOG:  MultiXact member wraparound
>> protections are disabled because oldest checkpointed MultiXact 1 does not
>> exist on disk
>
> **Thank you Robert and all involved for the resolution to this.**
>
>> With the fixes introduced in this release, such a situation will result in
>> immediate emergency autovacuuming until a correct oldestMultiXid value can
>> be determined
>
> Okay, I notice these vacuums are of the "to prevent wraparound" type (like
> VACUUM FREEZE), that do hold locks preventing ALTER TABLEs and such. Good to
> know, we'll plan our software updates accordingly.
>
> Is there any risk until these autovacuums finish?

As long as you see only a modest number of files in
pg_multixact/members, you're OK.  But in theory, until that emergency
autovacuuming finishes, there's nothing keeping that directory from
wrapping around.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] use null or 0 in foreign key column, to mean "no value"?

2015-06-26 Thread Robert Nikander

Hi,

(Maybe my subject line should be: `is not distinct from` and indexes.)

In Postgres 9.4, I’ve got a table of ‘items’ that references a table ‘colors’.  
Not all items have colors, so I created a nullable column in items like:

  color_id bigint references colors  

There is also an index on color_id:

  create index on items (color_id);

I thought this was the right way to do it, but now I’m not so sure... In 
application code, prepared statements want to say: `select * from items where 
color_id = ?` and that `?` might be a int or null, so that doesn’t work.  I 
used `is not distinct from` instead of =, which has the right meaning, but now 
I notice it doesn’t use the index for queries that replace `=` with `is not 
distinct from`, and queries run much slower.  Using `explain` confirms: it’s 
doing sequential scans where `=` was using index.

So… is this bad DB design to use null to mean that an item has no color? Should 
I instead put a special row in `colors`, maybe with id = 0, to represent the 
“no color” value?  Or is there some way to make an index work with nulls and 
`is not distinct from`?  

thank you,
Rob





-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Average New Users Per DOW

2015-07-06 Thread Robert DiFalco

I'm not sure how to create a result where I get the average number of new
users per day of the week. My issues are that days that did not have any
new users will not be factored into the average, giving an overinflated
result.

This is what I started with:

WITH userdays AS
  (SELECT u.created::DATE AS created,
  to_char(u.created,'Dy') AS d,
  COUNT(*) AS total
   FROM users u
   GROUP BY 1,2),
userdays_avg AS
  (SELECT extract('dow'
  FROM created) AS nDay,
  d AS "Day",
  AVG(total) AS "New Users"
   FROM userdays
   GROUP BY 1,2
   ORDER BY 1)
SELECT "Day", "New Users"
FROM userdays_avg
ORDER BY nDay;


But you can see it wont give correct results since (for example) Monday's
with no new users will not be counted in the average as 0.

TIA

R.

Re: [GENERAL] Average New Users Per DOW

2015-07-06 Thread Robert DiFalco

Thanks Paul, I guess I'm not sure how a generate_series between 0 to 6
would solve this problem. Wouldn't I have to generate a series based on the
date range (by day) and then group by DOW _after_ that? Can you give me an
example of how I'd do it with a series based on 0 to 6?

On Mon, Jul 6, 2015 at 10:58 AM, Paul Jungwirth  wrote:

> > I'm not sure how to create a result where I get the average number of
> > new users per day of the week. My issues are that days that did not
> > have any new users will not be factored into the average
>
> This is a pretty common problem with time-series queries when there is
> sparse data. My go-to solution is to use generate_series---in your case
> from 0 to 6---then do a left join from there to your actual data.
>
> Paul
>
>
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Re: [GENERAL] Average New Users Per DOW

2015-07-06 Thread Robert DiFalco

Paul, I'm sure I'm missing something but it seems like your approach will
not work. It's because the LEFT OUTER JOIN is on the numeric day of the
week. So if you had this query going over weeks or months of data wouldn't
you have the same issue with the days that had no new users not being
factored into the AVG?  I ended up doing something like this, which seems
to work pretty well.

WITH usersByDay AS (
SELECT cDate, COUNT(*) AS total
FROM (
SELECT generate_series(
{CALENDAR_INTERVAL.START}::DATE,
{CALENDAR_INTERVAL.END}::DATE,
interval '1 day')::DATE AS cDate
) AS c
LEFT OUTER JOIN users u ON u.created::DATE = c.cDate
GROUP BY cDate),
avgUsersByDOW AS (
SELECT extract('dow' FROM cDate) AS nDay,
 to_char(cDate,'Dy') AS "Day",
ROUND(AVG(total), 2) AS "New Users"
FROM usersByDay
GROUP BY 1, 2
ORDER BY 1)
SELECT "Day", "New Users" FROM avgUsersByDOW ORDER BY nDay




On Mon, Jul 6, 2015 at 11:30 AM, Paul Jungwirth  wrote:

> Thanks Paul, I guess I'm not sure how a generate_series between 0 to 6
>> would solve this problem. Wouldn't I have to generate a series based on
>> the date range (by day) and then group by DOW _after_ that? Can you give
>> me an example of how I'd do it with a series based on 0 to 6?
>>
>
> Looks like David Johnston beat me to it! :-) But this is what I had in
> mind:
>
> SELECT  s.d AS dow,
> COUNT(u.id) c
> FROMgenerate_series(0, 6) s(d)
> LEFT OUTER JOIN users u
> ON  EXTRACT(dow FROM created) = s.d
> GROUP BY dow
> ORDER BY dow
> ;
>
> You can also get human-readable DOW names by creating a 7-row CTE table
> and joining to it based on the numeric dow.
>
> Paul
>
>

Re: [GENERAL] Average New Users Per DOW

2015-07-06 Thread Robert DiFalco

>
> I am fairly certain this does not give you the correct results.
> Specifically, the minimum value for each cDate is going to be 1 since
> count(*) counts NULLs.  count(u) should probably work.
> 
>
> Yes you are right, I forgot to change COUNT(*) to COUNT(id), as you
mention COUNT(u.*) will also work. I just couldn't get the idea of
generating a sequence form 0 to 6 to work correctly. The approach I'm using
seems to give the correct results (with COUNT(u.id)).

[GENERAL] Bounded Zone Offset Query

2015-07-10 Thread Robert DiFalco

I have a table something like this:

CREATE TABLE devices (
  owner_idBIGINT NOT NULL,
  utc_offset_secs INT,
  PRIMARY KEY (uid, platform),
  FOREIGN KEY (owner_id) REFERENCES users(id) ON DELETE CASCADE
);


I want to do a query from an application that returns all devices who's
time is between 10am or 10pm for a given instant in time.

For example:

SELECT *
FROM devices
WHERE :utcSecondsOfDay + utc_offset_secs BETWEEEN 10am AND 10pm



In the above query assume the correct "seconds of day" values for 10am and
10pm. The problem is that I have to do addition on each record to do the
above query and I can't imagine that would be efficient. Also I think it
this example query will only work in some cases. For example what if the
utcSecondsOfDay is 360 (i.e. 1am) and the utc_offset_secs is -5 hours?

Thanks

Re: [GENERAL] Bounded Zone Offset Query

2015-07-10 Thread Robert DiFalco

On Fri, Jul 10, 2015 at 9:40 AM, John McKown 
wrote:

> On Fri, Jul 10, 2015 at 11:15 AM, Robert DiFalco  > wrote:
>
>> I have a table something like this:
>>
>> CREATE TABLE devices (
>>   owner_idBIGINT NOT NULL,
>>   utc_offset_secs INT,
>>   PRIMARY KEY (uid, platform),
>>   FOREIGN KEY (owner_id) REFERENCES users(id) ON DELETE CASCADE
>> );
>>
>>
>> I want to do a query from an application that returns all devices who's
>> time is between 10am or 10pm for a given instant in time.
>>
>> For example:
>>
>> SELECT *
>> FROM devices
>> WHERE :utcSecondsOfDay + utc_offset_secs BETWEEEN 10am AND 10pm
>>
>>
>>
>> In the above query assume the correct "seconds of day" values for 10am
>> and 10pm. The problem is that I have to do addition on each record to do
>> the above query and I can't imagine that would be efficient. Also I think
>> it this example query will only work in some cases. For example what if the
>> utcSecondsOfDay is 360 (i.e. 1am) and the utc_offset_secs is -5 hours?
>>
>> Thanks
>>
>
> I'm not sure exactly what :utSecondsOfDay really is. I guess it is an
> integer which is a "time" value, such as "seconds after midnight" and thus
> would range be from 0 to 24*60*60=86400 (actually 86399, I guess). In this
> notation, 10 am would be 10*60*60 or 36000 and 10pm would be 22*60*60 or
> 79200. How about calculating, in your application code, two different
> values: utcSecondsLower and utSecondsHigher. utcSecondsLower would be
> 36000-utcSecondsOfDay. utcSecondsHigher would be 79200-utSecondsOfDay.
> Change the SELECT to be:
>
> SELECT *
> FROM devices
> WHERE ut_offsec_secs BETWEEN :utcSecondsLower AND :utcSecondsHigher;
>
> I am not sure, but I think that is legal. Or maybe it gives you another
> approach.
>
>
> --
>
> Schrodinger's backup: The condition of any backup is unknown until a
> restore is attempted.
>
> Yoda of Borg, we are. Futile, resistance is, yes. Assimilated, you will be.
>
> He's about as useful as a wax frying pan.
>
> 10 to the 12th power microphones = 1 Megaphone
>
> Maranatha! <><
> John McKown
>


Thanks John, let me revise my original query to give a CORRECT and working
example, maybe this will help. I've created a query that actually works,
it's just ugly and I'd like to figure out how to make it like the example
you gave (i.e. no math on the utc_offset field, just comparisons).

 select *
 from devices d
 where (now() at time zone 'UTC' + make_interval(hours :=
d.utc_offset))::time
BETWEEN time '10:00' AND time '22:00';

[GENERAL] ts_rank and ts_rank_cd with multiple search terms

2015-07-25 Thread Robert Nikander

Hi,

I’m reading about the ranking functions [1], and I have a couple questions… 

1. Is ts_rank taking proximity of terms into account? It seems like it is, but 
the docs suggest that only ts_rank_cd does that.
2. Is there a way to search multiple terms like ‘a | b | c …’ but score higher 
when multiple match, AND take into account distance between words? It doesn’t 
seem like basic use of ts_rank or ts_rank_cd is doing this.  Do you recommend a 
custom ranking function here?

For example, I want to search for “black bear” and get better results ordered 
so that documents with both words close together score highest, and the 
document with only “bear" is the last.

create table search_test ( title text, body text, vec tsvector );
— These 3 have “black” and “bear” at different distances from each other
insert into search_test values ('close', 'The black bear sat on a brown 
rock and ate a barrel of red berries.');
insert into search_test values ('medium', 'The brown bear sat on a black 
rock and ate a barrel of red berries.’);
insert into search_test values ('far', 'The brown bear sat on a red rock 
and ate a barrel of black berries.’);
— This one has the word “bear”, but not “black"
insert into search_test values ('only bear', 'The brown bear sat on a red 
rock and ate a barrel of orange berries.');
update search_test set vec = to_tsvector(body);

Now a query:

select title, ts_rank(vec, q) as rank
from search_test, to_tsquery('black & bear') q
where vec @@ q order by rank desc;

That surprises me by scoring close > medium > far. Hence, my question #1.  
Substituting ts_rank_cd also works, as expected.

If I change the query to `black | bear`, to try to match “only bear” as well, 
then both ts_rank and ts_rank_cd return equal rankings for “close”, “medium” 
and “far”.

Any recommendations? 

thanks,
Rob 



[1] 
http://www.postgresql.org/docs/9.4/static/textsearch-controls.html#TEXTSEARCH-RANKING

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] [HACKERS] Change in order of criteria - reg

2016-06-03 Thread Robert Haas

On Wed, Jun 1, 2016 at 5:22 AM, Amit Langote
 wrote:
> On 2016/06/01 13:07, sri harsha wrote:
>> Hi,
>>
>> In PostgreSQL , does the order in which the criteria is given matter ??
>> For example
>>
>> Query 1 : Select * from TABLE where a > 5 and b < 10;
>>
>> Query 2 : Select * from TABLE where b <10 and a > 5;
>>
>> Are query 1 and query 2 the same in PostgreSQL or different ?? If its
>> different , WHY ??
>
> tl;dr they are the same.  As in they obviously produce the same result and
> result in invoking the same plan.
>
> Internally, optimizer will order application of those quals in resulting
> plan based on per-tuple cost of individual quals.  So a cheaper, more
> selective qual might result in short-circuiting of relatively expensive
> quals for a large number of rows in the table saving some cost in
> run-time.  Also, if index scan is chosen and quals pushed down, the
> underlying index method might know to order quals smartly.
>
> However, the cost-markings of operators/functions involved in quals better
> match reality.  By default, most operators/functions in a database are
> marked with cost of 1 unit.  Stable sorting used in ordering of quals
> would mean the order of applying quals in resulting plan matches the
> original order (ie, the order in which they appear in the query).  So, if
> the first specified qual really happens to be an expensive qual but marked
> as having the same cost as other less expensive quals, one would have to
> pay the price of evaluating it for all the rows.  Whereas, correctly
> marking the costs could have avoided that (as explained above).  Note that
> I am not suggesting that ordering quals in query by their perceived cost
> is the solution.  Keep optimizer informed by setting costs appropriately
> and it will do the right thing more often than not. :)

I think that if the costs are actually identical, the system will keep
the quals in the same order they were written - so then the order does
matter, a little bit.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: R: [GENERAL] Vacuum full: alternatives?

2016-06-20 Thread Robert Wysocki

On Mon, 2016-06-20 at 11:43 +0200, Job wrote:
> Hi Andreas,
> 
> >I would suggest run only autovacuum, and with time you will see a not 
> >more growing table. There is no need for vacuum full.
> 
> So new record, when will be pg_bulkloaded, will replace "marked-free" 
> location?

Yes, but you may have to make autovacuum process more aggressive. Even
then it might not cope with the frequency of your bulk I/U/D.

There are many tools to use instead of VACUUM FULL though, have a look
at pg_reorg and pgcompact for example. Do not be afraid to use an
awesome tool called Google as well ;-) (This is like the very basic
problem everyone asks about, so you'll find many more in-depth answers
and articles; the phrase you want to google for is "postgresql bloat")

Cheers,
R.

NET-A-PORTER.COM

CONFIDENTIALITY NOTICE
The information in this email is confidential and is intended solely for the 
addressee. Access to this email by anyone else is unauthorised. If you are not 
the intended recipient, you must not read, use or disseminate the information. 
Any views expressed in this message are those of the individual sender, except 
where the sender specifically states them to be the views of Net-A-Porter Group 
Limited. 

The Net-A-Porter Group Limited is a company registered in England & Wales 
Number: 3820604 Registered Office: 1 The Village Offices, Westfield, Ariel Way, 
London, W12 7GF

Re: [HACKERS] [GENERAL] PgQ and pg_dump

2016-06-21 Thread Robert Haas

On Thu, Jun 16, 2016 at 1:46 PM, Martín Marqués  wrote:
> The comment is accurate on what is going to be dumpable and what's not
> from the code. In our case, as the pgq schema is not dumpable becaause
> it comes from an extension, other objects it contain will not be
> dumpable as well.
>
> That's the reason why the PgQ event tables created by
> pgq.create_queue() are not dumped.

That sucks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] [HACKERS] pg_dumping extensions having sequences with 9.6beta3

2016-07-29 Thread Robert Haas

On Wed, Jul 27, 2016 at 2:24 AM, Michael Paquier
 wrote:
> On Wed, Jul 27, 2016 at 8:07 AM, Stephen Frost  wrote:
>> That'd be great.  It's definitely on my list of things to look into, but
>> I'm extremely busy this week.  I hope to look into it on Friday, would
>> be great to see what you find.
>
> Sequences that are directly defined in extensions do not get dumped,
> and sequences that are part of a serial column in an extension are
> getting dumped. Looking into this problem, getOwnedSeqs() is visibly
> doing an incorrect assumption: sequences owned by table columns are
> dumped unconditionally, but this is not true for sequences that are
> part of extensions. More precisely, dobj->dump is being enforced to
> DUMP_COMPONENT_ALL, which makes the sequence definition to be dumped.
> Oops.
>
> The patch attached fixes the problem for me. I have added as well
> tests in test_pg_dump in the shape of sequences defined in an
> extension, and sequences that are part of a serial column. This patch
> is also able to work in the case where a sequence is created as part
> of a serial column, and gets removed after, say with ALTER EXTENSION
> DROP SEQUENCE. The behavior for sequences and serial columns that are
> not part of extensions is unchanged.
>
> Stephen, it would be good if you could check the correctness of this
> patch as you did all this refactoring of pg_dump to support catalog
> ACLs. I am sure by the way that checking for (owning_tab->dobj.dump &&
> DUMP_COMPONENT_DEFINITION) != 0 is not good because of for example the
> case of a serial column created in an extension where the sequence is
> dropped from the extension afterwards.

Stephen, is this still on your list of things for today?  Please
provide a status update.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Graphical entity relation model

2016-10-04 Thread Robert Stone

Hello,

If you have jdk 1.8 or above installed go to www.executequery.org and
download the latest jar file. Download the JDBC driver from Postgres and
set it up. It's open source.
It has an ERD generator but obviously depends on having all your foreign
keys declared in order to link tables, etc. After generating the ERD you
then have to re-organise it by dragging and dropping so that when you
print, the diagram is readable.
I use it all the time for testing, etc.

HTH,
Rob

On 1 October 2016 at 04:45,  wrote:

> > Does anybody know a Software for generating graphical entity relation
> models from existing postgresql databases?
> >
> > Best regards Johannes
>
> I use dbWrench (dbwrench.com). It's not free, but they do have a free
> trial version so you can see if you like it before you buy it. It's also
> not expensive compared to many of these sorts of tools. It also runs on all
> 3 major platforms (it's written in Java) and the developer is responsive if
> you find a problem.
>
> If money is no object, you can look at Power Designer (by Sybase). I used
> to use it years ago and liked it even if it was MS-Windows only, but the
> price has gone up so much only companies can really afford it now, IMO.
>
> HTH,
> Kevin
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

[GENERAL] Invoice Table Design

2016-11-24 Thread Robert Heinen

I was wondering if anyone might be able to help me out with a table design
question.

A quick intro -- I'm helping a company switch from a mongo database over to
postgresql (yay!). The company is a marketplace app for musicians and
hosts. The basic idea is that a host can book a musician for an event, like
a wedding or a birthday. Also, an artist and a host can be either basic or
"pro" accounts -- if they're "pro" then they pay a little bit more and get
some extra features.

The design I'm struggling with is how to handle invoices and transactions
in postgres. In mongo, everything is stuffed into a single 'invoices' table
that includes sender and receiver addresses, the amount of the invoice,
taxes, etc. It also contains a reference to the booked event, the artist
and the host, as well as some state information through nullable columns --
created date, sent date, paid date.

At the same time the table also tracks the above mentioned "pro"
subscriptions by utilizing a type field (so 'concertfee' vs
'subscription'). So both type of invoices are stuffed into the table and
it's up to the application to understand the difference in the types.

To translate this to postgres, I'm leaning towards breaking out the
different types of invoices into their own tables but keeping the basics of
an invoice (sender, receiver, amount) and then referencing from specific
tables like -- subscription_invoices and event_invoices.

so tables would be:
invoices (invoice_uuid primary key)
event_invoices (invoice_uuid FK, event_uuid FK)
artist_subscription_invoices (invoice_uuid FK, artist_uuid FK)

There is one last interesting part. When an event is booked, two invoices
are generated -- one from the artist to the host for the payment of the
concert, and then a second one from my company to the artist for the
booking fee. Again, these seem like two separate tables, with, I suppose,
 a kind of a parent-child relationship (we can't have a booking fee unless
we have the original invoice for the booking).

Thanks for reading --any insight, comments, or questions are appreciated!

Rob

[GENERAL] Moving pg_xlog

2016-12-01 Thread Robert Inder

I'm running  Postgres9.4 in master/hot-standby mode on a few pairs of servers.

While recovering from A Bit Of Bother last week, I came across a
posting saying that pg_xlog should be on a separate partition.

I tried to find out more about this, by consulting the PostgresQL
documentation (i.e.
https://www.postgresql.org/docs/9.4/static/index.html )
But all I could find was a mention that "It is advantageous if the log
is located on a different disk from the main database files".

The questions:
1. WHY is this good?  Is it (just) to stop pg_xlog filling the
database disk/partition?  Or are there performance implications?
SPECIFICALLY: my database is currently in "/", which is on SSD.  Is it
better to move pg_xlog to another partition on the same SSD?  Or to a
physical disk or SAN?

2. What are the implications for doing a base backup?  I believe I
read that putting pg_xlog on a different partition meant it would be
omitted from a file-system bulk copy (e.g. rsync), and this was a GOOD
thing, because the copy operation would be faster -- not copying
pg_xlog would not prevent the standby server from starting, because
the information it needed would be in the WAL files that would be
shipped separately.  Have I got that right?

Finally, the suggestion.

I'd really like to read an explicit discussion of this in the official
documentation, rather than just glean what I can from answers to
questions.
The possibility of moving pg_xlog to another disk is mentioned in the
documentation, but I almost missed it because it is in "the wrong
place".  It is in Section 29.5 -- "Reliability and the Write Ahead
Log" / "WAL Internals".  But I wasn't interested in anything INTERNAL:
I wanted to know where I should try to locate it/them.  So I'd looked
in "the obvious places" -- Section 18 (Server configuration), and in
particular 18.2 "File Locations".  Could I suggest that the motivation
for doing this, and the consequences for backups, should be discussed
in "the right place" -- in or near the section that talks about file
locations in the context of server configuration.

Robert.

-- 
Robert Inder,0131 229 1052 / 07808 492 213
Interactive Information Ltd,   3, Lauriston Gardens, Edinburgh EH3 9HH
Registered in Scotland, Company no. SC 150689
   Interactions speak louder than words


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Moving pg_xlog

2016-12-03 Thread Robert Inder

Thanks, everyone, for your comments.

I think I've got a clearer idea of what's going on now...

Robert.


On 1 December 2016 at 13:55, Robert Inder  wrote:
> I'm running  Postgres9.4 in master/hot-standby mode on a few pairs of servers.
>
> While recovering from A Bit Of Bother last week, I came across a
> posting saying that pg_xlog should be on a separate partition.
>
> I tried to find out more about this, by consulting the PostgresQL
> documentation (i.e.
> https://www.postgresql.org/docs/9.4/static/index.html )
> But all I could find was a mention that "It is advantageous if the log
> is located on a different disk from the main database files".
>
> The questions:
> 1. WHY is this good?  Is it (just) to stop pg_xlog filling the
> database disk/partition?  Or are there performance implications?
> SPECIFICALLY: my database is currently in "/", which is on SSD.  Is it
> better to move pg_xlog to another partition on the same SSD?  Or to a
> physical disk or SAN?
>
> 2. What are the implications for doing a base backup?  I believe I
> read that putting pg_xlog on a different partition meant it would be
> omitted from a file-system bulk copy (e.g. rsync), and this was a GOOD
> thing, because the copy operation would be faster -- not copying
> pg_xlog would not prevent the standby server from starting, because
> the information it needed would be in the WAL files that would be
> shipped separately.  Have I got that right?
>
> Finally, the suggestion.
>
> I'd really like to read an explicit discussion of this in the official
> documentation, rather than just glean what I can from answers to
> questions.
> The possibility of moving pg_xlog to another disk is mentioned in the
> documentation, but I almost missed it because it is in "the wrong
> place".  It is in Section 29.5 -- "Reliability and the Write Ahead
> Log" / "WAL Internals".  But I wasn't interested in anything INTERNAL:
> I wanted to know where I should try to locate it/them.  So I'd looked
> in "the obvious places" -- Section 18 (Server configuration), and in
> particular 18.2 "File Locations".  Could I suggest that the motivation
> for doing this, and the consequences for backups, should be discussed
> in "the right place" -- in or near the section that talks about file
> locations in the context of server configuration.
>
> Robert.
>
> --
> Robert Inder,    0131 229 1052 / 07808 492 213
> Interactive Information Ltd,   3, Lauriston Gardens, Edinburgh EH3 9HH
> Registered in Scotland, Company no. SC 150689
>Interactions speak louder than 
> words



-- 
Robert Inder,0131 229 1052 / 07808 492 213
Interactive Information Ltd,   3, Lauriston Gardens, Edinburgh EH3 9HH
Registered in Scotland, Company no. SC 150689
   Interactions speak louder than words


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Cannot create role, no default superuser role exists

2016-03-10 Thread Robert McAlpine

Postgresql 9.5, Ubuntu 14.04.

I broke my ability to access postgres after attemping to install
postgres-xc (ironic, since I installed that to use pg_ctl to reload my
pg_hba.conf without restarting the entirety of postgres).

After doing so I can no longer access my databases via psql or any other
utilities, getting the error:

FATAL:  role "postgres" does not exist

Unfortunately the same error gets thrown if I attempt to use 'createuser'.

The postgres user and role were the only ones able to access postgresql
server, so altering the permissions in my pg_hba.conf also got me nowhere.

I made a complete copy of the data directory, took a snapshot (it is on a
vm), then purged postgresql-9.5 and reinstalled, restoring to a dump that
is a few days old. I thought about copying all the files from the data
directory (except for the roles table file) into a fresh install with a
correct default postgres role, but my gut tells me that screwing around
with those files is doomed to fail.

I would appreciate any help or thoughts on how to recover access to the
data.
-- 
Robert McAlpine
r...@pfcta.com

Re: [GENERAL] Cannot create role, no default superuser role exists

2016-03-10 Thread Robert McAlpine

Thank you for the quick reply.

I very recently had some success by doing a complete purge of
postgresql-9.5, reinstalling and then copying in the original data
directory (or I guess I could have just pointed to it, but w/e). I did not
expect this to work, as I thought it would just point to the table holding
the roles

But to answer your questions anyways:


Did you install postgres-xc over your existing instance?

I think that is probably what happened. The reason I installed it is that I
tried to run the command 'pg_ctl' and my box and couldn't find it. I had
not used pg_ctl before, so I assumed it was a utility package, and
installed it via postgres-xc.


Which data directory, the postgres-xc one or the original Postgres one?

The original.


Installed from a package or source?

>From a package, specifically apt-get install postgresql-9.5


The dump was from a pre-xc version of Postgres?

No, my postgres version has not changed recently


before or after the reinstall?

After. I wanted to see if apt-get remove and then apt-get install would be
enough of a jolt to fix the issue, as I didn't yet want to try apt-get
purge, which would remove all my data.





On Thu, Mar 10, 2016 at 7:44 PM, Adrian Klaver 
wrote:

> On 03/10/2016 04:11 PM, Robert McAlpine wrote:
>
>>
>> Postgresql 9.5, Ubuntu 14.04.
>>
>> I broke my ability to access postgres after attemping to install
>> postgres-xc (ironic, since I installed that to use pg_ctl to reload my
>> pg_hba.conf without restarting the entirety of postgres).
>>
>
> That is available with the stock Postgres, unless I am missing something:
>
> http://www.postgresql.org/docs/9.5/interactive/auth-pg-hba-conf.html
> "The pg_hba.conf file is read on start-up and when the main server process
> receives a SIGHUP signal. If you edit the file on an active system, you
> will need to signal the postmaster (using pg_ctl reload or kill -HUP) to
> make it re-read the file."
>
>
> Did you install postgres-xc over your existing instance?
>
>
>> After doing so I can no longer access my databases via psql or any other
>> utilities, getting the error:
>>
>> FATAL:  role "postgres" does not exist
>>
>> Unfortunately the same error gets thrown if I attempt to use 'createuser'.
>>
>> The postgres user and role were the only ones able to access postgresql
>> server, so altering the permissions in my pg_hba.conf also got me nowhere.
>>
>> I made a complete copy of the data directory, took a snapshot (it is on
>>
>
> Which data directory, the postgres-xc one or the original Postgres one?
>
> a vm), then purged postgresql-9.5 and reinstalled, restoring to a dump
>> that is a few days old. I thought about copying all the files from the
>>
>
> Installed from a package or source?
>
> The dump was from a pre-xc version of Postgres?
>
> It would seem to me if you reinstalled in default manner you would have a
> postgres user available. So where did you get:
>
> FATAL:  role "postgres" does not exist
>
> before or after the reinstall?
>
> data directory (except for the roles table file) into a fresh install
>> with a correct default postgres role, but my gut tells me that screwing
>> around with those files is doomed to fail.
>>
>
> Yeah, I would hold off doing that until it is clearer what is going on.
>
>
>> I would appreciate any help or thoughts on how to recover access to the
>> data.
>> --
>> Robert McAlpine
>> r...@pfcta.com <mailto:r...@pfcta.com>
>>
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>



-- 
Robert McAlpine
DevOps Engineer
Perfecta Federal <http://www.perfectafederal.com/>
6506 Loisdale Road
Springfield, VA 22150
O: 202.888.4949 ext 1005
C: 757.620.3503
r...@pfcta.com

Re: [GENERAL] vacuum - reclaiming disk space.

2016-03-19 Thread Robert McAlpine

Just to throw some extreme ideas out there, you could stand up a postgres
on some other server, pg_dump your current database and use that dump to
build up your second postgres. Use that new postgres when your system goes
live again after downtime. Restoring from a dump means your database would
not take up as much space since I assume your issue is that all that space
was allocated to postgres for the purposes of your large number of table
updates.

On Thu, Mar 17, 2016 at 11:34 AM, Melvin Davidson 
wrote:

>
>
> On Thu, Mar 17, 2016 at 10:57 AM, bricklen  wrote:
>
>> On Thu, Mar 17, 2016 at 7:27 AM, Mike Blackwell 
>> wrote:
>>
>>> I have a large table with numerous indexes which has approximately
>>> doubled in size after adding a column - every row was rewritten and 50% of
>>> the tuples are dead.  I'd like to reclaim this space, but VACUUM FULL
>>> cannot seem to finish within the scheduled downtime.
>>>
>>> Any suggestions for reclaiming the space without excessive downtime?
>>>
>>
>> pg_repack is a good tool for removing bloat.
>> https://github.com/reorg/pg_repack
>>
>>
> "I have a large table with numerous indexes :
> My first thought is, "DEFINE NUMEROUS". How many indexes do you actually
> have? How many of those indexes are actually used? In addition to VACUUMing
> the table, it also needs to go through every index you have.
> So find out if you have any unneeded indexes with:
>
> SELECT n.nspname as schema,
>i.relname as table,
>i.indexrelname as index,
>i.idx_scan,
>i.idx_tup_read,
>i.idx_tup_fetch,
>pg_size_pretty(pg_relation_size(quote_ident(n.nspname) || '.' ||
> quote_ident(i.relname))) AS table_size,
>pg_size_pretty(pg_relation_size(quote_ident(n.nspname) || '.' ||
> quote_ident(i.indexrelname))) AS index_size,
>pg_get_indexdef(idx.indexrelid) as idx_definition
>   FROM pg_stat_all_indexes i
>   JOIN pg_class c ON (c.oid = i.relid)
>   JOIN pg_namespace n ON (n.oid = c.relnamespace)
>   JOIN pg_index idx ON (idx.indexrelid =  i.indexrelid )
>  WHERE i.idx_scan = 0
>AND NOT idx.indisprimary
>    AND NOT idx.indisunique
>  ORDER BY 1, 2, 3;
>
> Then drop any index that shows up!
>
> --
> *Melvin Davidson*
> I reserve the right to fantasize.  Whether or not you
> wish to share my fantasy is entirely up to you.
>



-- 
Robert McAlpine
DevOps Engineer
Perfecta Federal <http://www.perfectafederal.com/>
6506 Loisdale Road
Springfield, VA 22150
O: 202.888.4949 ext 1005
C: 757.620.3503
r...@pfcta.com

Re: [HACKERS] [GENERAL] Request - repeat value of \pset title during \watch interations

2016-03-19 Thread Robert Haas

On Thu, Mar 10, 2016 at 1:40 AM, David G. Johnston
 wrote:
> Adding -hackers for consideration in the Commitfest.

I don't much like how this patch uses the arbitrary constant 50 in no
fewer than 5 locations.

Also, it seems like we could arrange for head_title to be "" rather
than NULL when myopt.title is NULL.  Then instead of this:

+if (head_title)
+snprintf(title, strlen(myopt.title) + 50,
+ _("Watch every %lds\t%s\n%s"),
+ sleep, asctime(localtime(&timer)), head_title);
+else
+snprintf(title, 50, _("Watch every %lds\t%s"),
+ sleep, asctime(localtime(&timer)));

...we could just the first branch of that if all the time.

 if (res == -1)
+{
+pg_free(title);
+pg_free(head_title);
 return false;
+}

Instead of repeating the cleanup code, how about making this break;
then, change the return statement at the bottom of the function to
return (res != -1).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [HACKERS] [GENERAL] Request - repeat value of \pset title during \watch interations

2016-03-21 Thread Robert Haas

On Sun, Mar 20, 2016 at 9:31 AM, Michael Paquier
 wrote:
> And the patch attached gives the following output:
> With title:
> =# \watch 1
> Watch every 1sSun Mar 20 22:28:38 2016
> popo
>  a
> ---
>  1
> (1 row)
>
> And without title:
> Watch every 1sSun Mar 20 22:29:31 2016
>
>  a
> ---
>  1
> (1 row)

And does everybody agree that this is a desirable change?

As for the patch itself, you could replace all this:

+   /*
+* Take into account any title present in the user setup as a part of
+* what is printed for each iteration by using it as a header.
+*/
+   if (myopt.title)
+   {
+   title_len = strlen(myopt.title);
+   title = pg_malloc(title_len + 50);
+   head_title = pg_strdup(myopt.title);
+   }
+   else
+   {
+   title_len = 0;
+   title = pg_malloc(50);
+   head_title = pg_strdup("");
+   }

...with:

head_title = pg_strdup(myopt.title != NULL ? myopt.title : "");
title_len = strlen(head_title);
title = pg_malloc(title_len + 50);

Better yet, include the + 50 in title_len, and then you don't need to
reference the number 50 again further down.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [HACKERS] [GENERAL] Request - repeat value of \pset title during \watch interations

2016-03-21 Thread Robert Haas

On Mon, Mar 21, 2016 at 11:17 AM, David G. Johnston
 wrote:
>> And does everybody agree that this is a desirable change?
>
> Adding the title is desirable.  While I'm inclined to bike-shed this
> anything that gets it in I can live with and so I'm content letting the
> author/committer decide where exactly things (including whitespace) appear.
>
> It is a bit odd that the "Watch every %s" gets centered if the result is
> wide but that the title remains left-aligned.

Well, the title isn't normally centered, but yeah, that is odd.  Yeah,
that is odd.  Come to think of it, I think I might have expected the
title to appear *above* "Watch every %s", not below it.  That might
decrease the oddness.

As for letting the committer decide, I don't care about this
personally at all, so I'm only looking at it to be nice to the people
who do.  Whatever is the consensus is OK with me.  I just don't want
to get yelled at later for committing something here, so it would be
nice to see a few votes for whatever we're gonna do here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Request - repeat value of \pset title during \watch interations

2016-03-21 Thread Robert Haas

On Mon, Mar 21, 2016 at 2:09 PM, David G. Johnston
 wrote:
> On Monday, March 21, 2016, Tom Lane  wrote:
>> "David G. Johnston"  writes:
>> > I'd rather not omit sleep but removing "Watch every" is fine (preferred
>> > actually), so:
>> > Title Is Here Mon Mar 21 15:05:06 2016 (5s)
>>
>> Meh ... seems a bit awkward to me.  Couldn't you include " (5s)" in the
>> title, if you want that info?  If it's variable, you could still
>> accommodate that:
>
> Actually, only if it's a variable that you setup and repeat and you show.  A
> bit cumbersome and mixes the parts that are title and those that are present
> only because you are watching.

Ah, come on.  This doesn't really seem like an issue we should spend
more time quibbling about.  I think Tom's version is fine.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Create index concurrently hanging with big table on pgsql 9.3.12

2016-05-09 Thread Robert Anderson

Hi,

We are trying to create a index concurrently but, at least apparently, it
hangs in a infinite loop and never ends.

Our version:

flip=# select version();
version

 PostgreSQL 9.3.12 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7
2012
0313 (Red Hat 4.4.7-16), 64-bit
(1 row)


Index creation:

CREATE INDEX CONCURRENTLY index_texto
  ON flip_pagina_edicao
  USING hash
  (texto COLLATE pg_catalog."default");  -- texto is a text data type.

Size of the table:

flip=# select pg_size_pretty(pg_total_relation_size('flip_pagina_edicao'));
 pg_size_pretty

 956 GB
(1 row)


Process strace:


semop(622611, {{6, 1, 0}}, 1)   = 0
semop(655380, {{5, -1, 0}}, 1)  = 0
select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
semop(655380, {{5, -1, 0}}, 1)  = 0
semop(622611, {{6, 1, 0}}, 1)   = 0
semop(655380, {{5, -1, 0}}, 1)  = 0
semop(622611, {{6, 1, 0}}, 1)   = 0
semop(622611, {{6, 1, 0}}, 1)   = 0
semop(622611, {{6, 1, 0}}, 1)   = 0
semop(655380, {{5, -1, 0}}, 1)  = 0
semop(622611, {{6, 1, 0}}, 1)   = 0
semop(655380, {{5, -1, 0}}, 1)  = 0
semop(622611, {{6, 1, 0}}, 1)   = 0
semop(557073, {{2, 1, 0}}, 1)   = 0
semop(655380, {{5, -1, 0}}, 1)  = 0
semop(655380, {{5, -1, 0}}, 1)  = 0
semop(655380, {{5, -1, 0}}, 1)  = 0
semop(655380, {{5, -1, 0}}, 1)  = 0
...


Thanks in advance.

Robert

Re: [GENERAL] Create index concurrently hanging with big table on pgsql 9.3.12

2016-05-09 Thread Robert Anderson

Hi,

There aren't transactions blocking:

postgres=# SELECT
postgres-#w.query as waiting_query,
postgres-#w.pid as waiting_pid,
postgres-#w.usename as w_user,
postgres-#l.pid as blocking_pid,
postgres-#l.usename as blocking_user,
postgres-#t.schemaname || '.' || t.relname as tablename
postgres-#FROM pg_stat_activity w
postgres-#JOIN pg_locks l1 ON (w.pid = l1.pid and not l1.granted)
postgres-#JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted)
postgres-#JOIN pg_stat_activity l ON (l2.pid = l.pid)
postgres-#JOIN pg_stat_user_tables t ON (l1.relation = t.relid)
postgres-#WHERE w.waiting;
 waiting_query | waiting_pid | w_user | blocking_pid | blocking_user |
tablename
---+-++--+---+---
(0 rows)

How long I'm waiting:

postgres=# \x
Expanded display is on.
postgres=# select * from pg_stat_activity where query like 'CREATE%';
-[ RECORD 1 ]+
datid| 16434
datname  | flip
pid  | 3990
usesysid | 10
usename  | postgres
application_name | psql
client_addr  |
client_hostname  |
client_port  | -1
backend_start| 2016-05-07 11:48:39.218398-03
xact_start   | 2016-05-07 11:48:43.417734-03
query_start  | 2016-05-07 11:48:43.417734-03
state_change | 2016-05-07 11:48:43.417742-03
waiting  | f
state| active
query| CREATE INDEX CONCURRENTLY index_texto
 |   ON flip_pagina_edicao
 |   USING hash
 |   (texto COLLATE pg_catalog."default");

postgres=#


In attachment follows a strace sample of the running process.

2016-05-09 13:25 GMT-03:00 Melvin Davidson :

> Try the following query. See if it shows you if another transaction is
> blocking the needed locks to create the index.
>
> SELECT
>w.query as waiting_query,
>w.pid as waiting_pid,
>w.usename as w_user,
>l.pid as blocking_pid,
>l.usename as blocking_user,
>t.schemaname || '.' || t.relname as tablename
>FROM pg_stat_activity w
>JOIN pg_locks l1 ON (w.pid = l1.pid and not l1.granted)
>JOIN pg_locks l2 on (l1.relation = l2.relation and l2.granted)
>JOIN pg_stat_activity l ON (l2.pid = l.pid)
>JOIN pg_stat_user_tables t ON (l1.relation = t.relid)
>WHERE w.waiting;
>
>
> On Mon, May 9, 2016 at 11:50 AM, Joshua D. Drake 
> wrote:
>
>> On 05/09/2016 05:04 AM, Robert Anderson wrote:
>>
>>> Hi,
>>>
>>> We are trying to create a index concurrently but, at least apparently,
>>> it hangs in a infinite loop and never ends.
>>>
>>
>> Apparently how?
>>
>> How long did you wait?
>>
>> JD
>>
>>
>> --
>> Command Prompt, Inc.  http://the.postgres.company/
>> +1-503-667-4564
>> PostgreSQL Centered full stack support, consulting and development.
>> Everyone appreciates your honesty, until you are honest with them.
>>
>>
>> --
>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general
>>
>
>
>
> --
> *Melvin Davidson*
> I reserve the right to fantasize.  Whether or not you
> wish to share my fantasy is entirely up to you.
>


strace.txt.tar.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Create index concurrently hanging with big table on pgsql 9.3.12

2016-05-09 Thread Robert Anderson

Only one line returned:

postgres=# select * from pg_stat_activity where pid=3990;
-[ RECORD 1 ]+
datid| 16434
datname  | flip
pid  | 3990
usesysid | 10
usename  | postgres
application_name | psql
client_addr  |
client_hostname  |
client_port  | -1
backend_start| 2016-05-07 11:48:39.218398-03
xact_start   | 2016-05-07 11:48:43.417734-03
query_start  | 2016-05-07 11:48:43.417734-03
state_change | 2016-05-07 11:48:43.417742-03
waiting  | f
state| active
query| CREATE INDEX CONCURRENTLY index_texto
 |   ON flip_pagina_edicao
 |   USING hash
 |   (texto COLLATE pg_catalog."default");

postgres=#

2016-05-09 14:20 GMT-03:00 Tom Lane :

> Robert Anderson  writes:
> > There aren't transactions blocking:
>
> > postgres=# SELECT
> > postgres-#w.query as waiting_query,
> > postgres-#w.pid as waiting_pid,
> > postgres-#w.usename as w_user,
> > postgres-#l.pid as blocking_pid,
> > postgres-#l.usename as blocking_user,
> > postgres-#t.schemaname || '.' || t.relname as tablename
> > postgres-#FROM pg_stat_activity w
> > postgres-#JOIN pg_locks l1 ON (w.pid = l1.pid and not l1.granted)
> > postgres-#JOIN pg_locks l2 on (l1.relation = l2.relation and
> l2.granted)
> > postgres-#JOIN pg_stat_activity l ON (l2.pid = l.pid)
> > postgres-#JOIN pg_stat_user_tables t ON (l1.relation = t.relid)
> > postgres-#WHERE w.waiting;
> >  waiting_query | waiting_pid | w_user | blocking_pid | blocking_user |
> > tablename
> >
> ---+-++--+---+---
> > (0 rows)
>
> This test proves little, because that last JOIN will discard locks on
> non-table objects, and what CREATE INDEX CONCURRENTLY would be most
> likely to be blocked on is transaction VXIDs.  However, since
> pg_stat_activity claims that "waiting" is false, probably there isn't
> anything in pg_locks.  Still, it'd be better to do
> "select * from pg_stat_activity where pid = 3990" and be sure.
>
> regards, tom lane
>

Re: [GENERAL] Create index concurrently hanging with big table on pgsql 9.3.12

2016-05-09 Thread Robert Anderson

 You still haven't stated why you think it is blocked?

Ouput from iotop:

3990 be/4 postgres0.00 B/s0.00 B/s  0.00 %  0.00 % postgres:
postgres flip [local] CREATE INDEX

The process isn't reading or writing anything for many hours, but it's
using almost 90% of CPU.

How long has it been taking?

backend_start| 2016-05-07 11:48:39.218398-03

More than 50 hours.

What is your maintenance_work_mem set to?

maintenance_work_mem = 352MB



2016-05-09 14:34 GMT-03:00 Joshua D. Drake :

> On 05/09/2016 10:32 AM, Robert Anderson wrote:
>
>> Only one line returned:
>>
>> postgres=# select * from pg_stat_activity where pid=3990;
>> -[ RECORD 1 ]+
>> datid| 16434
>> datname  | flip
>> pid  | 3990
>> usesysid | 10
>> usename  | postgres
>> application_name | psql
>> client_addr  |
>> client_hostname  |
>> client_port  | -1
>> backend_start| 2016-05-07 11:48:39.218398-03
>> xact_start   | 2016-05-07 11:48:43.417734-03
>> query_start  | 2016-05-07 11:48:43.417734-03
>> state_change | 2016-05-07 11:48:43.417742-03
>> waiting  | f
>> state| active
>> query| CREATE INDEX CONCURRENTLY index_texto
>>   |   ON flip_pagina_edicao
>>   |   USING hash
>>   |   (texto COLLATE pg_catalog."default");
>>
>
> So it isn't blocked by a lock. You still haven't stated why you think it
> is blocked? How long has it been taking? What is your maintenance_work_mem
> set to?
>
>
> JD
>
>
>
> --
> Command Prompt, Inc.  http://the.postgres.company/
> +1-503-667-4564
> PostgreSQL Centered full stack support, consulting and development.
> Everyone appreciates your honesty, until you are honest with them.
>

[GENERAL] Changing the location of the default data directory on PG 9.6.6 (CentOS 7)?

2017-11-17 Thread Robert Gordon

I'm trying to identify which postgresql.conf file I should be editing, in order 
to change the default database files location for Postgres 9.6.6, when 
installed on CentOS 7.x/

Is the bet method for changing the default data directory at the time of 
database init, to include the $PGDATA variable at initialization, such as:

su - postgres -c '/usr/pgsql-9.6/bin/initdb --pgdata=$PGDATA', where $PGDATA is 
the directory path that I want the psql database files to reside?

Re: [GENERAL] Vacuum DB in Postgres Vs similar concept in other RDBMS

2007-05-26 Thread Robert Treat

On Wednesday 23 May 2007 20:33, Ron Johnson wrote:
> On 05/23/07 19:17, Chris Browne wrote:
> > [EMAIL PROTECTED] ("Harpreet Dhaliwal") writes:
> >> I was just wondering if Vacuum Db in postgresql is somehow superior
> >> to the ones that we have in other RDBMS.
> >
> > The thing that is more akin to VACUUM, in Oracle's case, is the
> > rollback segment.  In Oracle, Rollback segments are areas in your
> > database which are used to temporarily save the previous values when
> > some updates are going on.
> >
> > In the case of Oracle, if a transaction rolls back, it has to go and
> > do some work to clean up after the dead transaction.
> >
> > This is not *exactly* like PostgreSQL's notion of vacuuming, but
> > that's the nearest equivalent that Oracle has.
>
> That's the only other way to do it, no?
>

You can also take care of the maintenence part both inline (as opposed to a 
seperate segment) and at commit time (rather than delay for a vacuum).  See 
the current HOT patch for a similar implementation to this idea. 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 6: explain analyze is your friend

[GENERAL] Will a DELETE violate an FK?

2007-05-28 Thread Robert James


Is there anyway to know if a DELETE will violate an FK without actually
trying it?

[GENERAL] psql Tab Completion in Windows

2007-05-29 Thread Robert James


Any way to get psql Tab Completion in Windows?

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1119 matches

Mail list logo