Re: [GENERAL] Which CMS/Ecommerce/Shopping cart ?

2010-07-29 Thread Bill
Samantha Atkins wrote:

> Why is MySQL so much more popular right now, especially
> in the OpenSource community?  As a database I find its
> architecture with multiple underlying engines and other
> quirks to be rather dubious.  Then there is the issue of
> commercial licenses and exactly when you must have those
> and what it will really cost.  Yet it is pretty

MySQL was available on Windows long before PostgreSQL.
MySQL has always been free for all uses, including
commercial use, for ISPs so it quickly became the database
that all ISPs/domain hosts provide and, therefore, a
popular choice for Web apps.

> ubiquitous.  How come?  Why isn't postgresql more on
> developer's minds when they think of OS databases?
> Amazon cloud has great scalable MySQL support but
> apparently not postgreql.  Why?   Is there something
> about postgresql that is bugging all these people or what?

My guess is that there are a lot of people who know of
MySQL who have never heard of PostgreSQL. Also, PostgreSQL
does not scale as well on Windows as it does on Linux/Unix.
I have talked to people who support 2,000 concurrent users
using PostgreSQL on Linux. I have been told that the limit
on Windows is about 300 concurrent users. I have no idea
how accurate that statement is. I share your surprise
because PostgreSQL is has a much more extensive feature set
than MySQL.

-- 
.Bill.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] ADO/ODBC returns TEXT fields as Varchar, not LongVarChar

2008-08-20 Thread Bill
I am new to PostgreSQL and running 8.3  on Windows. I am connecting
using ADO and the ODBC driver. My connection string contains
TextAsLongVarchar=1 and MaxLongVarcharSize=65536, however, my TEXT
column is truncated at 255 bytes.

What must I do to get ADO to recognize the TEXT field as a LongVarchar?

-- 
.Bill.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] ADO/ODBC returns TEXT fields as Varchar, not LongVarChar

2008-08-20 Thread Bill
Further testing has revealed that adding UnknownAsLongVarchar=1 to the
connection string causes ADO to see TEXT fields as LongVarchar but it
also causes all VARCHAR fields of any size to be seen as LongVarChar.

Why is ADO or the ODBC driver seeing all Varchar and Text fields as
type unknown?

-- 
.Bill.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Text field truncated using ADO/ODBC?

2008-08-21 Thread Bill
I am new to PostgreSQL and running 8.3  on Windows. I am connecting 
using ADO and the ODBC driver. I have a table the contains a Varchar(80) 
column and a Text column as well as a couple of integer and datetime 
columns. My connection string contains TextAsLongVarchar=1 and 
MaxLongVarcharSize=65536, however, my Text column is being truncated at 
254 bytes (the value of MaxVarcharSize).


Adding UnknownAsLongVarchar=1 to the connection string causes ADO to see 
Text column as LongVarchar and return its complete value but it also 
causes the Varchar column to be seen as LongVarChar. This suggests that 
the driver sees both Text and Varchar as unknown types.


What must I do to get the ODBC driver to see the Text column as 
LongVarchar and the Varchar column as Varchar? The complete connection 
string follows by signature.


Thanks.

Bill

Provider=MSDASQL.1;Extended Properties="DRIVER={PostgreSQL 
ANSI};DATABASE=app;SERVER=localhost;PORT=5432;UID=postgres;PWD=*;SSLmode=disable;ReadOnly=0;Protocol=7.4-1;FakeOidIndex=0;ShowOidColumn=0;RowVersioning=0;ShowSystemTables=0;ConnSettings=;Fetch=100;Socket=4096;UnknownSizes=0;MaxVarcharSize=2048;MaxLongVarcharSize=65536;Debug=0;CommLog=0;Optimizer=1;Ksqo=1;UseDeclareFetch=0;TextAsLongVarchar=1;UnknownsAsLongVarchar=0;BoolsAsChar=1;Parse=0;CancelAsFreeStmt=0;ExtraSysTablePrefixes=dd_;;LFConversion=1;UpdatableCursors=1;DisallowPremature=0;TrueIsMinus1=0;BI=0;ByteaAsLongVarBinary=0;UseServerSidePrepare=0;LowerCaseIdentifier=0;XaOpt=1"





--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Trigger function is not called

2008-08-25 Thread Bill
PostgreSQL 8.3 on Windows. I have the table below which has a before 
insert trigger. The CREATE TRIGGER statement and the trigger function 
are also shown below. When I insert a row into this table using pgAdmin 
III and the INSERT statement


insert into note.category (category_id, category)
values(689, 'Ztest');

the before insert trigger function is not called. The notice is not 
displayed and no value is assigned to the version or uc_category columns 
and the insert fails with a violation of the not null constraint on the 
version field? I have created a simple two column test table with a 
before insert trigger and it works perfectly. I am new to PostgreSQL so 
I suspect I am missing something simple but I cannot figure out what. 
Why is the trigger function never called?


Thanks,

Bill

CREATE TABLE note.category
(
category_id note.d_id NOT NULL,
category note.d_category NOT NULL,
uc_category note.d_category,
parent_category_id note.d_id_fk,
"version" note.d_id,
category_checked boolean NOT NULL DEFAULT false,
CONSTRAINT category_primary_key PRIMARY KEY (category_id)
)
WITH (OIDS=FALSE);
ALTER TABLE note.category OWNER TO postgres;

CREATE TRIGGER category_bi_trigger
BEFORE INSERT
ON note.category
FOR EACH ROW
EXECUTE PROCEDURE note.category_bi();

CREATE OR REPLACE FUNCTION note.category_bi()
RETURNS trigger AS
$BODY$
begin
RAISE NOTICE '*CATEGORY BEFORE INSERT*';
NEW.VERSION := nextval('note.version_seq');
NEW.UC_CATEGORY := UPPER(NEW.CATEGORY);
RETURN NEW;
end;
$BODY$
LANGUAGE 'plpgsql' VOLATILE

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Trigger function is not called

2008-08-25 Thread Bill

Tom Lane wrote:

Bill <[EMAIL PROTECTED]> writes:
  
PostgreSQL 8.3 on Windows. I have the table below which has a before 
insert trigger. The CREATE TRIGGER statement and the trigger function 
are also shown below.



The script you show attempts to create the trigger before creating the
function, which of course isn't going to work.  Did you check whether
the trigger actually got created?

regards, tom lane


  
The trigger was definitely created. The code I posted was not a script 
that I used to create the trigger and trigger function. I just copied  
the SQL from pgAdmin and pasted the commands into my message not paying 
any attention to the order. Sorry for the confusion.


In a newsgroup posting someone suggested that constraint checks on 
domains occur before the before insert trigger. That seems difficult to 
believe based on my experience with other databases. Do constraint 
checks on domains occur before the before insert trigger?


Bill


Re: [GENERAL] Trigger function is not called

2008-08-25 Thread Bill

Tom Lane wrote:

Bill <[EMAIL PROTECTED]> writes:
  
In a newsgroup posting someone suggested that constraint checks on 
domains occur before the before insert trigger.



Yeah, that is the case, but if a domain check was failing then the
row wouldn't get inserted, so I'm not clear on how this matches up
with your report.

regards, tom lane


  
The row is not getting inserted. I just created a test table and trigger 
and confirmed that the trigger fires if the column is defined as bigint 
not null and fails after I change the type to the domain. I will alter 
all of the tables and get rid of the domain.


Is it possible to create a type and use that instead of the domain or 
will I have the same problem with a type?


Bill


Re: [GENERAL] Trigger function is not called

2008-08-25 Thread Bill

You'd have the same problem. By the time the trigger sees it, the row

has already been converted to the table's column datatype(s), so any
exception associated with a datatype or domain would be thrown already.

A lot of people seem to have trouble with this concept; I dunno what
data representation they think the trigger is working on...

If you want to enforce constraints for a table in the trigger, you can
do that, but it's not going to work to try to mix and match
trigger-based and datatype-based restrictions.

regards, tom lane


  
I have no problem with the concept now that I understand it. It is just 
different than InterBase and Firebird which I have done a lot of work 
with lately. Thanks very much for your help.


Bill

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Trigger function is not called

2008-08-25 Thread Bill

Tom Lane wrote:

Bill <[EMAIL PROTECTED]> writes:
  
Is it possible to create a type and use that instead of the domain or 
will I have the same problem with a type?



You'd have the same problem.  By the time the trigger sees it, the row
has already been converted to the table's column datatype(s), so any
exception associated with a datatype or domain would be thrown already.

A lot of people seem to have trouble with this concept; I dunno what
data representation they think the trigger is working on...

If you want to enforce constraints for a table in the trigger, you can
do that, but it's not going to work to try to mix and match
trigger-based and datatype-based restrictions.

regards, tom lane


  
I removed the domain from the category_id and version columns leaving 
the following table, trigger function and trigger. The trigger function 
is still not called when I insert a new row. Any other ideas?


Bill

CREATE TABLE note.category
(
 category_id bigint NOT NULL,
 category character varying(40) NOT NULL,
 uc_category note.d_category,
 parent_category_id bigint,
 "version" bigint NOT NULL,
 category_checked boolean NOT NULL DEFAULT false,
 CONSTRAINT category_primary_key PRIMARY KEY (category_id)
)

CREATE OR REPLACE FUNCTION note.category_bi()
 RETURNS trigger AS
$BODY$
BEGIN
 RAISE NOTICE '**CATEGORY BI**';
 IF (NEW.CATEGORY IS NULL OR NEW.CATEGORY = '') THEN
   RAISE EXCEPTION 'Category cannot be blank.';
 END IF;

 IF (NEW.CATEGORY_ID IS NULL) THEN
   NEW.CATEGORY_ID := nextval('note.id_seq');
 END IF;

 NEW.VERSION := nextval('note.version_seq');
 NEW.UC_CATEGORY := UPPER(NEW.CATEGORY);
 RETURN NEW;
END;
$BODY$
 LANGUAGE 'plpgsql' VOLATILE


CREATE TRIGGER category_bi_trigger
 BEFORE UPDATE
 ON note.category
 FOR EACH ROW
 EXECUTE PROCEDURE note.category_bi();



Re: [GENERAL] Trigger function is not called

2008-08-25 Thread Bill

Tom Lane wrote:

Bill <[EMAIL PROTECTED]> writes:
  
I removed the domain from the category_id and version columns leaving 
the following table, trigger function and trigger. The trigger function 
is still not called when I insert a new row. Any other ideas?



You're still expecting the trigger to get invoked before any constraints
are enforced (the NOT NULLs being the problem here, I think).  Again,
you can enforce things through a trigger or through a table constraint,
but mixing and matching won't work too well.

regards, tom lane


  
The thing that has me confused is that the following table, trigger and 
trigger function work perfectly and the primary key for this table is 
also bigint not null. I added a bigint not null domain to this schema 
and changed the data type of the key to the domain and then I get the 
constraint violation. I changed the type of the key column back to 
bigint not null and the trigger fires and no error occurs.


Bill

CREATE TABLE test.trigger_test
(
 "key" bigint NOT NULL,
 data character varying(16),
 CONSTRAINT trigger_test_key PRIMARY KEY (key)
)

CREATE OR REPLACE FUNCTION test.trigger_test_before_insert()
 RETURNS trigger AS
$BODY$
begin
 raise notice '*Test before insert*';
 new."key" := nextval('test.id_seq');
 return new;
end;
$BODY$
 LANGUAGE 'plpgsql' VOLATILE


CREATE TRIGGER trigger_test_insert
 BEFORE INSERT
 ON test.trigger_test
 FOR EACH ROW
 EXECUTE PROCEDURE test.trigger_test_before_insert();




Re: [GENERAL] Trigger function is not called

2008-08-25 Thread Bill

Tom Lane wrote:

Bill <[EMAIL PROTECTED]> writes:
  
The thing that has me confused is that the following table, trigger and 
trigger function work perfectly and the primary key for this table is 
also bigint not null.



Actually, after looking closer, I think the problem with your previous
example is that you created an ON UPDATE trigger not an ON INSERT
trigger.  Table constraints are indeed enforced after before-triggers
fire, as a quick look at the code proves.  Sorry for the misinformation.

regards, tom lane


  
I knew I was missing something really simple. I changed the trigger to 
before insert and everything works perfectly. Thanks again for your 
help. I learned a lot.


Bill


[GENERAL] SQL optimization - WHERE SomeField STARTING WITH ...

2008-08-28 Thread Bill
The SQL database servers I have worked with cannot use and index for a 
SELECT of the form


SELECT * FROM ATABLE
WHERE AFIELD LIKE ?

because there is no way to know the location of the wild card until the 
parameter value is known. InterBase and Firebird allow


SELECT * FROM ATABLE
WHERE AFIELD STARTING WITH ?

which is equivalent to LIKE  'ABC%' and will use an index on AFIELD. Is 
there a similar syntax in PostgreSQL?


Bill

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] MySQL LAST_INSERT_ID() to Postgres

2008-08-28 Thread Bill

Masis, Alexander (US SSA) wrote:

I was mapping C++ application code that works with mySQL to work with
Postgres.
There were a number of articles on line regarding the conversion from
mySQL to Postgres like:
http://en.wikibooks.org/wiki/Programming:Converting_MySQL_to_PostgreSQL

http://groups.drupal.org/node/4680

http://jmz.iki.fi/blog/programming/converting_mysql_database_to_postgres
ql

http://www.raditha.com/blog/archives/000488.html

However, I found the most difficult issue was related to a MySQL's
"SELECT LAST_INSERT_ID()" sql call.
If your code did not use LAST_INSERT_ID(), then you don't have to read
this post.
In MySQL "LAST_INSERT_ID()" is a MySQL's syntax that returns the
last auto_increment type ID of the row(record) inserted in a table. 


In other words, if your MySQL table had a auto_increment
datatype for a field, that field will automatically advance whenever a
new record(row) is inserted into that table.

It is sometimes handy to know what is the value of that ID, that
has just been added to the table, so that that record(row) can be
addressed/updated later.

Well, in MySQL it's easy you just do:
"SELECT LAST_INSERT_ID();"
In Postgres, however it is not that simple. You have to know the
name of so called "insert sequence". Postgres has a system function for
that( SQL line below ).
In Postgres you will have to provide the table and column name(
"auto_increment" type in MySQL or "serial or bigserial" in Postgres).

Here is that SQL query that returns the last inserted ID:

   "SELECT CURRVAL(
pg_get_serial_sequence('my_tbl_name','id_col_name'));"


Alexander Masis.



  
That will work reliably in a multi-user environment if and only if 
currval() returns the last value for the current connection. I assume 
this is the case but the description of currval() in the PostgreSQL 
documentation says "Return value most recently obtained with |nextval| 
for specified sequence". There is no mention that currval() returns the 
last value obtained by calling nextval() for the current connection. Can 
someone confirm that currval() returns the the value for the connection 
from which it is called?


Bill


Re: [GENERAL] SQL optimization - WHERE SomeField STARTING WITH ...

2008-08-28 Thread Bill

Tino Wildenhain wrote:

Hi Bill,

Bill wrote:
The SQL database servers I have worked with cannot use and index for 
a SELECT of the form


SELECT * FROM ATABLE
WHERE AFIELD LIKE ?

because there is no way to know the location of the wild card until 
the parameter value is known. InterBase and Firebird allow


SELECT * FROM ATABLE
WHERE AFIELD STARTING WITH ?

which is equivalent to LIKE  'ABC%' and will use an index on AFIELD. 
Is there a similar syntax in PostgreSQL?


Yes, its actually: LIKE 'ABC%' and it will use an index.

Regards
Tino
Are you saying that a parameterized query whose WHERE clause is AFIELD 
LIKE ? will use an index on AFIELD if the parameter value is 'ABC%'. I 
do not understand how that is possible since optimizer does not know the 
value of the parameter at the time the SQL is parsed and optimized. When 
the parameter value is supplied it could just as easily be '%ABC' in 
which case an index cannot be used.


This is based on the assumption that PostgreSQL, like other database 
servers, parses and optimizes a parameterized query once then stores it 
in memory so it can be executed multiple times with different parameter 
values. The optimizer could only determine if an index could be used or 
not if it optimized the query each time it was executed after the 
parameter value was supplied.


Bill

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] MySQL LAST_INSERT_ID() to Postgres

2008-08-28 Thread Bill

Steve Atkins wrote:


On Aug 28, 2008, at 12:09 PM, Scott Marlowe wrote:


On Thu, Aug 28, 2008 at 10:14 AM, Masis, Alexander (US SSA)
<[EMAIL PROTECTED]> wrote:

I was mapping C++ application code that works with mySQL to work with
Postgres.
There were a number of articles on line regarding the conversion from
mySQL to Postgres like:

SNIP

Well, in MySQL it's easy you just do:
"SELECT LAST_INSERT_ID();"
  In Postgres, however it is not that simple. You have to know the
name of so called "insert sequence". Postgres has a system function for
that( SQL line below ).
In Postgres you will have to provide the table and column name(
"auto_increment" type in MySQL or "serial or bigserial" in Postgres).

Here is that SQL query that returns the last inserted ID:

 "SELECT CURRVAL(
pg_get_serial_sequence('my_tbl_name','id_col_name'));"


That's the hard way.  Starting with pgsql 8.2 you can do it much more 
easily:


create table tester (id serial primary key, info text);
insert into tester (info) values ('this is a text string') returning id;

tada!  All done, that insert will return the id for you.


Or lastval() if you want something bug-compatible with MySQL.

Cheers,
  Steve


I am new to PostgreSQL but it seems to me that lastval() will only work 
if the insert does not produce side effects that call nextval(). 
Consider the case where a row is inserted into a table that has an after 
insert trigger and the after insert trigger inserts a row into another 
table which has a serial primary key. In that case I assume that 
lastval() will  return the value from the serial column in the second table.


Bill

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] MySQL LAST_INSERT_ID() to Postgres

2008-08-28 Thread Bill

Scott Marlowe wrote:

On Thu, Aug 28, 2008 at 3:38 PM, Bill <[EMAIL PROTECTED]> wrote:
  

I am new to PostgreSQL but it seems to me that lastval() will only work if
the insert does not produce side effects that call nextval(). Consider the
case where a row is inserted into a table that has an after insert trigger
and the after insert trigger inserts a row into another table which has a
serial primary key. In that case I assume that lastval() will  return the
value from the serial column in the second table.



No, setval, currval, and lastval all require as an argument a sequence
name.  So the real issue is you have to know the sequence name to use
them.

The problem with lastval is that it reports the last value that the
sequence gave out whether it was to us or someone else.  this makes it
NOT SAFE for concurrent transactions, but more for maintenance work.

I use returning almost exclusively now.

  
The PostgresSQL 8.3 help file clearly shows that lastval() does not take 
a sequence as a parameter and the description i is "Return the value 
most recently returned by |nextval| in the current session. This 
function is identical to |currval|, except that instead of taking the 
sequence name as an argument it fetches the value of the last sequence 
that |nextval| was used on in the current session. It is an error to 
call |lastval| if |nextval| has not yet been called in the current 
session." Is the help incorrect?


Bill


[GENERAL] Clay Shirky observation regarding MySQL

2005-02-28 Thread bill

Hello!

Clay Shirky made a comment about MySQL that I thought the PostgreSQL
community should be aware of:

   http://www.shirky.com/writings/situated_software.html

It's the section (mostly toward the bottom) entitled, "The Nature of
Programming, and the Curious Case of MySQL". The whole article is, as
normal, interesting and thought-provoking.

[Please, Shirky wrote this, not me; however, if he's thinking it, we should
know.]

B-)

---(end of broadcast)---
TIP 8: explain analyze is your friend


[GENERAL] Scalability

2006-10-28 Thread Bill
I am new to PostgreSQL and just beginning to learn the product. I will
probrobably be using it exclusively on Windows.

I was surprised to learn that PostgreSQL creates a new process for each
connection. Doesn't this severely limit its scalability by consuming
resources rapidly on the server as the number of user increases?

-- 
.Bill.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] Scalability

2006-10-28 Thread Bill
"Joshua D. Drake" wrote:

> Bill wrote:
> > I am new to PostgreSQL and just beginning to learn the product. I
> > will probrobably be using it exclusively on Windows.
> > 
> > I was surprised to learn that PostgreSQL creates a new process for
> > each connection. Doesn't this severely limit its scalability by
> > consuming resources rapidly on the server as the number of user
> > increases?
> 
> The Windows version is not anywhere near as scalable as the unix
> versions. Depending on your hardware you will top out a Windows
> installation about about 350-400 connections. You can get more out of
> Windows by modifying the registry but I am unsure of how far it will
> go.
> 
> I have Linux installations that happily hum along with 2000-5000
> connections.
> 
> So in answer to your question, in general -- no the process
> methodology we use does not limit scalability and it makes our code
> base much simpler that the  equivalent threading model.
> 
> Sincerely,
> 
> Joshua D. Drake

Thanks. How much memory does PostgreSQL consume with 2000 connections?

Which Linux distribution do you prefer?

-- 
.Bill.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[GENERAL] How much memory is required?

2006-11-08 Thread Bill
Is there any published information on the minimum or recommended amount
of memory for PostgreSQL on Windows and/or Linux. I am looking at
PostgreSQL for an embedded app and hardware cost is a consideration. I
cannot find anything in the PostgreSQL 8.1.0 Documentation but I may
not be looking in the right place. Thanks.

-- 
.Bill.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[GENERAL] Does PostgreSQL support multi-instancing?

2006-11-08 Thread Bill
Is is possible to have two different versions of PostgreSQL running on
the same computer at the same time?

-- 
.Bill.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[GENERAL] Performance monitoring

2006-11-08 Thread Bill
Can someone point me to information about performance monitoring in the
PostgreSQL documentation? I want to see what tools are available to
diagnose performance problems. Thanks.

-- 
.Bill.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] How much memory is required?

2006-11-08 Thread Bill
"Raymond O'Donnell" wrote:

> On 8 Nov 2006 at 22:48, Bill wrote:
> 
> > Is there any published information on the minimum or recommended
> > amount of memory for PostgreSQL on Windows and/or Linux. I am
> > looking
> 
> There's some useful information here:
> 
> http://www.powerpostgresql.com/PerfList
> 
> HTH,
> 
> --Ray.
> 
> --
> 
> Raymond O'Donnell
> Director of Music, Galway Cathedral, Galway, Ireland
> [EMAIL PROTECTED]
> --
> 
> 
> 
> ---(end of
> broadcast)--- TIP 1: if posting/reading
> through Usenet, please send an appropriatesubscribe-nomail
> command to [EMAIL PROTECTED] so that yourmessage can
> get through to the mailing list cleanly

Thanks. Very useful.

-- 
.Bill.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] Performance monitoring

2006-11-08 Thread Bill
Jeff Davis wrote:

> On Wed, 2006-11-08 at 23:37 +0000, Bill wrote:
> > Can someone point me to information about performance monitoring in
> > the PostgreSQL documentation? I want to see what tools are
> > available to diagnose performance problems. Thanks.
> > 
> 
> http://www.postgresql.org/docs/8.1/static/monitoring.html
> 
> Does that help?
> 
> Regards,
>   Jeff Davis
> 
> 
> ---(end of
> broadcast)--- TIP 4: Have you searched our
> list archives?
> 
>http://archives.postgresql.org/

Thanks. I must have been blind to have missed that.

-- 
.Bill.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[GENERAL] Can PostgreSQL notify a client that a trigger has fired?

2006-11-08 Thread Bill
Does PostgreSQL have built in mechanism I can use to conditionally
notify a client application that a trigger has fired? What I want is
something along the line of the following pseudo code in a trigger.

if  then
  raise client event

-- 
.Bill.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] Can PostgreSQL notify a client that a trigger has fired?

2006-11-08 Thread Bill
Jorge Godoy wrote:

> I have the impression that you're missing a lot of sections in the
> manual...  How about some time to re-read it?

I don't know about you but for me a 1500 page manual is at least two
weeks of full time reading. I have read several sections of it but I
am trying to decide if PostgreSQL should be considered for a project
and I don't have 80 hours to make the evaluation. As well indexed as
the manual is, there are some topics I cannot find without knowing the
PostgreSQL specific terminology. That's one of the great things about
newsgroups and mailing lists; you can get help with specific questions
from experienced users. I appreciate all the anwers I have received
here. They have made it possible for me to do a much better job in the
time available. My thanks to everyone.

-- 
.Bill.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] shared_buffers formula

2015-03-04 Thread Bill Moran
On Wed, 4 Mar 2015 14:05:09 +0400
Alexander Shutyaev  wrote:

> Thanks for the answer. Now, given this info I've calculated that our
> postgresql should occupy approx. 30,53 GB while the server has 125 GB of
> RAM. However we often see in top that there is very little free memory and
> even swap is used. What could be the reason of postgres using so much
> memory?

Memory usage is much more dependent on the OS than Postgres than you
might realize. I don't see where you state the OS, but I'll assume
it's Linux for now.

Linux default NUMA policy seems to be tuned toward applications
that don't use a lot of RAM. If your 128G server has 8 CPUs, then
Linux will allow a single process to use 16G of RAM before deciding
that it has to use swap for that process. This is one of the 
advantantages I find with FreeBSD.

Read up on how NUMA works a bit, and do some research into how to
tune the NUMA policies ... assuming, of course, that you _are_ using
Linux. Or switch to FreeBSD where the default NUMA policy is more
friendly to programs that use a lot of RAM.

> 2015-03-03 14:26 GMT+03:00 Andres Freund :
> 
> > On 2015-03-03 15:06:54 +0400, Alexander Shutyaev wrote:
> > > Recently we've been having problems with swap on our postgresql server.
> > It
> > > has 125GB of RAM. We've decided to calculate it's memory consumption. To
> > do
> > > this we've used the formulas from the official docs [1].
> >
> > Note that I think those formulas have been removed from the docs for a
> > while now (9.2?).
> >
> > > However there is
> > > one parameter that seems strange - Shared disk buffers. According to the
> > > formula it occupies the following space:
> > >
> > > (block_size + 208) * shared_buffers
> >
> > > Our values are
> > >
> > > block_size=8192
> > > shared_buffers=30GB
> >
> >
> > > The block_size has the default value and shared_buffers was calculated by
> > > pgtune. According to the formula the product will be around 252 000 GB
> > > which doesn't make any sense.
> >
> > The problem with your calculation is that the shared_buffers referenced
> > in the formula is the number of buffers - whereas when you specify it
> > using a size unit (like MB, GB,...) that amount of memory is divided by
> > the size of a page. So you're off by a factor of 8192.
> >
> > Greetings,
> >
> > Andres Freund
> >
> > --
> >  Andres Freund http://www.2ndQuadrant.com/
> >  PostgreSQL Development, 24x7 Support, Training & Services
> >


-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Weight BLOB objects in postgreSQL? How?

2015-03-04 Thread Bill Moran
On Wed, 4 Mar 2015 12:36:36 -0300
María Griensu  wrote:

> Thanks, I mean, how heavy it is in kB.

If you mean how much overhead is involved in storing the data, that's
a bit complicated.

First off, how the data is stored depends on the size of it, and what
other fields there are in the table and their size as well. This is
because the TOAST code makes decisions on how to store things on the
fly based on various rules.

Read up: http://www.postgresql.org/docs/9.4/static/storage-toast.html

There are a number of factors. If the data is large enough for the
TOAST code to kick in, the server may decide to compress the data,
whether that actually helps depends on the nature of the data ...
a zip file or png isn't going to get any smaller, for example.

From there, if the data is large enough to trigger out-of-line
storage, the data will be broken down into chunks and stored in a
toast table, this increases the overhead because each row in the
toast table will have it's own overhead, and the number of rows
required in the toast table depends on the size of the data, which
in-turn depends on how well the data compressed ...

So, your answer is:
Take _your_ data and store a bunch of it, then check the resultant
size of the tables on disk vs. the actual size of the data. That's
really the only way to know since the actual efficiency of data
storage depends a lot on the data itself.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] #PERSONAL# Reg: date going as 01/01/0001

2015-03-06 Thread Bill Moran
On Fri, 6 Mar 2015 10:04:38 +0530
Medhavi Mahansaria  wrote:
> 
> I need to enter the date in null column based on the results obtained.
> 
> but my date gets inserted as 01/01/0001 in postgresql 9.3 when there is no 
> value.
> I need to enter NULL in the column. 

This isn't how PostgreSQL behaves. Something else is causing this
behavior ... either the client library or triggers or something
else configured on the table.

More detail would help us help you. Provide the CREATE TABLE
statement that created the table, as well as details of what
software you're using to execute the query, and the query
itself.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] #PERSONAL# Reg: date going as 01/01/0001

2015-03-06 Thread Bill Moran
On Fri, 6 Mar 2015 15:24:28 +0530
Medhavi Mahansaria  wrote:

> Hi Bill,
> 
> Here are the details of the table and the query i want to insert.
> 
> 
> aml_db=> \d+ check_date
>  Table "public.check_date"
>   Column   |Type | Modifiers | Storage | Stats 
> target | Description
> ---+-+---+-+--+-
>  some_date | date|   | plain   ||
>  sno   | integer |   | plain   ||
>  new_date  | timestamp without time zone |   | plain   ||
> Has OIDs: no
> 
> 
> aml_db=> select * from check_date;
>  some_date | sno | new_date
> ---+-+--
> (0 rows)
> 
> 
> aml_db=> insert into check_date values 
> (to_date('','mmddhh24miss'),1,to_date('','mmddhh24miss'));
> INSERT 0 1
> aml_db=> select * from check_date;
>some_date   | sno |new_date
> ---+-+
>  01/01/0001 BC |   1 | 01/01/0001 00:00:00 BC
> (1 row)
> 
> 
> I need to enter NULL as my date. but since I am getting these variables 
> into the bind variables as empty string that is this issue is happening.

You're not inserting null, so it's hard to expect null to be the
result. Try:

INSERT INTO check_date VALUES (null, 1, null);

or even:

INSERT INTO check_date
 VALUES (to_date(null, 'mmddhh24miss'), 1, to_date(null, 
'mmddhh24miss'));

both of which result in what you desire.

null and the empty string are not the same thing.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres and data warehouses

2015-03-08 Thread Bill Moran
On Sun, 8 Mar 2015 11:20:16 -0400
Stephen Frost  wrote:

> * Nigel Gardiner (nigelgardi...@gmail.com) wrote:
> > I've had a quick search and haven't seen this approach used yet, but I was
> > thinking, the asynchronous replication of Postgres databases could be used
> > as a streaming journal of changes to be processed by a data warehouse. The
> > other approach that suggests itself is WAL file shipping. I've not dug into
> > the async rep protocol yet, before I do so I just wanted to get some brief
> > feedback on whether I'm on the wrong track or not, and if there's some
> > better approach I should be looking at first instead.
> 
> Using a streaming replica for data warehouse queries is quite common..
> The issue there is if you want to change the data structure or store
> data on the replica because a streaming replica based on WAL shipping is
> read-only.  You can create FDW tables which you can write to (even on
> the replica) or you can create an independent database which has FDW
> tables to the replica.  They have their own pros and cons, of course.

Another option is to replicate using Slony instead of streaming, which allows
you to create additional tables on the replica that are read/write in
addition to triggers that only fire on the replica. It's complicated, but
pretty damn powerful.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] VACUUM FULL doesn't reduce table size

2015-03-09 Thread Bill Moran
On Mon, 9 Mar 2015 09:05:07 -0700 (MST)
pinker  wrote:

> 
> > So there are no longer any dead rows being left behind, right?
> > 
> > Why are we still discussing this?  Do you have some other question?
> 
> There are no dead rows, but postgres still cannot reuse the space because of
> 3043947 nonremovable row versions ..
> 
> INFO:  vacuuming "my_table"
> INFO:  "my_table": found 0 removable, 3043947 nonremovable row versions in
> 37580 pages
> DETAIL:  0 dead row versions cannot be removed yet.
> CPU 2.67s/1.59u sec elapsed 7.71 sec.
> Query returned successfully with no result in 8319 ms.

Given your weird description of the "snapshot" I wouldn't be surprised if
that instance of PostgreSQL had subtle corruption.

pg_dump that database, re-init it and reload the dump. Then recreate the
situation and see if the rows are sill nonremovable. I bet you $5.34 that
everything works fine after that, which would indicate that the folks
who made the snapshot didn't do it correctly.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] PostgreSQL-related legal question

2015-03-11 Thread Bill Moran

I've been asked to sign a legal document related to a PostgreSQL-
related job opening. I have concerns about the document and that
signing it could have a negative impact on the PostgreSQL project
(in addition to personal concerns).

I'm guessing I'm not the first person to go through this. I'm
hoping someone on this list can refer me to a lawyer who is
familiar with the challenges of NDAs and open source projects.

I'm not asking for pro-bono, I'm willing to pay for services,
but I just figured that I might get better results getting a
referral than by contacting $random_legal_service.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL-related legal question

2015-03-11 Thread Bill Moran
On Wed, 11 Mar 2015 13:50:02 +0100
Dorian Hoxha  wrote:

> I don't see how it could have negative impact on the postgresql project?
> It's not like your job will be to find vulnerabilities and not disclose
> them ?

I don't think I should discuss the particulars of the situation on
the list. That's why I'm just looking for a lawyer who understands
the situation and can advise me.

> 
> On Wed, Mar 11, 2015 at 1:28 PM, Bill Moran 
> wrote:
> 
> >
> > I've been asked to sign a legal document related to a PostgreSQL-
> > related job opening. I have concerns about the document and that
> > signing it could have a negative impact on the PostgreSQL project
> > (in addition to personal concerns).
> >
> > I'm guessing I'm not the first person to go through this. I'm
> > hoping someone on this list can refer me to a lawyer who is
> > familiar with the challenges of NDAs and open source projects.
> >
> > I'm not asking for pro-bono, I'm willing to pay for services,
> > but I just figured that I might get better results getting a
> > referral than by contacting $random_legal_service.
> >
> > --
> > Bill Moran
> >
> >
> > --
> > Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> > To make changes to your subscription:
> > http://www.postgresql.org/mailpref/pgsql-general
> >


-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Postgresql BDR(Bi-Directional Replication) Latency Monitoring

2015-03-16 Thread Bill Brown
Hi All,

I'm looking at  BDR monitoring documentation at this location
https://wiki.postgresql.org/wiki/BDR_Monitoring

I understand the query in documentation:

select slot_name, plugin, database, active, xmin,
pg_get_transaction_committime(xmin)
FROM pg_stat_logical_decoding ;


Should be:

select slot_name, plugin, database, active, xmin,
pg_get_transaction_committime(xmin)
FROM pg_replication_slots ;



I am running PostgreSQL 9.4.0 bits on Centos:

postgresql-bdr94-contrib-9.4.0_bdr1-1_2ndQuadrant.el6.x86_64
postgresql-bdr94-libs-9.4.0_bdr1-1_2ndQuadrant.el6.x86_64
postgresql-bdr94-bdr-0.8.0beta1-1_2ndQuadrant.el6.x86_64
postgresql-bdr94-2ndquadrant-redhat-1.0-2.noarch
postgresql-bdr94-devel-9.4.0_bdr1-1_2ndQuadrant.el6.x86_64
postgresql-bdr94-server-9.4.0_bdr1-1_2ndQuadrant.el6.x86_64
postgresql-bdr94-9.4.0_bdr1-1_2ndQuadrant.el6.x86_64


We have 3 nodes set-up.  We would like to understand the latency between
nodes.

When using http://oltpbenchmark.com/ to create a load on one node,  I would
expect the 'xmin' value not to be null but is.

What is the best way to monitor replication latency when using Postgresql
BDR?

Regards,

Bill


Re: [GENERAL] Autovacuum query

2015-03-26 Thread Bill Moran
On Thu, 26 Mar 2015 03:58:59 +
Mitu Verma  wrote:
> 
> We have a customer complaining about the time taken by one of the application 
> scripts while deleting older data from the log tables.
> During the deletion, customer reported that he often sees the below error and 
> because of which table size doesn?t reduce.
> 
> ERROR: canceling autovacuum task
> Date: 2015-03-14 04:29:19
> Context: automatic analyze of table "fm_db_Server3.mmsuper.audittraillogentry"
> 
> We have the following queries in this regard:
> 
> -  How often is the autovacuum task invoked by postgres

As needed. Read:
http://www.postgresql.org/docs/9.4/static/routine-vacuuming.html

> -  If the task gets cancelled (while we were deleting data from the 
> database) would this task be re-invoked at a later time?

Yes. Read the above

> -  If insertion of data into a table also impact this task?

No. Vacuum operations are not triggered by INSERTs.

> -  If we can manually schedule this task to a particular time (like 
> off peak hours)?

Yes, but given the questions you're asking, you probably do not
have a good enough understanding of the situation to schedule it
correctly and will make the problem worse. You can run it manually
any time you want, but I don't recommend that you disable
autovacuum unless you have a good understanding of what you're
doing.

Let me take a guess at the problem: The table gets LOTs of inserts,
constantly, and somewhere there's a job that runs out of cron or
some similar scheduler that DELETEs a lot of those rows in a big
chunk. The DELETE process probably runs infrequently, like once
a day or even once a week because the designers thought it would
be best to get everything taken care of all at once during some
real or perceived slow period on the database.

One solution to this is to run the DELETE process more frequently,
such as every 15 minutes. In such a case, the process will run
much faster, make less changes, and require less work on the part
of autovacuum to clean up after. People frequently complain that
"this will impact performance if run during normal use hours,"
but in every case I've seen, nobody had actually tested to see
if that statement was true, and running smaller purges more
frequently actually solved the problem.

Another option would be to manually run vacuum after the big
DELETE runs.
See http://www.postgresql.org/docs/9.4/static/sql-vacuum.html

Don't fall into the trap of running VACUUM FULL. This is usually
a bad idea. If the client is complaining about reclaiming disk
space, start asking some hard questions: How much space is too
much? Why are you convinced that the space is wasted?
Usually the correct answer is to add more disk space, since
Postgres tends to fall into a groove with a particular table
whereby the "unused" space is actually being used and reclaimed
by data tuples as the data in the table changes. It's not
unusal for the table to be 2x the size of the actual data on
a heavily updated table.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Column does not exists?

2015-03-26 Thread Bill Moran
On Thu, 26 Mar 2015 14:12:36 -0300
"Leonardo M. Ramé"  wrote:

> Ok, I have this table:
> 
> CREATE TABLE sessions
> (
>"SESSIONID" integer NOT NULL,
>"SESSIONTIMESTAMP" character varying(45) NOT NULL,
>"SESSIONDATA" character varying(200) DEFAULT NULL::character varying,
>CONSTRAINT sessions_pkey PRIMARY KEY ("SESSIONID")
> )
> 
> Now, when I do:
> 
> DELETE From sessions WHERE SESSIONTIMESTAMP < '2010-01-01 10:02:02'
> 
> I get:
> 
> ERROR:  column "sessiontimestamp" does not exist
> LINE 1: DELETE From sessions WHERE SESSIONTIMESTAMP < '2010-01-01 10...
> ^
> ** Error **
> 
> ERROR: column "sessiontimestamp" does not exist
> SQL state: 42703
> Character: 28
> 
> But if I do:
> 
> DELETE From sessions WHERE "SESSIONTIMESTAMP" < '2010-01-01 10:02:02'
> 
> It DOES work.
> 
> Why the db doesn't recognize the name of the table without quotes?.

See:
http://www.postgresql.org/docs/9.4/static/sql-syntax-lexical.html#SQL-SYNTAX-IDENTIFIERS

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Would like to know how analyze works technically

2015-04-01 Thread Bill Moran
On Wed, 1 Apr 2015 04:33:07 -0700 (MST)
TonyS  wrote:

> On Wed, April 1, 2015 12:18 am, Tom Lane-2 [via PostgreSQL] wrote:
> >
> > TonyS  writes:
> >
> >> Running "analyze verbose;" and watching top, the system starts out
> >> using no swap data and about 4GB of cached memory and about 1GB of used
> >> memory. As it runs, the amount of used RAM climbs, and eventually the
> >> used swap memory increases to 100% and after being at that level for a
> >> couple of minutes, the analyze function crashes and indicates "server
> >> closed the connection unexpectedly."
> >
> > ANALYZE is supposed to work in a constrained amount of memory, though
> > that amount depends quite a bit on what the data is and what you've got the
> > statistics target set to.
> >
> > We've heard reports that there are memory leaks in (some versions of?)
> > PostGIS's analyze support.  Whether that would apply to you would depend
> > on whether you're using PostGIS.
> >
> > Hard to say more without a lot more concrete info about what your
> > data is, what PG version you're using, etc.
> >
> > regards, tom lane
> >
> 
> Thanks for the response Tom.
> 
> I am not using PostGIS. The data in my system is mostly along the lines of
> what you would see in an accounts payable, accounts receivable, and
> billing type situation. Names and addresses of individuals, information
> about billing, payments received, payments sent etc.
> 
> All of my indexes are b-tree indexes.
> 
> Currently, the largest individual table is 1.8GB.
> 
> select version() returns:
> PostgreSQL 9.3.6 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu
> 4.8.2-19ubuntu1) 4.8.2, 64-bit
> 
> OS: Ubuntu 14.04.1 LTS
> 
> Physical memory: 8GB
> 
> The postgresql log has these entries at the crash point:
> 2015-04-01 06:24:37 EDT LOG:  server process (PID 1384) was terminated by
> signal 9: Killed
> 2015-04-01 06:24:38 EDT DETAIL:  Failed process was running: analyze verbose;
> 2015-04-01 06:24:38 EDT LOG:  terminating any other active server processes
> 
> I started this process at 11PM, so it ran for about 7.5 hours before
> crashing.
> 
> Is there anything else that would be helpful?

Don't know if I'm on the right track with this, but what is
maintenance_work_mem set to on this system?

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Would like to know how analyze works technically

2015-04-01 Thread Bill Moran
On Wed, 1 Apr 2015 06:26:36 -0700 (MST)
TonyS  wrote:

> On Wed, April 1, 2015 8:48 am, Bill Moran [via PostgreSQL] wrote:
> >
> 
> >>>> Running "analyze verbose;" and watching top, the system starts out
> >>>> using no swap data and about 4GB of cached memory and about 1GB of
> >>>> used memory. As it runs, the amount of used RAM climbs, and
> >>>> eventually the used swap memory increases to 100% and after being at
> >>>> that level for a couple of minutes, the analyze function crashes and
> >>>> indicates "server closed the connection unexpectedly."
> >>>
> >>> ANALYZE is supposed to work in a constrained amount of memory, though
> >>>  that amount depends quite a bit on what the data is and what you've
> >>> got the statistics target set to.
> >>>
> >>> We've heard reports that there are memory leaks in (some versions
> >>> of?) PostGIS's analyze support.  Whether that would apply to you would
> >>> depend on whether you're using PostGIS.
> >>>
> >>> Hard to say more without a lot more concrete info about what your
> >>> data is, what PG version you're using, etc.
> >
> > Don't know if I'm on the right track with this, but what is
> > maintenance_work_mem set to on this system?
> >
> 
> Hello Bill,
> 
> maintenance_work_mem is set to 480MB. I haven't changed that from what
> pgtune suggested.

Doesn't seem unreasonable, so my guess isn't right.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Would like to know how analyze works technically

2015-04-01 Thread Bill Moran
On Wed, 1 Apr 2015 10:47:32 -0700 (MST)
TonyS  wrote:

> On Wed, April 1, 2015 12:30 pm, Igor Neyman [via PostgreSQL] wrote:
> >>
> >> TonyS <[hidden email]>
> >> wrote:
> >>
> >>
> >
> > name,current_setting,source autovacuum,off,configuration file
> > synchronous_commit,off,configuration file TimeZone,localtime,configuration
> > file unix_socket_directories,/var/run/postgresql,configuration file
> > wal_buffers,8MB,configuration file work_mem,1536MB,configuration file
> >
> >
> > ---
> >
> >
> > ?work_mem,1536MB,configuration file
> >
> >
> >
> > IIRC, your RAM is 8GB.  Your work_mem is too high.  Actual memory used
> > for sorting, etc... could be multiples of work_mem setting.
> >
> > That could be the reason for your memory problems.  I'd suggest to set it
> > to 16MB, and see if you can avoid "on disk" sorting.  If not - gradually
> > increase work_mem.
> >
> > Regards,
> >
> >
> > Igor Neyman
> >
> 
> 
> Thanks Igor,
> 
> I will try changing that. I pretty much just let pgtune set all of those
> values for me.

If pgtune set 1.5G of work_mem, then someone should file a bug report.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Serializable transaction restart/re-execute

2015-04-03 Thread Bill Moran
On Fri, 3 Apr 2015 15:35:14 +0100
Filipe Pina  wrote:

> Hello,
> 
> I come from a GTM background and once of the transactional features there are 
> the ?Transaction Restarts?.
> 
> Transaction restart is when we have two concurrent processes reading/writing 
> to the same region/table of the database, the last process to commit will 
> ?see? that the database is not the same as it was when the transaction 
> started and goes back to the beginning of the transactional code and 
> re-executes it.
> 
> The closest I found to this in PGSQL is the Serializable transaction 
> isolation mode and it does seem to work well except it simply throws an error 
> (serialization_failure) instead of restarting.
> 
> I?m trying to make use of this exception to implement restartable functions 
> and I have all the examples and conditions mentioned here in a question in SO 
> (without any answer so far?):
> 
> http://stackoverflow.com/questions/29372202/postgresql-generic-handler-for-serialization-failure
>  
> <http://stackoverflow.com/questions/29372202/postgresql-generic-handler-for-serialization-failure>
> 
> So basically I have two questions:
> - the restartable ?wrapper? function never gets its ?DB view? refreshed once 
> it restarts, I assume it?s because of the outter transaction (at function 
> level) so it never re-reads the new values and keeps failing with 
> serialization_failure.. Any way to solve this?
> - the ideal would be to be able to define this at database level so I 
> wouldn?t have to implement wrappers for all functions.. Implementing a 
> ?serialization_failure? generic handler that would simply re-call the 
> function that threw that exception (up to a number of tries). Is this 
> possible without going into pgsql source code?

I suspect that savepoints will accomplish what you want:
http://www.postgresql.org/docs/9.4/static/sql-savepoint.html

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Serializable transaction restart/re-execute

2015-04-06 Thread Bill Moran
On Mon, 6 Apr 2015 10:41:25 +0100
Filipe Pina  wrote:

> Hi Bill, thanks for the quick reply.
> 
> I had read about SAVEPOINTs but I also read I cannot use them in PLPGSQL and
> should use BEGIN/END blocks and EXCEPTIONs.
> 
> Did you check the URL I mentioned?

Yes, I did:
http://stackoverflow.com/questions/29372202/postgresql-generic-handler-for-serialization-failure

...

> But it doesn't work.. Every iteration fails with serialization_failure
> probably because the outer transaction is not rolled back and I'm not sure
> how to write this in a way I can roll it back and still have control of the
> LOOP..

Probably one of your issues is that there is no such thing as an
"outer" transaction. There's just a transaction. There is no nesting
of transactions, so the belief that there is an outer transaction
that can somehow be manipulated indepently of some other transaction
is leading you to try things that will never work.

I wasn't aware that SAVEPOINTs didn't work in pl/pgsql, thanks for
educating me on that point.

> I find it hard to believe that PGSQL has this amazing "serializable"
> isolation method but not a standard way to take advantage of it to
> automatically "restart" the failed transactions...

I've been over this ground before. You're thinking in such a
micro case that you haven't realized the inherent difficulty of
restarting large transactions with lots of data modification.
An RDBMS may have many tables updated within a transaction, and
transactions may do data processing completely outside of the
database, which means the only way to ensure consistency is to
notify the controlling process of the problem so it can decide
how best to respond.

So ... I dug into your problem a little more, and I think the
problem is that you're trying too hard to replicate GTM design
paradigms instead of learning the way that PostgreSQL is designed
to work.

If I were creating the functions you describe, I would ditch the
second one and simply have this:

CREATE OR REPLACE FUNCTION mytest() RETURNS integer AS $$
BEGIN
update account set balance = balance+10 where id=1 RETURNING balance;
END
$$
LANGUAGE SQL;

of course, it's unlikely that you'll ever want to wrap such a
simple query in a function, so I'm supposing that you'd want
to do something else with the old value of balance before
updating it, in which case:

CREATE OR REPLACE FUNCTION mytest() RETURNS integer AS $$
DECLARE
cc integer;
BEGIN
SELECT INTO cc balance FROM account WHERE id = 1 FOR UPDATE;

RAISE NOTICE 'Balance: %', cc;
perform pg_sleep(3);

update account set balance = cc+10 where id=1 RETURNING balance INTO cc;

return cc;
END
$$
LANGUAGE plpgsql;

The FOR UPDATE ensures that no other process can modify the
row while this one is sleeping.

Now, I understand that you want to don't want to do row locking,
but this is (again) an insistance on your part of trying to
force PostgreSQL to do things the way GTM did instead of
understanding the RDBMS way of doing things. 

Unlearn.

Keep in mind that mytest() might be called as part of a much
larger transaction that does many other things, and you can't
simply roll that back and restart it within mytest() since
mytest() doesn't know everything else that happened.

In you're case, you're trying to look at mytest() as something
that will always be used in a specific way where the
aforementioned problem won't be encountered, but you can not
guarantee that, and it doesn't hold true for all functions.

In general, it's inappropriate for a function to be able to manipulate
a transaction beyond aborting it. And the abort has to bubble up so
that other statements involved in the transaction are also notified.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Regarding bytea column in Posgresql

2015-04-09 Thread Bill Moran
On Thu, 9 Apr 2015 11:03:30 +
"Deole, Pushkar (Pushkar)"  wrote:
> 
> I have been assigned to a product that uses Postgresql 9.3 as backend 
> database. I am new to postgresql.
> The product provides chat functionality between the uses and the completed 
> chats are stored in the database table in a 'bytea' column in the form of 
> xml. When I query the data from this column I see xml file with text data. I 
> have couple of queries:
> 
> 1.   Is 'bytea' column intended for storing text data?

No, it's intended for storing binary data.

> 2.   Typically a chat can have text data with several special characters 
> (which can be represented in multi bytes), how these characters can be stored 
> in a bytea column and retrieved back properly?

bytea won't help you here. You'll have to manage the special
characters entirely in your code. bytea gives you back the
exact same types you put in, with no changes or interpretation.

A better choice would be to use a text field with a proper
text encoding (such as utf-8).

Probably an even better choice would be to use the XML datatype
in PostgreSQL, since you say that you're storing XML anyway.

The place where people tend to get tripped up with TEXT and
XML datatypes is that they're strict. If you try to store
text in a TEXT data type that isn't valid (i.e., multi-byte
characters that aren't correct) you'll get an error and the
data won't be accepted. The same thing happens if you try to
store invalid XML in an XML field (such as XML without proper
closing tags, etc). It seems that this strictness causes a lot
of people to avoid those data types, as there seem to be a lot
of people who would rather have garbage data in their database
than actually go to the work of fixing their application.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] plpgsql functions organisation

2015-05-02 Thread Bill Moran
On Sat, 2 May 2015 14:07:31 -0700
Jeff Janes  wrote:

> On Sat, May 2, 2015 at 1:05 PM, Adrian Klaver 
> wrote:
> 
> > On 05/02/2015 10:12 AM, Melvin Davidson wrote:
> >
> >> AFAIK, you cannot "package" functions in  PostgreSQL, but it is possible
> >> to
> >> call a function from within a function.
> >>
> >> That being said, I would seriously look at how and why you are writing
> >> your functions
> >> as functions that call other functions are not very efficient.
> >>
> >
> > I am not following. That is what packaging is about, separating out 'units
> > of work' so they can be combined as needed. Part of that is using existing
> > functions in new functions/classes. In fact in the Postgres source I see
> > this in many places. Now it is entirely possible I missed a memo, so I am
> > open to a more detailed explanation of the inefficiencies involved.
> >
> 
> The Postgres source is written in C, not in plpgsql.  C has a good
> optimizing compiler and plpgsql doesn't.

Maybe that's a roundabout way of saying that if your functions are
complex enough to require calling "sub-functions" they might be
justifying being writting in C?

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] plpgsql functions organisation

2015-05-02 Thread Bill Moran
On Sat, 02 May 2015 15:06:24 -0700
Adrian Klaver  wrote:

> On 05/02/2015 02:07 PM, Jeff Janes wrote:
> > On Sat, May 2, 2015 at 1:05 PM, Adrian Klaver  > <mailto:adrian.kla...@aklaver.com>> wrote:
> >
> > On 05/02/2015 10:12 AM, Melvin Davidson wrote:
> >
> > AFAIK, you cannot "package" functions in  PostgreSQL, but it is
> > possible to
> > call a function from within a function.
> >
> > That being said, I would seriously look at how and why you are
> > writing
> > your functions
> > as functions that call other functions are not very efficient.
> >
> >
> > I am not following. That is what packaging is about, separating out
> > 'units of work' so they can be combined as needed. Part of that is
> > using existing functions in new functions/classes. In fact in the
> > Postgres source I see this in many places. Now it is entirely
> > possible I missed a memo, so I am open to a more detailed
> > explanation of the inefficiencies involved.
> >
> >
> > The Postgres source is written in C, not in plpgsql.  C has a good
> > optimizing compiler and plpgsql doesn't.
> 
> Does this actually matter?  I am a biologist that backed into computing, 
> so I realize I am weak on the fundamentals. Still the scientist in me 
> wants data backing assertions. As I understand it plpgsql works close to 
> the server and is optimized to do so.  I know writing in C would be a 
> better solution. Still is calling plpgsql functions inside plpgsql 
> really a bad thing when just considering plpgsql?

The answer to that is the same answer to so many other things: it depends.

plpgsql functions are slower than C. They also lack a lot of language
features that C has. That being said, if they're meeting your needs, then
don't worry about it. plpgsql is around because for most people, it works
well enough. There are certainly cases when you want to create very complex
logic in the database and plpgsql is liable to make that difficult. But
there are a lot of cases where having to manage pointers and a build
environment and all the things that go with C aren't justified, because
plpgsql has none of that complexity. There are advantages both ways.

The beauty of PostgreSQL is that you have both available and you
can choose whichever is best for your situation.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgresql and GlassFish - cannot commit when autoCommit is enabled

2015-05-09 Thread Bill Moran
On Thu, 07 May 2015 10:07:44 +0300
?   wrote:

> 
> I have postgresql 9.4 and glassfish 4.1. Besides I use MyBatis inside EJB. 
> Now I try to make  select from table and this is what I get:
> javax.resource.spi.LocalTransactionException:Cannot commit when autoCommit is 
> enabled.
> at 
> com.sun.gjc.spi.LocalTransactionImpl.commit(LocalTransactionImpl.java:112)
> at 
> com.sun.enterprise.resource.ConnectorXAResource.commit(ConnectorXAResource.java:124)
> at 
> com.sun.enterprise.transaction.JavaEETransactionImpl.commit(JavaEETransactionImpl.java:518)
> at 
> com.sun.enterprise.transaction.JavaEETransactionManagerSimplified.commit(JavaEETransactionManagerSimplified.java:854)
> at 
> com.sun.ejb.containers.EJBContainerTransactionManager.completeNewTx(EJBContainerTransactionManager.java:719)
> at 
> com.sun.ejb.containers.EJBContainerTransactionManager.postInvokeTx(EJBContainerTransactionManager.java:503)
> at 
> com.sun.ejb.containers.BaseContainer.postInvokeTx(BaseContainer.java:4566)
> at 
> com.sun.ejb.containers.BaseContainer.postInvoke(BaseContainer.java:2074)
> at 
> com.sun.ejb.containers.BaseContainer.postInvoke(BaseContainer.java:2044)
> at 
> com.sun.ejb.containers.EJBObjectInvocationHandler.invoke(EJBObjectInvocationHandler.java:212)
> at 
> com.sun.ejb.containers.EJBObjectInvocationHandlerDelegate.invoke(EJBObjectInvocationHandlerDelegate.java:79)
> at com.sun.proxy.$Proxy312.getLsist(UnknownSource)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethod)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:483)
> at 
> com.sun.corba.ee.impl.presentation.rmi.ReflectiveTie.dispatchToMethod(ReflectiveTie.java:143)
> at 
> com.sun.corba.ee.impl.presentation.rmi.ReflectiveTie._invoke(ReflectiveTie.java:173)
> at 
> com.sun.corba.ee.impl.protocol.ServerRequestDispatcherImpl.dispatchToServant(ServerRequestDispatcherImpl.java:528)
> at 
> com.sun.corba.ee.impl.protocol.ServerRequestDispatcherImpl.dispatch(ServerRequestDispatcherImpl.java:199)
> at 
> com.sun.corba.ee.impl.protocol.MessageMediatorImpl.handleRequestRequest(MessageMediatorImpl.java:1549)
> at 
> com.sun.corba.ee.impl.protocol.MessageMediatorImpl.handleRequest(MessageMediatorImpl.java:1425)
> at 
> com.sun.corba.ee.impl.protocol.MessageMediatorImpl.handleInput(MessageMediatorImpl.java:930)
> at 
> com.sun.corba.ee.impl.protocol.giopmsgheaders.RequestMessage_1_2.callback(RequestMessage_1_2.java:213)
> at 
> com.sun.corba.ee.impl.protocol.MessageMediatorImpl.handleRequest(MessageMediatorImpl.java:694)
> at 
> com.sun.corba.ee.impl.protocol.MessageMediatorImpl.dispatch(MessageMediatorImpl.java:496)
> at 
> com.sun.corba.ee.impl.transport.ConnectionImpl$1.dispatch(ConnectionImpl.java:195)
> at 
> com.sun.corba.ee.impl.transport.ConnectionImpl.read(ConnectionImpl.java:328)
> at 
> com.sun.corba.ee.impl.transport.ReaderThreadImpl.doWork(ReaderThreadImpl.java:112)
> at 
> com.sun.corba.ee.impl.threadpool.ThreadPoolImpl$WorkerThread.performWork(ThreadPoolImpl.java:497)
> at 
> com.sun.corba.ee.impl.threadpool.ThreadPoolImpl$WorkerThread.run(ThreadPoolImpl.java:540)Causedby:
>  org.postgresql.util.PSQLException:Cannot commit when autoCommit is enabled.
> at 
> org.postgresql.jdbc2.AbstractJdbc2Connection.commit(AbstractJdbc2Connection.java:811)
> at 
> com.sun.gjc.spi.LocalTransactionImpl.commit(LocalTransactionImpl.java:106)...30
>  more
> I even tried the following code:
> SqlSession session 
> =ConnectionFactory.getSession().openSession(false);//AUTOCOMMITList list=null;
> try{
> session.getConnection().setAutoCommit(false);
> TempMapper mapper =(TempMapper)session.getMapper(TempMapper.class);
> list=mapper.readAll();
> }catch(SQLException 
> ex){Logger.getLogger(TempBean.class.getName()).log(Level.SEVERE,null, ex);}
> finally{
> session.close();}
> However the result is the same. When I used gf+mysql+mybatis I changed 
> relaxautocommit but for postgresql driver there is no such attribute. How to 
> solve it?

Best guess, based on experience, is that somehow you're disabling autocommit on 
a different
connection from the one where you're doing the commit. I've seen this sort of 
thing happen
with connection poolers where the developers don't fully understand how the 
connection pool
works.

You could confirm this by turning on full query logging in Postgres and see 
which connection
does what. If it turns out to be the case, then you'll have 

[GENERAL] RPM building tools and info missing?

2015-05-09 Thread Bill Moran

I might need to roll a custom PostgreSQL server RPM for my current
job.

Looking here:
https://wiki.postgresql.org/wiki/RPM_Packaging

The link to the specfiles and other data at
http://svn.pgrpms.org/repo/ gives a 404. I found a few other
pieces on the internet suggesting the same URL, and can't find
any information on why that sit is down or where it might have
gone to.

Is this a temporary outage? Or has the RPM data moved somewhere
else and isn't documented yet? Any help is appreciated.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Allowing postgresql to accept 0xff syntax for data types that it makes sense for?

2015-05-21 Thread Bill Moran

I'm working on a project converting a bunch of code from another database
system to PostgreSQL.

One of the issues is that the previous system accepted integers and binary
data in the same hex format as C ... i.e. 0xff

I understand that the proper way to handle this in postgres is x'ff', but
the problem is that a large number of #define values use the 0xff syntax
and these macros are used both in C code, as well as in SQL. The simple
fact is that a LOT of code does this, and correcting it all and ensuring
that the modified code is correct is a BIG job. Just to clarify some of
the complexity: there is about 80,000 lines of PL/PGSQL code that contains
these macros, then is run through the C preprocessor to substitute actual
values for them before being loaded into Postgres.

Obviously, there are many options for fixing this. One of those options is
modifying PostgreSQL to accept the 0xff syntax ... and evaluating that
option is the reason for my post.

So, one of my questions is: does anyone have an existing simple answer on
how to fix this?

My other question: is there a specific reason why PostgreSQL doesn't support
this syntax, aside from "nobody has bothered to add such support"? Because
I'm considering writing a patch to Postgres and submitting it, but I'm not
going to go down that path if there's a specific reason why supporting this
syntax would be _bad_. Personally, I feel like it would be a good thing, as
it seems like a lot of other database systems support it, and even though
it's not ANSI, it's pretty much the de-facto standard.

-- 
Bill 



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Allowing postgresql to accept 0xff syntax for data types that it makes sense for?

2015-05-21 Thread Bill Moran
On Thu, 21 May 2015 13:57:24 -0400
Tom Lane  wrote:

> Bill Moran  writes:
> > My other question: is there a specific reason why PostgreSQL doesn't support
> > this syntax, aside from "nobody has bothered to add such support"? Because
> > I'm considering writing a patch to Postgres and submitting it, but I'm not
> > going to go down that path if there's a specific reason why supporting this
> > syntax would be _bad_. Personally, I feel like it would be a good thing, as
> > it seems like a lot of other database systems support it, and even though
> > it's not ANSI, it's pretty much the de-facto standard.
> 
> How many is "a lot", and do any of the responsible vendors sit on the SQL
> standards committee?

Well, I've personally worked with (in addition to PostgreSQL) Microsoft
SQL Server, MySQL, and Sybase -- PostgreSQL is the only one of those 4 that
doesn't support the 0xff syntax.

I did a litle research and it appears that neither Oracle nor db2 supports
the 0xff syntax ... so not _quite_ as common as it seemed to me.

> One large concern about doing anything like this is whether future
> versions of the SQL standard might blindside us with some
> not-terribly-compatible interpretation of that syntax.  If we do something
> that is also in Oracle or DB2 or one of the other big boys, then we can
> probably rely on the assumption that they'll block anything really
> incompatible from becoming standardized ;-).

I assume that Microsoft is big enough to prevent anything that would
hurt SQL Server's compatibility from becomming a standard?

> OTOH, if the actual meaning of "a lot" is "MySQL", I'd be pretty worried
> about this scenario.

Well, MySQL _does_ support that syntax ... but I couldn't care less. MySQL
also throws away your data instead of giving you errors and I would never
ask PostgreSQL to start behaving like that.

With all that being said, if I were to build a patch, would it be likely
to be accepted into core?

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Allowing postgresql to accept 0xff syntax for data types that it makes sense for?

2015-05-22 Thread Bill Moran
On Fri, 22 May 2015 11:02:47 -0400
Tom Lane  wrote:

> Alban Hertroys  writes:
> > On 22 May 2015 at 04:46, Bill Moran  wrote:
> >> With all that being said, if I were to build a patch, would it be likely
> >> to be accepted into core?
> 
> > Wouldn't you also need to support similar syntax for octal numbers for
> > the patch to be complete? Or are those already supported (ISTR that's
> > '077' for decimal 63)?
> 
> A patch that made 077 be interpreted as octal would certainly get rejected
> out of hand, because that's valid syntax right now and it doesn't mean 63.

You'll get no objection from me on that point.

> A similar objection might be raised to 0x..., as that is also valid
> syntax; it's read as 0 followed by an identifier:
> 
> regression=# select 0xff;
>  xff 
> -
>0
> (1 row)
> 
> (Yet another place where the fact that AS is optional yields surprising
> results...)  So there would be a backwards compatibility break here,
> and no you can't fix it with a GUC.  Still, it might be a small enough
> break that we could get away with it.

I hadn't even considered that issue. I really hate the fact that AS is
optional, and I'm irritated by code that omits it ... but nobody's
asking me ...

> I'm not personally very excited
> but other people might be.
> 
> Other questions you'd have to think about: what is the data type of
> 0x; what do you do with 0x (too big
> even for int8).  And it'd likely behoove you to check how Microsoft
> answers those questions, if you want to point to SQL Server as what's
> going to keep you out of standards-compatibility problems.  (IOW,
> if 0x ever did get standardized, the text might well match what
> SQL Server does.)

MSSQL seems to use it specifically for the equivalent of BYTEA types,
and it seems to me that should be how it works in PostgreSQL. Does
anyone watching this thread have access to a MSSQL server to verify?
If an implicit cast from a 4-byte BYTEA to int works now, then it
should work ... otherwise an explicit cast would be needed, with the
same behavior if you tried to specify a number that overflows an int
in any other way.

MySQL is more liberal in that you can use it to specify ints as
well, but I don't think MySQL is a good example of proper behavior.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Allowing postgresql to accept 0xff syntax for data types that it makes sense for?

2015-05-22 Thread Bill Moran
On Fri, 22 May 2015 11:27:49 -0500
Dennis Jenkins  wrote:

> On Fri, May 22, 2015 at 10:02 AM, Tom Lane  wrote:
> 
> > Alban Hertroys  writes:
> > > On 22 May 2015 at 04:46, Bill Moran  wrote:
> > >> With all that being said, if I were to build a patch, would it be likely
> > >> to be accepted into core?
>
> How feasible would it be to write a network proxy, like pg_bouncer, to
> handle converting the values on the fly, so that you need to change neither
> your original code base (with respect to this issue), nor change PostgreSQL
> itself?

Certainly feasible, but absolutely undesirable. The system I'm working on
is needlessly complex as it is ... I'd rather convince my bosses to let
me rewrite 80,000 lines of code than add another compatibility shim to
the mess.

I brought it up because I saw an opportunity to benefit my employer and
the PostgreSQL community at the same time. I have about 4 fallback plans
if there's a reason not to do this one. Quite frankly, adding a compatibility
shim isn't even on that list.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Allowing postgresql to accept 0xff syntax for data types that it makes sense for?

2015-05-22 Thread Bill Moran
On Fri, 22 May 2015 12:44:40 -0400
Tom Lane  wrote:

> Bill Moran  writes:
> > Tom Lane  wrote:
> >> Other questions you'd have to think about: what is the data type of
> >> 0x; what do you do with 0x (too big
> >> even for int8).  And it'd likely behoove you to check how Microsoft
> >> answers those questions, if you want to point to SQL Server as what's
> >> going to keep you out of standards-compatibility problems.  (IOW,
> >> if 0x ever did get standardized, the text might well match what
> >> SQL Server does.)
> 
> > MSSQL seems to use it specifically for the equivalent of BYTEA types,
> > and it seems to me that should be how it works in PostgreSQL.
> 
> Oh really?  Wow, I'd just assumed you wanted this as a way to write
> integers.  That's certainly the use-case I would have personally.
> I'm not even sure I like the idea of being able to write byteas without
> quotes --- they seem like strings to me, not numbers.

Arrgh ... it's good that you're bringing this up, but you're making me
realize that there's more to figure out than I originally thought ...
My focus had been on it being used for BYTEA columns, but there _are_
plenty of places in the code that do things like:

WHERE int_col & 0x04 = 0x04

Which means that Sybase will implicitly cast that to an int, which
probably means that MSSQL will as well.

Once I take that into consideration, I start thinking that int_col
should actualy be a bit string. which means that:

WHERE bit_varying_col & 0x04 = 0x04

should probably work without explicit casts as well.

> > If an implicit cast from a 4-byte BYTEA to int works now, then it
> > should work ... otherwise an explicit cast would be needed, with the
> > same behavior if you tried to specify a number that overflows an int
> > in any other way.
> 
> There's no cast at all from bytea to int.  For one thing, it's quite
> unclear what endianness should be assumed for such a cast.  (To get
> unsurprising behavior from what you're describing, I think we'd have
> to use a big-endian interpretation of the bytea; but that would be
> a pain for a lot of other scenarios, or even for this case if you'd
> written a bytea of length other than 4 or 8 bytes.)

As an implicit cast, obviously anything too large to fit in the
target data type should be an error. But the subject of endianness
becomes damn important.

It's more complex than I original thought, but it still seems like it
can be done without creating idiocy in the way things are cast. I'll
think about it some more and try to come up with some more specific
use scenarios to see what behavior seems the most POLA to me.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] FW: Constraint exclusion in partitions

2015-05-23 Thread Bill Moran
ot;
> "->  Seq Scan on subset  (cost=0.00..31.40 rows=2140 width=8)"
> "  ->  Index Scan using oldtable_groupid_idx on oldtable  
> (cost=0.58..107364.99 rows=51340 width=66)"
> "Index Cond: (group_id = subset.id)"
> Total query runtime: 3986 ms. 5978 rows retrieved.
> 
> 
> select * from newtable where group_id IN (select * from subset)
> "Hash Join  (cost=41.25..138092255.85 rows=1935067087 width=66)"
> "  Hash Cond: (newtable.group_id = subset.id)"
> "  ->  Append  (cost=0.00..84877869.72 rows=3870134173 width=66)"
> "->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
> "->  Seq Scan on newtable_01  (cost=0.00..946235.96 rows=46526896 
> width=66)"
> ...
> "->  Seq Scan on newtable_86  (cost=0.00..986527.64 rows=44269664 
> width=66)"
> "  ->  Hash  (cost=38.75..38.75 rows=200 width=8)"
> "->  HashAggregate  (cost=36.75..38.75 rows=200 width=8)"
> "  ->  Seq Scan on subset  (cost=0.00..31.40 rows=2140 width=8)"
> Execution Cancelled after 766702 ms !
> 
> I tried the same with "SET enable_seqscan = OFF" and got an index scan of all 
> tables;
> 
> 
> 
> -- 
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general


-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] FW: Constraint exclusion in partitions

2015-05-23 Thread Bill Moran
On Sat, 23 May 2015 18:16:43 -0400
Daniel Begin  wrote:

> Hello Bill, 
> You wrote that my testing methodology is flawed - I hope you are right!  
> 
> However, I am a bit confused about your comments. Yes, I did edited the name
> of the tables for clarity but if I miss the point I, I will do it again as I
> am writing without modifying anything. Here is the procedure I follow and
> results...
> 
> I use pgadmin_III sql window. I write the following query (I have changed
> the id to make sure it does not use previous results still in memory)...

I didn't realize you were using PGAdmin ... that explains some of it ...
see below:

> Select * from nodes where id=345678912; -- nodes is the real partitioned
> table name
> 
> Now I select "explain query" from the menu and I get the following result...
> "Append  (cost=0.00..384.08 rows=99 width=66)"
> "  ->  Seq Scan on nodes  (cost=0.00..0.00 rows=1 width=66)"
> "Filter: (id = 345678912)"
> "  ->  Index Scan using nodes19_idversion_pk on nodes_19  (cost=0.56..384.08
> rows=98 width=66)"
> "Index Cond: (id = 345678912)"
> 
> Now, I select "run" and I get one record as a result and the following
> message in history tab...
> -- Executing query:
> Select * from nodes where id=345678912; 
> Total query runtime: 62 ms.
> 1 row retrieved.
> 
> Now, if I use the same query on the original table using the same procedure,
> here is what I get...
> Select * from old_nodes where id=345678912; -- old_nodes is the real
> original table name
> 
> Explain gives me the following
> "Index Scan using nodes_idversion_pk on old_nodes  (cost=0.70..4437.15
> rows=1682 width=66)"
> "  Index Cond: (id = 345678912)"
> 
> Running the query gives me the same record with the following message in
> history tab...
> -- Executing query:
> select * from old_nodes where id=345678912; 
> Total query runtime: 62 ms.
> 1 row retrieved.
> 
> This time, the history tab shows that both took the same time to run (an
> improvement!?)

If your environment is providing such wildly variant results, then
you need to start running multiple tests instead of assuming that a single
run of a query is indicative of a pattern.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Efficient sorting the results of a join, without denormalization

2015-05-31 Thread Bill Moran
On Sun, 31 May 2015 04:50:00 -0500
"Glen M. Witherington"  wrote:
> 
> On Sun, May 31, 2015, at 12:53 AM, Tom Lane wrote:
> > "Glen M. Witherington"  writes:
> > > And here's the query I want to do, efficiently:
> > 
> > > SELECT * FROM c
> > >   JOIN b ON b.id = c.b_id
> > >   JOIN a ON a.id = b.a_id
> > > WHERE a.id = 3
> > > ORDER BY b.created_at DESC
> > > LIMIT 10
> > 
> > At least for that dummy data, this seems sufficient:
> > 
> > regression=# create index on b (a_id, created_at);
> > CREATE INDEX
> > regression=# explain analyze SELECT * FROM c
> >   JOIN b ON b.id = c.b_id
> >   JOIN a ON a.id = b.a_id
> > WHERE a.id = 3
> > ORDER BY b.created_at DESC
> > LIMIT 10;
> >   QUERY
> >   PLAN 
> > --
> >  Limit  (cost=0.14..21.95 rows=10 width=64) (actual time=0.064..1.176
> >  rows=10 loops=1)
> >->  Nested Loop  (cost=0.14..436079.81 rows=20 width=64) (actual
> >time=0.063..1.173 rows=10 loops=1)
> >  Join Filter: (b.id = c.b_id)
> >  Rows Removed by Join Filter: 1218
> >  ->  Nested Loop  (cost=0.14..9.81 rows=20 width=40) (actual
> >  time=0.035..0.035 rows=1 loops=1)
> >->  Index Scan Backward using b_a_id_created_at_idx on b 
> >(cost=0.14..8.49 rows=20 width=24) (actual
> >time=0.019..0.019 rows=1 loops=1)
> >  Index Cond: (a_id = 3)
> >->  Materialize  (cost=0.00..1.07 rows=1 width=16) (actual
> >time=0.013..0.013 rows=1 loops=1)
> >  ->  Seq Scan on a  (cost=0.00..1.06 rows=1 width=16)
> >  (actual time=0.009..0.009 rows=1 loops=1)
> >Filter: (id = 3)
> >Rows Removed by Filter: 2
> >  ->  Materialize  (cost=0.00..27230.00 rows=100 width=24)
> >  (actual time=0.008..0.811 rows=1228 loops=1)
> >->  Seq Scan on c  (cost=0.00..16370.00 rows=100
> >width=24) (actual time=0.007..0.310 rows=1228 loops=1)
> >  Planning time: 0.796 ms
> >  Execution time: 1.390 ms
> > (15 rows)
> > 
> > regards, tom lane
> 
> Wow, sorry I screwed up the query. It should be:
> 
> ORDER BY c.created_at DESC
> 
> Not b, or as you noted its trivial to index. Sorry!

Creating an index on c.created_at sped things up by a factor of over
1000, which caused the case you defined to run in ~0.5ms for me.

-- 
Bill Moran 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Planner cost adjustments

2015-06-02 Thread Bill Moran
--
> -
>  Hash Semi Join  (cost=21.50..819511.42 rows=729133 width=24) (actual
> time=1538.100..7307.743 rows=338568 loops=1)
>Hash Cond: (changesets.user_id = users.id)
>->  Seq Scan on changesets  (cost=0.00..745390.84 rows=25138084 width=24)
> (actual time=0.027..4620.691 rows=25133929 loops=1)
>->  Hash  (cost=14.00..14.00 rows=600 width=8) (actual time=0.300..0.300
> rows=600 loops=1)
>  Buckets: 1024  Batches: 1  Memory Usage: 24kB
>  ->  Seq Scan on users  (cost=0.00..14.00 rows=600 width=8) (actual
> time=0.022..0.187 rows=600 loops=1)
>  Total runtime: 7519.254 ms
> (7 rows)
> 
> osmdump=# SET enable_seqscan = OFF;
> osmdump=# EXPLAIN ANALYSE SELECT user_id, id AS changeset_id,closed FROM
> changesets WHERE changesets.user_id IN(SELECT id FROM sample.users);
> 
> 
>  Nested Loop  (cost=115.94..10001090810.49 rows=729133 width=24)
> (actual time=0.268..1466.248 rows=338568 loops=1)
>->  HashAggregate  (cost=115.50..121.50 rows=600 width=8)
> (actual time=0.205..0.530 rows=600 loops=1)
>  ->  Seq Scan on users  (cost=100.00..114.00
> rows=600 width=8) (actual time=0.003..0.035 rows=600 loops=1)
>->  Index Scan using changesets_useridndx on changesets
> (cost=0.44..1805.83 rows=1215 width=24) (actual time=0.036..2.314 rows=564
> loops=600)
>  Index Cond: (user_id = users.id)
>  Total runtime: 1677.447 ms
> (6 rows)
> 
> #3 - Run EXPLAIN ANALYZE and look for discrepancies between the estimated
> and actual times
> 
> --
> Looking at above results, there are obvious discrepancies between
> expected/actual rows and time!
> I dug a bit by exploring/trying to understand the different concepts
> explained in...
> 
> http://www.postgresql.org/docs/9.4/static/planner-stats.html
> http://www.postgresql.org/docs/8.1/static/planner-stats-details.html
> http://www.postgresql.org/docs/9.2/static/view-pg-stats.html
> 
> Concerning discrepancies between the actual number of rows and predicted
> value, I looked at what pg_stats was saying about user_id in table
> changesets.
> Here are the values provided to the planner...
> Average_width=8
> histogram_bounds: the size of the bins varies between 50 and 15, which
> make sense because if I had divided the column's values into groups of
> approximately equal population, I would have produced bins between 1 and
> 10 (if sorted by frequency)
> n_distinct= 20686 (there is actually 464858 distinct values for user_id in
> the table)
> most_common_vals: values make sense (I checked the frequency count of a
> couple most common users_id)
> correlation=0.617782 (?)
> most_common_elems, most_common_elem_freqs and elem_count_histogram were
> empty
> 
> At this point, I wonder if the assumptions behind the planner's statistics
> may produce such problems since the distribution of my data is not uniform
> but follows a power law (some user_id would return millions of records while
> others only one).  
> This is the farthest I can go at this point. Maybe someone can provide me
> with more explanations regarding planner's behavior and ways to go further
> to make it work properly?

You may also benefit from increasing the statistics targets and
running ANALYZE again. It certainly looks like some of those stats are
pretty far off. Raising the statistics target will cause ANALYZE to
investigate more rows (which takes longer but might produce more
accurate results)

I suggest experimenting with the cost settings first, though.

-- 
Bill Moran 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Automatic Client routing

2015-06-04 Thread Bill Moran
On Thu, 4 Jun 2015 08:53:15 -0400
Ravi Krishna  wrote:

> Are there any plans to introduce the concept of automatic client
> routing to the principal server in a cluster of N machines. For
> example, if there is a four node replication cluster N1 .. N4, at any
> time only one can be principal (the one which does the writing). In
> Oracle and DB2, client side libraries provide a way for the clients to
> connect to the principal writer regardless of where it is running on
> N1 .. N4. This way client need to be aware of only one connection
> string.
> 
> EnterpriseDb is a failover manager which relies on virtual IP
> management, not the one I described above.

pgpool has this capacity.

-- 
Bill Moran 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Momentary Delay

2015-06-08 Thread Bill Moran
dvi_sub, gis_via_viagem
> via_sub, gis_iti_itinerario iti_sub
>  where dvi_sub.via_cod_viagem = via_sub.via_cod_viagem
>and via_sub.via_status_viagem = 'A'
>and via_sub.via_dt_hora_ini > now() - interval '9 hours'
>and iti_sub.lin_cod_linha = 389
>and iti_sub.iti_sentido_itinerario = 'I'
>and iti_sub.iti_cod_itinerario_linha =
> via_sub.iti_cod_itinerario_linha
>and dvi_sub.vlo_cod_localizacao is not null
>  group by dvi_sub.via_cod_viagem,
> iti_sub.iti_cod_itinerario_linha, via_sub.equ_cod_eqpto
> ;
> Cancel request sent
> ERROR:  canceling statement due to user request
> Time: 10164.015 ms
> citgis=# select now();
>   now
> ---
>  2015-06-05 09:27:22.006072-03
> (1 row)
> 
> Time: 0.152 ms
> citgis=# select max(dvi_sub.dvi_cod_dados_via_iti) as dvi_cod_dados_via_iti,
> dvi_sub.via_cod_viagem
>  from gis_dvi_dados_viagem_itinerario dvi_sub, gis_via_viagem
> via_sub, gis_iti_itinerario iti_sub
>  where dvi_sub.via_cod_viagem = via_sub.via_cod_viagem
>and via_sub.via_status_viagem = 'A'
>and via_sub.via_dt_hora_ini > now() - interval '9 hours'
>and iti_sub.lin_cod_linha = 389
>and iti_sub.iti_sentido_itinerario = 'I'
>and iti_sub.iti_cod_itinerario_linha =
> via_sub.iti_cod_itinerario_linha
>and dvi_sub.vlo_cod_localizacao is not null
>  group by dvi_sub.via_cod_viagem,
> iti_sub.iti_cod_itinerario_linha, via_sub.equ_cod_eqpto
> ;
>  dvi_cod_dados_via_iti | via_cod_viagem
> ---+
> 1059964443 |7989813
> 1060072723 |7990876
> (2 rows)
> 
> Time: 5565.175 ms
> citgis=# select max(dvi_sub.dvi_cod_dados_via_iti) as dvi_cod_dados_via_iti,
> dvi_sub.via_cod_viagem
>  from gis_dvi_dados_viagem_itinerario dvi_sub, gis_via_viagem
> via_sub, gis_iti_itinerario iti_sub
>  where dvi_sub.via_cod_viagem = via_sub.via_cod_viagem
>and via_sub.via_status_viagem = 'A'
>and via_sub.via_dt_hora_ini > now() - interval '9 hours'
>and iti_sub.lin_cod_linha = 389
>and iti_sub.iti_sentido_itinerario = 'I'
>and iti_sub.iti_cod_itinerario_linha =
> via_sub.iti_cod_itinerario_linha
>and dvi_sub.vlo_cod_localizacao is not null
>  group by dvi_sub.via_cod_viagem,
> iti_sub.iti_cod_itinerario_linha, via_sub.equ_cod_eqpto
> ;
>  dvi_cod_dados_via_iti | via_cod_viagem
> ---+
> 1059964445 |7989813
> 1060072725 |7990876
> (2 rows)
> 
> Time: 27.944 ms
> citgis=# select max(dvi_sub.dvi_cod_dados_via_iti) as dvi_cod_dados_via_iti,
> dvi_sub.via_cod_viagem
>  from gis_dvi_dados_viagem_itinerario dvi_sub, gis_via_viagem
> via_sub, gis_iti_itinerario iti_sub
>  where dvi_sub.via_cod_viagem = via_sub.via_cod_viagem
>and via_sub.via_status_viagem = 'A'
>and via_sub.via_dt_hora_ini > now() - interval '9 hours'
>and iti_sub.lin_cod_linha = 389
>and iti_sub.iti_sentido_itinerario = 'I'
>and iti_sub.iti_cod_itinerario_linha =
> via_sub.iti_cod_itinerario_linha
>and dvi_sub.vlo_cod_localizacao is not null
>  group by dvi_sub.via_cod_viagem,
> iti_sub.iti_cod_itinerario_linha, via_sub.equ_cod_eqpto
> ;
>  dvi_cod_dados_via_iti | via_cod_viagem
> ---+
> 1059964445 |7989813
> 1060072727 |7990876
> (2 rows)
> 
> Time: 24.428 ms
> 
> Greetings
> 
>   Anderson


-- 
Bill Moran 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Planner cost adjustments

2015-06-11 Thread Bill Moran
On Wed, 10 Jun 2015 17:20:00 -0400
Daniel Begin  wrote:

> Here is a follow-up on adjusting the planner costs calculation
> 
> -Statistics target of problematic columns were increased from 100 to 200.
> -Analyse was ran on all concerned tables (actually ran on the whole DB)
> -Random_page_cost was decreased from 4 to 2.

Since you changed two things, there's no way to be sure which change led
to the improvement. You seem to be assuming that changing the stastics
target was what helped. While that _may_ be right, it might also have
been the change to random_page_cost.

> As a result, about 80% of queries are now using what I would consider an
> optimal plan. However, the planner keeps using full table scans for large
> tables... 
> 
> For instance, I ran a query that should have used an index scan on two
> similar test tables. The planner had selected an index scan for the smaller
> one and a Seq Scan for larger one. Except for their sizes and for one field
> not used in the exercise, the test tables were identical and indexed on the
> same field. The smaller test table had 3.26E+10 records and the larger one
> 3.78E+11 records.  
> 
> The query looked like...
> SELECT cs.user_id, cs.changeset_id, nd.id, nd.version  
> FROM changesets_selection cs, a_test_table nd
> WHERE nd.changeset_id=cs.changeset_id;
> 
> In order to understand why the planner selected the Seq Scan instead of an
> Index Scan on the large table (nodes), I ran an EXPLAIN ANALYSE (on warm
> cache) using enable_seqscan set to OFF/ON.
> 
> -- QUERY PLAN on larger table (nodes) with SET enable_seqscan=OFF
> -;
>  Nested Loop  (cost=100.58..10210480648.03 rows=194204592 width=40)
> (actual time=74.088..278008.579 rows=140112 loops=1)
>->  Seq Scan on changesets_selection cs
> (cost=100.00..1000110.44 rows=6644 width=24) (actual
> time=0.015..4.904 rows=6903 loops=1)
>->  Index Scan using nodes_changesetidndx on nodes nd
> (cost=0.58..31387.49 rows=29230 width=24) (actual time=12.655..40.259
> rows=20 loops=6903)
>  Index Cond: (changeset_id = cs.changeset_id)
>  Total runtime: 278026.196 ms
> (5 rows) 
> 
> -Completed after less than 5 minutes processing 
> -I assume that cost=100.00..1000110.44 for the Seq Scan of
> changesets_selection is an artefact of setting enable_seqscan=OFF.
> -From what I see, the evaluation of rows number is still way off (1400X) for
> the large table, even if the statistics target was doubled to 200.
> 
> -- QUERY PLAN on larger table (nodes) with SET enable_seqscan=ON
> --;
>  Hash Join  (cost=156171782.28..185673195.13 rows=194204592 width=40)
>Hash Cond: (cs.changeset_id = n.changeset_id)
>->  Seq Scan on changesets_selection cs  (cost=0.00..110.44 rows=6644
> width=24)
>->  Hash  (cost=84959952.68..84959952.68 rows=3878771968 width=24)
>  ->  Seq Scan on nodes nd  (cost=0.00..84959952.68 rows=3878771968
> width=24)
> (5 rows) 
> 
> -Still running after 2:30 hours processing! That is why I did not provided
> the actual time and rows (however, actual rows are provided on first query
> plan)
> -Not surprisingly, the evaluation of rows number is way off again for the
> large table - same stats, same results...
> 
> It seems there is a problem with my large table statistics, even after
> increase them to 200. Should I increase the statistic target to 500, or even
> to 1000?
> Is there something else I can trigger to get the appropriate plan?
> 
> Comments/explanations would be appreciated
> Daniel
> 
> 
> -Original Message-
> From: pgsql-general-ow...@postgresql.org
> [mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Daniel Begin
> Sent: June-03-15 06:32
> To: 'Bill Moran'
> Cc: pgsql-general@postgresql.org; 'Tomas Vondra'; 'Melvin Davidson'
> Subject: Re: [GENERAL] Planner cost adjustments
> 
> Thank Bill,
> 
> About disks performance, all drives are identical and connected using USB3
> connections and yes, I can tweak values and restart Postgres without any
> hardship!-)
> About seq_page_cost and random_page_cost, I am about to test different lower
> values as you and Thomas propose.
> Raising the statistics target is a good idea. Since most of the data have a
> power law distribution it might affect the statistics.
> 
> I will do as suggested and get back to the community for further comments. I
> wished to follow the procedure proposed by PT, just in case I eventually had
> to on step 4 (contact PostgreSQL developers so they can improve the
> planner).
> 
> And I am still open to other proposal
> Daniel
> 
> -Original Messag

Re: [GENERAL] Momentary Delay

2015-06-12 Thread Bill Moran

Please do not remove the mailing list from replies. See below.

On Fri, 12 Jun 2015 09:21:19 -0300
Anderson Valadares  wrote:

> 2015-06-08 20:33 GMT-03:00 Bill Moran :
> 
> > On Mon, 8 Jun 2015 11:59:31 -0300
> > Anderson Valadares  wrote:
> >
> > > Hi
> > >  We are experiencing an intermittent problem in a GIS database from a
> > > client. Some times of the day is as if the PostgreSQL executed the
> > slowest
> > > operations see below an example. The query was performed three times,
> > twice
> > > I canceled and the third time I left it to end. The consultation took 10
> > > seconds to finish, but the measured execution time is 20 ms. As we have
> > > several queries that run every 2 seconds when this momentary delay occurs
> > > queries accumulate and burst the limit of 203 connections allowed. The
> > > interval of "momentary stop" are approximately 2 seconds occurring at
> > > random times and during these stoppages occur no locks and no increased
> > > load on the server is identified. There is a job 2/2 seconds locks
> > > collection of information, running queries, etc., nmon also collects
> > > information every 2 seconds. My client asks what causes these momentary
> > > stops? because it affects all operations of the database? etc. How do I
> > > identify what is causing these delays in executions of operations in the
> > > database?
> >
> > Number of points to consider:
> > * Check the contents of pg_locks and see if something is waiting on a
> >   lock when the query is slow.
> 
>  There is nothing waiting when the query is slow, see:
> 
>  snap_datetime | waiting | count
> ---+-+---
>  2015-06-05 09:25:00.954731-03 | f   |74
>  2015-06-05 09:26:00.249187-03 | f   |   205
>  2015-06-05 09:27:00.826874-03 | f   |   207

I don't know what that means. Since you don't show the query that
generated that output, I have no idea if your statement is valid, or
if you're running a query that will inherently produce incorrect
results.

> * Also, ensure log_lock_waits is turned on for the server and check
> >   PostgreSQL's logs for messages about long lock waits.
> 
> There is also nothing

This is even worse than the previous statement. What did you _do_ to
determine this?

Since you're unable to diagnose and fix the problem on your own, the
possibility exists that the problem is with your diagnostic steps. If
you don't share the details of those steps, I have no way to help you
with them.

> > * Based on the connection behavior you describe, I'm guessing it's a
> >   Tomcat app using some sort of conection pool. Whatever it is, tune
> >   your connection pool settings so that the max size of the pool doesn't
> >   exceed the available PostgreSQL connections. At least that will prevent
> >   errors from happening when the problem occurs.
> 
> Yes it is a tomcat application using connection pooling and will be set to
> not exceed the limit of PostgreSQL connections, but the central point is
> that the number of connections has increased due to what I am calling
> "momentary stoppages"

I understand that. I made the suggestion as a temporary fix to reduce
the impact until you are able to fix the actual cause.

> > * The query you you keep cancelling below, run just EXPLAIN on it (not
> >   EXPLAIN ANALYZE) which will certainly finish and give you a plan
> >   that can be reviewed to help determine what the problem is.
> 
> Explain plan in http://explain.depesz.com/s/bWw

I'm confused. The query at that URL only took 10ms to complete, which
doesn't seem at all unreasonable.

> * On the OS, monitor iostat or something else to see if you're
> >   saturating disk capacity when the problem happens.
> 
> The OS is also being monitored and there occurs no saturation.

Again, without any details, it's possible that there is a problem
in your monitoring.

> 
> > The Server is a IBM P720 128G RAM PostgreSQL 9.2.9 on
> > > powerpc-ibm-aix7.1.0.0, compiled by gcc (GCC) 4.6.4, 64-bit
> > >
> > > Evolution of the number of connections for a period
> > >
> > > snap_datetime|  #connections
> > > --+-
> > > 2015-06-05 09:25:00.954731-03 | 74
> > > 2015-06-05 09:26:00.249187-03 | 205
> > > 2015-06-05 09:27:00.826874-03 | 207
> > > 2015-06-05 09:28:00.374666-03 | 73
> > > 2015-06-05 09:29:00.690696-03 | 75
> > >
> > >
> > 

Re: [GENERAL] Momentary Delay

2015-06-15 Thread Bill Moran
On Mon, 15 Jun 2015 17:48:54 -0300
Anderson Valadares  wrote:

> 2015-06-12 19:56 GMT-03:00 Bill Moran :
> 
> > Please do not remove the mailing list from replies. See below.
> >
> > On Fri, 12 Jun 2015 09:21:19 -0300
> > Anderson Valadares  wrote:
> >
> > > 2015-06-08 20:33 GMT-03:00 Bill Moran :
> > >
> > > > On Mon, 8 Jun 2015 11:59:31 -0300
> > > > Anderson Valadares  wrote:
> > > >
> > > > > Hi
> > > > >  We are experiencing an intermittent problem in a GIS database from a
> > > > > client. Some times of the day is as if the PostgreSQL executed the
> > > > slowest
> > > > > operations see below an example. The query was performed three times,
> > > > twice
> > > > > I canceled and the third time I left it to end. The consultation
> > took 10
> > > > > seconds to finish, but the measured execution time is 20 ms. As we
> > have
> > > > > several queries that run every 2 seconds when this momentary delay
> > occurs
> > > > > queries accumulate and burst the limit of 203 connections allowed.
> > The
> > > > > interval of "momentary stop" are approximately 2 seconds occurring at
> > > > > random times and during these stoppages occur no locks and no
> > increased
> > > > > load on the server is identified. There is a job 2/2 seconds locks
> > > > > collection of information, running queries, etc., nmon also collects
> > > > > information every 2 seconds. My client asks what causes these
> > momentary
> > > > > stops? because it affects all operations of the database? etc. How
> > do I
> > > > > identify what is causing these delays in executions of operations in
> > the
> > > > > database?
> > > >
> > > > Number of points to consider:
> > > > * Check the contents of pg_locks and see if something is waiting on a
> > > >   lock when the query is slow.
> > >
> > >  There is nothing waiting when the query is slow, see:
> > >
> > >  snap_datetime | waiting | count
> > > ---+-+---
> > >  2015-06-05 09:25:00.954731-03 | f   |74
> > >  2015-06-05 09:26:00.249187-03 | f   |   205
> > >  2015-06-05 09:27:00.826874-03 | f   |   207
> >
> > I don't know what that means. Since you don't show the query that
> > generated that output, I have no idea if your statement is valid, or
> > if you're running a query that will inherently produce incorrect
> > results.
> 
> Sorry the information was incomplete. Below is the query that was
> performed to extract information from tables that used to monitor the
> database.
> The snap_datetime column indicates the time that occurred monitoring,
> the waiting column tells you whether any connection was on hold and the
> column
> count tells how many connections existed at the time.
> 
> select
>  s.snap_datetime
>  ,a.waiting
>  , count(*)
> from stat_snapshot s, stat_activity a
> where s.snap_id = a.snap_id
>   and s.snap_datetime >= '2015-06-05 09:25:00'
>   and s.snap_datetime <= '2015-06-05 09:28:00'
> group by
>   s.snap_datetime
>   ,a.waiting
> order by s.snap_datetime
> ;
>  snap_datetime | waiting | count
> ---+-+---
>  2015-06-05 09:25:00.954731-03 | f   |74
>  2015-06-05 09:26:00.249187-03 | f   |   205
>  2015-06-05 09:27:00.826874-03 | f   |   207
> (3 rows)

Given what I'm seeing above, there's still a lot of information
missing. Those tables are not standard PostgreSQL tables, and I'm
not aware of any tool that creates them, thus I have to assume
it's a cron job or something similar created in-house. Given the
query you ran and the resultant data, my first guess is that the
data in stat_snapshot and stat_activity is worthless: i.e. it's
captured once per minute, and therefore there are 59+ seconds
worth of detail that aren't captured, thus the actual liklihood
that the those tables will contain any indication of the problem
is very low.

> > > * Also, ensure log_lock_waits is turned on for the server and check
> > > >   PostgreSQL's logs for messages about long lock waits.
> > >
> > > There is also nothing
> >
> > This is even worse than the previous statement. What did you _do_ to
> > determine this?
> >
> > Since you&#

Re: [GENERAL] Compression function

2015-06-16 Thread Bill Moran
On Tue, 16 Jun 2015 04:45:52 -0300
"Leonardo M. Ramé"  wrote:

> Hi, does anyone know if there's a compression function to let me store 
> in gzipped/deflate format TEXT or Bytea fields.
> 
> Please correct me if I'm wrong, but I also wonder if this function is 
> really needed since I've read large objects are stored with TOAST, hence 
> compression is already there.

The TOAST system does do compression, but depending on your expectation,
you may be disappointed.

The big thing that might let you down is that the TOAST code doesn't run
at all unless the tuple is larger than 2K. As a result, you could have
fairly large rows of almost 2000 bytes long, that _could_ compress to
significantly less than that, but PostgreSQL never tries to compress.
Additionally, PostgreSQL stops trying to compress fields once the row size
is smaller than 2K, so if you have multiple fields that could benefit from
compression, they might not all be compressed.

As a result, if you understand your data well, you need to take this into
account, as you might see better results if you do your own compression.
Unfortunately, I don't know of any in-database function that can be used
to compress data; you'd have to write your own or do it at the application
level.

-- 
Bill Moran 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] serialization failure why?

2015-06-16 Thread Bill Moran
On Tue, 16 Jun 2015 13:33:12 +0001
Filipe Pina  wrote:

> I have these 2 tables:
> 
> CREATE TABLE "stuff" ("number" integer NOT NULL PRIMARY KEY, "title" 
> varchar(40) NOT NULL);
> CREATE TABLE "stuff_ext" ("stuff_ptr_id" integer NOT NULL PRIMARY KEY, 
> "extra" integer NOT NULL);
> ALTER TABLE "stuff_ext" ADD CONSTRAINT 
> "stuff_ext_stuff_ptr_id_5a4ee8edae53404b" FOREIGN KEY ("stuff_ptr_id") 
> REFERENCES "stuff" ("number") DEFERRABLE INITIALLY DEFERRED;
> CREATE SEQUENCE stuff_seq;
> 
> And then the function:
> 
> CREATE OR REPLACE FUNCTION create_stuff(number integer, title text)
> RETURNS integer AS $$
> DECLARE
> a1 stuff;
> a2 stuff_ext;
> BEGIN
> IF number IS NULL THEN
> number := nextval('stuff_seq');
> END IF;
> 
> a1.number := number;
> a1.title := title;
> 
> a2.stuff_ptr_id := a1.number;
> 
> INSERT INTO stuff VALUES (a1.*);
> INSERT INTO stuff_ext VALUES (a2.*);
> 
> RETURN number;
> END
> $$
> LANGUAGE plpgsql;
> 
> 
> The DB is configured for SERIALIZABLE transaction mode.
> 
> Now, if I can the function without passing number, such as:
> 
> select create_stuff(NULL,'title');
> 
> in 10 forked processes in a loop with a few iterations in each, I get 
> quite a few SERIALIZATON FAILURE (sqlstate 40001).
> 
> If I comment out the "INSERT INTO stuff_ext" line, I don't get any.
> 
> How is the second insert causing serialize dependencies...?

I'm not sure this is correct, but I have a theory.

Essentially, PostgreSQL can't be sure that the foreign key will be
valid if the other transaction rolls back. i.e., what if the foreign
key is valid becuase the other transaction created the matching row,
and that other transaction then rolls back?

In other isolation modes, it can just wait for the appropriate lock to
free up, then see what happens. But in serializable mode it hits a
condition where it can't ensure serializability.

> The specific error messages vary between
> 
> ERROR: could not serialize access due to read/write dependencies among 
> transactions
> DETAIL: Reason code: Canceled on identification as a pivot, during 
> commit attempt.
> HINT: The transaction might succeed if retried.
> 
> and
> 
> ERROR: could not serialize access due to read/write dependencies among 
> transactions
> DETAIL: Reason code: Canceled on commit attempt with conflict in from 
> prepared pivot.
> HINT: The transaction might succeed if retried.
> 
> Thanks!


-- 
Bill Moran 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Momentary Delay

2015-06-17 Thread Bill Moran
On Wed, 17 Jun 2015 10:33:37 -0300
Anderson Valadares  wrote:

> 2015-06-15 18:19 GMT-03:00 Bill Moran :
> 
> > On Mon, 15 Jun 2015 17:48:54 -0300
> > Anderson Valadares  wrote:
> >
> > > 2015-06-12 19:56 GMT-03:00 Bill Moran :
> > >
> > > > Please do not remove the mailing list from replies. See below.
> > > >
> > > > On Fri, 12 Jun 2015 09:21:19 -0300
> > > > Anderson Valadares  wrote:
> > > >
> > > > > 2015-06-08 20:33 GMT-03:00 Bill Moran :
> > > > >
> > > > > > On Mon, 8 Jun 2015 11:59:31 -0300
> > > > > > Anderson Valadares  wrote:
> > > > > >
> > > > > > > Hi
> > > > > > >  We are experiencing an intermittent problem in a GIS database
> > from a
> > > > > > > client. Some times of the day is as if the PostgreSQL executed
> > the
> > > > > > slowest
> > > > > > > operations see below an example. The query was performed three
> > times,
> > > > > > twice
> > > > > > > I canceled and the third time I left it to end. The consultation
> > > > took 10
> > > > > > > seconds to finish, but the measured execution time is 20 ms. As
> > we
> > > > have
> > > > > > > several queries that run every 2 seconds when this momentary
> > delay
> > > > occurs
> > > > > > > queries accumulate and burst the limit of 203 connections
> > allowed.
> > > > The
> > > > > > > interval of "momentary stop" are approximately 2 seconds
> > occurring at
> > > > > > > random times and during these stoppages occur no locks and no
> > > > increased
> > > > > > > load on the server is identified. There is a job 2/2 seconds
> > locks
> > > > > > > collection of information, running queries, etc., nmon also
> > collects
> > > > > > > information every 2 seconds. My client asks what causes these
> > > > momentary
> > > > > > > stops? because it affects all operations of the database? etc.
> > How
> > > > do I
> > > > > > > identify what is causing these delays in executions of
> > operations in
> > > > the
> > > > > > > database?
> > > > > >
> > > > > > Number of points to consider:
> > > > > > * Check the contents of pg_locks and see if something is waiting
> > on a
> > > > > >   lock when the query is slow.
> > > > >
> > > > >  There is nothing waiting when the query is slow, see:
> > > > >
> > > > >  snap_datetime | waiting | count
> > > > > ---+-+---
> > > > >  2015-06-05 09:25:00.954731-03 | f   |74
> > > > >  2015-06-05 09:26:00.249187-03 | f   |   205
> > > > >  2015-06-05 09:27:00.826874-03 | f   |   207
> > > >
> > > > I don't know what that means. Since you don't show the query that
> > > > generated that output, I have no idea if your statement is valid, or
> > > > if you're running a query that will inherently produce incorrect
> > > > results.
> > >
> > > Sorry the information was incomplete. Below is the query that was
> > > performed to extract information from tables that used to monitor the
> > > database.
> > > The snap_datetime column indicates the time that occurred monitoring,
> > > the waiting column tells you whether any connection was on hold and the
> > > column
> > > count tells how many connections existed at the time.
> > >
> > > select
> > >  s.snap_datetime
> > >  ,a.waiting
> > >  , count(*)
> > > from stat_snapshot s, stat_activity a
> > > where s.snap_id = a.snap_id
> > >   and s.snap_datetime >= '2015-06-05 09:25:00'
> > >   and s.snap_datetime <= '2015-06-05 09:28:00'
> > > group by
> > >   s.snap_datetime
> > >   ,a.waiting
> > > order by s.snap_datetime
> > > ;
> > >  snap_datetime | waiting | count
> > > ---+-+---
> > >  2015-06-05 09:25:00.954731-03 | f   |74
> > >  2015-06-05 09:26:00.249187-03 | f

Re: [GENERAL] Select query regarding info

2015-06-18 Thread Bill Moran
On Thu, 18 Jun 2015 07:29:37 +
"Yogesh. Sharma"  wrote:

> HI Everyone,
> 
> Below DB query is showing below error on postgresql9.3.
> SELECT '\'' || t2.name || '\'', '\'' || t1.phone_number || '\'', '\'' || 
> t1.details || '\'', '\'' || t1.description || '\'', '\'' || (CASE WHEN 
> t1.s_id IS NULL THEN 'N/A' ELSE t3.s_type END) || '\'', '\'' || t1.s_id || 
> '\'' FROM abc_tble AS t1 LEFT JOIN pqrtable AS t2 ON t1.s_id = nid LEFT JOIN 
> te AS t3 ON t1.s_id = t3.s_id;
> Invalid command \''. Try \? for help.
> But Above query is working fine in postgresql8.3.
> Solution is provided by someone:-
> The SQL standard defines two single quotes to escape one inside a literal: 
> ''''
> Postgres 8.3 defaulted to a non-standard behavior where it was allowed to 
> escape a single quote using a backslash: '\''
> This deviation from the SQL standard was always discouraged and can be 
> controlled through the configuration parameter 
> standard_conforming_strings<http://www.postgresql.org/docs/current/static/runtime-config-compatible.html#GUC-STANDARD-CONFORMING-STRINGS>
> With version 9.1 the default for this parameter was changed from off to on. 
> Version 8.1 and later would emit a warning when you used the non-standard way 
> of escaping single quotes (unless you explicitly turned that off)
> 
> 
> Could you please provide below information.
>  How to change standard_conforming_strings value of postgresql.conf? I have 
> checked but this option is not found in postgresql.conf.

Add it to the file.

Also, don't reply to unrelated threads with new questions, a lot of
people won't see your question if you do that, and if nobody sees
your question you won't get an answer.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] INSERT a number in a column based on other columns OLD INSERTs

2015-06-20 Thread Bill Moran
On Sat, 20 Jun 2015 10:44:21 -0700 (MST)
litu16  wrote:

> In PostgreSQL I have this table... (there is a primary key in the most left
> side "timestamp02" which is not shown in this image)
> 
> in the table above, all columns are entered via querrys, except the
> "time_index" which I would like to be filled automatically via a trigger
> each time each row is filled.
> 
> This is the code to create the same table (without any value) so everyone
> could create it using the Postgre SQL query panel.
> 
> *CREATE TABLE table_ebscb_spa_log02
> (
>   pcnum smallint,
>   timestamp02 timestamp with time zone NOT NULL DEFAULT now(),
>   fn_name character varying,
>   "time" time without time zone,
>   time_elapse character varying,
>   time_type character varying,
>   time_index real,
>   CONSTRAINT table_ebscb_spa_log02_pkey PRIMARY KEY (timestamp02)
> )
> WITH (
>   OIDS=FALSE
> );
> ALTER TABLE table_ebscb_spa_log02
>   OWNER TO postgres;*
> 
> What I would like the trigger to do is:
> 
> INSERT a number in the "time_index" column based on the INSERTed values of
> the "fn_name" and "time_type" columns in each row.
> 
> If both ("fn_name" and "time_type") do a combination (eg. Check Mails -
> Start) that doesn't exist in any row before (above), then INSERT 1 in the
> "time_index" column,
> 
> Elif both ("fn_name" and "time_type") do a combination that does exist in
> some row before (above), then INSERT the number following the one
> before(above) in the "time_index" column.
> 
> (pls look at the example table image, this trigger will produce every red
> highlighted square on it)
> 
> 
> I have tried so far this to create the function:
> 
> CREATE OR REPLACE FUNCTION on_ai_myTable() RETURNS TRIGGER AS $$
> DECLARE
> t_ix real;
> n int;
> 
> BEGIN
> IF NEW.time_type = 'Start' THEN
> SELECT t.time_index FROM table_ebscb_spa_log02 t WHERE t.fn_name =
> NEW.fn_name AND t.time_type = 'Start' ORDER BY t.timestamp02 DESC LIMIT 1
> INTO t_ix;
>   GET DIAGNOSTICS n = ROW_COUNT;
> IF (n = 0) THEN 
> t_ix = 1;
> ELSE 
> t_ix = t_ix + 1;
> END IF;
> END IF;
> NEW.time_index = t_ix;
> return NEW;
> END
> $$
> LANGUAGE plpgsql;
> 
> 
> But when I manually insert the values in the table, nothing change (no error
> message) time_index column just remain empty, what am I doing wrong???
> 
> Please some good PostgreSQL fellow programmer could give me a hand, I really
> have come to a death point in this task, I have any more ideas.

Couple things.

First off, you don't show your statement for creating the trigger. This is 
important.
The trigger has to be a BEFORE trigger FOR EACH ROW. Otherwise, the returned 
value
won't do anything. It should read like this:

CREATE TRIGGER trigger_name
 BEFORE INSERT ON table_ebscb_spa_log02
 FOR EACH ROW EXECUTE PROCEDURE on_ai_myTable();

If you created it with AFTER INSERT or FOR EACH STATEMENT, then the trigger 
won't
work as desired.

The other thing about assignment being := was already mentioned.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] very slow queries and ineffective vacuum

2015-07-02 Thread Bill Moran
On Thu, 2 Jul 2015 12:58:18 +0200
Lukasz Wrobel  wrote:

> Hello again.
> 
> Thank you for all your responses. I will try to clarify more and attempt to
> answer the questions you raised.
> 
> I'm attaching the postgresql.conf this time. I cannot supply you guys with
> a proper database schema, so I will try to supply you with some obfuscated
> logs and queries. Sorry for the complication.
> 
> First of all I seem to have misdirected you guys about the pg_stat* tables.
> I have a virtual machine with the database from our test team, which was
> running for a month. When I deploy it, our java application is not running,
> so no queries are being executed. The pg_stat* tables contain no data
> (which is surprising). When I launch the application and queries start
> going, the stats are collected normally and autovacuums are being performed.
> 
> I attached the output of vacuum verbose command.
> 
> As for the pg_stat_activity, I have no "idle in transaction" records there,
> but I do have some in "idle" state, that don't disappear. Perhaps this
> means some sessions are not closed? I attached the query result as
> activity.txt.
> 
> I also have a few "sending cancel to blocking autovacuum" and "canceling
> autovacuum task" messages in syslog.
> 
> Sample query explain analyze. This was ran after vacuum analyze of the
> entire database.

The analyze doesn't seem to be working terribly well. Looking at the
explain, it expects 337963 rows in table57, but there are only 6789.
There are similar discrepencies with table19 and table84.

I don't know if indexes are your problem. Those three tables are pretty
small, so the sequential scans should be pretty quick (probably faster
than index scans, since it looks like most of the rows are returned from
all the tables.

I'm somewhat confused by your description of the situation. Is the performance
problem happening on the virtual machine? Because VMs are notorious for
being on oversubscribed hosts and exhibiting performance far below what
is expected. It would be worthwhile to do some disk speed and CPU speed tests
on the VM to see what kind of performance it's actually capable of ... if
the VM is performing poorly, there's not much you can do with PostgreSQL
to improve things.

> explain analyze SELECT col1, col2, col3, col4, col5 FROM ( table84 table84
> LEFT JOIN table57 table57 ON table84.col7 = table57.col7 ) LEFT JOIN
> table19 table19 ON table84.col7 = table19.col7;
>  QUERY
> PLAN
> -
>  Hash Right Join  (cost=46435.43..108382.29 rows=189496 width=79) (actual
> time=4461.686..13457.233 rows=5749 loops=1)
>Hash Cond: (table57.col7 = table84.col7)
>->  Seq Scan on table57 table57  (cost=0.00..49196.63 rows=337963
> width=57) (actual time=0.040..8981.438 rows=6789 loops=1)
>->  Hash  (cost=42585.73..42585.73 rows=189496 width=38) (actual
> time=4447.731..4447.731 rows=5749 loops=1)
>  Buckets: 16384  Batches: 2  Memory Usage: 203kB
>  ->  Hash Right Join  (cost=18080.66..42585.73 rows=189496
> width=38) (actual time=1675.223..4442.046 rows=5749 loops=1)
>Hash Cond: (table19.col7 = table84.col7)
>->  Seq Scan on table19 table19  (cost=0.00..17788.17
> rows=187317 width=26) (actual time=0.007..2756.501 rows=5003 loops=1)
>->  Hash  (cost=14600.96..14600.96 rows=189496 width=20)
> (actual time=1674.940..1674.940 rows=5749 loops=1)
>  Buckets: 32768  Batches: 2  Memory Usage: 159kB
>  ->  Seq Scan on table84 table84  (cost=0.00..14600.96
> rows=189496 width=20) (actual time=0.059..1661.482 rows=5749 loops=1)
>  Total runtime: 13458.301 ms
> (12 rows)
> 
> Thank you again for your advice and I hope that with your help I'll be able
> to solve this issue.
> 
> Best regards.
> Lukasz


-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Fwd: PostgreSQL & VMWare

2015-07-03 Thread Bill Moran
On Fri, 3 Jul 2015 12:35:07 +0200
Jean-Gérard Pailloncy  wrote:
> 
> I work on a project that collects geolocalized data.
> All data will be in PostgreSQL / PostGIS.
> 
> The small PostgreSQL databases will be on Linux guests on VMWare hosts.
> 
> The size of the main database will grow by 50 TB / year, 500 M row / day.
> For the largest one, we plan to test different options.
> One of them is to stay with Linux on WMWare.
> Outside the questions about schema, sharding, I would appreciate if some of 
> you have informations, benchmarks, stories about big PostgreSQL databases on 
> Linux guests on VMWare hosts.

The place I'm working now did a feasibility study about installing
their primary app on vmware instead of directly onto the hardware.
Their conclusion was that the app would be about 25% slower running
on VMWare. The app is very database-centric. However, I wasn't
involved in the tests, can't vouche for the quality of the testing,
and there _are_ other pieces involved than the database.

That being said, I've used PostgreSQL on VMs quite a bit. It does
seem slower, but I've never actually benchmarked it. And it's never
seemed slower enough for me to complain much.

The concern I have about running a large database on a VM (especially
since you're asking about performance) is not he VM itself, but all
the baggage that inevitably comes with it ... oversubscribed hosts,
terrible, cheap SANs, poor administration leading to bad configuration,
and yet another layer of obscurity preventing you from figuring out
why things are slow. In my experience, you _will_ get all of these,
because once you're on a VM, the admins will be pressured to host
more and more VMs on the existing hardware and/or add capacity at
minimal cost.

There's nothing like a VM where you never know what the performance
will be because you never know when some other VMs (completely unrelated
to you and/or your work) will saturate the IO with some ridiculous
grep recursive command or something.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Backup Method

2015-07-03 Thread Bill Moran
On Fri, 03 Jul 2015 13:16:02 +0200
Jan Lentfer  wrote:

> Am 2015-07-03 13:00, schrieb howardn...@selestial.com:
> > On 03/07/2015 11:39, Guillaume Lelarge wrote:
> >>
> >> > In that case is there any recommendation for how often to make 
> >> base backups in relation to the size of the cluster and the size of 
> >> the WAL?
> >> >
> >>
> >> Nope, not really. That depends on a lot of things. Our customers 
> >> usually do one per day.
> >>
> >>
> > Excuse my ignorance... Is the base backup, in general, faster than 
> > pg_dump?
> 
> It is a different approach. With the base backup you are actually 
> backing up files from the filesystem ($PGDATA directory), whereas with 
> pg_dump your saving the SQL commands to reload and rebuild the database.
> "Usually" a file based backup will be faster, both on backup and 
> restore, but it is - as mentioned - a different approach and it might 
> also not serve all your purposes.

One of the things that makes a lot of difference is the amount of
redundant data in the database. For example, indexes are completely
redundant. They sure do speed things up, but they're storing the same
data 2x for each index you have. When you do a base backup, you have
to copy all that redundancy, but when you do a pg_dump, all that
redundant data is reduced to a single CREATE INDEX command. The
result being that if your database has a lot of indexes, the pg_dump
might actually be faster.

But the only way to know is to try it out on your particular system.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgresql jsonb

2015-08-14 Thread Bill Moran
On Fri, 14 Aug 2015 17:39:49 +0530
Deepak Balasubramanyam  wrote:
> 
> I have a table (20 million rows) in Postgresql 9.4 that contains a bigint
> id as the primary key and another column that contains jsonb data. Queries
> run on this table look like so...
> 
> 
> ## Query
> 
> select ... from table
> WHERE table.column ->'item'->> 'name' = 'value'
> 
> 
> I'd like to make an effort to get Postgresql to keep all data available in
> this table and any index on this table in memory. This would ensure that
> sequence or index scans made on the data are fairly fast.
> 
> Research into this problem indicates that there is no reliable way to get
> Postgresql to run off of RAM memory completely (
> http://stackoverflow.com/a/24235439/830964). Assuming the table and its
> indexes amount to 15 gb of data  on the disk and the machine contains 64GB
> of RAM with shared buffers placed at anywhere from 16-24 GB, here are my
> questions...
> 
> 1. When postgresql returns data from this query, how can I tell how much of
> the data was cached in memory?

I'm not aware of any way to do that on a per-query basis.

> 2. I'm aware that I can tweak the shared buffer so that more data is
> cached. Is there a way to monitor this value for its effectiveness?

Install the pg_buffercache extension and read up on what it provides. It
gives a pretty good view into what PostgreSQL is keeping in memory.

> 3. Is there a reliable way / calculation (or close to it), to determine a
> point after which Postgresql will ask the disk for data Vs the caches?

It will ask the disk for data if the data is not in memory. As long as the
data it needs is in memory, it will never talk to the disk unless it needs
to write data back.

The cache is a cache. So there are only 2 reasons your data wouldn't all be
in memory all the time:

1) It doesn't all fit
2) Some of that memory is needed by other tables/indexes/etc

As far as when things get evicted from memory, you'll have to look at the
source code, but it's your typical "keep the most commonly needed data in
memory" algorithms.

What problem are you seeing? What is your performance requirement, and what
is the observed performance? I ask because it's unlikely that you really
need to dig into these details like you are, and most people who ask
questions like this are misguided in some way.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Why does splitting $PGDATA and xlog yield a performance benefit?

2015-08-25 Thread Bill Moran
On Tue, 25 Aug 2015 10:08:48 -0700
David Kerr  wrote:

> Howdy All,
> 
> For a very long time I've held the belief that splitting PGDATA and xlog on 
> linux systems fairly universally gives a decent performance benefit for many 
> common workloads.
> (i've seen up to 20% personally).
> 
> I was under the impression that this had to do with regular fsync()'s from 
> the WAL 
> interfearing with and over-reaching writing out the filesystem buffers. 
> 
> Basically, I think i was conflating fsync() with sync(). 
> 
> So if it's not that, then that just leaves bandwith (ignoring all of the 
> other best practice reasons for reliablity, etc.). So, in theory if you're 
> not swamping your disk I/O then you won't really benefit from relocating your 
> XLOGs.

Disk performance can be a bit more complicated than just "swamping." Even if
you're not maxing out the IO bandwidth, you could be getting enough that some
writes are waiting on other writes before they can be processed. Consider the
fact that old-style ethernet was only able to hit ~80% of its theoretical
capacity in the real world, because the chance of collisions increased with
the amount of data, and each collision slowed down the overall transfer speed.
Contrasted with modern ethernet that doesn't do collisions, you can get much
closer to 100% of the rated bandwith because the communications are effectively
partitioned from each other.

In the worst case scenerion, if two processes (due to horrible luck) _always_
try to write at the same time, the overall responsiveness will be lousy, even
if the bandwidth usage is only a small percent of the available. Of course,
that worst case doesn't happen in actual practice, but as the usage goes up,
the chance of hitting that interference increases, and the effective response
goes down, even when there's bandwidth still available.

Separate the competing processes, and the chance of conflict is 0. So your
responsiveness is pretty much at best-case all the time.

> However, I know from experience that's not entirely true, (although it's not 
> always easy to measure all aspects of your I/O bandwith).
> 
> Am I missing something?

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Anyone interested in a Pittsburgh-area Postgres users'

2015-09-08 Thread Bill Moran

I will be in Pittsburgh full-time starting the middle of Oct, and I
would be interested in helping coordinate a UG. I have a feeling that
I could convince a number of people I've worked with to get involved
as well.

On Tue, 8 Sep 2015 11:10:34 -0400
James Keener  wrote:

> Is there a user group in Pittsburgh?  This email was the first that
> showed up in a Google Search.
> 
> Jim
> 
> On 2004-05-02 05:43:26, Tom Lane wrote:
> 
> > I've gotten a couple of inquiries lately about a Postgres users' group
> > in my home town of Pittsburgh PA.  There is not one (unless it's very
> > well camouflaged) but perhaps there is critical mass to create one.
> > If you think you might come to meetings of such a group, let me know
> > off-list.  If I get enough responses I'll set up an initial meeting
> > and we'll see where it goes ...
> > 
> > regards, tom lane
> 
> 
> 


-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Can somebody explain what is the meaning for HashAggregate?

2015-09-20 Thread Bill Moran
On Sun, 20 Sep 2015 21:56:39 +0800 (CST)
lin  wrote:

> Can somebody explain what does the postgres done for the explain of sql shows 
> HashAggregate( what is the meaning for  HashAggregate  )?
> for example: 
> 
> 
> postgres=# explain verbose select oid,relname  from pg_class group by 
> oid,relname;
>  QUERY PLAN  
> -
>  HashAggregate  (cost=12.42..15.38 rows=295 width=68)
>Output: oid, relname
>Group Key: pg_class.oid, pg_class.relname
>->  Seq Scan on pg_catalog.pg_class  (cost=0.00..10.95 rows=295 width=68)
>  Output: oid, relname
> (5 rows)
> 
> 
> ::  first, seq scan pg_class and return (oid,relname);
>second,  make group (oid,relname),  is it fisrt sort by oid then in the 
> oid group sort by relname?
> Can somebody explain what does the database done for hashAggregate?

It combines the values for oid and relname for each returned row, generates a 
hashkey
for them, then uses that hashkey to aggregate (compute the GROUP BY, 
essentially, in
this case)

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Best way to sync table DML between databases

2015-10-05 Thread Bill Moran
On Mon, 5 Oct 2015 06:20:28 -0700 (MST)
jimbosworth  wrote:

> Hi All,
> 
> I have two servers each running pg9.4.4 database instances.  
> I need to determine the best way to keep a large 20gb table on server A
> synchronised onto server B...
> 
> At the moment, I use pg_dump to periodically dump the table on server A,
> then psql to reload into server B.  This is fine, but means I have to pull
> 100% of the table each time rather than just the changes.  This option does
> not offer real time accuracy on server B.
> 
> I have considered using a table trigger on row (update, insert or delete)
> and then using db_link or postgres_fdw to sync the changes, but am concerned
> that a table trigger is synchronous... so a db_link or fdw could incur a
> lengthy delay.
> 
> I have also considered using table OIDs to track changes, then just
> periodically sync the difference.
> 
> I have considered using postgre_fdw and then 'refresh concurrently
> materialized view' on server B.
> 
> I have considered using logical decoding to read the wal files, then extract
> the changes.
> 
> Can anyone explain the best way to synchronise JUST the changes on a table
> between servers please?

Sounds like a problem custom-made to be solved by Slony:
http://slony.info/

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] using postgresql for session

2015-10-07 Thread Bill Moran
On Wed, 7 Oct 2015 09:58:04 -0600
"john.tiger"  wrote:

> has anyone used postgres jsonb for holding session ?  Since server side 
> session is really just a piece of data, why bother with special 
> "session" plugins and just use postgres to hold the data and retrieve it 
> with psycopg2 ?  Maybe use some trigger if session changes?We are 
> using python Bottle with psycopg2 (super simple, powerful combo) - are 
> we missing something magical about session plugins ?

Nothing that I'm aware of. I've worked on large projects that keep the
session data in a Postgres table with great success.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Drop or alter column under load give ERROR #42804 structure of query does not match function result type:

2015-10-09 Thread Bill Moran
On Fri, 9 Oct 2015 14:32:44 +0800
Victor Blomqvist  wrote:

> I have a heavily used PostgreSQL 9.3.5 database on CentOS 6. Sometimes I
> need to add/remove columns, preferably without any service interruptions,
> but I get temporary errors.
> 
> I follow the safe operations list from
> https://www.braintreepayments.com/blog/safe-operations-for-high-volume-postgresql
> but many operations cause troubles anyway when the more busy tables are
> updated.

I seriously doubt if Paul did enough research to be sure that "safe" is an
absolute term for that list.

> Typically I have user defined functions for all operations, and my table
> and functions follow this pattern:
> 
> CREATE TABLE users (
>   id integer PRIMARY KEY,
>   name varchar NOT NULL,
>   to_be_removed integer NOT NULL
> );
> 
> CREATE FUNCTION select_users(id_ integer) RETURNS SETOF users AS
> $$
> BEGIN
>   RETURN QUERY SELECT * FROM users WHERE id = id_;
> END;
> $$ LANGUAGE plpgsql;
> 
> Then the actual queries are run by our application as
> 
> SELECT id, name FROM select_users(18);
> 
> As you can see the column to_be_removed is not selected.

Sure it is ... the function does SELECT *, which absolutely includes the
to_be_removed column. The fact that you ignore that column in a
subsequent superselect doesn't mean that the query in the function knows
to do so.

> Then to remove the
> column I use:
> 
> ALTER TABLE users DROP COLUMN to_be_removed;
> 
> However, while the system is under load sometimes (more frequently and
> persistent the more load the system is experiencing) I get errors like
> these:
> 
> ERROR #42804 structure of query does not match function result type:
> Number of returned columns (2) does not match expected column count (3).

This is a bit surprising to me. I would expect Postgres to have one or the
other definition of that row within a single transaction, but what seems to
be happening is that the ALTER causes the row definition to be changed in
the middle of the transaction, thus the the function may return 3 columns,
but when the outer query checks the type, it sees that it should only
have 2.

> The same error can happen when columns are added. Can this be avoided
> somehow, or do I need to take the system offline during these kind of
> changes?
> 
> For reference, there was a similar but not same issue posted to psql-bugs a
> long time ago:
> http://www.postgresql.org/message-id/8254631e-61a5-4c03-899f-22fdcf369...@e23g2000vbe.googlegroups.com
> 
> I posted this same question at dba.stackexchange and got the advice to
> repost here:
> http://dba.stackexchange.com/questions/117511/postgresql-drop-column-under-load-give-wrong-number-of-columns-errors

This has a lot to do with internals. You should wait a bit to see if you
get a good answer, but if not you might need to post to the hackers list
so the developers can chime in. My opinion is that this is a bug, but it's
an obscure enough bug that it's not surprising that it's gone unfixed for
a while.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] I'm starting a PostgreSQL user's group in Pittsburgh

2015-10-25 Thread Bill Moran

I'm going to see if I can drum up enough interest for a PostgreSQL
user's group in the Pittsburgh area.

After talking to the organizers of the Philadelphia PUG, I decided
to try using Meetup to coordinate things:
http://www.meetup.com/Pittsburgh-PostgreSQL-Users-Group/

If you're in the Pittsburgh area and would like to get involved,
please show your interest by joining the meetup. I'll get a first
event scheduled as soon as we have enough people signed up to make
it interesting.

If you haven't used meetup before: it's a service specifically for
coordinating things like user's groups, and it does a pretty good
job of letting us coordinate activities. Basic membership on the
site is free and includes participating in as many groups as you
desire. (it only costs something if you want to host your own group).

Hope to see you soon.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Service not starting on Ubuntu 15.04

2015-10-26 Thread Bill Moran
On Mon, 26 Oct 2015 11:21:23 +
Lasse Westh-Nielsen  wrote:
> 
> I posted to stackoverflow about my problem upgrading from Ubuntu 14.04 to
> Ubuntu 15.04:
> http://stackoverflow.com/questions/33306475/ubuntu-15-04-postgresql-doesnt-start
> 
> Tl;dr: postgresql service does not start properly when installed as a
> package using cloud-init.
> 
> And I can't figure out if I am doing something wrong, if the AMI is no
> good, if the package has problems, ... I reckon I cannot be the first
> person to use Postgres on Ubuntu Vivid, but I have been hammering my head
> against the wall with this for hours.
> 
> Any help greatly appreciated!

I'm taking a shot in the dark here, but ...

The symptoms you describe seem to suggest that the script is starting
PostgreSQL asynchronously (i.e. in the background) which means that
the CREATE command runs too quickly and the server isn't started yet.

A quick way to _test_ this theory would be to put a sleep between the
install and the CREATE commands and see if the problem goes away.

If that does seem to be the problem, then a good _fix_ would be to
find a way to foreground the startup of the server, or have some command
that tests to ensure the server is started and blocks until it is
before running the create command.

The only point I'm unclear on is whether you've confirmed that
Postgres actually _is_ started once the server is up (albiet without
the CREATE statement having succeeded).

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] ??: postgres cpu 100% need help

2015-10-27 Thread Bill Moran
On Tue, 27 Oct 2015 11:30:45 +0800
"657985...@qq.com" <657985...@qq.com> wrote:

> Dear sir:
>  Recently a wired question about postgresql database really bothered 
> me a lot, so i really need your help. Here is the problem, in the most 
> situations the postgre database work very well,  Average 3500tps/s per day, 
> the cpu usage of its process is 3%~10% and every query can be responsed in 
> less than 20ms, but sometimes the cpu usages of its process can suddenly grow 
> up to 90%+ , at that time a simple query can cost  2000+ms. ps: My postgresql 
> version is 9.3.5 and the database is oltp  server.

9.3.5 is pretty old, you should probably schedule an upgrade.

>  shared_buffers | 25GB  

Try setting this to 16GB. It's been a while since I tested on
large-memory/high-load systems, but I seem to remember that
shared_buffers above 16G could cause these sorts of intermittant
stalls.

If that doesn't improve the situation, you'll probably need to
provide more details, specifically the layout of the table in
question, as well as the queries that are active when the
problem occurs, and the contents of the pg_locks table when
the problem is occurring.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Auto-analyse on insert operations

2015-11-04 Thread Bill Moran
On Wed, 4 Nov 2015 14:32:37 +0100
Bertrand Roos  wrote:
> 
> I try to configure auto-analyse task with postgresql 9.4.
> I have the following configuration (default configuration):
> track_counts = on
> autovacuum = on
> log_autovacuum_min_duration = -1
> autovacuum_max_workers = 3
> autovacuum_naptime = 300s
> autovacuum_vacuum_threshold = 50
> autovacuum_analyze_threshold = 50
> autovacuum_vacuum_scale_factor = 0.2
> autovacuum_analyze_scale_factor = 0.2
> autovacuum_freeze_max_age = 2
> autovacuum_multixact_freeze_max_age = 4
> autovacuum_vacuum_cost_delay = 20ms
> autovacuum_vacuum_cost_limit = -1
> 
> With this configuration, I can observe that some tables are 
> auto-analysed, but some others are not. Even if there are millions of 
> insert operations on an empty table (all tables are in cluster mode).
> In fact it seems that tables with update operations are the only ones 
> that are auto-analysed.
> I'm quite suprised because the documentation says that daemon check the 
> count of insert, update and delete operations.
> What could it be the reason ? Why tables which have only update 
> operation, aren't analysed ?
> Are update operations really taken into account ?

Given that autoanalyze is pretty critical to the way the system functions,
it's unlikely that it just doesn't work (someone else would have noticed).

A more likely scenario is that you've found some extremely obscure edge
case. If that's the case, you're going to have to give very specific
details as to how you're testing it before anyone is liable to be able
to help you.

I get the impression that you're somewhat new to Postgres, in which case
it's very likely that the problem is that you're not testing the situation
correctly. In that case, we're going to need specific details on how you're
observing that tables are or are not being analysed.

As a wild-guess theory: the process that does the analyze only wakes up
to check tables every 5 minutes (based on the config you show) ... so are
you doing the inserts then checking the table without leaving enough time
in between for the system to wake up and notice the change?

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Auto-analyse on insert operations

2015-11-04 Thread Bill Moran
On Wed, 4 Nov 2015 16:43:57 +0100
Bertrand Roos  wrote:
> 
> Le 04/11/2015 14:55, Bill Moran a écrit :
> > On Wed, 4 Nov 2015 14:32:37 +0100
> > Bertrand Roos  wrote:
> >> I try to configure auto-analyse task with postgresql 9.4.
> >> I have the following configuration (default configuration):
> >> track_counts = on
> >> autovacuum = on
> >> log_autovacuum_min_duration = -1
> >> autovacuum_max_workers = 3
> >> autovacuum_naptime = 300s
> >> autovacuum_vacuum_threshold = 50
> >> autovacuum_analyze_threshold = 50
> >> autovacuum_vacuum_scale_factor = 0.2
> >> autovacuum_analyze_scale_factor = 0.2
> >> autovacuum_freeze_max_age = 2
> >> autovacuum_multixact_freeze_max_age = 4
> >> autovacuum_vacuum_cost_delay = 20ms
> >> autovacuum_vacuum_cost_limit = -1
> >>
> >> With this configuration, I can observe that some tables are
> >> auto-analysed, but some others are not. Even if there are millions of
> >> insert operations on an empty table (all tables are in cluster mode).
> >> In fact it seems that tables with update operations are the only ones
> >> that are auto-analysed.
> >> I'm quite suprised because the documentation says that daemon check the
> >> count of insert, update and delete operations.
> >> What could it be the reason ? Why tables which have only update
> >> operation, aren't analysed ?
> >> Are update operations really taken into account ?
> > Given that autoanalyze is pretty critical to the way the system functions,
> > it's unlikely that it just doesn't work (someone else would have noticed).
> >
> > A more likely scenario is that you've found some extremely obscure edge
> > case. If that's the case, you're going to have to give very specific
> > details as to how you're testing it before anyone is liable to be able
> > to help you.
> >
> > I get the impression that you're somewhat new to Postgres, in which case
> > it's very likely that the problem is that you're not testing the situation
> > correctly. In that case, we're going to need specific details on how you're
> > observing that tables are or are not being analysed.
> >
> > As a wild-guess theory: the process that does the analyze only wakes up
> > to check tables every 5 minutes (based on the config you show) ... so are
> > you doing the inserts then checking the table without leaving enough time
> > in between for the system to wake up and notice the change?
> >
> Thanks for your answer Bill.
> Indeed, I'm pretty new to Postgres and I don't exclude that I'm doing 
> something wrong. But I did my test on a more than 1 day duration, so 
> it's not an issue of autovacuum_naptime (I insert 760 lignes each 30 
> seconds during 36 hours).
> I can't give all the details of this test because it is to complicated 
> with triggers and partman (and your objective is not to solve 
> configuration issues of others).

Others have answered some of your other questions, so I'll just throw
out another possibility: have the per-table analyze settings been altered
on the table(s) that are behaving badly? See
http://www.postgresql.org/docs/9.4/static/sql-createtable.html#SQL-CREATETABLE-STORAGE-PARAMETERS

Attaching the output of
pg_dump -s -t $table_name -U postgres $database_name
will probably go a long way toward getting more targeted assistance.
(substitute the actual database name, and the name of a table that is
giving you trouble)

In addition, the output of
SELECT * FROM pg_stat_user_tables WHERE relname = '$table_name';
(Again, substitute an actual table name that's giving you trouble,
preferrably the same table as from the pg_dump)

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Protect a table against concurrent data changes while allowing to vacuum it

2016-06-22 Thread Bill Moran
On Wed, 22 Jun 2016 10:20:38 +
Sameer Kumar  wrote:

> On Wed, Jun 22, 2016 at 6:08 PM Vlad Arkhipov  wrote:
> 
> > I am running PostgreSQL 9.5.
> >
> > CREATE TABLE t (id BIGINT NOT NULL PRIMARY KEY, name TEXT);
> >
> > The constraint that the data must satisfy is `there is no more than 3
> > records with the same name`.
> >
> > I am not in control of queries that modify the table, so advisory locks
> > can hardly be of help to me.
> >
> 
> Define a function which does a count of the rows and if count is 3 it
> return false if count is less it returns true.

An exclusion constraint might be a better solution.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] The consequenses of interrupted vacuum

2016-10-27 Thread Bill Moran

Very specific question:
Does interrupting a VACUUM (not FULL) rollback all the work it has done
so far, or is the work done on a page by page basis such that at least
some of the pages in the table have been vacuumed?

I'm asking for cases of large tables where autovacuum frequently gets
interrupted. I'm trying to understand if the partial runs are at least
making _some_ progress so the next vacuum has less to do, or if this is
a serious problem that I need to fiddle with tuning to fix.

-- 
Bill Moran 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] The consequenses of interrupted vacuum

2016-10-27 Thread Bill Moran
On Thu, 27 Oct 2016 10:44:03 -0400
Tom Lane  wrote:

> > I'm asking for cases of large tables where autovacuum frequently gets
> > interrupted. I'm trying to understand if the partial runs are at least
> > making _some_ progress so the next vacuum has less to do, or if this is
> > a serious problem that I need to fiddle with tuning to fix.
> 
> It's probably making some progress but not much.  You need to fix that.

Thanks for the feedback. The good news is that grepping through recent logs,
I'm not seeing the problem any more. So I must have just noticed it on a
particularly problematic day last time I looked.

-- 
Bill Moran 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Unexplained statistics reset? Help tracking it down.

2016-12-09 Thread Bill Moran

I've been seeing some unexplained behavior whereas the statistics in a Postgres
database reset with no explanation as to why. This is concerning because it
results in terrible query plans until someone manually runs analyze, and that
it negatively impacts autovacuum.

This is Postgres 9.5.4 on Ubuntu 14 (yes, I know there's a patch I need to
apply -- it's on my TODO list, but I don't see any mention of fixing unexplained
stats resets in that version, so I'm not assuming that will fix it)

My first thought is that some random user was calling pg_reset_stats() without
realizing what they were doing. However, I have full query logging enabled on
this system, and the logs don't show this happening. (Yes, I've also checked
for someone disabling query logging for their connection before doing 
myseterious
things).

Before I start theorizing that this might be a bug in Postgres, does anyone have
any suggestions on what other ways the stats could be reset that I need to check
on? Has anyone else experienced this to lend credence to the possibility that 
it's
a bug? I have no clue how to reproduce it, as the occurrance is rare and still
seems random.

-- 
Bill Moran 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] pg_repack and Postgres versions > 9.4

2016-12-16 Thread Bill Moran

Does anyone have experience using pg_repack on Postgres versions > 9.4?
Specifically 9.5, but probably 9.6 at some point.

The documentation claims it supports up to 9.4. I haven't looked at it
closely enough to guess whether there might be changes in 9.5/9.6 to
cause it not to work any more.

Anyone know? Or, alternatively, anyone have another option to get the
same job done?

-- 
Bill Moran 


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] controlling memory management with regard to a specific query (or groups of connections)

2015-11-19 Thread Bill Moran
On Wed, 18 Nov 2015 20:10:00 -0500
Jonathan Vanasco  wrote:

> As a temporary fix I need to write some uploaded image files to PostgreSQL 
> until a task server can read/process/delete them.  
> 
> The problem I've run into (via server load tests that model our production 
> environment), is that these read/writes end up pushing the indexes used by 
> other queries out of memory -- causing them to be re-read from disk.   These 
> files can be anywhere from 200k to 5MB.
> 
> has anyone dealt with situations like this before and has any suggestions?  I 
> could use a dedicated db connection if that would introduce any options. 

PostgreSQL doesn't have any provisions for preferring one thing
or another for storing in memory.

The easiest thing I can think would be to add memory to the machine
(or configure Postgres to use more) such that those files aren't
pushing enough other pages out of memory to have a problematic
impact.

Another idea would be to put the image database on a different
physical server, or run 2 instances of Postgres on a single
server with the files in one database configured with a low
shared_buffers value, and the rest of the data on the other
database server configured with higher shared_buffers.

I know these probably aren't the kind of answers you're looking
for, but I don't have anything better to suggest; and the rest
of the mailing list seems to be devoid of ideas as well.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] JSONB performance enhancement for 9.6

2015-11-29 Thread Bill Moran
On Sat, 28 Nov 2015 21:27:51 -0500
Tom Smith  wrote:
> 
> Is there a plan for 9.6 to resolve the issue of very slow query/retrieval
> of jsonb fields
> when there are large number (maybe several thousands) of top level keys.
> Currently, if I save a large json document with top level keys of thousands
> and query/retrieve
> field values,  the whole document has to be first decompressed and load to
> memory
> before searching for the specific field key/value.

I could be off-base here, but have you tried:

ATLER TABLE $table ALTER COLUMN $json_column SET STORAGE EXTERNAL;

?

The default storage for a JSONB field is EXTENDED. Switching it to
EXTERNAL will disable compression. You'll have to insert your data over
again, since this change doesn't alter any existing data, but see
if that change improves performance.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] JSONB performance enhancement for 9.6

2015-11-29 Thread Bill Moran
On Sun, 29 Nov 2015 08:24:12 -0500
Tom Smith  wrote:

> Hi, Thanks for everyone's response.
> 
> The issue is not just compression, but lack of "indexing" or "segmentation"
> when a
> single doc has, say 2000 top level keys (with multiple levels of subkeys).
>  right now, if I query for one key,  the whole doc
> has to be first uncompressed and loaded and then search for the single key.
> 
> Compared to traditional way of storing each top level key with a separate
> column, this is huge overhead when table scan is required.  Some kind of
> "keyed/slotted" storage for the doc could
> help, (for illustration, all keys starting with 'A' would have its own
> storage unit, so on,
> so when I search for key  "A1" only that unit would be unpacked and
> traversed to get :"A1" value". it is like postgresql predfine 26
> columns/slots for the whole doc. an internal indexing
> within each doc for fast retrieval of individual field values.

Sounds like you're pushing the limits of what JSONB is designed to do
(at this stage, at least). I'm not aware of any improvements in recent
versions (or head) that would do much to improve the situation, but I
don't track ever commit either. If you really need this improvement and
you're willing to wait for 9.6, then I suggest you check out the latest
git version and test on that to see if anything has been done.

I doubt you'll see much, though. As a thought experiment, the only way
I can think to improve this use case is to ditch the current TOAST
system and replace it with something that stores large JSON values in
a form optimized for indexed access. That's a pretty massive change
to some fairly core stuff just to optimize a single use-case of a
single data type. Not saying it won't happen ... in fact, all things
considered, it's pretty likely to happen at some point.

As far as a current solution: my solution would be to decompose the
JSON into an optimized table. I.e.:

CREATE TABLE store1 (
 id SERIAL PRIMARY KEY,
 data JSONB
);

CREATE TABLE store2 (
 id INT NOT NULL REFERENCES store1(id),
 top_level_key VARCHAR(1024),
 data JSONB,
 PRIMARY KEY(top_level_key, id)
);

You can then use a trigger to ensure that store2 is always in sync with
store1. Lookups can then use store2 and will be quite fast because of
the index. A lot of the design is conjectural: do you even still need
the data column on store1? Are there other useful indexes? etc. But,
hopefully the general idea is made clear.

This probably aren't the answers you want, but (to the best of my
knowledge) they're the best answers available at this time. I'd really
like to build the alternate TOAST storage, but I'm not in a position to
start on a project that ambitious right ... I'm not even really keeping
up with the project I'm currently supposed to be doing.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] AccessExclusiveLock on tuple?

2015-12-02 Thread Bill Moran
On Wed, 2 Dec 2015 09:01:37 -0800
Christophe Pettus  wrote:

> On 9.4, I've encountered a locking message I've not seen before:
> 
>   process 5293 still waiting for AccessExclusiveLock on tuple (88636,15) 
> of relation 18238 of database 16415 after 5000.045 ms
> 
> What conditions produce an "AccessExclusiveLock on tuple"?  Attempting to 
> lock a tuple when another process has done an explicit LOCK ACCESS EXCLUSIVE?

No. See the section on row level locks here:
http://www.postgresql.org/docs/9.4/static/explicit-locking.html

Essentially, any data modification could take an exclusive lock on the row(s)
that it's going to modify. Generally, this will be an UPDATE statement,
although the same thing happens when you do SELECT ... FOR UPDATE.

The message you're seeing simply means that one process has been waiting for
a long time for the lock to release (5 seconds in this case). Deadlocks are
automatically handled, so this is not a deadlock. Although if the process
holding the lock does not commit the transaction, the waiting process will
wait indefinitely.

If this is happening infrequently, it's probably of no concern. If it's
happening frequently, you'll want to investigate what process is holding
the locks for so long and see what can be done about it.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] AccessExclusiveLock on tuple?

2015-12-02 Thread Bill Moran
On Wed, 2 Dec 2015 09:31:44 -0800
Christophe Pettus  wrote:
> 
> On Dec 2, 2015, at 9:25 AM, Bill Moran  wrote:
> 
> > No. See the section on row level locks here:
> > http://www.postgresql.org/docs/9.4/static/explicit-locking.html
> 
> That wasn't quite my question.  I'm familiar with the row-level locking and 
> the locking messages in general, but this message implies there is such a 
> thing as an AccessExclusiveLock on a tuple, which is new to me.  I wasn't 
> able to produce this message experimentally doing various combinations of 
> UPDATE statements and SELECT FOR UPDATEs, or even with explicit LOCK ACCESS 
> EXCLUSIVE MODE, thus the question.

First off, that documentation page _does_ answer your question.

Secondly, there is a config setting: log_lock_waits, which is
disabled by default. The message won't appear if that is off, so
if you're testing on a different install than where the incident
happened, that could be part of the problem.

Finally, the following sequence triggers the message:

create table test1 (data int);
insert into test1 values (1);

Connection 1:
begin;
select * from test1 where data = 1 for update;

Connection 2:
select * from test1 where data = 1 for update;

Then wait for a little while and the message will be logged.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] JDBC and inet type

2015-12-04 Thread Bill Moran
On Fri, 4 Dec 2015 09:41:24 +
Tim Smith  wrote:

> When I use "preparedStatement.setString(5,ip);" to send values to a
> stored function, it obviously gets sent to postgres as "character
> varying".
> 
> Postgres obviously complains loudly and says " Hint: No function
> matches the given name and argument types. You might need to add
> explicit type casts.".
> 
> What is the appropriate workaround ?

You can define param 5 as varchar in your query, as Rob suggests:

CREATE FUNCTION some_function(int, int, int, int, int, varchar) ...

Then cast the 5th parameter to INET within your function.

You can also cast the value in your SQL.

sql = "SELECT some_function($, $, $, $, $::INET)"; ...

You could also create an Inet class in Java and implement the
SQLData interface, then use setObject() instead of setString().
It doesn't appear as if anyone has done this yet, but it would
be nice if it were incluced in the JDBC driver.

The first answer is probably best for stored procedures, as it
simplifies things down the road. The second solution is more
universal, as it works for non-function-calling SQL as well.
The third solution is probably _really_ the correct one, from
a pedantic standpoint, but it's a bit more work to implement.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Memory Leak executing small queries without closing the connection - FreeBSD

2015-12-13 Thread Bill Moran
77 
> > postgres: postgres postgres [local] SELECT
> > 26851 postgres  20   0 2365732 408464 406788 R 100.0 10.1 0:17.81 
> > postgres: postgres postgres [local] SELECT
> > 26851 postgres  20   0 2365732 864472 862576 R 100.0 21.4 0:38.90 
> > postgres: postgres postgres [local] SELECT
> > -- Function execmultiplei and transaction terminated, but memory still 
> > allocated!!!
> > 26851 postgres  20   0 2365732 920668 918748 S   0.0 22.7 0:41.40 
> > postgres: postgres postgres [local] idle
> > -- Calling it again
> > 26851 postgres  20   0 2365732 920668 918748 R  99.0 22.7 0:46.51 
> > postgres: postgres postgres [local] SELECT
> > -- idle again, memory still allocated
> > 26851 postgres  20   0 2365732 920668 918748 S   0.0 22.7 1:22.54 
> > postgres: postgres postgres [local] idle
> >
> > Memory will only be released if psql is exited. According to the 
> > PostgreSQL design memory should be freed when the transaction completed.
> >
> > top commands on FreeBSD: top -SaPz -o res -s 1
> > top commands on Linux: top -o RES d1
> >
> > Config: VMs with 4GB of RAM, 2 vCPUs
> > shared_buffers = 2048MB # min 128kB
> > effective_cache_size = 2GB
> > work_mem = 892MB
> > wal_buffers = 8MB
> > checkpoint_segments = 16

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Memory Leak executing small queries without closing the connection - FreeBSD

2015-12-13 Thread Bill Moran
On Sun, 13 Dec 2015 16:35:08 +0100
Gerhard Wiesinger  wrote:

> Hello Bill,
> 
> Thank you for your response, comments inline:
> 
> On 13.12.2015 16:05, Bill Moran wrote:
> > On Sun, 13 Dec 2015 09:57:21 +0100
> > Gerhard Wiesinger  wrote:
> >> some further details from the original FreeBSD 10.1 machine:
> >>
> >> Mem: 7814M Active, 20G Inact, 2982M Wired, 232M Cache, 1661M Buf, 30M Free
> >> Swap: 512M Total, 506M Used, 6620K Free, 98% Inuse
> >>
> >> PID USERNAMETHR PRI NICE   SIZERES STATE   C   TIME WCPU 
> >> COMMAND
> >> 77941 pgsql 5  200  7925M  7296M usem2 352:34 6.98%
> >> postgres: username dbnamee 127.0.0.1(43367)  (postgres)
> > 
> >
> > I see no evidence of an actual leak here. Each process is basically using
> > the 7G of shared_buffers you have allocated in the config (which is only
> > 7G _total_ for all processes, since it's shared memory)
> 
> OK, but why do we then get: kernel: swap_pager_getswapspace(4): failed?

You haven't provided enough information to isolate that cause yet. What's
in the Postgres log? Surely it will have logged something when its request
for RAM was denied, and it should be more informational than the OS'
generic message.

> >> Out of memory:
> >> kernel: swap_pager_getswapspace(4): failed
> >> kernel: swap_pager_getswapspace(8): failed
> >> kernel: swap_pager_getswapspace(3): failed
> >>
> >> Main issue is IHMO (as far as I understood the FreeBSD Memory system)
> >> that 20G are INACTIVE. When I subtract the shared memory, even ~13GB
> >> should be available, but they are still allocated but inactive
> >> (INACTIVE). INACTIVE memory might be clean or dirty. As we get into out
> >> of memory situations it is likely that the memory is dirty (otherwise it
> >> would have been reused).
> > Not quite correct. Inactive memory is _always_ available for re-use.
> 
> Are you sure that's true?

Yes. Read The Design and Implementation of FreeBSD for the details.

> Monitoring inactive memory:
> cat vm_stat.sh
> #!/usr/bin/env bash
> 
> while [ 1 ]; do
>date +%Y.%m.%d.%H.%M.%S
>sysctl -a | grep vm.stats.vm.
>sleep 1
> done
> 
> And even we get out of memory with swap_pager_getswapspace Inactive 
> Memory (from the log file) is around 20GB (doesn't go down or up)
> vm.stats.vm.v_inactive_count: 5193530 (*4k pages is around 20GB)
> 
> Then we have 20GB inactive memory, but we still get out of memory with 
> kernel: swap_pager_getswapspace(4): failed. Any ideas why?

Theory: If the planner decides it needs to do 30 sort operations for a
query, it will try to allocate 27G of RAM, which exceeds what's available,
and therefore never gets allocated. So you get the "out of space" message,
but the actual memory usage doesn't change.

> >> maintenance_work_mem = 512MB
> >> effective_cache_size = 10GB
> >> work_mem = 892MB
> > I expect that this value is the cause of the problem. The scenario you
> > describe below is sorting a large table on an unindexed column, meaning
> > it will have to use all that work_mem. I'd be interested to see the
> > output of:
> >
> > EXPLAIN ANALYZE SELECT * FROM t_random ORDER BY md5 LIMIT 10;
> 
> That was only a test query, has nothing to do with production based 
> query. They are mostly SELECT/INSERTS/UPDATES on primary keys.

Then provide the _actual_ queries and the EXPLAIN ANALYZE and the table
layouts and basic data distribution of the actual cause. If your test
case is completely non-representative of what's happening, then you're
not going to get useful answers.

> > But even without that information, I'd recommend you reduce work_mem
> > to about 16M or so.
> 
> Why so low? E.g. sorting on reporting or some long running queries are 
> then done on disk and not in memory.

Even a simple query could involve multiple sorts, and you're allowing
each sort to use up to 890M of RAM (which is _not_ shared). As noted
earlier, even a moderately complex query could exceed the available
RAM on the system. But since you don't provide the actual queries and
tables causing problems, I can only guess. And since you appear to
have already decided what the cause of the problem is, then crafted
completely non-relevent queries that you think prove your point, I'm
not sure there's anything I can do to help you.

> >> wal_buffers = 8MB
> >> checkpoint_segments = 16
> >> shared_buffers = 7080MB
> >> max_connections = 80
> >> autovacuum_max_workers = 3
> > [snip]
> >
> >>> W

Re: [GENERAL] Memory Leak executing small queries without closing the connection - FreeBSD

2015-12-13 Thread Bill Moran
On Sun, 13 Dec 2015 20:09:04 +0100
Gerhard Wiesinger  wrote:

> On 13.12.2015 18:17, Tom Lane wrote:
> > Gerhard Wiesinger  writes:
> >>> Mem: 7814M Active, 20G Inact, 2982M Wired, 232M Cache, 1661M Buf, 30M Free
> >>> Swap: 512M Total, 506M Used, 6620K Free, 98% Inuse
> >> OK, but why do we then get: kernel: swap_pager_getswapspace(4): failed?
> > Just judging from the name of the function, I would bet this is a direct
> > result of having only 512M of swap configured.  As Bill already pointed
> > out, that's a pretty useless choice on a system with 32G of RAM.  As soon
> > as the kernel tries to push out any significant amount of idle processes,
> > it's gonna be out of swap space.  The numbers you show above prove that
> > it is almost out of free swap already.
> 
> The system wasn't designed by me, I wouldn't do it either that way. Does 
> swapoff help?

FreeBSD and Linux (and most modern OS) are designed to have swap,
and usually more swap than RAM. I have never heard a good reason for
not using swap, and the reasons I _have_ heard have always been by
people misinformed about how the OS works.

If someone has a _good_ explanation for why you wouldn't want any
swap on a DB server, I'd love to hear it; but everything I've heard
up till now has been speculation based on misinformation.

IOW: no, you should not turn swap off, you should instead allocate
the appropriate amount of swap space.

> > Also, while that 20G of "inactive" pages may be candidates for reuse,
> > they probably can't actually be reused without swapping them out ...
> > and there's noplace for that data to go.
> 
> There is no log in syslog (where postgres log) when 
> swap_pager_getswapspace is logged.
> 
> But why do we have 20G of Inactive pages? They are still allocated by 
> kernel or user space. As you can see below (top output) NON Postgres 
> processes are around 9G in virtual size, resident even lower. The system 
> is nearly idle, and the queries typically aren't active after one second 
> agin. Therefore where does the rest of the 11G of Inactive pages come 
> from (if it isn't a Postgres/FreeBSD memory leak)?
> I read that Postgres has it's own memory allocator:
> https://www.reddit.com/r/programming/comments/18zija/github_got_30_better_performance_using_tcmalloc/
> Might that be an issue with double allocation/freeing and the "cheese 
> hole" topic with memory fragmentation?

If there were a memory leak in either FreeBSD or Postgres of the
seriousness you're describing that were as easy to trigger as you
claim, I would expect the mailing lists and other support forums
to be exploding in panic. Notice that they are not. Also, I still
don't see _ANY_ evidence of a leak. I see evidence that something
is happening that is trying to allocate a LOT of RAM, that isn't
available on your system; but that's not the same as a leak.

> https://www.opennet.ru/base/dev/fbsdvm.txt.html
>  inactivepages not actively used by programs which are
>  dirty and (at some point) need to be written
>  to their backing store (typically disk).
>  These pages are still associated with objects and
>  can be reclaimed if a program references them.
>  Pages can be moved from the active to the inactive
>  queue at any time with little adverse effect.
>  Moving pages to the cache queue has bigger
>  consequences (note 1)

Correct, but, when under pressure, the system _will_ recycle those
pages to be available.

Tom might be correct in that the system thinks they are inactive
because it could easily push them out to swap, but then it can't
_actually_ do that because you haven't allocated enough swap, but
that doesn't match my understanding of how inactive is used. A
question of that detail would be better asked on a FreeBSD forum,
as the differences between different VM implementations can be
pretty specific and technical.

[snip]

> Mem: 8020M Active, 19G Inact, 3537M Wired, 299M Cache, 1679M Buf, 38M Free
> Swap: 512M Total, 501M Used, 12M Free, 97% Inuse
> 
>PID USERNAMETHR PRI NICE   SIZERES STATE   C TIMEWCPU COMMAND
> 77941 pgsql 5  200  7921M  7295M usem7 404:32  10.25% 
> postgres
> 79570 pgsql 1  200  7367M  6968M sbwait  6 4:24   0.59% postgres

[snip about 30 identical PG processes]

> 32387 myusername9  200   980M   375M uwait   5 69:03   1.27% node

[snip similar processes]

>622 myusername1  200   261M  3388K kqread  3 41:01   0.00% nginx

[snip similar proces

Re: [GENERAL] Memory Leak executing small queries without closing the connection - FreeBSD

2015-12-13 Thread Bill Moran
On Sun, 13 Dec 2015 22:23:19 +0100
Gerhard Wiesinger  wrote:

> On 13.12.2015 21:14, Bill Moran wrote:
> > Wait ... this is a combined HTTP/Postgres server? You didn't mention that
> > earlier, and it's kind of important.
> >
> > What evidence do you have that Postgres is actually the part of
> > this system running out of memory?
> 
> For me the complete picture doesn't look consistent.

That's because you haven't gathered enough of the right type of information.

> > I don't see any such evidence in any of
> > your emails, and (based on experience) I find it pretty likely that whatever
> > is running under node is doing something in a horrifically 
> > memory-inefficient
> > manner. Since you mention that you see nothing in the PG logs, that makes it
> > even more likely (to me) that you're looking entirely in the wrong place.
> >
> > I'd be willing to bet a steak dinner that if you put the web server on a
> > different server than the DB, that the memory problems would follow the
> > web server and not the DB server.
> 
> Changes in config:
> track_activity_query_size = 102400
> work_mem = 100MB
> 
> Ok, we restarted PostgreSQL and had it stopped for seconds, and logged 
> top every second:
> 
> When PostgreSQL was down nearly all memory was freed, looks good to me. 
> So it is likely that node and other processes are not the cause.
> Mem: 742M Active, 358M Inact, 1420M Wired, 21G Cache, 871M Buf, 8110M Free
> Swap: 512M Total, 477M Used, 35M Free, 93% Inuse
> 
> When PostgreSQL restarted, Inactive was growing fast (~1min):
> Mem: 7998M Active, 18G Inact, 2763M Wired, 1766M Cache, 1889M Buf, 1041M 
> Free
> Swap: 512M Total, 472M Used, 41M Free, 92% Inuse
> 
> After some few minutes we are back again at the same situation:
> Mem: 8073M Active, 20G Inact, 2527M Wired, 817M Cache, 1677M Buf, 268M Free
> Swap: 512M Total, 472M Used, 41M Free, 92% Inuse
> 
> The steak dinner is mine :-) Donating to the PostgreSQL community :-)

Based on the fact that the inactive memory increased? Your understanding
of inactive memory in FreeBSD is incorrect. Those pages are probably DB
pages that the OS is keeping in inactive memory because Postgres requests
them over and over, which is what the OS is supposed to do to ensure the
best performance. Are you seeing any out of swap space errors? Even if
you are, you still haven't determined if the problem is the result of
Postgres or the node.js stuff you have running. I don't know what node.js
might be caching on the client side ... do you?

No. Until you can actually report back something other than wild
speculation, I'll keep that steak dinner for myself. Besides, that bet
was based on you putting the PG server on seperate hardware from the
web server, which you didn't do.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Test disk reliability (or HGST HTS721010A9E630 surprisingly reliable)

2015-12-21 Thread Bill Moran
On Mon, 21 Dec 2015 14:54:14 +0100
Félix GERZAGUET  wrote:

> On Mon, Dec 21, 2015 at 12:31 AM, Jim Nasby 
> wrote:
> 
> > On 12/20/15 1:09 PM, Félix GERZAGUET wrote:
> >
> >> After reading
> >> http://www.postgresql.org/docs/current/static/wal-reliability.html, I
> >> tried the recommended diskchecker.pl
> >> <http://brad.livejournal.com/2116715.html> but I am not satisfied:
> >>
> >> I always get:
> >> Total errors: 0
> >>
> >> even if I tested with with a HGST HTS721010A9E630 that the vendor's
> >> datasheet
> >> (http://www.hgst.com/sites/default/files/resources/TS7K1000_ds.pdf)
> >> advertise as "
> >> Designed for low duty cycle, non mission-critical applications in
> >> PC,nearline and consumer electronics environments, which vary
> >> application to application
> >> "
> >>
> >> Since it is not, a high end disk, I expect some errors.
> >>
> >
> > Why? Just because a disk isn't enterprise-grade doesn't mean it has to lie
> > about fsync, which is the only thing diskchecker.pl tests for.
> >
> 
> I was thinking that since the disk have a 32M write-cache (with not
> battery) it would lie to the OS (and postgres) about when data are really
> on disk (not in the disk write cache). But maybe that thinking was wrong.

It varies by vendor and product, which is why diskchecker.pl exists.
It's even possible that the behavior is configurable ... check to see
if the vendor provides a utility for configuring it.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] uuid-ossp: Performance considerations for different UUID approaches?

2015-12-22 Thread Bill Moran
On Tue, 22 Dec 2015 11:07:30 -0600
Brendan McCollam  wrote:

> (previously posted to the pgsql-performance list)
> 
> Hello,
> 
> We're in the process of designing the database for a new service, and 
> some of our tables are going to be using UUID primary key columns.
> 
> We're trying to decide between:
> 
> * UUIDv1 (timestamp/MAC uuid) and
> 
> * UUIDv4 (random uuid)
> 
> And the separate but related choice between:
> 
> * Generating the UUIDs client-side with the Python uuid library 
> (https://docs.python.org/2/library/uuid.html) or
> 
> * Letting PostgreSQL handle uuid creation with the uuid-ossp extension 
> (http://www.postgresql.org/docs/9.4/static/uuid-ossp.html)
> 
> In terms of insert and indexing/retrieval performance, is there one 
> clearly superior approach? If not, could somebody speak to the 
> performance tradeoffs of different approaches?
> 
> There seem to be sources online (e.g. 
> https://blog.starkandwayne.com/2015/05/23/uuid-primary-keys-in-postgresql/ 
> http://rob.conery.io/2014/05/29/a-better-id-generator-for-postgresql/) 
> that claim that UUIDv4 (random) will lead to damaging keyspace 
> fragmentation and using UUIDv1 will avoid this.

There's no substance to these claims. Chasing the links around we finally
find this article:
http://www.sqlskills.com/blogs/kimberly/guids-as-primary-keys-andor-the-clustering-key/
which makes the reasonable argument that random primary keys can cause
performance robbing fragmentation on clustered indexes.
But Postgres doesn't _have_ clustered indexes, so that article doesn't
apply at all. The other authors appear to have missed this important
point.

One could make the argument that the index itself becomming fragmented
could cause some performance degredation, but I've yet to see any
convincing evidence that index fragmentation produces any measurable
performance issues (my own experiments have been inconclusive).

Looking at it another way, a quick experiment shows that PG can fit
about 180 UUID primary keys per database page, which means a million
row table will use about 5600 pages to the tune of about 46m. On
modern hardware, that index is likely to be wholly in memory all the
time.

If your performance requirements are really so dire, then you should
probably consider ditching UUIDs as keys. Taking the same million row
table I postulated in the previous paragraph, but using ints insted
of UUIDs for the primary key, the primary key index would be about
3200 pages (~26m) ... or almost 1/2 the size -- making it more likely
to all be in memory at any point in time.

I seriously doubt that trying to make your UUIDs generate in a
predictable fashon will produce any measurable improvement, and I
see no evidence in the articles you cited that claims otherwise
have any real basis or were made by anyone knowledgeable enough
to know.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Enforcing referential integrity against a HSTORE column

2016-01-02 Thread Bill Moran
On Sat, 2 Jan 2016 07:30:38 -0800
Adrian Klaver  wrote:

> > So given:
> >
> > CREATE TABLE xtra_fields(
> >xfk SERIAL PRIMARY KEY,
> >xtk INTEGER NOT NULL REFERENCES xtra_types,
> >...
> > );
> >
> > CREATE OR REPLACE FUNCTION foo_xtra_fk(HSTORE) RETURNS BOOLEAN AS $$
> > WITH keyz AS (SELECT skeys($1)::INT AS xfk)
> > SELECT
> >(SELECT COUNT(*) FROM keyz JOIN xtra_fields USING (xfk))
> >=
> >(SELECT COUNT(*) FROM keyz)
> > $$LANGUAGE SQL STABLE STRICT LEAKPROOF;
> >
> > CREATE TABLE foo(
> >id INTEGER NOT NULL CHECK (id > 0),
> >...
> > -- Extra fields where the keys are the xtra_fields.xfk values and the
> > values are the
> > -- data values for the specific xfk.
> >xtra hstore CHECK (foo_xtra_fk(xtra))
> > );
> >
> > is ?there a more efficient way of maintaining logical referential integrity?

I second Adrian's comment on making sure that the benefit of HSTORE is
outweighing the drawback of having to write your own checks ... however,
if you decide that HSTORE is the right way to go, you may want to try
something more along the lines of this for your check:

SELECT true WHERE NOT EXISTS(SELECT 1 FROM keyz WHERE xfk NOT IN (akeys($1)));

Not tested, so it's possible that I have some typo or something; but overall
I've found that the NOT EXISTS construct can be very efficient in cases
like these.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Code of Conduct: Is it time?

2016-01-10 Thread Bill Moran
On Sun, 10 Jan 2016 07:36:23 -0800
"Joshua D. Drake"  wrote:

> Hey,
> 
> For the record, my thoughts on a CoC are something like:
> 
> 1. Be excellent to each other
> 2. If you don't know what that means, leave
> 3. If someone isn't being excellent please contact: XYZ
> 
> With XYZ being a committee that determines the ABCs.

In general, I agree; but there are problems with 1 and 2.

The definition of "being excellent" varies from individual
to individual; but more importantly, from culture to culture.
As a result, pretty much everyone would have to leave as a
result of #2, because very few people know what "being
excellent" means to everyone involved.

As a result, I would feel REALLY bad for XYZ, who would be
put in the unenviable place of trying to mitigate disputes
with no guidance whatsoever.

So, the purpose of a CoC is twofold:

A) Define what "being excellent" means to this particular
   community.
B) Provide a process for how to resolve things when "being
   excellent" doesn't happen.

Without #1, nobody will want to do #2, as it's basically a
job that can never be done correctly.

But defining #1 is the really difficult part, because no matter
how you define it, there will be some people who disagree with
said definition.

The fact that Postgres has not needed a CoC up till now is a
testiment to the quality of the people in the community. However,
if Postgres continues to be more popular, the number of people
involved is going to increase. Simply as a factor of statistics,
the project will be forced to deal with some unsavory people at
some point. Having a CoC is laying the foundation to ensure that
dealing with those people involves the least pain possible. It
will always involve _some_ pain, but less is better.

I've done the job of #3 with other groups, and 99% of the time
there was nothing to do. The one incident I had to handle was
terrible, but at least I had some guidance on how to deal with
it.

-- 
Bill Moran


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


  1   2   3   4   5   6   7   8   9   10   >