Re: [GENERAL] Which CMS/Ecommerce/Shopping cart ?
Samantha Atkins wrote: > Why is MySQL so much more popular right now, especially > in the OpenSource community? As a database I find its > architecture with multiple underlying engines and other > quirks to be rather dubious. Then there is the issue of > commercial licenses and exactly when you must have those > and what it will really cost. Yet it is pretty MySQL was available on Windows long before PostgreSQL. MySQL has always been free for all uses, including commercial use, for ISPs so it quickly became the database that all ISPs/domain hosts provide and, therefore, a popular choice for Web apps. > ubiquitous. How come? Why isn't postgresql more on > developer's minds when they think of OS databases? > Amazon cloud has great scalable MySQL support but > apparently not postgreql. Why? Is there something > about postgresql that is bugging all these people or what? My guess is that there are a lot of people who know of MySQL who have never heard of PostgreSQL. Also, PostgreSQL does not scale as well on Windows as it does on Linux/Unix. I have talked to people who support 2,000 concurrent users using PostgreSQL on Linux. I have been told that the limit on Windows is about 300 concurrent users. I have no idea how accurate that statement is. I share your surprise because PostgreSQL is has a much more extensive feature set than MySQL. -- .Bill. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] ADO/ODBC returns TEXT fields as Varchar, not LongVarChar
I am new to PostgreSQL and running 8.3 on Windows. I am connecting using ADO and the ODBC driver. My connection string contains TextAsLongVarchar=1 and MaxLongVarcharSize=65536, however, my TEXT column is truncated at 255 bytes. What must I do to get ADO to recognize the TEXT field as a LongVarchar? -- .Bill. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] ADO/ODBC returns TEXT fields as Varchar, not LongVarChar
Further testing has revealed that adding UnknownAsLongVarchar=1 to the connection string causes ADO to see TEXT fields as LongVarchar but it also causes all VARCHAR fields of any size to be seen as LongVarChar. Why is ADO or the ODBC driver seeing all Varchar and Text fields as type unknown? -- .Bill. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Text field truncated using ADO/ODBC?
I am new to PostgreSQL and running 8.3 on Windows. I am connecting using ADO and the ODBC driver. I have a table the contains a Varchar(80) column and a Text column as well as a couple of integer and datetime columns. My connection string contains TextAsLongVarchar=1 and MaxLongVarcharSize=65536, however, my Text column is being truncated at 254 bytes (the value of MaxVarcharSize). Adding UnknownAsLongVarchar=1 to the connection string causes ADO to see Text column as LongVarchar and return its complete value but it also causes the Varchar column to be seen as LongVarChar. This suggests that the driver sees both Text and Varchar as unknown types. What must I do to get the ODBC driver to see the Text column as LongVarchar and the Varchar column as Varchar? The complete connection string follows by signature. Thanks. Bill Provider=MSDASQL.1;Extended Properties="DRIVER={PostgreSQL ANSI};DATABASE=app;SERVER=localhost;PORT=5432;UID=postgres;PWD=*;SSLmode=disable;ReadOnly=0;Protocol=7.4-1;FakeOidIndex=0;ShowOidColumn=0;RowVersioning=0;ShowSystemTables=0;ConnSettings=;Fetch=100;Socket=4096;UnknownSizes=0;MaxVarcharSize=2048;MaxLongVarcharSize=65536;Debug=0;CommLog=0;Optimizer=1;Ksqo=1;UseDeclareFetch=0;TextAsLongVarchar=1;UnknownsAsLongVarchar=0;BoolsAsChar=1;Parse=0;CancelAsFreeStmt=0;ExtraSysTablePrefixes=dd_;;LFConversion=1;UpdatableCursors=1;DisallowPremature=0;TrueIsMinus1=0;BI=0;ByteaAsLongVarBinary=0;UseServerSidePrepare=0;LowerCaseIdentifier=0;XaOpt=1" -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Trigger function is not called
PostgreSQL 8.3 on Windows. I have the table below which has a before insert trigger. The CREATE TRIGGER statement and the trigger function are also shown below. When I insert a row into this table using pgAdmin III and the INSERT statement insert into note.category (category_id, category) values(689, 'Ztest'); the before insert trigger function is not called. The notice is not displayed and no value is assigned to the version or uc_category columns and the insert fails with a violation of the not null constraint on the version field? I have created a simple two column test table with a before insert trigger and it works perfectly. I am new to PostgreSQL so I suspect I am missing something simple but I cannot figure out what. Why is the trigger function never called? Thanks, Bill CREATE TABLE note.category ( category_id note.d_id NOT NULL, category note.d_category NOT NULL, uc_category note.d_category, parent_category_id note.d_id_fk, "version" note.d_id, category_checked boolean NOT NULL DEFAULT false, CONSTRAINT category_primary_key PRIMARY KEY (category_id) ) WITH (OIDS=FALSE); ALTER TABLE note.category OWNER TO postgres; CREATE TRIGGER category_bi_trigger BEFORE INSERT ON note.category FOR EACH ROW EXECUTE PROCEDURE note.category_bi(); CREATE OR REPLACE FUNCTION note.category_bi() RETURNS trigger AS $BODY$ begin RAISE NOTICE '*CATEGORY BEFORE INSERT*'; NEW.VERSION := nextval('note.version_seq'); NEW.UC_CATEGORY := UPPER(NEW.CATEGORY); RETURN NEW; end; $BODY$ LANGUAGE 'plpgsql' VOLATILE -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Trigger function is not called
Tom Lane wrote: Bill <[EMAIL PROTECTED]> writes: PostgreSQL 8.3 on Windows. I have the table below which has a before insert trigger. The CREATE TRIGGER statement and the trigger function are also shown below. The script you show attempts to create the trigger before creating the function, which of course isn't going to work. Did you check whether the trigger actually got created? regards, tom lane The trigger was definitely created. The code I posted was not a script that I used to create the trigger and trigger function. I just copied the SQL from pgAdmin and pasted the commands into my message not paying any attention to the order. Sorry for the confusion. In a newsgroup posting someone suggested that constraint checks on domains occur before the before insert trigger. That seems difficult to believe based on my experience with other databases. Do constraint checks on domains occur before the before insert trigger? Bill
Re: [GENERAL] Trigger function is not called
Tom Lane wrote: Bill <[EMAIL PROTECTED]> writes: In a newsgroup posting someone suggested that constraint checks on domains occur before the before insert trigger. Yeah, that is the case, but if a domain check was failing then the row wouldn't get inserted, so I'm not clear on how this matches up with your report. regards, tom lane The row is not getting inserted. I just created a test table and trigger and confirmed that the trigger fires if the column is defined as bigint not null and fails after I change the type to the domain. I will alter all of the tables and get rid of the domain. Is it possible to create a type and use that instead of the domain or will I have the same problem with a type? Bill
Re: [GENERAL] Trigger function is not called
You'd have the same problem. By the time the trigger sees it, the row has already been converted to the table's column datatype(s), so any exception associated with a datatype or domain would be thrown already. A lot of people seem to have trouble with this concept; I dunno what data representation they think the trigger is working on... If you want to enforce constraints for a table in the trigger, you can do that, but it's not going to work to try to mix and match trigger-based and datatype-based restrictions. regards, tom lane I have no problem with the concept now that I understand it. It is just different than InterBase and Firebird which I have done a lot of work with lately. Thanks very much for your help. Bill -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Trigger function is not called
Tom Lane wrote: Bill <[EMAIL PROTECTED]> writes: Is it possible to create a type and use that instead of the domain or will I have the same problem with a type? You'd have the same problem. By the time the trigger sees it, the row has already been converted to the table's column datatype(s), so any exception associated with a datatype or domain would be thrown already. A lot of people seem to have trouble with this concept; I dunno what data representation they think the trigger is working on... If you want to enforce constraints for a table in the trigger, you can do that, but it's not going to work to try to mix and match trigger-based and datatype-based restrictions. regards, tom lane I removed the domain from the category_id and version columns leaving the following table, trigger function and trigger. The trigger function is still not called when I insert a new row. Any other ideas? Bill CREATE TABLE note.category ( category_id bigint NOT NULL, category character varying(40) NOT NULL, uc_category note.d_category, parent_category_id bigint, "version" bigint NOT NULL, category_checked boolean NOT NULL DEFAULT false, CONSTRAINT category_primary_key PRIMARY KEY (category_id) ) CREATE OR REPLACE FUNCTION note.category_bi() RETURNS trigger AS $BODY$ BEGIN RAISE NOTICE '**CATEGORY BI**'; IF (NEW.CATEGORY IS NULL OR NEW.CATEGORY = '') THEN RAISE EXCEPTION 'Category cannot be blank.'; END IF; IF (NEW.CATEGORY_ID IS NULL) THEN NEW.CATEGORY_ID := nextval('note.id_seq'); END IF; NEW.VERSION := nextval('note.version_seq'); NEW.UC_CATEGORY := UPPER(NEW.CATEGORY); RETURN NEW; END; $BODY$ LANGUAGE 'plpgsql' VOLATILE CREATE TRIGGER category_bi_trigger BEFORE UPDATE ON note.category FOR EACH ROW EXECUTE PROCEDURE note.category_bi();
Re: [GENERAL] Trigger function is not called
Tom Lane wrote: Bill <[EMAIL PROTECTED]> writes: I removed the domain from the category_id and version columns leaving the following table, trigger function and trigger. The trigger function is still not called when I insert a new row. Any other ideas? You're still expecting the trigger to get invoked before any constraints are enforced (the NOT NULLs being the problem here, I think). Again, you can enforce things through a trigger or through a table constraint, but mixing and matching won't work too well. regards, tom lane The thing that has me confused is that the following table, trigger and trigger function work perfectly and the primary key for this table is also bigint not null. I added a bigint not null domain to this schema and changed the data type of the key to the domain and then I get the constraint violation. I changed the type of the key column back to bigint not null and the trigger fires and no error occurs. Bill CREATE TABLE test.trigger_test ( "key" bigint NOT NULL, data character varying(16), CONSTRAINT trigger_test_key PRIMARY KEY (key) ) CREATE OR REPLACE FUNCTION test.trigger_test_before_insert() RETURNS trigger AS $BODY$ begin raise notice '*Test before insert*'; new."key" := nextval('test.id_seq'); return new; end; $BODY$ LANGUAGE 'plpgsql' VOLATILE CREATE TRIGGER trigger_test_insert BEFORE INSERT ON test.trigger_test FOR EACH ROW EXECUTE PROCEDURE test.trigger_test_before_insert();
Re: [GENERAL] Trigger function is not called
Tom Lane wrote: Bill <[EMAIL PROTECTED]> writes: The thing that has me confused is that the following table, trigger and trigger function work perfectly and the primary key for this table is also bigint not null. Actually, after looking closer, I think the problem with your previous example is that you created an ON UPDATE trigger not an ON INSERT trigger. Table constraints are indeed enforced after before-triggers fire, as a quick look at the code proves. Sorry for the misinformation. regards, tom lane I knew I was missing something really simple. I changed the trigger to before insert and everything works perfectly. Thanks again for your help. I learned a lot. Bill
[GENERAL] SQL optimization - WHERE SomeField STARTING WITH ...
The SQL database servers I have worked with cannot use and index for a SELECT of the form SELECT * FROM ATABLE WHERE AFIELD LIKE ? because there is no way to know the location of the wild card until the parameter value is known. InterBase and Firebird allow SELECT * FROM ATABLE WHERE AFIELD STARTING WITH ? which is equivalent to LIKE 'ABC%' and will use an index on AFIELD. Is there a similar syntax in PostgreSQL? Bill -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] MySQL LAST_INSERT_ID() to Postgres
Masis, Alexander (US SSA) wrote: I was mapping C++ application code that works with mySQL to work with Postgres. There were a number of articles on line regarding the conversion from mySQL to Postgres like: http://en.wikibooks.org/wiki/Programming:Converting_MySQL_to_PostgreSQL http://groups.drupal.org/node/4680 http://jmz.iki.fi/blog/programming/converting_mysql_database_to_postgres ql http://www.raditha.com/blog/archives/000488.html However, I found the most difficult issue was related to a MySQL's "SELECT LAST_INSERT_ID()" sql call. If your code did not use LAST_INSERT_ID(), then you don't have to read this post. In MySQL "LAST_INSERT_ID()" is a MySQL's syntax that returns the last auto_increment type ID of the row(record) inserted in a table. In other words, if your MySQL table had a auto_increment datatype for a field, that field will automatically advance whenever a new record(row) is inserted into that table. It is sometimes handy to know what is the value of that ID, that has just been added to the table, so that that record(row) can be addressed/updated later. Well, in MySQL it's easy you just do: "SELECT LAST_INSERT_ID();" In Postgres, however it is not that simple. You have to know the name of so called "insert sequence". Postgres has a system function for that( SQL line below ). In Postgres you will have to provide the table and column name( "auto_increment" type in MySQL or "serial or bigserial" in Postgres). Here is that SQL query that returns the last inserted ID: "SELECT CURRVAL( pg_get_serial_sequence('my_tbl_name','id_col_name'));" Alexander Masis. That will work reliably in a multi-user environment if and only if currval() returns the last value for the current connection. I assume this is the case but the description of currval() in the PostgreSQL documentation says "Return value most recently obtained with |nextval| for specified sequence". There is no mention that currval() returns the last value obtained by calling nextval() for the current connection. Can someone confirm that currval() returns the the value for the connection from which it is called? Bill
Re: [GENERAL] SQL optimization - WHERE SomeField STARTING WITH ...
Tino Wildenhain wrote: Hi Bill, Bill wrote: The SQL database servers I have worked with cannot use and index for a SELECT of the form SELECT * FROM ATABLE WHERE AFIELD LIKE ? because there is no way to know the location of the wild card until the parameter value is known. InterBase and Firebird allow SELECT * FROM ATABLE WHERE AFIELD STARTING WITH ? which is equivalent to LIKE 'ABC%' and will use an index on AFIELD. Is there a similar syntax in PostgreSQL? Yes, its actually: LIKE 'ABC%' and it will use an index. Regards Tino Are you saying that a parameterized query whose WHERE clause is AFIELD LIKE ? will use an index on AFIELD if the parameter value is 'ABC%'. I do not understand how that is possible since optimizer does not know the value of the parameter at the time the SQL is parsed and optimized. When the parameter value is supplied it could just as easily be '%ABC' in which case an index cannot be used. This is based on the assumption that PostgreSQL, like other database servers, parses and optimizes a parameterized query once then stores it in memory so it can be executed multiple times with different parameter values. The optimizer could only determine if an index could be used or not if it optimized the query each time it was executed after the parameter value was supplied. Bill -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] MySQL LAST_INSERT_ID() to Postgres
Steve Atkins wrote: On Aug 28, 2008, at 12:09 PM, Scott Marlowe wrote: On Thu, Aug 28, 2008 at 10:14 AM, Masis, Alexander (US SSA) <[EMAIL PROTECTED]> wrote: I was mapping C++ application code that works with mySQL to work with Postgres. There were a number of articles on line regarding the conversion from mySQL to Postgres like: SNIP Well, in MySQL it's easy you just do: "SELECT LAST_INSERT_ID();" In Postgres, however it is not that simple. You have to know the name of so called "insert sequence". Postgres has a system function for that( SQL line below ). In Postgres you will have to provide the table and column name( "auto_increment" type in MySQL or "serial or bigserial" in Postgres). Here is that SQL query that returns the last inserted ID: "SELECT CURRVAL( pg_get_serial_sequence('my_tbl_name','id_col_name'));" That's the hard way. Starting with pgsql 8.2 you can do it much more easily: create table tester (id serial primary key, info text); insert into tester (info) values ('this is a text string') returning id; tada! All done, that insert will return the id for you. Or lastval() if you want something bug-compatible with MySQL. Cheers, Steve I am new to PostgreSQL but it seems to me that lastval() will only work if the insert does not produce side effects that call nextval(). Consider the case where a row is inserted into a table that has an after insert trigger and the after insert trigger inserts a row into another table which has a serial primary key. In that case I assume that lastval() will return the value from the serial column in the second table. Bill -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] MySQL LAST_INSERT_ID() to Postgres
Scott Marlowe wrote: On Thu, Aug 28, 2008 at 3:38 PM, Bill <[EMAIL PROTECTED]> wrote: I am new to PostgreSQL but it seems to me that lastval() will only work if the insert does not produce side effects that call nextval(). Consider the case where a row is inserted into a table that has an after insert trigger and the after insert trigger inserts a row into another table which has a serial primary key. In that case I assume that lastval() will return the value from the serial column in the second table. No, setval, currval, and lastval all require as an argument a sequence name. So the real issue is you have to know the sequence name to use them. The problem with lastval is that it reports the last value that the sequence gave out whether it was to us or someone else. this makes it NOT SAFE for concurrent transactions, but more for maintenance work. I use returning almost exclusively now. The PostgresSQL 8.3 help file clearly shows that lastval() does not take a sequence as a parameter and the description i is "Return the value most recently returned by |nextval| in the current session. This function is identical to |currval|, except that instead of taking the sequence name as an argument it fetches the value of the last sequence that |nextval| was used on in the current session. It is an error to call |lastval| if |nextval| has not yet been called in the current session." Is the help incorrect? Bill
[GENERAL] Clay Shirky observation regarding MySQL
Hello! Clay Shirky made a comment about MySQL that I thought the PostgreSQL community should be aware of: http://www.shirky.com/writings/situated_software.html It's the section (mostly toward the bottom) entitled, "The Nature of Programming, and the Curious Case of MySQL". The whole article is, as normal, interesting and thought-provoking. [Please, Shirky wrote this, not me; however, if he's thinking it, we should know.] B-) ---(end of broadcast)--- TIP 8: explain analyze is your friend
[GENERAL] Scalability
I am new to PostgreSQL and just beginning to learn the product. I will probrobably be using it exclusively on Windows. I was surprised to learn that PostgreSQL creates a new process for each connection. Doesn't this severely limit its scalability by consuming resources rapidly on the server as the number of user increases? -- .Bill. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Scalability
"Joshua D. Drake" wrote: > Bill wrote: > > I am new to PostgreSQL and just beginning to learn the product. I > > will probrobably be using it exclusively on Windows. > > > > I was surprised to learn that PostgreSQL creates a new process for > > each connection. Doesn't this severely limit its scalability by > > consuming resources rapidly on the server as the number of user > > increases? > > The Windows version is not anywhere near as scalable as the unix > versions. Depending on your hardware you will top out a Windows > installation about about 350-400 connections. You can get more out of > Windows by modifying the registry but I am unsure of how far it will > go. > > I have Linux installations that happily hum along with 2000-5000 > connections. > > So in answer to your question, in general -- no the process > methodology we use does not limit scalability and it makes our code > base much simpler that the equivalent threading model. > > Sincerely, > > Joshua D. Drake Thanks. How much memory does PostgreSQL consume with 2000 connections? Which Linux distribution do you prefer? -- .Bill. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[GENERAL] How much memory is required?
Is there any published information on the minimum or recommended amount of memory for PostgreSQL on Windows and/or Linux. I am looking at PostgreSQL for an embedded app and hardware cost is a consideration. I cannot find anything in the PostgreSQL 8.1.0 Documentation but I may not be looking in the right place. Thanks. -- .Bill. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[GENERAL] Does PostgreSQL support multi-instancing?
Is is possible to have two different versions of PostgreSQL running on the same computer at the same time? -- .Bill. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[GENERAL] Performance monitoring
Can someone point me to information about performance monitoring in the PostgreSQL documentation? I want to see what tools are available to diagnose performance problems. Thanks. -- .Bill. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] How much memory is required?
"Raymond O'Donnell" wrote: > On 8 Nov 2006 at 22:48, Bill wrote: > > > Is there any published information on the minimum or recommended > > amount of memory for PostgreSQL on Windows and/or Linux. I am > > looking > > There's some useful information here: > > http://www.powerpostgresql.com/PerfList > > HTH, > > --Ray. > > -- > > Raymond O'Donnell > Director of Music, Galway Cathedral, Galway, Ireland > [EMAIL PROTECTED] > -- > > > > ---(end of > broadcast)--- TIP 1: if posting/reading > through Usenet, please send an appropriatesubscribe-nomail > command to [EMAIL PROTECTED] so that yourmessage can > get through to the mailing list cleanly Thanks. Very useful. -- .Bill. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] Performance monitoring
Jeff Davis wrote: > On Wed, 2006-11-08 at 23:37 +0000, Bill wrote: > > Can someone point me to information about performance monitoring in > > the PostgreSQL documentation? I want to see what tools are > > available to diagnose performance problems. Thanks. > > > > http://www.postgresql.org/docs/8.1/static/monitoring.html > > Does that help? > > Regards, > Jeff Davis > > > ---(end of > broadcast)--- TIP 4: Have you searched our > list archives? > >http://archives.postgresql.org/ Thanks. I must have been blind to have missed that. -- .Bill. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[GENERAL] Can PostgreSQL notify a client that a trigger has fired?
Does PostgreSQL have built in mechanism I can use to conditionally notify a client application that a trigger has fired? What I want is something along the line of the following pseudo code in a trigger. if then raise client event -- .Bill. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [GENERAL] Can PostgreSQL notify a client that a trigger has fired?
Jorge Godoy wrote: > I have the impression that you're missing a lot of sections in the > manual... How about some time to re-read it? I don't know about you but for me a 1500 page manual is at least two weeks of full time reading. I have read several sections of it but I am trying to decide if PostgreSQL should be considered for a project and I don't have 80 hours to make the evaluation. As well indexed as the manual is, there are some topics I cannot find without knowing the PostgreSQL specific terminology. That's one of the great things about newsgroups and mailing lists; you can get help with specific questions from experienced users. I appreciate all the anwers I have received here. They have made it possible for me to do a much better job in the time available. My thanks to everyone. -- .Bill. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] shared_buffers formula
On Wed, 4 Mar 2015 14:05:09 +0400 Alexander Shutyaev wrote: > Thanks for the answer. Now, given this info I've calculated that our > postgresql should occupy approx. 30,53 GB while the server has 125 GB of > RAM. However we often see in top that there is very little free memory and > even swap is used. What could be the reason of postgres using so much > memory? Memory usage is much more dependent on the OS than Postgres than you might realize. I don't see where you state the OS, but I'll assume it's Linux for now. Linux default NUMA policy seems to be tuned toward applications that don't use a lot of RAM. If your 128G server has 8 CPUs, then Linux will allow a single process to use 16G of RAM before deciding that it has to use swap for that process. This is one of the advantantages I find with FreeBSD. Read up on how NUMA works a bit, and do some research into how to tune the NUMA policies ... assuming, of course, that you _are_ using Linux. Or switch to FreeBSD where the default NUMA policy is more friendly to programs that use a lot of RAM. > 2015-03-03 14:26 GMT+03:00 Andres Freund : > > > On 2015-03-03 15:06:54 +0400, Alexander Shutyaev wrote: > > > Recently we've been having problems with swap on our postgresql server. > > It > > > has 125GB of RAM. We've decided to calculate it's memory consumption. To > > do > > > this we've used the formulas from the official docs [1]. > > > > Note that I think those formulas have been removed from the docs for a > > while now (9.2?). > > > > > However there is > > > one parameter that seems strange - Shared disk buffers. According to the > > > formula it occupies the following space: > > > > > > (block_size + 208) * shared_buffers > > > > > Our values are > > > > > > block_size=8192 > > > shared_buffers=30GB > > > > > > > The block_size has the default value and shared_buffers was calculated by > > > pgtune. According to the formula the product will be around 252 000 GB > > > which doesn't make any sense. > > > > The problem with your calculation is that the shared_buffers referenced > > in the formula is the number of buffers - whereas when you specify it > > using a size unit (like MB, GB,...) that amount of memory is divided by > > the size of a page. So you're off by a factor of 8192. > > > > Greetings, > > > > Andres Freund > > > > -- > > Andres Freund http://www.2ndQuadrant.com/ > > PostgreSQL Development, 24x7 Support, Training & Services > > -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Weight BLOB objects in postgreSQL? How?
On Wed, 4 Mar 2015 12:36:36 -0300 María Griensu wrote: > Thanks, I mean, how heavy it is in kB. If you mean how much overhead is involved in storing the data, that's a bit complicated. First off, how the data is stored depends on the size of it, and what other fields there are in the table and their size as well. This is because the TOAST code makes decisions on how to store things on the fly based on various rules. Read up: http://www.postgresql.org/docs/9.4/static/storage-toast.html There are a number of factors. If the data is large enough for the TOAST code to kick in, the server may decide to compress the data, whether that actually helps depends on the nature of the data ... a zip file or png isn't going to get any smaller, for example. From there, if the data is large enough to trigger out-of-line storage, the data will be broken down into chunks and stored in a toast table, this increases the overhead because each row in the toast table will have it's own overhead, and the number of rows required in the toast table depends on the size of the data, which in-turn depends on how well the data compressed ... So, your answer is: Take _your_ data and store a bunch of it, then check the resultant size of the tables on disk vs. the actual size of the data. That's really the only way to know since the actual efficiency of data storage depends a lot on the data itself. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] #PERSONAL# Reg: date going as 01/01/0001
On Fri, 6 Mar 2015 10:04:38 +0530 Medhavi Mahansaria wrote: > > I need to enter the date in null column based on the results obtained. > > but my date gets inserted as 01/01/0001 in postgresql 9.3 when there is no > value. > I need to enter NULL in the column. This isn't how PostgreSQL behaves. Something else is causing this behavior ... either the client library or triggers or something else configured on the table. More detail would help us help you. Provide the CREATE TABLE statement that created the table, as well as details of what software you're using to execute the query, and the query itself. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] #PERSONAL# Reg: date going as 01/01/0001
On Fri, 6 Mar 2015 15:24:28 +0530 Medhavi Mahansaria wrote: > Hi Bill, > > Here are the details of the table and the query i want to insert. > > > aml_db=> \d+ check_date > Table "public.check_date" > Column |Type | Modifiers | Storage | Stats > target | Description > ---+-+---+-+--+- > some_date | date| | plain || > sno | integer | | plain || > new_date | timestamp without time zone | | plain || > Has OIDs: no > > > aml_db=> select * from check_date; > some_date | sno | new_date > ---+-+-- > (0 rows) > > > aml_db=> insert into check_date values > (to_date('','mmddhh24miss'),1,to_date('','mmddhh24miss')); > INSERT 0 1 > aml_db=> select * from check_date; >some_date | sno |new_date > ---+-+ > 01/01/0001 BC | 1 | 01/01/0001 00:00:00 BC > (1 row) > > > I need to enter NULL as my date. but since I am getting these variables > into the bind variables as empty string that is this issue is happening. You're not inserting null, so it's hard to expect null to be the result. Try: INSERT INTO check_date VALUES (null, 1, null); or even: INSERT INTO check_date VALUES (to_date(null, 'mmddhh24miss'), 1, to_date(null, 'mmddhh24miss')); both of which result in what you desire. null and the empty string are not the same thing. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Postgres and data warehouses
On Sun, 8 Mar 2015 11:20:16 -0400 Stephen Frost wrote: > * Nigel Gardiner (nigelgardi...@gmail.com) wrote: > > I've had a quick search and haven't seen this approach used yet, but I was > > thinking, the asynchronous replication of Postgres databases could be used > > as a streaming journal of changes to be processed by a data warehouse. The > > other approach that suggests itself is WAL file shipping. I've not dug into > > the async rep protocol yet, before I do so I just wanted to get some brief > > feedback on whether I'm on the wrong track or not, and if there's some > > better approach I should be looking at first instead. > > Using a streaming replica for data warehouse queries is quite common.. > The issue there is if you want to change the data structure or store > data on the replica because a streaming replica based on WAL shipping is > read-only. You can create FDW tables which you can write to (even on > the replica) or you can create an independent database which has FDW > tables to the replica. They have their own pros and cons, of course. Another option is to replicate using Slony instead of streaming, which allows you to create additional tables on the replica that are read/write in addition to triggers that only fire on the replica. It's complicated, but pretty damn powerful. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] VACUUM FULL doesn't reduce table size
On Mon, 9 Mar 2015 09:05:07 -0700 (MST) pinker wrote: > > > So there are no longer any dead rows being left behind, right? > > > > Why are we still discussing this? Do you have some other question? > > There are no dead rows, but postgres still cannot reuse the space because of > 3043947 nonremovable row versions .. > > INFO: vacuuming "my_table" > INFO: "my_table": found 0 removable, 3043947 nonremovable row versions in > 37580 pages > DETAIL: 0 dead row versions cannot be removed yet. > CPU 2.67s/1.59u sec elapsed 7.71 sec. > Query returned successfully with no result in 8319 ms. Given your weird description of the "snapshot" I wouldn't be surprised if that instance of PostgreSQL had subtle corruption. pg_dump that database, re-init it and reload the dump. Then recreate the situation and see if the rows are sill nonremovable. I bet you $5.34 that everything works fine after that, which would indicate that the folks who made the snapshot didn't do it correctly. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] PostgreSQL-related legal question
I've been asked to sign a legal document related to a PostgreSQL- related job opening. I have concerns about the document and that signing it could have a negative impact on the PostgreSQL project (in addition to personal concerns). I'm guessing I'm not the first person to go through this. I'm hoping someone on this list can refer me to a lawyer who is familiar with the challenges of NDAs and open source projects. I'm not asking for pro-bono, I'm willing to pay for services, but I just figured that I might get better results getting a referral than by contacting $random_legal_service. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PostgreSQL-related legal question
On Wed, 11 Mar 2015 13:50:02 +0100 Dorian Hoxha wrote: > I don't see how it could have negative impact on the postgresql project? > It's not like your job will be to find vulnerabilities and not disclose > them ? I don't think I should discuss the particulars of the situation on the list. That's why I'm just looking for a lawyer who understands the situation and can advise me. > > On Wed, Mar 11, 2015 at 1:28 PM, Bill Moran > wrote: > > > > > I've been asked to sign a legal document related to a PostgreSQL- > > related job opening. I have concerns about the document and that > > signing it could have a negative impact on the PostgreSQL project > > (in addition to personal concerns). > > > > I'm guessing I'm not the first person to go through this. I'm > > hoping someone on this list can refer me to a lawyer who is > > familiar with the challenges of NDAs and open source projects. > > > > I'm not asking for pro-bono, I'm willing to pay for services, > > but I just figured that I might get better results getting a > > referral than by contacting $random_legal_service. > > > > -- > > Bill Moran > > > > > > -- > > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > > To make changes to your subscription: > > http://www.postgresql.org/mailpref/pgsql-general > > -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Postgresql BDR(Bi-Directional Replication) Latency Monitoring
Hi All, I'm looking at BDR monitoring documentation at this location https://wiki.postgresql.org/wiki/BDR_Monitoring I understand the query in documentation: select slot_name, plugin, database, active, xmin, pg_get_transaction_committime(xmin) FROM pg_stat_logical_decoding ; Should be: select slot_name, plugin, database, active, xmin, pg_get_transaction_committime(xmin) FROM pg_replication_slots ; I am running PostgreSQL 9.4.0 bits on Centos: postgresql-bdr94-contrib-9.4.0_bdr1-1_2ndQuadrant.el6.x86_64 postgresql-bdr94-libs-9.4.0_bdr1-1_2ndQuadrant.el6.x86_64 postgresql-bdr94-bdr-0.8.0beta1-1_2ndQuadrant.el6.x86_64 postgresql-bdr94-2ndquadrant-redhat-1.0-2.noarch postgresql-bdr94-devel-9.4.0_bdr1-1_2ndQuadrant.el6.x86_64 postgresql-bdr94-server-9.4.0_bdr1-1_2ndQuadrant.el6.x86_64 postgresql-bdr94-9.4.0_bdr1-1_2ndQuadrant.el6.x86_64 We have 3 nodes set-up. We would like to understand the latency between nodes. When using http://oltpbenchmark.com/ to create a load on one node, I would expect the 'xmin' value not to be null but is. What is the best way to monitor replication latency when using Postgresql BDR? Regards, Bill
Re: [GENERAL] Autovacuum query
On Thu, 26 Mar 2015 03:58:59 + Mitu Verma wrote: > > We have a customer complaining about the time taken by one of the application > scripts while deleting older data from the log tables. > During the deletion, customer reported that he often sees the below error and > because of which table size doesn?t reduce. > > ERROR: canceling autovacuum task > Date: 2015-03-14 04:29:19 > Context: automatic analyze of table "fm_db_Server3.mmsuper.audittraillogentry" > > We have the following queries in this regard: > > - How often is the autovacuum task invoked by postgres As needed. Read: http://www.postgresql.org/docs/9.4/static/routine-vacuuming.html > - If the task gets cancelled (while we were deleting data from the > database) would this task be re-invoked at a later time? Yes. Read the above > - If insertion of data into a table also impact this task? No. Vacuum operations are not triggered by INSERTs. > - If we can manually schedule this task to a particular time (like > off peak hours)? Yes, but given the questions you're asking, you probably do not have a good enough understanding of the situation to schedule it correctly and will make the problem worse. You can run it manually any time you want, but I don't recommend that you disable autovacuum unless you have a good understanding of what you're doing. Let me take a guess at the problem: The table gets LOTs of inserts, constantly, and somewhere there's a job that runs out of cron or some similar scheduler that DELETEs a lot of those rows in a big chunk. The DELETE process probably runs infrequently, like once a day or even once a week because the designers thought it would be best to get everything taken care of all at once during some real or perceived slow period on the database. One solution to this is to run the DELETE process more frequently, such as every 15 minutes. In such a case, the process will run much faster, make less changes, and require less work on the part of autovacuum to clean up after. People frequently complain that "this will impact performance if run during normal use hours," but in every case I've seen, nobody had actually tested to see if that statement was true, and running smaller purges more frequently actually solved the problem. Another option would be to manually run vacuum after the big DELETE runs. See http://www.postgresql.org/docs/9.4/static/sql-vacuum.html Don't fall into the trap of running VACUUM FULL. This is usually a bad idea. If the client is complaining about reclaiming disk space, start asking some hard questions: How much space is too much? Why are you convinced that the space is wasted? Usually the correct answer is to add more disk space, since Postgres tends to fall into a groove with a particular table whereby the "unused" space is actually being used and reclaimed by data tuples as the data in the table changes. It's not unusal for the table to be 2x the size of the actual data on a heavily updated table. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Column does not exists?
On Thu, 26 Mar 2015 14:12:36 -0300 "Leonardo M. Ramé" wrote: > Ok, I have this table: > > CREATE TABLE sessions > ( >"SESSIONID" integer NOT NULL, >"SESSIONTIMESTAMP" character varying(45) NOT NULL, >"SESSIONDATA" character varying(200) DEFAULT NULL::character varying, >CONSTRAINT sessions_pkey PRIMARY KEY ("SESSIONID") > ) > > Now, when I do: > > DELETE From sessions WHERE SESSIONTIMESTAMP < '2010-01-01 10:02:02' > > I get: > > ERROR: column "sessiontimestamp" does not exist > LINE 1: DELETE From sessions WHERE SESSIONTIMESTAMP < '2010-01-01 10... > ^ > ** Error ** > > ERROR: column "sessiontimestamp" does not exist > SQL state: 42703 > Character: 28 > > But if I do: > > DELETE From sessions WHERE "SESSIONTIMESTAMP" < '2010-01-01 10:02:02' > > It DOES work. > > Why the db doesn't recognize the name of the table without quotes?. See: http://www.postgresql.org/docs/9.4/static/sql-syntax-lexical.html#SQL-SYNTAX-IDENTIFIERS -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Would like to know how analyze works technically
On Wed, 1 Apr 2015 04:33:07 -0700 (MST) TonyS wrote: > On Wed, April 1, 2015 12:18 am, Tom Lane-2 [via PostgreSQL] wrote: > > > > TonyS writes: > > > >> Running "analyze verbose;" and watching top, the system starts out > >> using no swap data and about 4GB of cached memory and about 1GB of used > >> memory. As it runs, the amount of used RAM climbs, and eventually the > >> used swap memory increases to 100% and after being at that level for a > >> couple of minutes, the analyze function crashes and indicates "server > >> closed the connection unexpectedly." > > > > ANALYZE is supposed to work in a constrained amount of memory, though > > that amount depends quite a bit on what the data is and what you've got the > > statistics target set to. > > > > We've heard reports that there are memory leaks in (some versions of?) > > PostGIS's analyze support. Whether that would apply to you would depend > > on whether you're using PostGIS. > > > > Hard to say more without a lot more concrete info about what your > > data is, what PG version you're using, etc. > > > > regards, tom lane > > > > Thanks for the response Tom. > > I am not using PostGIS. The data in my system is mostly along the lines of > what you would see in an accounts payable, accounts receivable, and > billing type situation. Names and addresses of individuals, information > about billing, payments received, payments sent etc. > > All of my indexes are b-tree indexes. > > Currently, the largest individual table is 1.8GB. > > select version() returns: > PostgreSQL 9.3.6 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu > 4.8.2-19ubuntu1) 4.8.2, 64-bit > > OS: Ubuntu 14.04.1 LTS > > Physical memory: 8GB > > The postgresql log has these entries at the crash point: > 2015-04-01 06:24:37 EDT LOG: server process (PID 1384) was terminated by > signal 9: Killed > 2015-04-01 06:24:38 EDT DETAIL: Failed process was running: analyze verbose; > 2015-04-01 06:24:38 EDT LOG: terminating any other active server processes > > I started this process at 11PM, so it ran for about 7.5 hours before > crashing. > > Is there anything else that would be helpful? Don't know if I'm on the right track with this, but what is maintenance_work_mem set to on this system? -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Would like to know how analyze works technically
On Wed, 1 Apr 2015 06:26:36 -0700 (MST) TonyS wrote: > On Wed, April 1, 2015 8:48 am, Bill Moran [via PostgreSQL] wrote: > > > > >>>> Running "analyze verbose;" and watching top, the system starts out > >>>> using no swap data and about 4GB of cached memory and about 1GB of > >>>> used memory. As it runs, the amount of used RAM climbs, and > >>>> eventually the used swap memory increases to 100% and after being at > >>>> that level for a couple of minutes, the analyze function crashes and > >>>> indicates "server closed the connection unexpectedly." > >>> > >>> ANALYZE is supposed to work in a constrained amount of memory, though > >>> that amount depends quite a bit on what the data is and what you've > >>> got the statistics target set to. > >>> > >>> We've heard reports that there are memory leaks in (some versions > >>> of?) PostGIS's analyze support. Whether that would apply to you would > >>> depend on whether you're using PostGIS. > >>> > >>> Hard to say more without a lot more concrete info about what your > >>> data is, what PG version you're using, etc. > > > > Don't know if I'm on the right track with this, but what is > > maintenance_work_mem set to on this system? > > > > Hello Bill, > > maintenance_work_mem is set to 480MB. I haven't changed that from what > pgtune suggested. Doesn't seem unreasonable, so my guess isn't right. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Would like to know how analyze works technically
On Wed, 1 Apr 2015 10:47:32 -0700 (MST) TonyS wrote: > On Wed, April 1, 2015 12:30 pm, Igor Neyman [via PostgreSQL] wrote: > >> > >> TonyS <[hidden email]> > >> wrote: > >> > >> > > > > name,current_setting,source autovacuum,off,configuration file > > synchronous_commit,off,configuration file TimeZone,localtime,configuration > > file unix_socket_directories,/var/run/postgresql,configuration file > > wal_buffers,8MB,configuration file work_mem,1536MB,configuration file > > > > > > --- > > > > > > ?work_mem,1536MB,configuration file > > > > > > > > IIRC, your RAM is 8GB. Your work_mem is too high. Actual memory used > > for sorting, etc... could be multiples of work_mem setting. > > > > That could be the reason for your memory problems. I'd suggest to set it > > to 16MB, and see if you can avoid "on disk" sorting. If not - gradually > > increase work_mem. > > > > Regards, > > > > > > Igor Neyman > > > > > Thanks Igor, > > I will try changing that. I pretty much just let pgtune set all of those > values for me. If pgtune set 1.5G of work_mem, then someone should file a bug report. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Serializable transaction restart/re-execute
On Fri, 3 Apr 2015 15:35:14 +0100 Filipe Pina wrote: > Hello, > > I come from a GTM background and once of the transactional features there are > the ?Transaction Restarts?. > > Transaction restart is when we have two concurrent processes reading/writing > to the same region/table of the database, the last process to commit will > ?see? that the database is not the same as it was when the transaction > started and goes back to the beginning of the transactional code and > re-executes it. > > The closest I found to this in PGSQL is the Serializable transaction > isolation mode and it does seem to work well except it simply throws an error > (serialization_failure) instead of restarting. > > I?m trying to make use of this exception to implement restartable functions > and I have all the examples and conditions mentioned here in a question in SO > (without any answer so far?): > > http://stackoverflow.com/questions/29372202/postgresql-generic-handler-for-serialization-failure > > <http://stackoverflow.com/questions/29372202/postgresql-generic-handler-for-serialization-failure> > > So basically I have two questions: > - the restartable ?wrapper? function never gets its ?DB view? refreshed once > it restarts, I assume it?s because of the outter transaction (at function > level) so it never re-reads the new values and keeps failing with > serialization_failure.. Any way to solve this? > - the ideal would be to be able to define this at database level so I > wouldn?t have to implement wrappers for all functions.. Implementing a > ?serialization_failure? generic handler that would simply re-call the > function that threw that exception (up to a number of tries). Is this > possible without going into pgsql source code? I suspect that savepoints will accomplish what you want: http://www.postgresql.org/docs/9.4/static/sql-savepoint.html -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Serializable transaction restart/re-execute
On Mon, 6 Apr 2015 10:41:25 +0100 Filipe Pina wrote: > Hi Bill, thanks for the quick reply. > > I had read about SAVEPOINTs but I also read I cannot use them in PLPGSQL and > should use BEGIN/END blocks and EXCEPTIONs. > > Did you check the URL I mentioned? Yes, I did: http://stackoverflow.com/questions/29372202/postgresql-generic-handler-for-serialization-failure ... > But it doesn't work.. Every iteration fails with serialization_failure > probably because the outer transaction is not rolled back and I'm not sure > how to write this in a way I can roll it back and still have control of the > LOOP.. Probably one of your issues is that there is no such thing as an "outer" transaction. There's just a transaction. There is no nesting of transactions, so the belief that there is an outer transaction that can somehow be manipulated indepently of some other transaction is leading you to try things that will never work. I wasn't aware that SAVEPOINTs didn't work in pl/pgsql, thanks for educating me on that point. > I find it hard to believe that PGSQL has this amazing "serializable" > isolation method but not a standard way to take advantage of it to > automatically "restart" the failed transactions... I've been over this ground before. You're thinking in such a micro case that you haven't realized the inherent difficulty of restarting large transactions with lots of data modification. An RDBMS may have many tables updated within a transaction, and transactions may do data processing completely outside of the database, which means the only way to ensure consistency is to notify the controlling process of the problem so it can decide how best to respond. So ... I dug into your problem a little more, and I think the problem is that you're trying too hard to replicate GTM design paradigms instead of learning the way that PostgreSQL is designed to work. If I were creating the functions you describe, I would ditch the second one and simply have this: CREATE OR REPLACE FUNCTION mytest() RETURNS integer AS $$ BEGIN update account set balance = balance+10 where id=1 RETURNING balance; END $$ LANGUAGE SQL; of course, it's unlikely that you'll ever want to wrap such a simple query in a function, so I'm supposing that you'd want to do something else with the old value of balance before updating it, in which case: CREATE OR REPLACE FUNCTION mytest() RETURNS integer AS $$ DECLARE cc integer; BEGIN SELECT INTO cc balance FROM account WHERE id = 1 FOR UPDATE; RAISE NOTICE 'Balance: %', cc; perform pg_sleep(3); update account set balance = cc+10 where id=1 RETURNING balance INTO cc; return cc; END $$ LANGUAGE plpgsql; The FOR UPDATE ensures that no other process can modify the row while this one is sleeping. Now, I understand that you want to don't want to do row locking, but this is (again) an insistance on your part of trying to force PostgreSQL to do things the way GTM did instead of understanding the RDBMS way of doing things. Unlearn. Keep in mind that mytest() might be called as part of a much larger transaction that does many other things, and you can't simply roll that back and restart it within mytest() since mytest() doesn't know everything else that happened. In you're case, you're trying to look at mytest() as something that will always be used in a specific way where the aforementioned problem won't be encountered, but you can not guarantee that, and it doesn't hold true for all functions. In general, it's inappropriate for a function to be able to manipulate a transaction beyond aborting it. And the abort has to bubble up so that other statements involved in the transaction are also notified. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Regarding bytea column in Posgresql
On Thu, 9 Apr 2015 11:03:30 + "Deole, Pushkar (Pushkar)" wrote: > > I have been assigned to a product that uses Postgresql 9.3 as backend > database. I am new to postgresql. > The product provides chat functionality between the uses and the completed > chats are stored in the database table in a 'bytea' column in the form of > xml. When I query the data from this column I see xml file with text data. I > have couple of queries: > > 1. Is 'bytea' column intended for storing text data? No, it's intended for storing binary data. > 2. Typically a chat can have text data with several special characters > (which can be represented in multi bytes), how these characters can be stored > in a bytea column and retrieved back properly? bytea won't help you here. You'll have to manage the special characters entirely in your code. bytea gives you back the exact same types you put in, with no changes or interpretation. A better choice would be to use a text field with a proper text encoding (such as utf-8). Probably an even better choice would be to use the XML datatype in PostgreSQL, since you say that you're storing XML anyway. The place where people tend to get tripped up with TEXT and XML datatypes is that they're strict. If you try to store text in a TEXT data type that isn't valid (i.e., multi-byte characters that aren't correct) you'll get an error and the data won't be accepted. The same thing happens if you try to store invalid XML in an XML field (such as XML without proper closing tags, etc). It seems that this strictness causes a lot of people to avoid those data types, as there seem to be a lot of people who would rather have garbage data in their database than actually go to the work of fixing their application. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] plpgsql functions organisation
On Sat, 2 May 2015 14:07:31 -0700 Jeff Janes wrote: > On Sat, May 2, 2015 at 1:05 PM, Adrian Klaver > wrote: > > > On 05/02/2015 10:12 AM, Melvin Davidson wrote: > > > >> AFAIK, you cannot "package" functions in PostgreSQL, but it is possible > >> to > >> call a function from within a function. > >> > >> That being said, I would seriously look at how and why you are writing > >> your functions > >> as functions that call other functions are not very efficient. > >> > > > > I am not following. That is what packaging is about, separating out 'units > > of work' so they can be combined as needed. Part of that is using existing > > functions in new functions/classes. In fact in the Postgres source I see > > this in many places. Now it is entirely possible I missed a memo, so I am > > open to a more detailed explanation of the inefficiencies involved. > > > > The Postgres source is written in C, not in plpgsql. C has a good > optimizing compiler and plpgsql doesn't. Maybe that's a roundabout way of saying that if your functions are complex enough to require calling "sub-functions" they might be justifying being writting in C? -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] plpgsql functions organisation
On Sat, 02 May 2015 15:06:24 -0700 Adrian Klaver wrote: > On 05/02/2015 02:07 PM, Jeff Janes wrote: > > On Sat, May 2, 2015 at 1:05 PM, Adrian Klaver > <mailto:adrian.kla...@aklaver.com>> wrote: > > > > On 05/02/2015 10:12 AM, Melvin Davidson wrote: > > > > AFAIK, you cannot "package" functions in PostgreSQL, but it is > > possible to > > call a function from within a function. > > > > That being said, I would seriously look at how and why you are > > writing > > your functions > > as functions that call other functions are not very efficient. > > > > > > I am not following. That is what packaging is about, separating out > > 'units of work' so they can be combined as needed. Part of that is > > using existing functions in new functions/classes. In fact in the > > Postgres source I see this in many places. Now it is entirely > > possible I missed a memo, so I am open to a more detailed > > explanation of the inefficiencies involved. > > > > > > The Postgres source is written in C, not in plpgsql. C has a good > > optimizing compiler and plpgsql doesn't. > > Does this actually matter? I am a biologist that backed into computing, > so I realize I am weak on the fundamentals. Still the scientist in me > wants data backing assertions. As I understand it plpgsql works close to > the server and is optimized to do so. I know writing in C would be a > better solution. Still is calling plpgsql functions inside plpgsql > really a bad thing when just considering plpgsql? The answer to that is the same answer to so many other things: it depends. plpgsql functions are slower than C. They also lack a lot of language features that C has. That being said, if they're meeting your needs, then don't worry about it. plpgsql is around because for most people, it works well enough. There are certainly cases when you want to create very complex logic in the database and plpgsql is liable to make that difficult. But there are a lot of cases where having to manage pointers and a build environment and all the things that go with C aren't justified, because plpgsql has none of that complexity. There are advantages both ways. The beauty of PostgreSQL is that you have both available and you can choose whichever is best for your situation. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Postgresql and GlassFish - cannot commit when autoCommit is enabled
On Thu, 07 May 2015 10:07:44 +0300 ? wrote: > > I have postgresql 9.4 and glassfish 4.1. Besides I use MyBatis inside EJB. > Now I try to make select from table and this is what I get: > javax.resource.spi.LocalTransactionException:Cannot commit when autoCommit is > enabled. > at > com.sun.gjc.spi.LocalTransactionImpl.commit(LocalTransactionImpl.java:112) > at > com.sun.enterprise.resource.ConnectorXAResource.commit(ConnectorXAResource.java:124) > at > com.sun.enterprise.transaction.JavaEETransactionImpl.commit(JavaEETransactionImpl.java:518) > at > com.sun.enterprise.transaction.JavaEETransactionManagerSimplified.commit(JavaEETransactionManagerSimplified.java:854) > at > com.sun.ejb.containers.EJBContainerTransactionManager.completeNewTx(EJBContainerTransactionManager.java:719) > at > com.sun.ejb.containers.EJBContainerTransactionManager.postInvokeTx(EJBContainerTransactionManager.java:503) > at > com.sun.ejb.containers.BaseContainer.postInvokeTx(BaseContainer.java:4566) > at > com.sun.ejb.containers.BaseContainer.postInvoke(BaseContainer.java:2074) > at > com.sun.ejb.containers.BaseContainer.postInvoke(BaseContainer.java:2044) > at > com.sun.ejb.containers.EJBObjectInvocationHandler.invoke(EJBObjectInvocationHandler.java:212) > at > com.sun.ejb.containers.EJBObjectInvocationHandlerDelegate.invoke(EJBObjectInvocationHandlerDelegate.java:79) > at com.sun.proxy.$Proxy312.getLsist(UnknownSource) > at sun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethod) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:483) > at > com.sun.corba.ee.impl.presentation.rmi.ReflectiveTie.dispatchToMethod(ReflectiveTie.java:143) > at > com.sun.corba.ee.impl.presentation.rmi.ReflectiveTie._invoke(ReflectiveTie.java:173) > at > com.sun.corba.ee.impl.protocol.ServerRequestDispatcherImpl.dispatchToServant(ServerRequestDispatcherImpl.java:528) > at > com.sun.corba.ee.impl.protocol.ServerRequestDispatcherImpl.dispatch(ServerRequestDispatcherImpl.java:199) > at > com.sun.corba.ee.impl.protocol.MessageMediatorImpl.handleRequestRequest(MessageMediatorImpl.java:1549) > at > com.sun.corba.ee.impl.protocol.MessageMediatorImpl.handleRequest(MessageMediatorImpl.java:1425) > at > com.sun.corba.ee.impl.protocol.MessageMediatorImpl.handleInput(MessageMediatorImpl.java:930) > at > com.sun.corba.ee.impl.protocol.giopmsgheaders.RequestMessage_1_2.callback(RequestMessage_1_2.java:213) > at > com.sun.corba.ee.impl.protocol.MessageMediatorImpl.handleRequest(MessageMediatorImpl.java:694) > at > com.sun.corba.ee.impl.protocol.MessageMediatorImpl.dispatch(MessageMediatorImpl.java:496) > at > com.sun.corba.ee.impl.transport.ConnectionImpl$1.dispatch(ConnectionImpl.java:195) > at > com.sun.corba.ee.impl.transport.ConnectionImpl.read(ConnectionImpl.java:328) > at > com.sun.corba.ee.impl.transport.ReaderThreadImpl.doWork(ReaderThreadImpl.java:112) > at > com.sun.corba.ee.impl.threadpool.ThreadPoolImpl$WorkerThread.performWork(ThreadPoolImpl.java:497) > at > com.sun.corba.ee.impl.threadpool.ThreadPoolImpl$WorkerThread.run(ThreadPoolImpl.java:540)Causedby: > org.postgresql.util.PSQLException:Cannot commit when autoCommit is enabled. > at > org.postgresql.jdbc2.AbstractJdbc2Connection.commit(AbstractJdbc2Connection.java:811) > at > com.sun.gjc.spi.LocalTransactionImpl.commit(LocalTransactionImpl.java:106)...30 > more > I even tried the following code: > SqlSession session > =ConnectionFactory.getSession().openSession(false);//AUTOCOMMITList list=null; > try{ > session.getConnection().setAutoCommit(false); > TempMapper mapper =(TempMapper)session.getMapper(TempMapper.class); > list=mapper.readAll(); > }catch(SQLException > ex){Logger.getLogger(TempBean.class.getName()).log(Level.SEVERE,null, ex);} > finally{ > session.close();} > However the result is the same. When I used gf+mysql+mybatis I changed > relaxautocommit but for postgresql driver there is no such attribute. How to > solve it? Best guess, based on experience, is that somehow you're disabling autocommit on a different connection from the one where you're doing the commit. I've seen this sort of thing happen with connection poolers where the developers don't fully understand how the connection pool works. You could confirm this by turning on full query logging in Postgres and see which connection does what. If it turns out to be the case, then you'll have
[GENERAL] RPM building tools and info missing?
I might need to roll a custom PostgreSQL server RPM for my current job. Looking here: https://wiki.postgresql.org/wiki/RPM_Packaging The link to the specfiles and other data at http://svn.pgrpms.org/repo/ gives a 404. I found a few other pieces on the internet suggesting the same URL, and can't find any information on why that sit is down or where it might have gone to. Is this a temporary outage? Or has the RPM data moved somewhere else and isn't documented yet? Any help is appreciated. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Allowing postgresql to accept 0xff syntax for data types that it makes sense for?
I'm working on a project converting a bunch of code from another database system to PostgreSQL. One of the issues is that the previous system accepted integers and binary data in the same hex format as C ... i.e. 0xff I understand that the proper way to handle this in postgres is x'ff', but the problem is that a large number of #define values use the 0xff syntax and these macros are used both in C code, as well as in SQL. The simple fact is that a LOT of code does this, and correcting it all and ensuring that the modified code is correct is a BIG job. Just to clarify some of the complexity: there is about 80,000 lines of PL/PGSQL code that contains these macros, then is run through the C preprocessor to substitute actual values for them before being loaded into Postgres. Obviously, there are many options for fixing this. One of those options is modifying PostgreSQL to accept the 0xff syntax ... and evaluating that option is the reason for my post. So, one of my questions is: does anyone have an existing simple answer on how to fix this? My other question: is there a specific reason why PostgreSQL doesn't support this syntax, aside from "nobody has bothered to add such support"? Because I'm considering writing a patch to Postgres and submitting it, but I'm not going to go down that path if there's a specific reason why supporting this syntax would be _bad_. Personally, I feel like it would be a good thing, as it seems like a lot of other database systems support it, and even though it's not ANSI, it's pretty much the de-facto standard. -- Bill -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Allowing postgresql to accept 0xff syntax for data types that it makes sense for?
On Thu, 21 May 2015 13:57:24 -0400 Tom Lane wrote: > Bill Moran writes: > > My other question: is there a specific reason why PostgreSQL doesn't support > > this syntax, aside from "nobody has bothered to add such support"? Because > > I'm considering writing a patch to Postgres and submitting it, but I'm not > > going to go down that path if there's a specific reason why supporting this > > syntax would be _bad_. Personally, I feel like it would be a good thing, as > > it seems like a lot of other database systems support it, and even though > > it's not ANSI, it's pretty much the de-facto standard. > > How many is "a lot", and do any of the responsible vendors sit on the SQL > standards committee? Well, I've personally worked with (in addition to PostgreSQL) Microsoft SQL Server, MySQL, and Sybase -- PostgreSQL is the only one of those 4 that doesn't support the 0xff syntax. I did a litle research and it appears that neither Oracle nor db2 supports the 0xff syntax ... so not _quite_ as common as it seemed to me. > One large concern about doing anything like this is whether future > versions of the SQL standard might blindside us with some > not-terribly-compatible interpretation of that syntax. If we do something > that is also in Oracle or DB2 or one of the other big boys, then we can > probably rely on the assumption that they'll block anything really > incompatible from becoming standardized ;-). I assume that Microsoft is big enough to prevent anything that would hurt SQL Server's compatibility from becomming a standard? > OTOH, if the actual meaning of "a lot" is "MySQL", I'd be pretty worried > about this scenario. Well, MySQL _does_ support that syntax ... but I couldn't care less. MySQL also throws away your data instead of giving you errors and I would never ask PostgreSQL to start behaving like that. With all that being said, if I were to build a patch, would it be likely to be accepted into core? -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Allowing postgresql to accept 0xff syntax for data types that it makes sense for?
On Fri, 22 May 2015 11:02:47 -0400 Tom Lane wrote: > Alban Hertroys writes: > > On 22 May 2015 at 04:46, Bill Moran wrote: > >> With all that being said, if I were to build a patch, would it be likely > >> to be accepted into core? > > > Wouldn't you also need to support similar syntax for octal numbers for > > the patch to be complete? Or are those already supported (ISTR that's > > '077' for decimal 63)? > > A patch that made 077 be interpreted as octal would certainly get rejected > out of hand, because that's valid syntax right now and it doesn't mean 63. You'll get no objection from me on that point. > A similar objection might be raised to 0x..., as that is also valid > syntax; it's read as 0 followed by an identifier: > > regression=# select 0xff; > xff > - >0 > (1 row) > > (Yet another place where the fact that AS is optional yields surprising > results...) So there would be a backwards compatibility break here, > and no you can't fix it with a GUC. Still, it might be a small enough > break that we could get away with it. I hadn't even considered that issue. I really hate the fact that AS is optional, and I'm irritated by code that omits it ... but nobody's asking me ... > I'm not personally very excited > but other people might be. > > Other questions you'd have to think about: what is the data type of > 0x; what do you do with 0x (too big > even for int8). And it'd likely behoove you to check how Microsoft > answers those questions, if you want to point to SQL Server as what's > going to keep you out of standards-compatibility problems. (IOW, > if 0x ever did get standardized, the text might well match what > SQL Server does.) MSSQL seems to use it specifically for the equivalent of BYTEA types, and it seems to me that should be how it works in PostgreSQL. Does anyone watching this thread have access to a MSSQL server to verify? If an implicit cast from a 4-byte BYTEA to int works now, then it should work ... otherwise an explicit cast would be needed, with the same behavior if you tried to specify a number that overflows an int in any other way. MySQL is more liberal in that you can use it to specify ints as well, but I don't think MySQL is a good example of proper behavior. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Allowing postgresql to accept 0xff syntax for data types that it makes sense for?
On Fri, 22 May 2015 11:27:49 -0500 Dennis Jenkins wrote: > On Fri, May 22, 2015 at 10:02 AM, Tom Lane wrote: > > > Alban Hertroys writes: > > > On 22 May 2015 at 04:46, Bill Moran wrote: > > >> With all that being said, if I were to build a patch, would it be likely > > >> to be accepted into core? > > How feasible would it be to write a network proxy, like pg_bouncer, to > handle converting the values on the fly, so that you need to change neither > your original code base (with respect to this issue), nor change PostgreSQL > itself? Certainly feasible, but absolutely undesirable. The system I'm working on is needlessly complex as it is ... I'd rather convince my bosses to let me rewrite 80,000 lines of code than add another compatibility shim to the mess. I brought it up because I saw an opportunity to benefit my employer and the PostgreSQL community at the same time. I have about 4 fallback plans if there's a reason not to do this one. Quite frankly, adding a compatibility shim isn't even on that list. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Allowing postgresql to accept 0xff syntax for data types that it makes sense for?
On Fri, 22 May 2015 12:44:40 -0400 Tom Lane wrote: > Bill Moran writes: > > Tom Lane wrote: > >> Other questions you'd have to think about: what is the data type of > >> 0x; what do you do with 0x (too big > >> even for int8). And it'd likely behoove you to check how Microsoft > >> answers those questions, if you want to point to SQL Server as what's > >> going to keep you out of standards-compatibility problems. (IOW, > >> if 0x ever did get standardized, the text might well match what > >> SQL Server does.) > > > MSSQL seems to use it specifically for the equivalent of BYTEA types, > > and it seems to me that should be how it works in PostgreSQL. > > Oh really? Wow, I'd just assumed you wanted this as a way to write > integers. That's certainly the use-case I would have personally. > I'm not even sure I like the idea of being able to write byteas without > quotes --- they seem like strings to me, not numbers. Arrgh ... it's good that you're bringing this up, but you're making me realize that there's more to figure out than I originally thought ... My focus had been on it being used for BYTEA columns, but there _are_ plenty of places in the code that do things like: WHERE int_col & 0x04 = 0x04 Which means that Sybase will implicitly cast that to an int, which probably means that MSSQL will as well. Once I take that into consideration, I start thinking that int_col should actualy be a bit string. which means that: WHERE bit_varying_col & 0x04 = 0x04 should probably work without explicit casts as well. > > If an implicit cast from a 4-byte BYTEA to int works now, then it > > should work ... otherwise an explicit cast would be needed, with the > > same behavior if you tried to specify a number that overflows an int > > in any other way. > > There's no cast at all from bytea to int. For one thing, it's quite > unclear what endianness should be assumed for such a cast. (To get > unsurprising behavior from what you're describing, I think we'd have > to use a big-endian interpretation of the bytea; but that would be > a pain for a lot of other scenarios, or even for this case if you'd > written a bytea of length other than 4 or 8 bytes.) As an implicit cast, obviously anything too large to fit in the target data type should be an error. But the subject of endianness becomes damn important. It's more complex than I original thought, but it still seems like it can be done without creating idiocy in the way things are cast. I'll think about it some more and try to come up with some more specific use scenarios to see what behavior seems the most POLA to me. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] FW: Constraint exclusion in partitions
ot; > "-> Seq Scan on subset (cost=0.00..31.40 rows=2140 width=8)" > " -> Index Scan using oldtable_groupid_idx on oldtable > (cost=0.58..107364.99 rows=51340 width=66)" > "Index Cond: (group_id = subset.id)" > Total query runtime: 3986 ms. 5978 rows retrieved. > > > select * from newtable where group_id IN (select * from subset) > "Hash Join (cost=41.25..138092255.85 rows=1935067087 width=66)" > " Hash Cond: (newtable.group_id = subset.id)" > " -> Append (cost=0.00..84877869.72 rows=3870134173 width=66)" > "-> Seq Scan on newtable (cost=0.00..0.00 rows=1 width=66)" > "-> Seq Scan on newtable_01 (cost=0.00..946235.96 rows=46526896 > width=66)" > ... > "-> Seq Scan on newtable_86 (cost=0.00..986527.64 rows=44269664 > width=66)" > " -> Hash (cost=38.75..38.75 rows=200 width=8)" > "-> HashAggregate (cost=36.75..38.75 rows=200 width=8)" > " -> Seq Scan on subset (cost=0.00..31.40 rows=2140 width=8)" > Execution Cancelled after 766702 ms ! > > I tried the same with "SET enable_seqscan = OFF" and got an index scan of all > tables; > > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] FW: Constraint exclusion in partitions
On Sat, 23 May 2015 18:16:43 -0400 Daniel Begin wrote: > Hello Bill, > You wrote that my testing methodology is flawed - I hope you are right! > > However, I am a bit confused about your comments. Yes, I did edited the name > of the tables for clarity but if I miss the point I, I will do it again as I > am writing without modifying anything. Here is the procedure I follow and > results... > > I use pgadmin_III sql window. I write the following query (I have changed > the id to make sure it does not use previous results still in memory)... I didn't realize you were using PGAdmin ... that explains some of it ... see below: > Select * from nodes where id=345678912; -- nodes is the real partitioned > table name > > Now I select "explain query" from the menu and I get the following result... > "Append (cost=0.00..384.08 rows=99 width=66)" > " -> Seq Scan on nodes (cost=0.00..0.00 rows=1 width=66)" > "Filter: (id = 345678912)" > " -> Index Scan using nodes19_idversion_pk on nodes_19 (cost=0.56..384.08 > rows=98 width=66)" > "Index Cond: (id = 345678912)" > > Now, I select "run" and I get one record as a result and the following > message in history tab... > -- Executing query: > Select * from nodes where id=345678912; > Total query runtime: 62 ms. > 1 row retrieved. > > Now, if I use the same query on the original table using the same procedure, > here is what I get... > Select * from old_nodes where id=345678912; -- old_nodes is the real > original table name > > Explain gives me the following > "Index Scan using nodes_idversion_pk on old_nodes (cost=0.70..4437.15 > rows=1682 width=66)" > " Index Cond: (id = 345678912)" > > Running the query gives me the same record with the following message in > history tab... > -- Executing query: > select * from old_nodes where id=345678912; > Total query runtime: 62 ms. > 1 row retrieved. > > This time, the history tab shows that both took the same time to run (an > improvement!?) If your environment is providing such wildly variant results, then you need to start running multiple tests instead of assuming that a single run of a query is indicative of a pattern. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Efficient sorting the results of a join, without denormalization
On Sun, 31 May 2015 04:50:00 -0500 "Glen M. Witherington" wrote: > > On Sun, May 31, 2015, at 12:53 AM, Tom Lane wrote: > > "Glen M. Witherington" writes: > > > And here's the query I want to do, efficiently: > > > > > SELECT * FROM c > > > JOIN b ON b.id = c.b_id > > > JOIN a ON a.id = b.a_id > > > WHERE a.id = 3 > > > ORDER BY b.created_at DESC > > > LIMIT 10 > > > > At least for that dummy data, this seems sufficient: > > > > regression=# create index on b (a_id, created_at); > > CREATE INDEX > > regression=# explain analyze SELECT * FROM c > > JOIN b ON b.id = c.b_id > > JOIN a ON a.id = b.a_id > > WHERE a.id = 3 > > ORDER BY b.created_at DESC > > LIMIT 10; > > QUERY > > PLAN > > -- > > Limit (cost=0.14..21.95 rows=10 width=64) (actual time=0.064..1.176 > > rows=10 loops=1) > >-> Nested Loop (cost=0.14..436079.81 rows=20 width=64) (actual > >time=0.063..1.173 rows=10 loops=1) > > Join Filter: (b.id = c.b_id) > > Rows Removed by Join Filter: 1218 > > -> Nested Loop (cost=0.14..9.81 rows=20 width=40) (actual > > time=0.035..0.035 rows=1 loops=1) > >-> Index Scan Backward using b_a_id_created_at_idx on b > >(cost=0.14..8.49 rows=20 width=24) (actual > >time=0.019..0.019 rows=1 loops=1) > > Index Cond: (a_id = 3) > >-> Materialize (cost=0.00..1.07 rows=1 width=16) (actual > >time=0.013..0.013 rows=1 loops=1) > > -> Seq Scan on a (cost=0.00..1.06 rows=1 width=16) > > (actual time=0.009..0.009 rows=1 loops=1) > >Filter: (id = 3) > >Rows Removed by Filter: 2 > > -> Materialize (cost=0.00..27230.00 rows=100 width=24) > > (actual time=0.008..0.811 rows=1228 loops=1) > >-> Seq Scan on c (cost=0.00..16370.00 rows=100 > >width=24) (actual time=0.007..0.310 rows=1228 loops=1) > > Planning time: 0.796 ms > > Execution time: 1.390 ms > > (15 rows) > > > > regards, tom lane > > Wow, sorry I screwed up the query. It should be: > > ORDER BY c.created_at DESC > > Not b, or as you noted its trivial to index. Sorry! Creating an index on c.created_at sped things up by a factor of over 1000, which caused the case you defined to run in ~0.5ms for me. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Planner cost adjustments
-- > - > Hash Semi Join (cost=21.50..819511.42 rows=729133 width=24) (actual > time=1538.100..7307.743 rows=338568 loops=1) >Hash Cond: (changesets.user_id = users.id) >-> Seq Scan on changesets (cost=0.00..745390.84 rows=25138084 width=24) > (actual time=0.027..4620.691 rows=25133929 loops=1) >-> Hash (cost=14.00..14.00 rows=600 width=8) (actual time=0.300..0.300 > rows=600 loops=1) > Buckets: 1024 Batches: 1 Memory Usage: 24kB > -> Seq Scan on users (cost=0.00..14.00 rows=600 width=8) (actual > time=0.022..0.187 rows=600 loops=1) > Total runtime: 7519.254 ms > (7 rows) > > osmdump=# SET enable_seqscan = OFF; > osmdump=# EXPLAIN ANALYSE SELECT user_id, id AS changeset_id,closed FROM > changesets WHERE changesets.user_id IN(SELECT id FROM sample.users); > > > Nested Loop (cost=115.94..10001090810.49 rows=729133 width=24) > (actual time=0.268..1466.248 rows=338568 loops=1) >-> HashAggregate (cost=115.50..121.50 rows=600 width=8) > (actual time=0.205..0.530 rows=600 loops=1) > -> Seq Scan on users (cost=100.00..114.00 > rows=600 width=8) (actual time=0.003..0.035 rows=600 loops=1) >-> Index Scan using changesets_useridndx on changesets > (cost=0.44..1805.83 rows=1215 width=24) (actual time=0.036..2.314 rows=564 > loops=600) > Index Cond: (user_id = users.id) > Total runtime: 1677.447 ms > (6 rows) > > #3 - Run EXPLAIN ANALYZE and look for discrepancies between the estimated > and actual times > > -- > Looking at above results, there are obvious discrepancies between > expected/actual rows and time! > I dug a bit by exploring/trying to understand the different concepts > explained in... > > http://www.postgresql.org/docs/9.4/static/planner-stats.html > http://www.postgresql.org/docs/8.1/static/planner-stats-details.html > http://www.postgresql.org/docs/9.2/static/view-pg-stats.html > > Concerning discrepancies between the actual number of rows and predicted > value, I looked at what pg_stats was saying about user_id in table > changesets. > Here are the values provided to the planner... > Average_width=8 > histogram_bounds: the size of the bins varies between 50 and 15, which > make sense because if I had divided the column's values into groups of > approximately equal population, I would have produced bins between 1 and > 10 (if sorted by frequency) > n_distinct= 20686 (there is actually 464858 distinct values for user_id in > the table) > most_common_vals: values make sense (I checked the frequency count of a > couple most common users_id) > correlation=0.617782 (?) > most_common_elems, most_common_elem_freqs and elem_count_histogram were > empty > > At this point, I wonder if the assumptions behind the planner's statistics > may produce such problems since the distribution of my data is not uniform > but follows a power law (some user_id would return millions of records while > others only one). > This is the farthest I can go at this point. Maybe someone can provide me > with more explanations regarding planner's behavior and ways to go further > to make it work properly? You may also benefit from increasing the statistics targets and running ANALYZE again. It certainly looks like some of those stats are pretty far off. Raising the statistics target will cause ANALYZE to investigate more rows (which takes longer but might produce more accurate results) I suggest experimenting with the cost settings first, though. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Automatic Client routing
On Thu, 4 Jun 2015 08:53:15 -0400 Ravi Krishna wrote: > Are there any plans to introduce the concept of automatic client > routing to the principal server in a cluster of N machines. For > example, if there is a four node replication cluster N1 .. N4, at any > time only one can be principal (the one which does the writing). In > Oracle and DB2, client side libraries provide a way for the clients to > connect to the principal writer regardless of where it is running on > N1 .. N4. This way client need to be aware of only one connection > string. > > EnterpriseDb is a failover manager which relies on virtual IP > management, not the one I described above. pgpool has this capacity. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Momentary Delay
dvi_sub, gis_via_viagem > via_sub, gis_iti_itinerario iti_sub > where dvi_sub.via_cod_viagem = via_sub.via_cod_viagem >and via_sub.via_status_viagem = 'A' >and via_sub.via_dt_hora_ini > now() - interval '9 hours' >and iti_sub.lin_cod_linha = 389 >and iti_sub.iti_sentido_itinerario = 'I' >and iti_sub.iti_cod_itinerario_linha = > via_sub.iti_cod_itinerario_linha >and dvi_sub.vlo_cod_localizacao is not null > group by dvi_sub.via_cod_viagem, > iti_sub.iti_cod_itinerario_linha, via_sub.equ_cod_eqpto > ; > Cancel request sent > ERROR: canceling statement due to user request > Time: 10164.015 ms > citgis=# select now(); > now > --- > 2015-06-05 09:27:22.006072-03 > (1 row) > > Time: 0.152 ms > citgis=# select max(dvi_sub.dvi_cod_dados_via_iti) as dvi_cod_dados_via_iti, > dvi_sub.via_cod_viagem > from gis_dvi_dados_viagem_itinerario dvi_sub, gis_via_viagem > via_sub, gis_iti_itinerario iti_sub > where dvi_sub.via_cod_viagem = via_sub.via_cod_viagem >and via_sub.via_status_viagem = 'A' >and via_sub.via_dt_hora_ini > now() - interval '9 hours' >and iti_sub.lin_cod_linha = 389 >and iti_sub.iti_sentido_itinerario = 'I' >and iti_sub.iti_cod_itinerario_linha = > via_sub.iti_cod_itinerario_linha >and dvi_sub.vlo_cod_localizacao is not null > group by dvi_sub.via_cod_viagem, > iti_sub.iti_cod_itinerario_linha, via_sub.equ_cod_eqpto > ; > dvi_cod_dados_via_iti | via_cod_viagem > ---+ > 1059964443 |7989813 > 1060072723 |7990876 > (2 rows) > > Time: 5565.175 ms > citgis=# select max(dvi_sub.dvi_cod_dados_via_iti) as dvi_cod_dados_via_iti, > dvi_sub.via_cod_viagem > from gis_dvi_dados_viagem_itinerario dvi_sub, gis_via_viagem > via_sub, gis_iti_itinerario iti_sub > where dvi_sub.via_cod_viagem = via_sub.via_cod_viagem >and via_sub.via_status_viagem = 'A' >and via_sub.via_dt_hora_ini > now() - interval '9 hours' >and iti_sub.lin_cod_linha = 389 >and iti_sub.iti_sentido_itinerario = 'I' >and iti_sub.iti_cod_itinerario_linha = > via_sub.iti_cod_itinerario_linha >and dvi_sub.vlo_cod_localizacao is not null > group by dvi_sub.via_cod_viagem, > iti_sub.iti_cod_itinerario_linha, via_sub.equ_cod_eqpto > ; > dvi_cod_dados_via_iti | via_cod_viagem > ---+ > 1059964445 |7989813 > 1060072725 |7990876 > (2 rows) > > Time: 27.944 ms > citgis=# select max(dvi_sub.dvi_cod_dados_via_iti) as dvi_cod_dados_via_iti, > dvi_sub.via_cod_viagem > from gis_dvi_dados_viagem_itinerario dvi_sub, gis_via_viagem > via_sub, gis_iti_itinerario iti_sub > where dvi_sub.via_cod_viagem = via_sub.via_cod_viagem >and via_sub.via_status_viagem = 'A' >and via_sub.via_dt_hora_ini > now() - interval '9 hours' >and iti_sub.lin_cod_linha = 389 >and iti_sub.iti_sentido_itinerario = 'I' >and iti_sub.iti_cod_itinerario_linha = > via_sub.iti_cod_itinerario_linha >and dvi_sub.vlo_cod_localizacao is not null > group by dvi_sub.via_cod_viagem, > iti_sub.iti_cod_itinerario_linha, via_sub.equ_cod_eqpto > ; > dvi_cod_dados_via_iti | via_cod_viagem > ---+ > 1059964445 |7989813 > 1060072727 |7990876 > (2 rows) > > Time: 24.428 ms > > Greetings > > Anderson -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Planner cost adjustments
On Wed, 10 Jun 2015 17:20:00 -0400 Daniel Begin wrote: > Here is a follow-up on adjusting the planner costs calculation > > -Statistics target of problematic columns were increased from 100 to 200. > -Analyse was ran on all concerned tables (actually ran on the whole DB) > -Random_page_cost was decreased from 4 to 2. Since you changed two things, there's no way to be sure which change led to the improvement. You seem to be assuming that changing the stastics target was what helped. While that _may_ be right, it might also have been the change to random_page_cost. > As a result, about 80% of queries are now using what I would consider an > optimal plan. However, the planner keeps using full table scans for large > tables... > > For instance, I ran a query that should have used an index scan on two > similar test tables. The planner had selected an index scan for the smaller > one and a Seq Scan for larger one. Except for their sizes and for one field > not used in the exercise, the test tables were identical and indexed on the > same field. The smaller test table had 3.26E+10 records and the larger one > 3.78E+11 records. > > The query looked like... > SELECT cs.user_id, cs.changeset_id, nd.id, nd.version > FROM changesets_selection cs, a_test_table nd > WHERE nd.changeset_id=cs.changeset_id; > > In order to understand why the planner selected the Seq Scan instead of an > Index Scan on the large table (nodes), I ran an EXPLAIN ANALYSE (on warm > cache) using enable_seqscan set to OFF/ON. > > -- QUERY PLAN on larger table (nodes) with SET enable_seqscan=OFF > -; > Nested Loop (cost=100.58..10210480648.03 rows=194204592 width=40) > (actual time=74.088..278008.579 rows=140112 loops=1) >-> Seq Scan on changesets_selection cs > (cost=100.00..1000110.44 rows=6644 width=24) (actual > time=0.015..4.904 rows=6903 loops=1) >-> Index Scan using nodes_changesetidndx on nodes nd > (cost=0.58..31387.49 rows=29230 width=24) (actual time=12.655..40.259 > rows=20 loops=6903) > Index Cond: (changeset_id = cs.changeset_id) > Total runtime: 278026.196 ms > (5 rows) > > -Completed after less than 5 minutes processing > -I assume that cost=100.00..1000110.44 for the Seq Scan of > changesets_selection is an artefact of setting enable_seqscan=OFF. > -From what I see, the evaluation of rows number is still way off (1400X) for > the large table, even if the statistics target was doubled to 200. > > -- QUERY PLAN on larger table (nodes) with SET enable_seqscan=ON > --; > Hash Join (cost=156171782.28..185673195.13 rows=194204592 width=40) >Hash Cond: (cs.changeset_id = n.changeset_id) >-> Seq Scan on changesets_selection cs (cost=0.00..110.44 rows=6644 > width=24) >-> Hash (cost=84959952.68..84959952.68 rows=3878771968 width=24) > -> Seq Scan on nodes nd (cost=0.00..84959952.68 rows=3878771968 > width=24) > (5 rows) > > -Still running after 2:30 hours processing! That is why I did not provided > the actual time and rows (however, actual rows are provided on first query > plan) > -Not surprisingly, the evaluation of rows number is way off again for the > large table - same stats, same results... > > It seems there is a problem with my large table statistics, even after > increase them to 200. Should I increase the statistic target to 500, or even > to 1000? > Is there something else I can trigger to get the appropriate plan? > > Comments/explanations would be appreciated > Daniel > > > -Original Message- > From: pgsql-general-ow...@postgresql.org > [mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Daniel Begin > Sent: June-03-15 06:32 > To: 'Bill Moran' > Cc: pgsql-general@postgresql.org; 'Tomas Vondra'; 'Melvin Davidson' > Subject: Re: [GENERAL] Planner cost adjustments > > Thank Bill, > > About disks performance, all drives are identical and connected using USB3 > connections and yes, I can tweak values and restart Postgres without any > hardship!-) > About seq_page_cost and random_page_cost, I am about to test different lower > values as you and Thomas propose. > Raising the statistics target is a good idea. Since most of the data have a > power law distribution it might affect the statistics. > > I will do as suggested and get back to the community for further comments. I > wished to follow the procedure proposed by PT, just in case I eventually had > to on step 4 (contact PostgreSQL developers so they can improve the > planner). > > And I am still open to other proposal > Daniel > > -Original Messag
Re: [GENERAL] Momentary Delay
Please do not remove the mailing list from replies. See below. On Fri, 12 Jun 2015 09:21:19 -0300 Anderson Valadares wrote: > 2015-06-08 20:33 GMT-03:00 Bill Moran : > > > On Mon, 8 Jun 2015 11:59:31 -0300 > > Anderson Valadares wrote: > > > > > Hi > > > We are experiencing an intermittent problem in a GIS database from a > > > client. Some times of the day is as if the PostgreSQL executed the > > slowest > > > operations see below an example. The query was performed three times, > > twice > > > I canceled and the third time I left it to end. The consultation took 10 > > > seconds to finish, but the measured execution time is 20 ms. As we have > > > several queries that run every 2 seconds when this momentary delay occurs > > > queries accumulate and burst the limit of 203 connections allowed. The > > > interval of "momentary stop" are approximately 2 seconds occurring at > > > random times and during these stoppages occur no locks and no increased > > > load on the server is identified. There is a job 2/2 seconds locks > > > collection of information, running queries, etc., nmon also collects > > > information every 2 seconds. My client asks what causes these momentary > > > stops? because it affects all operations of the database? etc. How do I > > > identify what is causing these delays in executions of operations in the > > > database? > > > > Number of points to consider: > > * Check the contents of pg_locks and see if something is waiting on a > > lock when the query is slow. > > There is nothing waiting when the query is slow, see: > > snap_datetime | waiting | count > ---+-+--- > 2015-06-05 09:25:00.954731-03 | f |74 > 2015-06-05 09:26:00.249187-03 | f | 205 > 2015-06-05 09:27:00.826874-03 | f | 207 I don't know what that means. Since you don't show the query that generated that output, I have no idea if your statement is valid, or if you're running a query that will inherently produce incorrect results. > * Also, ensure log_lock_waits is turned on for the server and check > > PostgreSQL's logs for messages about long lock waits. > > There is also nothing This is even worse than the previous statement. What did you _do_ to determine this? Since you're unable to diagnose and fix the problem on your own, the possibility exists that the problem is with your diagnostic steps. If you don't share the details of those steps, I have no way to help you with them. > > * Based on the connection behavior you describe, I'm guessing it's a > > Tomcat app using some sort of conection pool. Whatever it is, tune > > your connection pool settings so that the max size of the pool doesn't > > exceed the available PostgreSQL connections. At least that will prevent > > errors from happening when the problem occurs. > > Yes it is a tomcat application using connection pooling and will be set to > not exceed the limit of PostgreSQL connections, but the central point is > that the number of connections has increased due to what I am calling > "momentary stoppages" I understand that. I made the suggestion as a temporary fix to reduce the impact until you are able to fix the actual cause. > > * The query you you keep cancelling below, run just EXPLAIN on it (not > > EXPLAIN ANALYZE) which will certainly finish and give you a plan > > that can be reviewed to help determine what the problem is. > > Explain plan in http://explain.depesz.com/s/bWw I'm confused. The query at that URL only took 10ms to complete, which doesn't seem at all unreasonable. > * On the OS, monitor iostat or something else to see if you're > > saturating disk capacity when the problem happens. > > The OS is also being monitored and there occurs no saturation. Again, without any details, it's possible that there is a problem in your monitoring. > > > The Server is a IBM P720 128G RAM PostgreSQL 9.2.9 on > > > powerpc-ibm-aix7.1.0.0, compiled by gcc (GCC) 4.6.4, 64-bit > > > > > > Evolution of the number of connections for a period > > > > > > snap_datetime| #connections > > > --+- > > > 2015-06-05 09:25:00.954731-03 | 74 > > > 2015-06-05 09:26:00.249187-03 | 205 > > > 2015-06-05 09:27:00.826874-03 | 207 > > > 2015-06-05 09:28:00.374666-03 | 73 > > > 2015-06-05 09:29:00.690696-03 | 75 > > > > > > > >
Re: [GENERAL] Momentary Delay
On Mon, 15 Jun 2015 17:48:54 -0300 Anderson Valadares wrote: > 2015-06-12 19:56 GMT-03:00 Bill Moran : > > > Please do not remove the mailing list from replies. See below. > > > > On Fri, 12 Jun 2015 09:21:19 -0300 > > Anderson Valadares wrote: > > > > > 2015-06-08 20:33 GMT-03:00 Bill Moran : > > > > > > > On Mon, 8 Jun 2015 11:59:31 -0300 > > > > Anderson Valadares wrote: > > > > > > > > > Hi > > > > > We are experiencing an intermittent problem in a GIS database from a > > > > > client. Some times of the day is as if the PostgreSQL executed the > > > > slowest > > > > > operations see below an example. The query was performed three times, > > > > twice > > > > > I canceled and the third time I left it to end. The consultation > > took 10 > > > > > seconds to finish, but the measured execution time is 20 ms. As we > > have > > > > > several queries that run every 2 seconds when this momentary delay > > occurs > > > > > queries accumulate and burst the limit of 203 connections allowed. > > The > > > > > interval of "momentary stop" are approximately 2 seconds occurring at > > > > > random times and during these stoppages occur no locks and no > > increased > > > > > load on the server is identified. There is a job 2/2 seconds locks > > > > > collection of information, running queries, etc., nmon also collects > > > > > information every 2 seconds. My client asks what causes these > > momentary > > > > > stops? because it affects all operations of the database? etc. How > > do I > > > > > identify what is causing these delays in executions of operations in > > the > > > > > database? > > > > > > > > Number of points to consider: > > > > * Check the contents of pg_locks and see if something is waiting on a > > > > lock when the query is slow. > > > > > > There is nothing waiting when the query is slow, see: > > > > > > snap_datetime | waiting | count > > > ---+-+--- > > > 2015-06-05 09:25:00.954731-03 | f |74 > > > 2015-06-05 09:26:00.249187-03 | f | 205 > > > 2015-06-05 09:27:00.826874-03 | f | 207 > > > > I don't know what that means. Since you don't show the query that > > generated that output, I have no idea if your statement is valid, or > > if you're running a query that will inherently produce incorrect > > results. > > Sorry the information was incomplete. Below is the query that was > performed to extract information from tables that used to monitor the > database. > The snap_datetime column indicates the time that occurred monitoring, > the waiting column tells you whether any connection was on hold and the > column > count tells how many connections existed at the time. > > select > s.snap_datetime > ,a.waiting > , count(*) > from stat_snapshot s, stat_activity a > where s.snap_id = a.snap_id > and s.snap_datetime >= '2015-06-05 09:25:00' > and s.snap_datetime <= '2015-06-05 09:28:00' > group by > s.snap_datetime > ,a.waiting > order by s.snap_datetime > ; > snap_datetime | waiting | count > ---+-+--- > 2015-06-05 09:25:00.954731-03 | f |74 > 2015-06-05 09:26:00.249187-03 | f | 205 > 2015-06-05 09:27:00.826874-03 | f | 207 > (3 rows) Given what I'm seeing above, there's still a lot of information missing. Those tables are not standard PostgreSQL tables, and I'm not aware of any tool that creates them, thus I have to assume it's a cron job or something similar created in-house. Given the query you ran and the resultant data, my first guess is that the data in stat_snapshot and stat_activity is worthless: i.e. it's captured once per minute, and therefore there are 59+ seconds worth of detail that aren't captured, thus the actual liklihood that the those tables will contain any indication of the problem is very low. > > > * Also, ensure log_lock_waits is turned on for the server and check > > > > PostgreSQL's logs for messages about long lock waits. > > > > > > There is also nothing > > > > This is even worse than the previous statement. What did you _do_ to > > determine this? > > > > Since you
Re: [GENERAL] Compression function
On Tue, 16 Jun 2015 04:45:52 -0300 "Leonardo M. Ramé" wrote: > Hi, does anyone know if there's a compression function to let me store > in gzipped/deflate format TEXT or Bytea fields. > > Please correct me if I'm wrong, but I also wonder if this function is > really needed since I've read large objects are stored with TOAST, hence > compression is already there. The TOAST system does do compression, but depending on your expectation, you may be disappointed. The big thing that might let you down is that the TOAST code doesn't run at all unless the tuple is larger than 2K. As a result, you could have fairly large rows of almost 2000 bytes long, that _could_ compress to significantly less than that, but PostgreSQL never tries to compress. Additionally, PostgreSQL stops trying to compress fields once the row size is smaller than 2K, so if you have multiple fields that could benefit from compression, they might not all be compressed. As a result, if you understand your data well, you need to take this into account, as you might see better results if you do your own compression. Unfortunately, I don't know of any in-database function that can be used to compress data; you'd have to write your own or do it at the application level. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] serialization failure why?
On Tue, 16 Jun 2015 13:33:12 +0001 Filipe Pina wrote: > I have these 2 tables: > > CREATE TABLE "stuff" ("number" integer NOT NULL PRIMARY KEY, "title" > varchar(40) NOT NULL); > CREATE TABLE "stuff_ext" ("stuff_ptr_id" integer NOT NULL PRIMARY KEY, > "extra" integer NOT NULL); > ALTER TABLE "stuff_ext" ADD CONSTRAINT > "stuff_ext_stuff_ptr_id_5a4ee8edae53404b" FOREIGN KEY ("stuff_ptr_id") > REFERENCES "stuff" ("number") DEFERRABLE INITIALLY DEFERRED; > CREATE SEQUENCE stuff_seq; > > And then the function: > > CREATE OR REPLACE FUNCTION create_stuff(number integer, title text) > RETURNS integer AS $$ > DECLARE > a1 stuff; > a2 stuff_ext; > BEGIN > IF number IS NULL THEN > number := nextval('stuff_seq'); > END IF; > > a1.number := number; > a1.title := title; > > a2.stuff_ptr_id := a1.number; > > INSERT INTO stuff VALUES (a1.*); > INSERT INTO stuff_ext VALUES (a2.*); > > RETURN number; > END > $$ > LANGUAGE plpgsql; > > > The DB is configured for SERIALIZABLE transaction mode. > > Now, if I can the function without passing number, such as: > > select create_stuff(NULL,'title'); > > in 10 forked processes in a loop with a few iterations in each, I get > quite a few SERIALIZATON FAILURE (sqlstate 40001). > > If I comment out the "INSERT INTO stuff_ext" line, I don't get any. > > How is the second insert causing serialize dependencies...? I'm not sure this is correct, but I have a theory. Essentially, PostgreSQL can't be sure that the foreign key will be valid if the other transaction rolls back. i.e., what if the foreign key is valid becuase the other transaction created the matching row, and that other transaction then rolls back? In other isolation modes, it can just wait for the appropriate lock to free up, then see what happens. But in serializable mode it hits a condition where it can't ensure serializability. > The specific error messages vary between > > ERROR: could not serialize access due to read/write dependencies among > transactions > DETAIL: Reason code: Canceled on identification as a pivot, during > commit attempt. > HINT: The transaction might succeed if retried. > > and > > ERROR: could not serialize access due to read/write dependencies among > transactions > DETAIL: Reason code: Canceled on commit attempt with conflict in from > prepared pivot. > HINT: The transaction might succeed if retried. > > Thanks! -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Momentary Delay
On Wed, 17 Jun 2015 10:33:37 -0300 Anderson Valadares wrote: > 2015-06-15 18:19 GMT-03:00 Bill Moran : > > > On Mon, 15 Jun 2015 17:48:54 -0300 > > Anderson Valadares wrote: > > > > > 2015-06-12 19:56 GMT-03:00 Bill Moran : > > > > > > > Please do not remove the mailing list from replies. See below. > > > > > > > > On Fri, 12 Jun 2015 09:21:19 -0300 > > > > Anderson Valadares wrote: > > > > > > > > > 2015-06-08 20:33 GMT-03:00 Bill Moran : > > > > > > > > > > > On Mon, 8 Jun 2015 11:59:31 -0300 > > > > > > Anderson Valadares wrote: > > > > > > > > > > > > > Hi > > > > > > > We are experiencing an intermittent problem in a GIS database > > from a > > > > > > > client. Some times of the day is as if the PostgreSQL executed > > the > > > > > > slowest > > > > > > > operations see below an example. The query was performed three > > times, > > > > > > twice > > > > > > > I canceled and the third time I left it to end. The consultation > > > > took 10 > > > > > > > seconds to finish, but the measured execution time is 20 ms. As > > we > > > > have > > > > > > > several queries that run every 2 seconds when this momentary > > delay > > > > occurs > > > > > > > queries accumulate and burst the limit of 203 connections > > allowed. > > > > The > > > > > > > interval of "momentary stop" are approximately 2 seconds > > occurring at > > > > > > > random times and during these stoppages occur no locks and no > > > > increased > > > > > > > load on the server is identified. There is a job 2/2 seconds > > locks > > > > > > > collection of information, running queries, etc., nmon also > > collects > > > > > > > information every 2 seconds. My client asks what causes these > > > > momentary > > > > > > > stops? because it affects all operations of the database? etc. > > How > > > > do I > > > > > > > identify what is causing these delays in executions of > > operations in > > > > the > > > > > > > database? > > > > > > > > > > > > Number of points to consider: > > > > > > * Check the contents of pg_locks and see if something is waiting > > on a > > > > > > lock when the query is slow. > > > > > > > > > > There is nothing waiting when the query is slow, see: > > > > > > > > > > snap_datetime | waiting | count > > > > > ---+-+--- > > > > > 2015-06-05 09:25:00.954731-03 | f |74 > > > > > 2015-06-05 09:26:00.249187-03 | f | 205 > > > > > 2015-06-05 09:27:00.826874-03 | f | 207 > > > > > > > > I don't know what that means. Since you don't show the query that > > > > generated that output, I have no idea if your statement is valid, or > > > > if you're running a query that will inherently produce incorrect > > > > results. > > > > > > Sorry the information was incomplete. Below is the query that was > > > performed to extract information from tables that used to monitor the > > > database. > > > The snap_datetime column indicates the time that occurred monitoring, > > > the waiting column tells you whether any connection was on hold and the > > > column > > > count tells how many connections existed at the time. > > > > > > select > > > s.snap_datetime > > > ,a.waiting > > > , count(*) > > > from stat_snapshot s, stat_activity a > > > where s.snap_id = a.snap_id > > > and s.snap_datetime >= '2015-06-05 09:25:00' > > > and s.snap_datetime <= '2015-06-05 09:28:00' > > > group by > > > s.snap_datetime > > > ,a.waiting > > > order by s.snap_datetime > > > ; > > > snap_datetime | waiting | count > > > ---+-+--- > > > 2015-06-05 09:25:00.954731-03 | f |74 > > > 2015-06-05 09:26:00.249187-03 | f
Re: [GENERAL] Select query regarding info
On Thu, 18 Jun 2015 07:29:37 + "Yogesh. Sharma" wrote: > HI Everyone, > > Below DB query is showing below error on postgresql9.3. > SELECT '\'' || t2.name || '\'', '\'' || t1.phone_number || '\'', '\'' || > t1.details || '\'', '\'' || t1.description || '\'', '\'' || (CASE WHEN > t1.s_id IS NULL THEN 'N/A' ELSE t3.s_type END) || '\'', '\'' || t1.s_id || > '\'' FROM abc_tble AS t1 LEFT JOIN pqrtable AS t2 ON t1.s_id = nid LEFT JOIN > te AS t3 ON t1.s_id = t3.s_id; > Invalid command \''. Try \? for help. > But Above query is working fine in postgresql8.3. > Solution is provided by someone:- > The SQL standard defines two single quotes to escape one inside a literal: > '''' > Postgres 8.3 defaulted to a non-standard behavior where it was allowed to > escape a single quote using a backslash: '\'' > This deviation from the SQL standard was always discouraged and can be > controlled through the configuration parameter > standard_conforming_strings<http://www.postgresql.org/docs/current/static/runtime-config-compatible.html#GUC-STANDARD-CONFORMING-STRINGS> > With version 9.1 the default for this parameter was changed from off to on. > Version 8.1 and later would emit a warning when you used the non-standard way > of escaping single quotes (unless you explicitly turned that off) > > > Could you please provide below information. > How to change standard_conforming_strings value of postgresql.conf? I have > checked but this option is not found in postgresql.conf. Add it to the file. Also, don't reply to unrelated threads with new questions, a lot of people won't see your question if you do that, and if nobody sees your question you won't get an answer. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] INSERT a number in a column based on other columns OLD INSERTs
On Sat, 20 Jun 2015 10:44:21 -0700 (MST) litu16 wrote: > In PostgreSQL I have this table... (there is a primary key in the most left > side "timestamp02" which is not shown in this image) > > in the table above, all columns are entered via querrys, except the > "time_index" which I would like to be filled automatically via a trigger > each time each row is filled. > > This is the code to create the same table (without any value) so everyone > could create it using the Postgre SQL query panel. > > *CREATE TABLE table_ebscb_spa_log02 > ( > pcnum smallint, > timestamp02 timestamp with time zone NOT NULL DEFAULT now(), > fn_name character varying, > "time" time without time zone, > time_elapse character varying, > time_type character varying, > time_index real, > CONSTRAINT table_ebscb_spa_log02_pkey PRIMARY KEY (timestamp02) > ) > WITH ( > OIDS=FALSE > ); > ALTER TABLE table_ebscb_spa_log02 > OWNER TO postgres;* > > What I would like the trigger to do is: > > INSERT a number in the "time_index" column based on the INSERTed values of > the "fn_name" and "time_type" columns in each row. > > If both ("fn_name" and "time_type") do a combination (eg. Check Mails - > Start) that doesn't exist in any row before (above), then INSERT 1 in the > "time_index" column, > > Elif both ("fn_name" and "time_type") do a combination that does exist in > some row before (above), then INSERT the number following the one > before(above) in the "time_index" column. > > (pls look at the example table image, this trigger will produce every red > highlighted square on it) > > > I have tried so far this to create the function: > > CREATE OR REPLACE FUNCTION on_ai_myTable() RETURNS TRIGGER AS $$ > DECLARE > t_ix real; > n int; > > BEGIN > IF NEW.time_type = 'Start' THEN > SELECT t.time_index FROM table_ebscb_spa_log02 t WHERE t.fn_name = > NEW.fn_name AND t.time_type = 'Start' ORDER BY t.timestamp02 DESC LIMIT 1 > INTO t_ix; > GET DIAGNOSTICS n = ROW_COUNT; > IF (n = 0) THEN > t_ix = 1; > ELSE > t_ix = t_ix + 1; > END IF; > END IF; > NEW.time_index = t_ix; > return NEW; > END > $$ > LANGUAGE plpgsql; > > > But when I manually insert the values in the table, nothing change (no error > message) time_index column just remain empty, what am I doing wrong??? > > Please some good PostgreSQL fellow programmer could give me a hand, I really > have come to a death point in this task, I have any more ideas. Couple things. First off, you don't show your statement for creating the trigger. This is important. The trigger has to be a BEFORE trigger FOR EACH ROW. Otherwise, the returned value won't do anything. It should read like this: CREATE TRIGGER trigger_name BEFORE INSERT ON table_ebscb_spa_log02 FOR EACH ROW EXECUTE PROCEDURE on_ai_myTable(); If you created it with AFTER INSERT or FOR EACH STATEMENT, then the trigger won't work as desired. The other thing about assignment being := was already mentioned. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] very slow queries and ineffective vacuum
On Thu, 2 Jul 2015 12:58:18 +0200 Lukasz Wrobel wrote: > Hello again. > > Thank you for all your responses. I will try to clarify more and attempt to > answer the questions you raised. > > I'm attaching the postgresql.conf this time. I cannot supply you guys with > a proper database schema, so I will try to supply you with some obfuscated > logs and queries. Sorry for the complication. > > First of all I seem to have misdirected you guys about the pg_stat* tables. > I have a virtual machine with the database from our test team, which was > running for a month. When I deploy it, our java application is not running, > so no queries are being executed. The pg_stat* tables contain no data > (which is surprising). When I launch the application and queries start > going, the stats are collected normally and autovacuums are being performed. > > I attached the output of vacuum verbose command. > > As for the pg_stat_activity, I have no "idle in transaction" records there, > but I do have some in "idle" state, that don't disappear. Perhaps this > means some sessions are not closed? I attached the query result as > activity.txt. > > I also have a few "sending cancel to blocking autovacuum" and "canceling > autovacuum task" messages in syslog. > > Sample query explain analyze. This was ran after vacuum analyze of the > entire database. The analyze doesn't seem to be working terribly well. Looking at the explain, it expects 337963 rows in table57, but there are only 6789. There are similar discrepencies with table19 and table84. I don't know if indexes are your problem. Those three tables are pretty small, so the sequential scans should be pretty quick (probably faster than index scans, since it looks like most of the rows are returned from all the tables. I'm somewhat confused by your description of the situation. Is the performance problem happening on the virtual machine? Because VMs are notorious for being on oversubscribed hosts and exhibiting performance far below what is expected. It would be worthwhile to do some disk speed and CPU speed tests on the VM to see what kind of performance it's actually capable of ... if the VM is performing poorly, there's not much you can do with PostgreSQL to improve things. > explain analyze SELECT col1, col2, col3, col4, col5 FROM ( table84 table84 > LEFT JOIN table57 table57 ON table84.col7 = table57.col7 ) LEFT JOIN > table19 table19 ON table84.col7 = table19.col7; > QUERY > PLAN > - > Hash Right Join (cost=46435.43..108382.29 rows=189496 width=79) (actual > time=4461.686..13457.233 rows=5749 loops=1) >Hash Cond: (table57.col7 = table84.col7) >-> Seq Scan on table57 table57 (cost=0.00..49196.63 rows=337963 > width=57) (actual time=0.040..8981.438 rows=6789 loops=1) >-> Hash (cost=42585.73..42585.73 rows=189496 width=38) (actual > time=4447.731..4447.731 rows=5749 loops=1) > Buckets: 16384 Batches: 2 Memory Usage: 203kB > -> Hash Right Join (cost=18080.66..42585.73 rows=189496 > width=38) (actual time=1675.223..4442.046 rows=5749 loops=1) >Hash Cond: (table19.col7 = table84.col7) >-> Seq Scan on table19 table19 (cost=0.00..17788.17 > rows=187317 width=26) (actual time=0.007..2756.501 rows=5003 loops=1) >-> Hash (cost=14600.96..14600.96 rows=189496 width=20) > (actual time=1674.940..1674.940 rows=5749 loops=1) > Buckets: 32768 Batches: 2 Memory Usage: 159kB > -> Seq Scan on table84 table84 (cost=0.00..14600.96 > rows=189496 width=20) (actual time=0.059..1661.482 rows=5749 loops=1) > Total runtime: 13458.301 ms > (12 rows) > > Thank you again for your advice and I hope that with your help I'll be able > to solve this issue. > > Best regards. > Lukasz -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Fwd: PostgreSQL & VMWare
On Fri, 3 Jul 2015 12:35:07 +0200 Jean-Gérard Pailloncy wrote: > > I work on a project that collects geolocalized data. > All data will be in PostgreSQL / PostGIS. > > The small PostgreSQL databases will be on Linux guests on VMWare hosts. > > The size of the main database will grow by 50 TB / year, 500 M row / day. > For the largest one, we plan to test different options. > One of them is to stay with Linux on WMWare. > Outside the questions about schema, sharding, I would appreciate if some of > you have informations, benchmarks, stories about big PostgreSQL databases on > Linux guests on VMWare hosts. The place I'm working now did a feasibility study about installing their primary app on vmware instead of directly onto the hardware. Their conclusion was that the app would be about 25% slower running on VMWare. The app is very database-centric. However, I wasn't involved in the tests, can't vouche for the quality of the testing, and there _are_ other pieces involved than the database. That being said, I've used PostgreSQL on VMs quite a bit. It does seem slower, but I've never actually benchmarked it. And it's never seemed slower enough for me to complain much. The concern I have about running a large database on a VM (especially since you're asking about performance) is not he VM itself, but all the baggage that inevitably comes with it ... oversubscribed hosts, terrible, cheap SANs, poor administration leading to bad configuration, and yet another layer of obscurity preventing you from figuring out why things are slow. In my experience, you _will_ get all of these, because once you're on a VM, the admins will be pressured to host more and more VMs on the existing hardware and/or add capacity at minimal cost. There's nothing like a VM where you never know what the performance will be because you never know when some other VMs (completely unrelated to you and/or your work) will saturate the IO with some ridiculous grep recursive command or something. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Backup Method
On Fri, 03 Jul 2015 13:16:02 +0200 Jan Lentfer wrote: > Am 2015-07-03 13:00, schrieb howardn...@selestial.com: > > On 03/07/2015 11:39, Guillaume Lelarge wrote: > >> > >> > In that case is there any recommendation for how often to make > >> base backups in relation to the size of the cluster and the size of > >> the WAL? > >> > > >> > >> Nope, not really. That depends on a lot of things. Our customers > >> usually do one per day. > >> > >> > > Excuse my ignorance... Is the base backup, in general, faster than > > pg_dump? > > It is a different approach. With the base backup you are actually > backing up files from the filesystem ($PGDATA directory), whereas with > pg_dump your saving the SQL commands to reload and rebuild the database. > "Usually" a file based backup will be faster, both on backup and > restore, but it is - as mentioned - a different approach and it might > also not serve all your purposes. One of the things that makes a lot of difference is the amount of redundant data in the database. For example, indexes are completely redundant. They sure do speed things up, but they're storing the same data 2x for each index you have. When you do a base backup, you have to copy all that redundancy, but when you do a pg_dump, all that redundant data is reduced to a single CREATE INDEX command. The result being that if your database has a lot of indexes, the pg_dump might actually be faster. But the only way to know is to try it out on your particular system. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Postgresql jsonb
On Fri, 14 Aug 2015 17:39:49 +0530 Deepak Balasubramanyam wrote: > > I have a table (20 million rows) in Postgresql 9.4 that contains a bigint > id as the primary key and another column that contains jsonb data. Queries > run on this table look like so... > > > ## Query > > select ... from table > WHERE table.column ->'item'->> 'name' = 'value' > > > I'd like to make an effort to get Postgresql to keep all data available in > this table and any index on this table in memory. This would ensure that > sequence or index scans made on the data are fairly fast. > > Research into this problem indicates that there is no reliable way to get > Postgresql to run off of RAM memory completely ( > http://stackoverflow.com/a/24235439/830964). Assuming the table and its > indexes amount to 15 gb of data on the disk and the machine contains 64GB > of RAM with shared buffers placed at anywhere from 16-24 GB, here are my > questions... > > 1. When postgresql returns data from this query, how can I tell how much of > the data was cached in memory? I'm not aware of any way to do that on a per-query basis. > 2. I'm aware that I can tweak the shared buffer so that more data is > cached. Is there a way to monitor this value for its effectiveness? Install the pg_buffercache extension and read up on what it provides. It gives a pretty good view into what PostgreSQL is keeping in memory. > 3. Is there a reliable way / calculation (or close to it), to determine a > point after which Postgresql will ask the disk for data Vs the caches? It will ask the disk for data if the data is not in memory. As long as the data it needs is in memory, it will never talk to the disk unless it needs to write data back. The cache is a cache. So there are only 2 reasons your data wouldn't all be in memory all the time: 1) It doesn't all fit 2) Some of that memory is needed by other tables/indexes/etc As far as when things get evicted from memory, you'll have to look at the source code, but it's your typical "keep the most commonly needed data in memory" algorithms. What problem are you seeing? What is your performance requirement, and what is the observed performance? I ask because it's unlikely that you really need to dig into these details like you are, and most people who ask questions like this are misguided in some way. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why does splitting $PGDATA and xlog yield a performance benefit?
On Tue, 25 Aug 2015 10:08:48 -0700 David Kerr wrote: > Howdy All, > > For a very long time I've held the belief that splitting PGDATA and xlog on > linux systems fairly universally gives a decent performance benefit for many > common workloads. > (i've seen up to 20% personally). > > I was under the impression that this had to do with regular fsync()'s from > the WAL > interfearing with and over-reaching writing out the filesystem buffers. > > Basically, I think i was conflating fsync() with sync(). > > So if it's not that, then that just leaves bandwith (ignoring all of the > other best practice reasons for reliablity, etc.). So, in theory if you're > not swamping your disk I/O then you won't really benefit from relocating your > XLOGs. Disk performance can be a bit more complicated than just "swamping." Even if you're not maxing out the IO bandwidth, you could be getting enough that some writes are waiting on other writes before they can be processed. Consider the fact that old-style ethernet was only able to hit ~80% of its theoretical capacity in the real world, because the chance of collisions increased with the amount of data, and each collision slowed down the overall transfer speed. Contrasted with modern ethernet that doesn't do collisions, you can get much closer to 100% of the rated bandwith because the communications are effectively partitioned from each other. In the worst case scenerion, if two processes (due to horrible luck) _always_ try to write at the same time, the overall responsiveness will be lousy, even if the bandwidth usage is only a small percent of the available. Of course, that worst case doesn't happen in actual practice, but as the usage goes up, the chance of hitting that interference increases, and the effective response goes down, even when there's bandwidth still available. Separate the competing processes, and the chance of conflict is 0. So your responsiveness is pretty much at best-case all the time. > However, I know from experience that's not entirely true, (although it's not > always easy to measure all aspects of your I/O bandwith). > > Am I missing something? -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Anyone interested in a Pittsburgh-area Postgres users'
I will be in Pittsburgh full-time starting the middle of Oct, and I would be interested in helping coordinate a UG. I have a feeling that I could convince a number of people I've worked with to get involved as well. On Tue, 8 Sep 2015 11:10:34 -0400 James Keener wrote: > Is there a user group in Pittsburgh? This email was the first that > showed up in a Google Search. > > Jim > > On 2004-05-02 05:43:26, Tom Lane wrote: > > > I've gotten a couple of inquiries lately about a Postgres users' group > > in my home town of Pittsburgh PA. There is not one (unless it's very > > well camouflaged) but perhaps there is critical mass to create one. > > If you think you might come to meetings of such a group, let me know > > off-list. If I get enough responses I'll set up an initial meeting > > and we'll see where it goes ... > > > > regards, tom lane > > > -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Can somebody explain what is the meaning for HashAggregate?
On Sun, 20 Sep 2015 21:56:39 +0800 (CST) lin wrote: > Can somebody explain what does the postgres done for the explain of sql shows > HashAggregate( what is the meaning for HashAggregate )? > for example: > > > postgres=# explain verbose select oid,relname from pg_class group by > oid,relname; > QUERY PLAN > - > HashAggregate (cost=12.42..15.38 rows=295 width=68) >Output: oid, relname >Group Key: pg_class.oid, pg_class.relname >-> Seq Scan on pg_catalog.pg_class (cost=0.00..10.95 rows=295 width=68) > Output: oid, relname > (5 rows) > > > :: first, seq scan pg_class and return (oid,relname); >second, make group (oid,relname), is it fisrt sort by oid then in the > oid group sort by relname? > Can somebody explain what does the database done for hashAggregate? It combines the values for oid and relname for each returned row, generates a hashkey for them, then uses that hashkey to aggregate (compute the GROUP BY, essentially, in this case) -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Best way to sync table DML between databases
On Mon, 5 Oct 2015 06:20:28 -0700 (MST) jimbosworth wrote: > Hi All, > > I have two servers each running pg9.4.4 database instances. > I need to determine the best way to keep a large 20gb table on server A > synchronised onto server B... > > At the moment, I use pg_dump to periodically dump the table on server A, > then psql to reload into server B. This is fine, but means I have to pull > 100% of the table each time rather than just the changes. This option does > not offer real time accuracy on server B. > > I have considered using a table trigger on row (update, insert or delete) > and then using db_link or postgres_fdw to sync the changes, but am concerned > that a table trigger is synchronous... so a db_link or fdw could incur a > lengthy delay. > > I have also considered using table OIDs to track changes, then just > periodically sync the difference. > > I have considered using postgre_fdw and then 'refresh concurrently > materialized view' on server B. > > I have considered using logical decoding to read the wal files, then extract > the changes. > > Can anyone explain the best way to synchronise JUST the changes on a table > between servers please? Sounds like a problem custom-made to be solved by Slony: http://slony.info/ -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] using postgresql for session
On Wed, 7 Oct 2015 09:58:04 -0600 "john.tiger" wrote: > has anyone used postgres jsonb for holding session ? Since server side > session is really just a piece of data, why bother with special > "session" plugins and just use postgres to hold the data and retrieve it > with psycopg2 ? Maybe use some trigger if session changes?We are > using python Bottle with psycopg2 (super simple, powerful combo) - are > we missing something magical about session plugins ? Nothing that I'm aware of. I've worked on large projects that keep the session data in a Postgres table with great success. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Drop or alter column under load give ERROR #42804 structure of query does not match function result type:
On Fri, 9 Oct 2015 14:32:44 +0800 Victor Blomqvist wrote: > I have a heavily used PostgreSQL 9.3.5 database on CentOS 6. Sometimes I > need to add/remove columns, preferably without any service interruptions, > but I get temporary errors. > > I follow the safe operations list from > https://www.braintreepayments.com/blog/safe-operations-for-high-volume-postgresql > but many operations cause troubles anyway when the more busy tables are > updated. I seriously doubt if Paul did enough research to be sure that "safe" is an absolute term for that list. > Typically I have user defined functions for all operations, and my table > and functions follow this pattern: > > CREATE TABLE users ( > id integer PRIMARY KEY, > name varchar NOT NULL, > to_be_removed integer NOT NULL > ); > > CREATE FUNCTION select_users(id_ integer) RETURNS SETOF users AS > $$ > BEGIN > RETURN QUERY SELECT * FROM users WHERE id = id_; > END; > $$ LANGUAGE plpgsql; > > Then the actual queries are run by our application as > > SELECT id, name FROM select_users(18); > > As you can see the column to_be_removed is not selected. Sure it is ... the function does SELECT *, which absolutely includes the to_be_removed column. The fact that you ignore that column in a subsequent superselect doesn't mean that the query in the function knows to do so. > Then to remove the > column I use: > > ALTER TABLE users DROP COLUMN to_be_removed; > > However, while the system is under load sometimes (more frequently and > persistent the more load the system is experiencing) I get errors like > these: > > ERROR #42804 structure of query does not match function result type: > Number of returned columns (2) does not match expected column count (3). This is a bit surprising to me. I would expect Postgres to have one or the other definition of that row within a single transaction, but what seems to be happening is that the ALTER causes the row definition to be changed in the middle of the transaction, thus the the function may return 3 columns, but when the outer query checks the type, it sees that it should only have 2. > The same error can happen when columns are added. Can this be avoided > somehow, or do I need to take the system offline during these kind of > changes? > > For reference, there was a similar but not same issue posted to psql-bugs a > long time ago: > http://www.postgresql.org/message-id/8254631e-61a5-4c03-899f-22fdcf369...@e23g2000vbe.googlegroups.com > > I posted this same question at dba.stackexchange and got the advice to > repost here: > http://dba.stackexchange.com/questions/117511/postgresql-drop-column-under-load-give-wrong-number-of-columns-errors This has a lot to do with internals. You should wait a bit to see if you get a good answer, but if not you might need to post to the hackers list so the developers can chime in. My opinion is that this is a bug, but it's an obscure enough bug that it's not surprising that it's gone unfixed for a while. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] I'm starting a PostgreSQL user's group in Pittsburgh
I'm going to see if I can drum up enough interest for a PostgreSQL user's group in the Pittsburgh area. After talking to the organizers of the Philadelphia PUG, I decided to try using Meetup to coordinate things: http://www.meetup.com/Pittsburgh-PostgreSQL-Users-Group/ If you're in the Pittsburgh area and would like to get involved, please show your interest by joining the meetup. I'll get a first event scheduled as soon as we have enough people signed up to make it interesting. If you haven't used meetup before: it's a service specifically for coordinating things like user's groups, and it does a pretty good job of letting us coordinate activities. Basic membership on the site is free and includes participating in as many groups as you desire. (it only costs something if you want to host your own group). Hope to see you soon. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Service not starting on Ubuntu 15.04
On Mon, 26 Oct 2015 11:21:23 + Lasse Westh-Nielsen wrote: > > I posted to stackoverflow about my problem upgrading from Ubuntu 14.04 to > Ubuntu 15.04: > http://stackoverflow.com/questions/33306475/ubuntu-15-04-postgresql-doesnt-start > > Tl;dr: postgresql service does not start properly when installed as a > package using cloud-init. > > And I can't figure out if I am doing something wrong, if the AMI is no > good, if the package has problems, ... I reckon I cannot be the first > person to use Postgres on Ubuntu Vivid, but I have been hammering my head > against the wall with this for hours. > > Any help greatly appreciated! I'm taking a shot in the dark here, but ... The symptoms you describe seem to suggest that the script is starting PostgreSQL asynchronously (i.e. in the background) which means that the CREATE command runs too quickly and the server isn't started yet. A quick way to _test_ this theory would be to put a sleep between the install and the CREATE commands and see if the problem goes away. If that does seem to be the problem, then a good _fix_ would be to find a way to foreground the startup of the server, or have some command that tests to ensure the server is started and blocks until it is before running the create command. The only point I'm unclear on is whether you've confirmed that Postgres actually _is_ started once the server is up (albiet without the CREATE statement having succeeded). -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] ??: postgres cpu 100% need help
On Tue, 27 Oct 2015 11:30:45 +0800 "657985...@qq.com" <657985...@qq.com> wrote: > Dear sir: > Recently a wired question about postgresql database really bothered > me a lot, so i really need your help. Here is the problem, in the most > situations the postgre database work very well, Average 3500tps/s per day, > the cpu usage of its process is 3%~10% and every query can be responsed in > less than 20ms, but sometimes the cpu usages of its process can suddenly grow > up to 90%+ , at that time a simple query can cost 2000+ms. ps: My postgresql > version is 9.3.5 and the database is oltp server. 9.3.5 is pretty old, you should probably schedule an upgrade. > shared_buffers | 25GB Try setting this to 16GB. It's been a while since I tested on large-memory/high-load systems, but I seem to remember that shared_buffers above 16G could cause these sorts of intermittant stalls. If that doesn't improve the situation, you'll probably need to provide more details, specifically the layout of the table in question, as well as the queries that are active when the problem occurs, and the contents of the pg_locks table when the problem is occurring. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Auto-analyse on insert operations
On Wed, 4 Nov 2015 14:32:37 +0100 Bertrand Roos wrote: > > I try to configure auto-analyse task with postgresql 9.4. > I have the following configuration (default configuration): > track_counts = on > autovacuum = on > log_autovacuum_min_duration = -1 > autovacuum_max_workers = 3 > autovacuum_naptime = 300s > autovacuum_vacuum_threshold = 50 > autovacuum_analyze_threshold = 50 > autovacuum_vacuum_scale_factor = 0.2 > autovacuum_analyze_scale_factor = 0.2 > autovacuum_freeze_max_age = 2 > autovacuum_multixact_freeze_max_age = 4 > autovacuum_vacuum_cost_delay = 20ms > autovacuum_vacuum_cost_limit = -1 > > With this configuration, I can observe that some tables are > auto-analysed, but some others are not. Even if there are millions of > insert operations on an empty table (all tables are in cluster mode). > In fact it seems that tables with update operations are the only ones > that are auto-analysed. > I'm quite suprised because the documentation says that daemon check the > count of insert, update and delete operations. > What could it be the reason ? Why tables which have only update > operation, aren't analysed ? > Are update operations really taken into account ? Given that autoanalyze is pretty critical to the way the system functions, it's unlikely that it just doesn't work (someone else would have noticed). A more likely scenario is that you've found some extremely obscure edge case. If that's the case, you're going to have to give very specific details as to how you're testing it before anyone is liable to be able to help you. I get the impression that you're somewhat new to Postgres, in which case it's very likely that the problem is that you're not testing the situation correctly. In that case, we're going to need specific details on how you're observing that tables are or are not being analysed. As a wild-guess theory: the process that does the analyze only wakes up to check tables every 5 minutes (based on the config you show) ... so are you doing the inserts then checking the table without leaving enough time in between for the system to wake up and notice the change? -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Auto-analyse on insert operations
On Wed, 4 Nov 2015 16:43:57 +0100 Bertrand Roos wrote: > > Le 04/11/2015 14:55, Bill Moran a écrit : > > On Wed, 4 Nov 2015 14:32:37 +0100 > > Bertrand Roos wrote: > >> I try to configure auto-analyse task with postgresql 9.4. > >> I have the following configuration (default configuration): > >> track_counts = on > >> autovacuum = on > >> log_autovacuum_min_duration = -1 > >> autovacuum_max_workers = 3 > >> autovacuum_naptime = 300s > >> autovacuum_vacuum_threshold = 50 > >> autovacuum_analyze_threshold = 50 > >> autovacuum_vacuum_scale_factor = 0.2 > >> autovacuum_analyze_scale_factor = 0.2 > >> autovacuum_freeze_max_age = 2 > >> autovacuum_multixact_freeze_max_age = 4 > >> autovacuum_vacuum_cost_delay = 20ms > >> autovacuum_vacuum_cost_limit = -1 > >> > >> With this configuration, I can observe that some tables are > >> auto-analysed, but some others are not. Even if there are millions of > >> insert operations on an empty table (all tables are in cluster mode). > >> In fact it seems that tables with update operations are the only ones > >> that are auto-analysed. > >> I'm quite suprised because the documentation says that daemon check the > >> count of insert, update and delete operations. > >> What could it be the reason ? Why tables which have only update > >> operation, aren't analysed ? > >> Are update operations really taken into account ? > > Given that autoanalyze is pretty critical to the way the system functions, > > it's unlikely that it just doesn't work (someone else would have noticed). > > > > A more likely scenario is that you've found some extremely obscure edge > > case. If that's the case, you're going to have to give very specific > > details as to how you're testing it before anyone is liable to be able > > to help you. > > > > I get the impression that you're somewhat new to Postgres, in which case > > it's very likely that the problem is that you're not testing the situation > > correctly. In that case, we're going to need specific details on how you're > > observing that tables are or are not being analysed. > > > > As a wild-guess theory: the process that does the analyze only wakes up > > to check tables every 5 minutes (based on the config you show) ... so are > > you doing the inserts then checking the table without leaving enough time > > in between for the system to wake up and notice the change? > > > Thanks for your answer Bill. > Indeed, I'm pretty new to Postgres and I don't exclude that I'm doing > something wrong. But I did my test on a more than 1 day duration, so > it's not an issue of autovacuum_naptime (I insert 760 lignes each 30 > seconds during 36 hours). > I can't give all the details of this test because it is to complicated > with triggers and partman (and your objective is not to solve > configuration issues of others). Others have answered some of your other questions, so I'll just throw out another possibility: have the per-table analyze settings been altered on the table(s) that are behaving badly? See http://www.postgresql.org/docs/9.4/static/sql-createtable.html#SQL-CREATETABLE-STORAGE-PARAMETERS Attaching the output of pg_dump -s -t $table_name -U postgres $database_name will probably go a long way toward getting more targeted assistance. (substitute the actual database name, and the name of a table that is giving you trouble) In addition, the output of SELECT * FROM pg_stat_user_tables WHERE relname = '$table_name'; (Again, substitute an actual table name that's giving you trouble, preferrably the same table as from the pg_dump) -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Protect a table against concurrent data changes while allowing to vacuum it
On Wed, 22 Jun 2016 10:20:38 + Sameer Kumar wrote: > On Wed, Jun 22, 2016 at 6:08 PM Vlad Arkhipov wrote: > > > I am running PostgreSQL 9.5. > > > > CREATE TABLE t (id BIGINT NOT NULL PRIMARY KEY, name TEXT); > > > > The constraint that the data must satisfy is `there is no more than 3 > > records with the same name`. > > > > I am not in control of queries that modify the table, so advisory locks > > can hardly be of help to me. > > > > Define a function which does a count of the rows and if count is 3 it > return false if count is less it returns true. An exclusion constraint might be a better solution. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] The consequenses of interrupted vacuum
Very specific question: Does interrupting a VACUUM (not FULL) rollback all the work it has done so far, or is the work done on a page by page basis such that at least some of the pages in the table have been vacuumed? I'm asking for cases of large tables where autovacuum frequently gets interrupted. I'm trying to understand if the partial runs are at least making _some_ progress so the next vacuum has less to do, or if this is a serious problem that I need to fiddle with tuning to fix. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] The consequenses of interrupted vacuum
On Thu, 27 Oct 2016 10:44:03 -0400 Tom Lane wrote: > > I'm asking for cases of large tables where autovacuum frequently gets > > interrupted. I'm trying to understand if the partial runs are at least > > making _some_ progress so the next vacuum has less to do, or if this is > > a serious problem that I need to fiddle with tuning to fix. > > It's probably making some progress but not much. You need to fix that. Thanks for the feedback. The good news is that grepping through recent logs, I'm not seeing the problem any more. So I must have just noticed it on a particularly problematic day last time I looked. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Unexplained statistics reset? Help tracking it down.
I've been seeing some unexplained behavior whereas the statistics in a Postgres database reset with no explanation as to why. This is concerning because it results in terrible query plans until someone manually runs analyze, and that it negatively impacts autovacuum. This is Postgres 9.5.4 on Ubuntu 14 (yes, I know there's a patch I need to apply -- it's on my TODO list, but I don't see any mention of fixing unexplained stats resets in that version, so I'm not assuming that will fix it) My first thought is that some random user was calling pg_reset_stats() without realizing what they were doing. However, I have full query logging enabled on this system, and the logs don't show this happening. (Yes, I've also checked for someone disabling query logging for their connection before doing myseterious things). Before I start theorizing that this might be a bug in Postgres, does anyone have any suggestions on what other ways the stats could be reset that I need to check on? Has anyone else experienced this to lend credence to the possibility that it's a bug? I have no clue how to reproduce it, as the occurrance is rare and still seems random. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] pg_repack and Postgres versions > 9.4
Does anyone have experience using pg_repack on Postgres versions > 9.4? Specifically 9.5, but probably 9.6 at some point. The documentation claims it supports up to 9.4. I haven't looked at it closely enough to guess whether there might be changes in 9.5/9.6 to cause it not to work any more. Anyone know? Or, alternatively, anyone have another option to get the same job done? -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] controlling memory management with regard to a specific query (or groups of connections)
On Wed, 18 Nov 2015 20:10:00 -0500 Jonathan Vanasco wrote: > As a temporary fix I need to write some uploaded image files to PostgreSQL > until a task server can read/process/delete them. > > The problem I've run into (via server load tests that model our production > environment), is that these read/writes end up pushing the indexes used by > other queries out of memory -- causing them to be re-read from disk. These > files can be anywhere from 200k to 5MB. > > has anyone dealt with situations like this before and has any suggestions? I > could use a dedicated db connection if that would introduce any options. PostgreSQL doesn't have any provisions for preferring one thing or another for storing in memory. The easiest thing I can think would be to add memory to the machine (or configure Postgres to use more) such that those files aren't pushing enough other pages out of memory to have a problematic impact. Another idea would be to put the image database on a different physical server, or run 2 instances of Postgres on a single server with the files in one database configured with a low shared_buffers value, and the rest of the data on the other database server configured with higher shared_buffers. I know these probably aren't the kind of answers you're looking for, but I don't have anything better to suggest; and the rest of the mailing list seems to be devoid of ideas as well. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] JSONB performance enhancement for 9.6
On Sat, 28 Nov 2015 21:27:51 -0500 Tom Smith wrote: > > Is there a plan for 9.6 to resolve the issue of very slow query/retrieval > of jsonb fields > when there are large number (maybe several thousands) of top level keys. > Currently, if I save a large json document with top level keys of thousands > and query/retrieve > field values, the whole document has to be first decompressed and load to > memory > before searching for the specific field key/value. I could be off-base here, but have you tried: ATLER TABLE $table ALTER COLUMN $json_column SET STORAGE EXTERNAL; ? The default storage for a JSONB field is EXTENDED. Switching it to EXTERNAL will disable compression. You'll have to insert your data over again, since this change doesn't alter any existing data, but see if that change improves performance. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] JSONB performance enhancement for 9.6
On Sun, 29 Nov 2015 08:24:12 -0500 Tom Smith wrote: > Hi, Thanks for everyone's response. > > The issue is not just compression, but lack of "indexing" or "segmentation" > when a > single doc has, say 2000 top level keys (with multiple levels of subkeys). > right now, if I query for one key, the whole doc > has to be first uncompressed and loaded and then search for the single key. > > Compared to traditional way of storing each top level key with a separate > column, this is huge overhead when table scan is required. Some kind of > "keyed/slotted" storage for the doc could > help, (for illustration, all keys starting with 'A' would have its own > storage unit, so on, > so when I search for key "A1" only that unit would be unpacked and > traversed to get :"A1" value". it is like postgresql predfine 26 > columns/slots for the whole doc. an internal indexing > within each doc for fast retrieval of individual field values. Sounds like you're pushing the limits of what JSONB is designed to do (at this stage, at least). I'm not aware of any improvements in recent versions (or head) that would do much to improve the situation, but I don't track ever commit either. If you really need this improvement and you're willing to wait for 9.6, then I suggest you check out the latest git version and test on that to see if anything has been done. I doubt you'll see much, though. As a thought experiment, the only way I can think to improve this use case is to ditch the current TOAST system and replace it with something that stores large JSON values in a form optimized for indexed access. That's a pretty massive change to some fairly core stuff just to optimize a single use-case of a single data type. Not saying it won't happen ... in fact, all things considered, it's pretty likely to happen at some point. As far as a current solution: my solution would be to decompose the JSON into an optimized table. I.e.: CREATE TABLE store1 ( id SERIAL PRIMARY KEY, data JSONB ); CREATE TABLE store2 ( id INT NOT NULL REFERENCES store1(id), top_level_key VARCHAR(1024), data JSONB, PRIMARY KEY(top_level_key, id) ); You can then use a trigger to ensure that store2 is always in sync with store1. Lookups can then use store2 and will be quite fast because of the index. A lot of the design is conjectural: do you even still need the data column on store1? Are there other useful indexes? etc. But, hopefully the general idea is made clear. This probably aren't the answers you want, but (to the best of my knowledge) they're the best answers available at this time. I'd really like to build the alternate TOAST storage, but I'm not in a position to start on a project that ambitious right ... I'm not even really keeping up with the project I'm currently supposed to be doing. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] AccessExclusiveLock on tuple?
On Wed, 2 Dec 2015 09:01:37 -0800 Christophe Pettus wrote: > On 9.4, I've encountered a locking message I've not seen before: > > process 5293 still waiting for AccessExclusiveLock on tuple (88636,15) > of relation 18238 of database 16415 after 5000.045 ms > > What conditions produce an "AccessExclusiveLock on tuple"? Attempting to > lock a tuple when another process has done an explicit LOCK ACCESS EXCLUSIVE? No. See the section on row level locks here: http://www.postgresql.org/docs/9.4/static/explicit-locking.html Essentially, any data modification could take an exclusive lock on the row(s) that it's going to modify. Generally, this will be an UPDATE statement, although the same thing happens when you do SELECT ... FOR UPDATE. The message you're seeing simply means that one process has been waiting for a long time for the lock to release (5 seconds in this case). Deadlocks are automatically handled, so this is not a deadlock. Although if the process holding the lock does not commit the transaction, the waiting process will wait indefinitely. If this is happening infrequently, it's probably of no concern. If it's happening frequently, you'll want to investigate what process is holding the locks for so long and see what can be done about it. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] AccessExclusiveLock on tuple?
On Wed, 2 Dec 2015 09:31:44 -0800 Christophe Pettus wrote: > > On Dec 2, 2015, at 9:25 AM, Bill Moran wrote: > > > No. See the section on row level locks here: > > http://www.postgresql.org/docs/9.4/static/explicit-locking.html > > That wasn't quite my question. I'm familiar with the row-level locking and > the locking messages in general, but this message implies there is such a > thing as an AccessExclusiveLock on a tuple, which is new to me. I wasn't > able to produce this message experimentally doing various combinations of > UPDATE statements and SELECT FOR UPDATEs, or even with explicit LOCK ACCESS > EXCLUSIVE MODE, thus the question. First off, that documentation page _does_ answer your question. Secondly, there is a config setting: log_lock_waits, which is disabled by default. The message won't appear if that is off, so if you're testing on a different install than where the incident happened, that could be part of the problem. Finally, the following sequence triggers the message: create table test1 (data int); insert into test1 values (1); Connection 1: begin; select * from test1 where data = 1 for update; Connection 2: select * from test1 where data = 1 for update; Then wait for a little while and the message will be logged. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] JDBC and inet type
On Fri, 4 Dec 2015 09:41:24 + Tim Smith wrote: > When I use "preparedStatement.setString(5,ip);" to send values to a > stored function, it obviously gets sent to postgres as "character > varying". > > Postgres obviously complains loudly and says " Hint: No function > matches the given name and argument types. You might need to add > explicit type casts.". > > What is the appropriate workaround ? You can define param 5 as varchar in your query, as Rob suggests: CREATE FUNCTION some_function(int, int, int, int, int, varchar) ... Then cast the 5th parameter to INET within your function. You can also cast the value in your SQL. sql = "SELECT some_function($, $, $, $, $::INET)"; ... You could also create an Inet class in Java and implement the SQLData interface, then use setObject() instead of setString(). It doesn't appear as if anyone has done this yet, but it would be nice if it were incluced in the JDBC driver. The first answer is probably best for stored procedures, as it simplifies things down the road. The second solution is more universal, as it works for non-function-calling SQL as well. The third solution is probably _really_ the correct one, from a pedantic standpoint, but it's a bit more work to implement. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Memory Leak executing small queries without closing the connection - FreeBSD
77 > > postgres: postgres postgres [local] SELECT > > 26851 postgres 20 0 2365732 408464 406788 R 100.0 10.1 0:17.81 > > postgres: postgres postgres [local] SELECT > > 26851 postgres 20 0 2365732 864472 862576 R 100.0 21.4 0:38.90 > > postgres: postgres postgres [local] SELECT > > -- Function execmultiplei and transaction terminated, but memory still > > allocated!!! > > 26851 postgres 20 0 2365732 920668 918748 S 0.0 22.7 0:41.40 > > postgres: postgres postgres [local] idle > > -- Calling it again > > 26851 postgres 20 0 2365732 920668 918748 R 99.0 22.7 0:46.51 > > postgres: postgres postgres [local] SELECT > > -- idle again, memory still allocated > > 26851 postgres 20 0 2365732 920668 918748 S 0.0 22.7 1:22.54 > > postgres: postgres postgres [local] idle > > > > Memory will only be released if psql is exited. According to the > > PostgreSQL design memory should be freed when the transaction completed. > > > > top commands on FreeBSD: top -SaPz -o res -s 1 > > top commands on Linux: top -o RES d1 > > > > Config: VMs with 4GB of RAM, 2 vCPUs > > shared_buffers = 2048MB # min 128kB > > effective_cache_size = 2GB > > work_mem = 892MB > > wal_buffers = 8MB > > checkpoint_segments = 16 -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Memory Leak executing small queries without closing the connection - FreeBSD
On Sun, 13 Dec 2015 16:35:08 +0100 Gerhard Wiesinger wrote: > Hello Bill, > > Thank you for your response, comments inline: > > On 13.12.2015 16:05, Bill Moran wrote: > > On Sun, 13 Dec 2015 09:57:21 +0100 > > Gerhard Wiesinger wrote: > >> some further details from the original FreeBSD 10.1 machine: > >> > >> Mem: 7814M Active, 20G Inact, 2982M Wired, 232M Cache, 1661M Buf, 30M Free > >> Swap: 512M Total, 506M Used, 6620K Free, 98% Inuse > >> > >> PID USERNAMETHR PRI NICE SIZERES STATE C TIME WCPU > >> COMMAND > >> 77941 pgsql 5 200 7925M 7296M usem2 352:34 6.98% > >> postgres: username dbnamee 127.0.0.1(43367) (postgres) > > > > > > I see no evidence of an actual leak here. Each process is basically using > > the 7G of shared_buffers you have allocated in the config (which is only > > 7G _total_ for all processes, since it's shared memory) > > OK, but why do we then get: kernel: swap_pager_getswapspace(4): failed? You haven't provided enough information to isolate that cause yet. What's in the Postgres log? Surely it will have logged something when its request for RAM was denied, and it should be more informational than the OS' generic message. > >> Out of memory: > >> kernel: swap_pager_getswapspace(4): failed > >> kernel: swap_pager_getswapspace(8): failed > >> kernel: swap_pager_getswapspace(3): failed > >> > >> Main issue is IHMO (as far as I understood the FreeBSD Memory system) > >> that 20G are INACTIVE. When I subtract the shared memory, even ~13GB > >> should be available, but they are still allocated but inactive > >> (INACTIVE). INACTIVE memory might be clean or dirty. As we get into out > >> of memory situations it is likely that the memory is dirty (otherwise it > >> would have been reused). > > Not quite correct. Inactive memory is _always_ available for re-use. > > Are you sure that's true? Yes. Read The Design and Implementation of FreeBSD for the details. > Monitoring inactive memory: > cat vm_stat.sh > #!/usr/bin/env bash > > while [ 1 ]; do >date +%Y.%m.%d.%H.%M.%S >sysctl -a | grep vm.stats.vm. >sleep 1 > done > > And even we get out of memory with swap_pager_getswapspace Inactive > Memory (from the log file) is around 20GB (doesn't go down or up) > vm.stats.vm.v_inactive_count: 5193530 (*4k pages is around 20GB) > > Then we have 20GB inactive memory, but we still get out of memory with > kernel: swap_pager_getswapspace(4): failed. Any ideas why? Theory: If the planner decides it needs to do 30 sort operations for a query, it will try to allocate 27G of RAM, which exceeds what's available, and therefore never gets allocated. So you get the "out of space" message, but the actual memory usage doesn't change. > >> maintenance_work_mem = 512MB > >> effective_cache_size = 10GB > >> work_mem = 892MB > > I expect that this value is the cause of the problem. The scenario you > > describe below is sorting a large table on an unindexed column, meaning > > it will have to use all that work_mem. I'd be interested to see the > > output of: > > > > EXPLAIN ANALYZE SELECT * FROM t_random ORDER BY md5 LIMIT 10; > > That was only a test query, has nothing to do with production based > query. They are mostly SELECT/INSERTS/UPDATES on primary keys. Then provide the _actual_ queries and the EXPLAIN ANALYZE and the table layouts and basic data distribution of the actual cause. If your test case is completely non-representative of what's happening, then you're not going to get useful answers. > > But even without that information, I'd recommend you reduce work_mem > > to about 16M or so. > > Why so low? E.g. sorting on reporting or some long running queries are > then done on disk and not in memory. Even a simple query could involve multiple sorts, and you're allowing each sort to use up to 890M of RAM (which is _not_ shared). As noted earlier, even a moderately complex query could exceed the available RAM on the system. But since you don't provide the actual queries and tables causing problems, I can only guess. And since you appear to have already decided what the cause of the problem is, then crafted completely non-relevent queries that you think prove your point, I'm not sure there's anything I can do to help you. > >> wal_buffers = 8MB > >> checkpoint_segments = 16 > >> shared_buffers = 7080MB > >> max_connections = 80 > >> autovacuum_max_workers = 3 > > [snip] > > > >>> W
Re: [GENERAL] Memory Leak executing small queries without closing the connection - FreeBSD
On Sun, 13 Dec 2015 20:09:04 +0100 Gerhard Wiesinger wrote: > On 13.12.2015 18:17, Tom Lane wrote: > > Gerhard Wiesinger writes: > >>> Mem: 7814M Active, 20G Inact, 2982M Wired, 232M Cache, 1661M Buf, 30M Free > >>> Swap: 512M Total, 506M Used, 6620K Free, 98% Inuse > >> OK, but why do we then get: kernel: swap_pager_getswapspace(4): failed? > > Just judging from the name of the function, I would bet this is a direct > > result of having only 512M of swap configured. As Bill already pointed > > out, that's a pretty useless choice on a system with 32G of RAM. As soon > > as the kernel tries to push out any significant amount of idle processes, > > it's gonna be out of swap space. The numbers you show above prove that > > it is almost out of free swap already. > > The system wasn't designed by me, I wouldn't do it either that way. Does > swapoff help? FreeBSD and Linux (and most modern OS) are designed to have swap, and usually more swap than RAM. I have never heard a good reason for not using swap, and the reasons I _have_ heard have always been by people misinformed about how the OS works. If someone has a _good_ explanation for why you wouldn't want any swap on a DB server, I'd love to hear it; but everything I've heard up till now has been speculation based on misinformation. IOW: no, you should not turn swap off, you should instead allocate the appropriate amount of swap space. > > Also, while that 20G of "inactive" pages may be candidates for reuse, > > they probably can't actually be reused without swapping them out ... > > and there's noplace for that data to go. > > There is no log in syslog (where postgres log) when > swap_pager_getswapspace is logged. > > But why do we have 20G of Inactive pages? They are still allocated by > kernel or user space. As you can see below (top output) NON Postgres > processes are around 9G in virtual size, resident even lower. The system > is nearly idle, and the queries typically aren't active after one second > agin. Therefore where does the rest of the 11G of Inactive pages come > from (if it isn't a Postgres/FreeBSD memory leak)? > I read that Postgres has it's own memory allocator: > https://www.reddit.com/r/programming/comments/18zija/github_got_30_better_performance_using_tcmalloc/ > Might that be an issue with double allocation/freeing and the "cheese > hole" topic with memory fragmentation? If there were a memory leak in either FreeBSD or Postgres of the seriousness you're describing that were as easy to trigger as you claim, I would expect the mailing lists and other support forums to be exploding in panic. Notice that they are not. Also, I still don't see _ANY_ evidence of a leak. I see evidence that something is happening that is trying to allocate a LOT of RAM, that isn't available on your system; but that's not the same as a leak. > https://www.opennet.ru/base/dev/fbsdvm.txt.html > inactivepages not actively used by programs which are > dirty and (at some point) need to be written > to their backing store (typically disk). > These pages are still associated with objects and > can be reclaimed if a program references them. > Pages can be moved from the active to the inactive > queue at any time with little adverse effect. > Moving pages to the cache queue has bigger > consequences (note 1) Correct, but, when under pressure, the system _will_ recycle those pages to be available. Tom might be correct in that the system thinks they are inactive because it could easily push them out to swap, but then it can't _actually_ do that because you haven't allocated enough swap, but that doesn't match my understanding of how inactive is used. A question of that detail would be better asked on a FreeBSD forum, as the differences between different VM implementations can be pretty specific and technical. [snip] > Mem: 8020M Active, 19G Inact, 3537M Wired, 299M Cache, 1679M Buf, 38M Free > Swap: 512M Total, 501M Used, 12M Free, 97% Inuse > >PID USERNAMETHR PRI NICE SIZERES STATE C TIMEWCPU COMMAND > 77941 pgsql 5 200 7921M 7295M usem7 404:32 10.25% > postgres > 79570 pgsql 1 200 7367M 6968M sbwait 6 4:24 0.59% postgres [snip about 30 identical PG processes] > 32387 myusername9 200 980M 375M uwait 5 69:03 1.27% node [snip similar processes] >622 myusername1 200 261M 3388K kqread 3 41:01 0.00% nginx [snip similar proces
Re: [GENERAL] Memory Leak executing small queries without closing the connection - FreeBSD
On Sun, 13 Dec 2015 22:23:19 +0100 Gerhard Wiesinger wrote: > On 13.12.2015 21:14, Bill Moran wrote: > > Wait ... this is a combined HTTP/Postgres server? You didn't mention that > > earlier, and it's kind of important. > > > > What evidence do you have that Postgres is actually the part of > > this system running out of memory? > > For me the complete picture doesn't look consistent. That's because you haven't gathered enough of the right type of information. > > I don't see any such evidence in any of > > your emails, and (based on experience) I find it pretty likely that whatever > > is running under node is doing something in a horrifically > > memory-inefficient > > manner. Since you mention that you see nothing in the PG logs, that makes it > > even more likely (to me) that you're looking entirely in the wrong place. > > > > I'd be willing to bet a steak dinner that if you put the web server on a > > different server than the DB, that the memory problems would follow the > > web server and not the DB server. > > Changes in config: > track_activity_query_size = 102400 > work_mem = 100MB > > Ok, we restarted PostgreSQL and had it stopped for seconds, and logged > top every second: > > When PostgreSQL was down nearly all memory was freed, looks good to me. > So it is likely that node and other processes are not the cause. > Mem: 742M Active, 358M Inact, 1420M Wired, 21G Cache, 871M Buf, 8110M Free > Swap: 512M Total, 477M Used, 35M Free, 93% Inuse > > When PostgreSQL restarted, Inactive was growing fast (~1min): > Mem: 7998M Active, 18G Inact, 2763M Wired, 1766M Cache, 1889M Buf, 1041M > Free > Swap: 512M Total, 472M Used, 41M Free, 92% Inuse > > After some few minutes we are back again at the same situation: > Mem: 8073M Active, 20G Inact, 2527M Wired, 817M Cache, 1677M Buf, 268M Free > Swap: 512M Total, 472M Used, 41M Free, 92% Inuse > > The steak dinner is mine :-) Donating to the PostgreSQL community :-) Based on the fact that the inactive memory increased? Your understanding of inactive memory in FreeBSD is incorrect. Those pages are probably DB pages that the OS is keeping in inactive memory because Postgres requests them over and over, which is what the OS is supposed to do to ensure the best performance. Are you seeing any out of swap space errors? Even if you are, you still haven't determined if the problem is the result of Postgres or the node.js stuff you have running. I don't know what node.js might be caching on the client side ... do you? No. Until you can actually report back something other than wild speculation, I'll keep that steak dinner for myself. Besides, that bet was based on you putting the PG server on seperate hardware from the web server, which you didn't do. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Test disk reliability (or HGST HTS721010A9E630 surprisingly reliable)
On Mon, 21 Dec 2015 14:54:14 +0100 Félix GERZAGUET wrote: > On Mon, Dec 21, 2015 at 12:31 AM, Jim Nasby > wrote: > > > On 12/20/15 1:09 PM, Félix GERZAGUET wrote: > > > >> After reading > >> http://www.postgresql.org/docs/current/static/wal-reliability.html, I > >> tried the recommended diskchecker.pl > >> <http://brad.livejournal.com/2116715.html> but I am not satisfied: > >> > >> I always get: > >> Total errors: 0 > >> > >> even if I tested with with a HGST HTS721010A9E630 that the vendor's > >> datasheet > >> (http://www.hgst.com/sites/default/files/resources/TS7K1000_ds.pdf) > >> advertise as " > >> Designed for low duty cycle, non mission-critical applications in > >> PC,nearline and consumer electronics environments, which vary > >> application to application > >> " > >> > >> Since it is not, a high end disk, I expect some errors. > >> > > > > Why? Just because a disk isn't enterprise-grade doesn't mean it has to lie > > about fsync, which is the only thing diskchecker.pl tests for. > > > > I was thinking that since the disk have a 32M write-cache (with not > battery) it would lie to the OS (and postgres) about when data are really > on disk (not in the disk write cache). But maybe that thinking was wrong. It varies by vendor and product, which is why diskchecker.pl exists. It's even possible that the behavior is configurable ... check to see if the vendor provides a utility for configuring it. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] uuid-ossp: Performance considerations for different UUID approaches?
On Tue, 22 Dec 2015 11:07:30 -0600 Brendan McCollam wrote: > (previously posted to the pgsql-performance list) > > Hello, > > We're in the process of designing the database for a new service, and > some of our tables are going to be using UUID primary key columns. > > We're trying to decide between: > > * UUIDv1 (timestamp/MAC uuid) and > > * UUIDv4 (random uuid) > > And the separate but related choice between: > > * Generating the UUIDs client-side with the Python uuid library > (https://docs.python.org/2/library/uuid.html) or > > * Letting PostgreSQL handle uuid creation with the uuid-ossp extension > (http://www.postgresql.org/docs/9.4/static/uuid-ossp.html) > > In terms of insert and indexing/retrieval performance, is there one > clearly superior approach? If not, could somebody speak to the > performance tradeoffs of different approaches? > > There seem to be sources online (e.g. > https://blog.starkandwayne.com/2015/05/23/uuid-primary-keys-in-postgresql/ > http://rob.conery.io/2014/05/29/a-better-id-generator-for-postgresql/) > that claim that UUIDv4 (random) will lead to damaging keyspace > fragmentation and using UUIDv1 will avoid this. There's no substance to these claims. Chasing the links around we finally find this article: http://www.sqlskills.com/blogs/kimberly/guids-as-primary-keys-andor-the-clustering-key/ which makes the reasonable argument that random primary keys can cause performance robbing fragmentation on clustered indexes. But Postgres doesn't _have_ clustered indexes, so that article doesn't apply at all. The other authors appear to have missed this important point. One could make the argument that the index itself becomming fragmented could cause some performance degredation, but I've yet to see any convincing evidence that index fragmentation produces any measurable performance issues (my own experiments have been inconclusive). Looking at it another way, a quick experiment shows that PG can fit about 180 UUID primary keys per database page, which means a million row table will use about 5600 pages to the tune of about 46m. On modern hardware, that index is likely to be wholly in memory all the time. If your performance requirements are really so dire, then you should probably consider ditching UUIDs as keys. Taking the same million row table I postulated in the previous paragraph, but using ints insted of UUIDs for the primary key, the primary key index would be about 3200 pages (~26m) ... or almost 1/2 the size -- making it more likely to all be in memory at any point in time. I seriously doubt that trying to make your UUIDs generate in a predictable fashon will produce any measurable improvement, and I see no evidence in the articles you cited that claims otherwise have any real basis or were made by anyone knowledgeable enough to know. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Enforcing referential integrity against a HSTORE column
On Sat, 2 Jan 2016 07:30:38 -0800 Adrian Klaver wrote: > > So given: > > > > CREATE TABLE xtra_fields( > >xfk SERIAL PRIMARY KEY, > >xtk INTEGER NOT NULL REFERENCES xtra_types, > >... > > ); > > > > CREATE OR REPLACE FUNCTION foo_xtra_fk(HSTORE) RETURNS BOOLEAN AS $$ > > WITH keyz AS (SELECT skeys($1)::INT AS xfk) > > SELECT > >(SELECT COUNT(*) FROM keyz JOIN xtra_fields USING (xfk)) > >= > >(SELECT COUNT(*) FROM keyz) > > $$LANGUAGE SQL STABLE STRICT LEAKPROOF; > > > > CREATE TABLE foo( > >id INTEGER NOT NULL CHECK (id > 0), > >... > > -- Extra fields where the keys are the xtra_fields.xfk values and the > > values are the > > -- data values for the specific xfk. > >xtra hstore CHECK (foo_xtra_fk(xtra)) > > ); > > > > is ?there a more efficient way of maintaining logical referential integrity? I second Adrian's comment on making sure that the benefit of HSTORE is outweighing the drawback of having to write your own checks ... however, if you decide that HSTORE is the right way to go, you may want to try something more along the lines of this for your check: SELECT true WHERE NOT EXISTS(SELECT 1 FROM keyz WHERE xfk NOT IN (akeys($1))); Not tested, so it's possible that I have some typo or something; but overall I've found that the NOT EXISTS construct can be very efficient in cases like these. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Code of Conduct: Is it time?
On Sun, 10 Jan 2016 07:36:23 -0800 "Joshua D. Drake" wrote: > Hey, > > For the record, my thoughts on a CoC are something like: > > 1. Be excellent to each other > 2. If you don't know what that means, leave > 3. If someone isn't being excellent please contact: XYZ > > With XYZ being a committee that determines the ABCs. In general, I agree; but there are problems with 1 and 2. The definition of "being excellent" varies from individual to individual; but more importantly, from culture to culture. As a result, pretty much everyone would have to leave as a result of #2, because very few people know what "being excellent" means to everyone involved. As a result, I would feel REALLY bad for XYZ, who would be put in the unenviable place of trying to mitigate disputes with no guidance whatsoever. So, the purpose of a CoC is twofold: A) Define what "being excellent" means to this particular community. B) Provide a process for how to resolve things when "being excellent" doesn't happen. Without #1, nobody will want to do #2, as it's basically a job that can never be done correctly. But defining #1 is the really difficult part, because no matter how you define it, there will be some people who disagree with said definition. The fact that Postgres has not needed a CoC up till now is a testiment to the quality of the people in the community. However, if Postgres continues to be more popular, the number of people involved is going to increase. Simply as a factor of statistics, the project will be forced to deal with some unsavory people at some point. Having a CoC is laying the foundation to ensure that dealing with those people involves the least pain possible. It will always involve _some_ pain, but less is better. I've done the job of #3 with other groups, and 99% of the time there was nothing to do. The one incident I had to handle was terrible, but at least I had some guidance on how to deal with it. -- Bill Moran -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general