date:20120926

Re: [GENERAL] PostgreSQL data loads - turn off WAL

2012-09-26 Thread Albe Laurenz

hartrc wrote:
> Basically my question is:
> Is there currently any way to avoid wal generation during data load
for
> given tables and then have point in time recovery after that?
> 
> Background and blurb
> The situation i'm referring to here is for a development environment.
I
> require point in time recovery because if there is crash etc I don't
want to
> lose up to a days work for 12 developers. I'm fairly new to PostgreSQL
so
> please forgive any gaps in my knowledge.
> 
> A developer did a data load yesterday of approximately 5GB of data
into a
> new schema. This generated approximately 7GB of wal.  The situation
arises
> where if something is incorrect in the data load the data load may
need to
> be repeated 2 or 3 times (thus generating 20GB +of WAL). For a data
load i
> don't want wal to be generated. I accept the fact before there was
nothing
> and from the point of the next pg_basebackup there was everything. It
is
> from the point i say ok that is everything (the next backup) that i
want
> point in time recovery to apply to that table.
> It is doesn't seem practical, and appears very risky to turn off
wal_archive
> during the data load.

But that's exactly what I'd recommend.

Set archive_mode=off, restart the server, load your data,
set archive_mode=on, restart the server, perform a backup.

If something goes wrong during data load, restore the
previous backup and PITR to the end.

Turning off WAL archiving is no data corruption risk.
The server will still be able to recover from crashes.

Yours,
Laurenz Albe


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Multiple Schema and extensions

2012-09-26 Thread Albe Laurenz

Alan Nilsson wrote:
> Is it the case that extensions can be added to only one schema?  If
so, what is the recommended
> practice for accessing a function from an extension in multiple
schemas?

Yes, a (relocatable) extension can belong to only one schema.
There are two ways to access the functions:
1) Use qualified names.
2) Set search_path to contain the schema.

> Is it *ok* to load the extension in the pg_catalog schema so functions
can be accessed by unqualified
> names?  Is it *better* to have a separate schema for functions and use
qualified names to access?
> What is the general accepted practice?

No, that is not ok.
pg_catalog is only for system objects.
You should use schema "public" - that is in search_path by
default, and everybody can use it.

> As a side note:
> ALTER EXTENSION foo ADD SCHEMA bar
> What the heck does this do?  AFICS, it has no effect on the visibility
of an extension function in the
> *added* schema.

This makes "bar" belong to extension "foo", not the other way around.
The effect is that
a) "bar" will not be dumped by pg_dump
b) "DROP EXTENSION foo" will drop schema "bar"

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] idle in transaction query makes server unresponsive

2012-09-26 Thread Albe Laurenz

Scot Kreienkamp wrote:
> I have a problem that I've been struggling with for quite some time.
Every once in a while I will get
> a connection that goes to idle in transaction on an in-house
programmed application that connects with
> JDBC.  That happens fairly regularly and the programmers are trying to
clean that up, but sometimes
> the idle in transaction connection makes the PG server entirely
unresponsive.  I'm not getting
> connection refused, nothing.  All connections existing or new, JDBC or
psql, just hang.  I've already
> got full query logging on to try to catch the problem query or
connection so I can give the developers
> somewhere to look to resolve their issue with the application, but
since queries are logged with
> runtimes I'm assuming they are only logged after they are complete.
And since it's idle in
> transaction it never completes so it never gets logged.  Our
application is connecting as an
> unprivileged user named rmstomcat, and the database is limited to 400
connections out of 512.  I'm not
> running out of connections as I've got reserved connections set, and
even connecting as user postgres
> with psql the connection just hangs.  The server doesn't appear to be
running out of memory when this
> happens and nothing is printed in the log.  The only thing that
resolves it is doing a kill on the PID
> of any idle in transaction connections existing at the time causing
them to roll back.  Then
> everything else picks up right where it left off and works again.
> 
> Can anyone give me any hints about why PG becomes unresponsive?  Or
how to fix it so it doesn't?
> 
> My server is 9.1.2 right now.  I will be upgrading to the latest 9.1
series soon, but until 9.2 can be
> run through our development/testing cycle I can't upgrade to 9.2.
That will take about 6-10 months.

Yes, see if upgrading to 9.1 makes the problem disappear.

It is surprising that you cannot even start new connections.

You could try to "strace" the postmaster during a connection attempt
and see what happens.  Maybe that helps to spot the place where
things go wrong.

Yours,
Laurenz Albe


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] shared memory settings

2012-09-26 Thread Alexander Shutyaev

Hi all!

We have at present the following parameters related to shared memory:

*postgres*

shared_buffers = 7GB
max_connections = 1500
max_locks_per_transaction = 1024
max_prepared_transactions = 0 (not set)

*system*

SHMALL = 2097152
SHMMAX = 17 670 512 640
SHMMNI = 4096

The amount of RAM is 24 693 176k

We need to increase max_connections to 3000. When we tried to do this we
got an error

*[1-1] FATAL:  could not create shared memory segment: No space left on
device*
*[2-1] DETAIL:  Failed system call was shmget(key=5432001, size=8964661248,
03600).*
*[3-1] HINT:  This error does *not* mean that you have run out of disk
space.  It occurs either if all available shared memory IDs have been
taken, in which case you need to raise the SHMMNI parameter in your kernel,
or because the system's overall limit for shared memory has been reached.
 If you cannot increase the shared memory limit, reduce PostgreSQL's shared
memory request (currently 8964661248 bytes), perhaps by reducing
shared_buffers or max_connections.*
* The PostgreSQL documentation contains more information about shared
memory configuration.*

The hint proposes to increase SHMMNI kernel parameter, but I'm not sure
about how much to add :) Also, do we need to change any other parameters
accordingly?

Thanks in advance,
Alexander

Re: [GENERAL] shared memory settings

2012-09-26 Thread Devrim GÜNDÜZ

Hi,

On Wed, 2012-09-26 at 13:39 +0400, Alexander Shutyaev wrote:

> We need to increase max_connections to 3000. When we tried to do this
> we got an error 

Not an answer to your question but: Are you talking about 3k
*concurrent* connections? What are the server specs? I am not sure you
can handle that many connections, unless you are using HP DL 980 or  HP
Superdome or so. 

You also may want to consider using connection pooler, namely pgbouncer.

Regards,
-- 
Devrim GÜNDÜZ
Principal Systems Engineer @ EnterpriseDB: http://www.enterprisedb.com
PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer
Community: devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr
http://www.gunduz.org  Twitter: http://twitter.com/devrimgunduz

signature.asc
Description: This is a digitally signed message part

[GENERAL] PostgreSQL, OLAP, and Large Clusters

2012-09-26 Thread Ryan Kelly

Hi:

The size of our database is growing rather rapidly. We're concerned
about how well Postgres will scale for OLAP-style queries over terabytes
of data. Googling around doesn't yield great results for vanilla
Postgres in this application, but generally links to other software like
Greenplum, Netezza, and Aster Data (some of which are based off of
Postgres). Too, there are solutions like Stado. But I'm concerned about
the amount of effort to use such solutions and what we would have to
give up feature-wise.

We love a lot of the features that we currently have that (I think)
would only be available with Postgres: arrays, hstore, gin/gist indexes,
extensions. Not to mention a lot of the other great SQL standard
features not available in other open-source databases, especially window
functions and CTEs.

Of course, migration to another solution is feasible (with enough
effort, of course), but given my experiences with Postgres and the
support provided by the community that is second to none, I'd very much
like to stay with PostgreSQL.

Thoughts?

-Ryan Kelly


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Running CREATE only on certain Postgres versions

2012-09-26 Thread Adrian Klaver


On 09/25/2012 08:25 AM, Daniele Varrazzo wrote:

On Tue, Sep 25, 2012 at 3:47 PM, Adrian Klaver  wrote:


To elaborate:
test=> SELECT current_setting('server_version_num');
  current_setting
-
  90009


Yes, but knowing that, how does he run a statement only if version
e.g. >= 80400? Is there a better way than the proposed
create/call/drop function before PG 9.0? (since 9.0 there is the DO
statement).


From the OP:
"I have some code which creates a function in Postgres.."

It is unclear what that code is in its entirety. If it is  straight SQL 
than I see no other choice than the above create/call/drop. If it is 
some other language over SQL then I could see an IF statement or its 
equivalent.




-- Daniele




--
Adrian Klaver
adrian.kla...@gmail.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] estemated number of rows and optimiser effeciency

2012-09-26 Thread salah jubeh

Hello,

I know that the optimizer, uses the number of rows to calculate the total 
query  execution time.  In complex queries, it is very difficult to know the 
number of rows in the expected result. This certainly affects the optimizer to 
very great extent.  I have a view and it should return around 5.5 million rows 
; at the beginning postgresql  used to execute this query in around 12 minutes. 
After running vacuum analyze; the query execution time dropped to 1.5 minutes. 
Still, I think this query time could be executed in around 40-50 second.  
Before and after running vacuum Analyze,  the number of expected rows was 1600 
and 6500 respectively .  By comparing 5.5 million rows (real result)  and 6500 
rows,  and 1600 rows (expected results) ,  one can observe how much this could 
affect the optimizer plans. 


 I am wondering, why the historical data (real result of the query) does not 
affect the execution plan. For example, If I ran the query 10 times I always 
get around 6500 instead of 5.5 million in the top most execution plan node.

Regards

[GENERAL] Re: Need help in reclaiming disk space by deleting the selected records

2012-09-26 Thread Yelai, Ramkumar IN BLR STS

Thanks Steve and Andres,

I read these articles
http://www.linuxinsight.com/optimize_postgresql_database_size.html
http://yuval.bar-or.org/blog/2010/08/sluggish-postgresql-databases-and-reindexing-indexes/
http://www.if-not-true-then-false.com/2009/partitioning-large-postgresql-tables-and-handle-millions-of-rows-efficiently-and-quickly/

and I have some more questions on the Steve comments.

1.  Do I need run REINDEX to reduce space or auto vacuum will handle re 
indexing?
2.  Cluster, Re index and Vacuum full locks the table, Hence do we need to  
avoid database operations ( select, delete, insert ) while doing disk clean up? 
Just curious what if I keep inserting while running this command?
3.  All the three commands needs some additional space to do this 
operation? Am I correct?
4.  Would all database server ( oracle, sqlserver and mysql ) needs 
downtime while doing disk clean up?
5.  I am very happy to use Truncate and table partitioning, it is 
satisfying my requirements. But in order to achieve this, for 10 years ( 
currently 6 unique archiving tables I have )  I have to create 1440 month 
tables. Will it creates any issue and is there anything  I need to consider 
carefully while doing this?

Thanks & Regards,
Ramkumar
_
From: Yelai, Ramkumar IN BLR STS
Sent: Thursday, September 13, 2012 7:03 PM
To: 'pgsql-general@postgresql.org'
Subject: Need help in reclaiming disk space by deleting the selected records


Hi All,

I am a beginner in Postgresql and Databases. I have a requirement that 
reclaiming disk space by deleting the rows in a selected time span.  I went 
through the documents and articles to know how to get the table size 
(http://wiki.postgresql.org/wiki/Disk_Usage)

But before let the user delete, I have to show the size of the records size in 
the selected time span.  But here I don't know how to calculate the selected 
records size.
In addition to this, I assume that after deleted the records I have to run 
VACUUM FULL command to reclaiming the space( Please correct me if I am wrong or 
let me know the best approach) .

The table looks like this

CREATE TABLE IF NOT EXISTS "SN_SamplTable"
(
  "ID" integer NOT NULL,
  "Data" integer,
  "CLIENT_COUNT_TIMESTAMP" timestamp without time zone
);

Please help me to how to proceed on this.

Thanks & Regards,
Ramkumar.

Re: [GENERAL] unique constraint with significant nulls?

2012-09-26 Thread Mike Blackwell

Ah.  A pair of constraints.  I see.

Thanks!

__
*Mike Blackwell | Technical Analyst, Distribution Services/Rollout
Management | RR Donnelley*
1750 Wallace Ave | St Charles, IL 60174-3401
Office: 630.313.7818
mike.blackw...@rrd.com
http://www.rrdonnelley.com



* *


On Tue, Sep 25, 2012 at 12:37 PM, Andreas Joseph Krogh  wrote:

> On 09/25/2012 05:05 PM, Mike Blackwell wrote:
>
>> How would one go about building a multi-column unique constraint where
>> null is a significant value, eg. (1, NULL) <> (2, NULL)?
>>
>> I see a number of references to not being able to use an index for this,
>> but no mention of an alternative.  Any pointers would be appreciated
>>
>
> create table my_table(
> some_column varchar not null,
> other_column varchar);
>
> create unique index my_idx on my_table(some_column, other_column) where
> other_column is not null;
> create unique index my_fish_idx on my_table(some_column) where
> other_column is null;
>
> insert into my_table (some_column, other_column) values('a', 'a');
> insert into my_table (some_column, other_column) values('a', 'b');
> insert into my_table (some_column) values('a');
> insert into my_table (some_column) values('b');
>
> -- fails
> insert into my_table (some_column, other_column) values('a', 'a');
> -- also fails
> insert into my_table (some_column) values('a');
>
> result:
>
> andreak=# insert into my_table (some_column, other_column) values('a',
> 'a');
> ERROR:  duplicate key value violates unique constraint "my_idx"
> DETAIL:  Key (some_column, other_column)=(a, a) already exists.
>
> andreak=# insert into my_table (some_column) values('a');
> ERROR:  duplicate key value violates unique constraint "my_fish_idx"
> DETAIL:  Key (some_column)=(a) already exists.
>
>
> --
> Andreas Joseph Krogh  - mob: +47 909 56 963
> Senior Software Developer / CEO - OfficeNet AS - http://www.officenet.no
> Public key: 
> http://home.officenet.no/~**andreak/public_key.asc
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/**mailpref/pgsql-general
>

[GENERAL] Server doesn't accept connection

2012-09-26 Thread fra2012

Hi,

i'm new user i'm this problem:

My database Pgadmin III up to yesterday connect, tomorrow display this
message:
Could not connect to server: connection refused (0x274D/10061) Is the
server running on host "localhost" (127.0.0.1) and accepting TCP/IP
connections on port 5432? 

I'm scanning pc with antivirus, it's ok.
I'm not delete users or made changes.
My version of Postgres è 9.1.3-1

I change the date, i'm rename directory log, i'm reinstall version of
postgres...

Help my please.

Fra2012



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Server-doesn-t-accept-connection-tp5725516.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] estemated number of rows and optimiser effeciency

2012-09-26 Thread David Johnston

>>  I am wondering, why the historical data (real result of the query) does
not affect the execution plan. For example, If I ran the query 10 times I
always get around 6500 instead of 5.5 million in the top most execution plan
node.

The likely accurate, if cheeky, answer is simply that no one has taken the
time to implement what at face value would seem to be an extremely complex
behavior.


As to the rest you should probably gather up some additional details and
post this over on performance.  There is likely a middle ground between your
idea and the status quo but that ground cannot be evaluated unless examples
of possibly misbehaving queries are provided.

David J.




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] estemated number of rows and optimiser effeciency

2012-09-26 Thread salah jubeh

Hello David, 


Thanks for the quick response, I will follow up with the performance  group 
after preparing some case scenarios.

Regards




 From: David Johnston 
To: 'salah jubeh' ; 'pgsql'  
Sent: Wednesday, September 26, 2012 4:41 PM
Subject: RE: [GENERAL] estemated number of rows and  optimiser effeciency
 
>>  I am wondering, why the historical data (real result of the query) does
not affect the execution plan. For example, If I ran the query 10 times I
always get around 6500 instead of 5.5 million in the top most execution plan
node.

The likely accurate, if cheeky, answer is simply that no one has taken the
time to implement what at face value would seem to be an extremely complex
behavior.


As to the rest you should probably gather up some additional details and
post this over on performance.  There is likely a middle ground between your
idea and the status quo but that ground cannot be evaluated unless examples
of possibly misbehaving queries are provided.

David J.

Re: [GENERAL] Server doesn't accept connection

2012-09-26 Thread Adrian Klaver


On 09/26/2012 07:39 AM, fra2012 wrote:

Hi,

i'm new user i'm this problem:

My database Pgadmin III up to yesterday connect, tomorrow display this
message:
Could not connect to server: connection refused (0x274D/10061) Is the
server running on host "localhost" (127.0.0.1) and accepting TCP/IP
connections on port 5432?

I'm scanning pc with antivirus, it's ok.
I'm not delete users or made changes.
My version of Postgres è 9.1.3-1

I change the date, i'm rename directory log, i'm reinstall version of
postgres...


So to be clear you did a reinstall of Postgres just before this error 
occurred or after?





Help my please.

Fra2012



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Server-doesn-t-accept-connection-tp5725516.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.





--
Adrian Klaver
adrian.kla...@gmail.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Server doesn't accept connection

2012-09-26 Thread Adrian Klaver


On 09/26/2012 08:57 AM, Empire Software srl wrote:

Dear mr Adrian,

I reinstall postgres 9.1 but the problem  does not change.

I open the config file pg_hda.conf  the adress is 127.0.0.1/32


I am CCing the list so more eyes can see.

pg_hba.conf controls who gets in once a connection is made. Your problem 
is the connection is not being made.


Some questions:

Look in postgresql.conf:
  What is listen_addresses set to?

Is the server on the same machine as the pgAdminIII client?
  Could there be a firewall issue?

What are the OSs' involved?
  You mentioned AV software so I assume Windows is at least one of them

Is there a chance more than one instance of Postgres running and none 
are on port 5432?

  Look in postgresql.conf for the port setting.



Thanks for the reply.

Fra






--
Adrian Klaver
adrian.kla...@gmail.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Memory issues

2012-09-26 Thread Scott Marlowe

On Tue, Sep 25, 2012 at 7:00 PM, Shiran Kleiderman  wrote:
>
> Hi
> Thanks for your answer.
> I understood that the server is ok memory wise.
> What can I check on the client side or the DB queries?

Well you're connecting to localhost so I'd expect you to show a memory
issue in free I'm not seeing.  Are you really connecting to localhost
or not?


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Memory issues

2012-09-26 Thread Shiran Kleiderman

Hi
Thanks again.
Right now, this is *free -m and ps aux* and non of the crons can run -
can't allocate memory.

cif@domU-12-31-39-08-06-20:~$ free -m
 total   used   free sharedbuffers cached
Mem: 17079  12051   5028  0270   9578
-/+ buffers/cache:   2202  14877
Swap:  511  0511


cif@domU-12-31-39-08-06-20:~$ ps aux
USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
root 1  0.0  0.0  24316  2280 ?Ss   Sep24   0:00 /sbin/init
root 2  0.0  0.0  0 0 ?SSep24   0:00 [kthreadd]
root 3  0.0  0.0  0 0 ?SSep24   0:00
[ksoftirqd/0]
root 4  0.0  0.0  0 0 ?SSep24   0:00
[kworker/0:0]
root 5  0.0  0.0  0 0 ?SSep24   0:00
[kworker/u:0]
root 6  0.0  0.0  0 0 ?SSep24   0:00
[migration/0]
root 7  0.0  0.0  0 0 ?SSep24   0:00
[watchdog/0]
root 8  0.0  0.0  0 0 ?SSep24   0:00
[migration/1]
root 9  0.0  0.0  0 0 ?SSep24   0:00
[kworker/1:0]
root10  0.0  0.0  0 0 ?SSep24   0:01
[ksoftirqd/1]
root11  0.0  0.0  0 0 ?SSep24   0:00
[watchdog/1]
root12  0.0  0.0  0 0 ?S<   Sep24   0:00 [cpuset]
root13  0.0  0.0  0 0 ?S<   Sep24   0:00 [khelper]
root14  0.0  0.0  0 0 ?SSep24   0:00 [kdevtmpfs]
root15  0.0  0.0  0 0 ?S<   Sep24   0:00 [netns]
root16  0.0  0.0  0 0 ?SSep24   0:00
[kworker/u:1]
root17  0.0  0.0  0 0 ?SSep24   0:00 [xenwatch]
root18  0.0  0.0  0 0 ?SSep24   0:00 [xenbus]
root19  0.0  0.0  0 0 ?SSep24   0:00
[sync_supers]
root20  0.0  0.0  0 0 ?SSep24   0:00
[bdi-default]
root21  0.0  0.0  0 0 ?S<   Sep24   0:00
[kintegrityd]
root22  0.0  0.0  0 0 ?S<   Sep24   0:00 [kblockd]
root23  0.0  0.0  0 0 ?S<   Sep24   0:00 [ata_sff]
root24  0.0  0.0  0 0 ?SSep24   0:00 [khubd]
root25  0.0  0.0  0 0 ?S<   Sep24   0:00 [md]
root26  0.0  0.0  0 0 ?SSep24   0:02
[kworker/0:1]
root28  0.0  0.0  0 0 ?SSep24   0:00
[khungtaskd]
root29  0.0  0.0  0 0 ?SSep24   0:00 [kswapd0]
root30  0.0  0.0  0 0 ?SN   Sep24   0:00 [ksmd]
root31  0.0  0.0  0 0 ?SSep24   0:00
[fsnotify_mark]
root32  0.0  0.0  0 0 ?SSep24   0:00
[ecryptfs-kthrea]
root33  0.0  0.0  0 0 ?S<   Sep24   0:00 [crypto]
root41  0.0  0.0  0 0 ?S<   Sep24   0:00 [kthrotld]
root42  0.0  0.0  0 0 ?SSep24   0:00 [khvcd]
root43  0.0  0.0  0 0 ?SSep24   0:01
[kworker/1:1]
root62  0.0  0.0  0 0 ?S<   Sep24   0:00
[devfreq_wq]
root   176  0.0  0.0  0 0 ?S<   Sep24   0:00 [kdmflush]
root   187  0.0  0.0  0 0 ?SSep24   0:01
[jbd2/xvda1-8]
root   188  0.0  0.0  0 0 ?S<   Sep24   0:00
[ext4-dio-unwrit]
root   258  0.0  0.0  17224   640 ?SSep24   0:00
upstart-udev-bridge --daemon
root   265  0.0  0.0  21460  1196 ?Ss   Sep24   0:00
/sbin/udevd --daemon
root   328  0.0  0.0  21456   712 ?SSep24   0:00
/sbin/udevd --daemon
root   329  0.0  0.0  21456   716 ?SSep24   0:00
/sbin/udevd --daemon
root   389  0.0  0.0  15180   392 ?SSep24   0:00
upstart-socket-bridge --daemon
root   419  0.0  0.0   7256  1008 ?Ss   Sep24   0:00 dhclient3
-e IF_METRIC=100 -pf /var/run/dhclient.eth0.pid -lf
/var/lib/dhcp/dhclient.eth0.leases -1 eth
root   574  0.0  0.0  0 0 ?SSep24   0:03
[jbd2/dm-0-8]
root   575  0.0  0.0  0 0 ?S<   Sep24   0:00
[ext4-dio-unwrit]
root   610  0.0  0.0  49948  2880 ?Ss   Sep24   0:00
/usr/sbin/sshd -D
syslog 625  0.0  0.0 253708  1552 ?Sl   Sep24   0:11 rsyslogd
-c5
102630  0.0  0.0  23808   944 ?Ss   Sep24   0:00
dbus-daemon --system --fork --activation=upstart
root   687  0.0  0.0  14496   968 tty4 Ss+  Sep24   0:00
/sbin/getty -8 38400 tty4
root   696  0.0  0.0  14496   972 tty5 Ss+  Sep24   0:00
/sbin/getty -8 38400 tty5
root   708  0.0  0.0  14496   968 tty2 Ss+  Sep24   0:00
/sbin/getty -8 38400 tty2
root   710  0.0  0.0  14496   964 tty3 Ss+  Sep24   0:00
/sbin/getty -8 38400 tty3
root   715  0.0  0.0  14496   968 tty6 Ss+  Sep24   0:00
/sbin/getty -8 384

Re: [GENERAL] PostgreSQL data loads - turn off WAL

2012-09-26 Thread Jeff Janes

On Tue, Sep 25, 2012 at 9:09 AM, hartrc  wrote:
> My version: PostgreSQL v9.1.5
> Version string: "PostgreSQL 9.1.5 on x86_64-unknown-linux-gnu, compiled by
> gcc (SUSE Linux) 4.3.4 [gcc-4_3-branch revision 152973], 64-bit"
>
> Basically my question is:
> Is there currently any way to avoid wal generation during data load for
> given tables and then have point in time recovery after that?
>
> Background and blurb
> The situation i'm referring to here is for a development environment. I
> require point in time recovery because if there is crash etc I don't want to
> lose up to a days work for 12 developers.

What is it that is being developed?  If you are developing
applications that use postgres for storage, then with a crash of that
database you should not lose the developers' work, unless you are
using the same cluster as both the source-code repository and the
database to which the test applications being developed connect (which
seems like a bad idea)

Or are you not documenting your database schema and code anywhere
other than in the database itself?  That too seems like a bad idea, at
least for a busy multi-person development shop.

> A developer did a data load yesterday of approximately 5GB of data into a
> new schema. This generated approximately 7GB of wal.  The situation arises
> where if something is incorrect in the data load the data load may need to
> be repeated 2 or 3 times (thus generating 20GB +of WAL).

Is the problem the IO generated by this, or the short-term storage, or
the long term storage of it?  If it is the storage, you could do a
specially-scheduled backup as soon as the load is done, and then
delete the WAL soon.  Are you compressing your WAL?  If not, then
doing that might be enough to solve the problem.

It sounds like the main thing your group "produces" is code
development, and from what you describe the database is a "production"
database since losing it loses your product.  If that is the case, you
really need a "development" database as well.  This could run in
archive_mode=off, and then the 5GB would not go to "production" until
they are fairly confident it will work and not have to be repeated.
So that would cut out 2/3 of the WAL.

Or, they could create the table as "unlogged" until they know the bulk
load works, then recreate it as logged and repeat the load.  (I think
that currently there is no way to make an unlogged table then promote
it to logged, that would probably be ideal for you, but someday)

> For a data load i
> don't want wal to be generated. I accept the fact before there was nothing
> and from the point of the next pg_basebackup there was everything. It is
> from the point i say ok that is everything (the next backup) that i want
> point in time recovery to apply to that table.
> It is doesn't seem practical, and appears very risky to turn off wal_archive
> during the data load.

How long does the bulk load plus one backup plus two database restarts
take?  That would probably be far less than one day, so the risk of
turning off archive_mode temporarily seems like it should be minimal.

If your hard drives were to die horribly, how long would it take you
replace them and restore the database from the last backup and roll it
forward?  That could calibrate how much past work you are willing to
lose in the event it comes to that.

Cheers,

Jeff

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] trigger and plpgsq help needed

2012-09-26 Thread SUNDAY A. OLUTAYO

Dear all, 

I have issue with the red portion function below; 

This IF last_id IS NULL THEN suppose to test if last_id is NULL, that is the 
select query did not found it 
then execute the rest of the red sql but it always fail to insert the 
NEW.amount into amount , every other things fine. 

Kindly help out. 



CREATE OR REPLACE FUNCTION invoice_trigger_func() RETURNS trigger AS 
$$ 

DECLARE 
last_id integer; 
current_balance_id integer; 
oldbalance numeric(10,2); 
newbalance numeric(10,2); 

BEGIN 

SELECT lastbal_id INTO last_id FROM patient 
WHERE patient_id = NEW.patient_id; 
IF last_id IS NULL THEN 
INSERT INTO balance (invoice_id, patient_id, amount) 
VALUES (NEW.invoice_id, NEW.patient_id, NEW.amount); 
UPDATE patient SET lastbal_id = (SELECT currval('balance_balance_id_seq')) 
WHERE patient_id = NEW.patient_id; 
ELSE 
SELECT amount INTO oldbalance FROM balance 
WHERE balance_id = last_id; 
INSERT INTO balance (invoice_id, patient_id, amount) 
VALUES (NEW.invoice_id, NEW.patient_id, oldbalance + NEW.amount); 
UPDATE patient SET lastbal_id = (SELECT currval('balance_balance_id_seq')) 
WHERE patient_id = NEW.patient_id; 
END IF; 
RETURN NEW; 

END; 
$$ LANGUAGE plpgsql; 



Thanks, 

Sunday Olutayo

Re: [GENERAL] trigger and plpgsq help needed

2012-09-26 Thread Rob Richardson

Why not use IF FOUND?

RobR

From: pgsql-general-ow...@postgresql.org 
[mailto:pgsql-general-ow...@postgresql.org] On Behalf Of SUNDAY A. OLUTAYO
Sent: Wednesday, September 26, 2012 12:43 PM
To: pgsql-general@postgresql.org
Subject: [GENERAL] trigger and plpgsq help needed

Dear all,

I have issue with the red portion function below;

This IF last_id IS NULL THEN suppose to test if last_id is NULL, that is the 
select query did not found it
then execute the rest of the red sql but it always fail to insert the 
NEW.amount into amount, every other things fine.

Kindly help out.

Re: [GENERAL] trigger and plpgsq help needed

2012-09-26 Thread Chris Ernst

On 09/26/2012 10:42 AM, SUNDAY A. OLUTAYO wrote:
> This IF last_id IS NULL THEN suppose to test if last_id is NULL, that is
> the select query did not found it
> then execute the rest of the red sql but it always fail to insert the
> *NEW.amount *into*amount*, every other things fine.

I believe what you're looking for is "IF NOT FOUND THEN" instead of "IF
last_id IS NULL THEN"

Cheers!

- Chris



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Inaccurate Explain Cost

2012-09-26 Thread Robert Sosinski

Hey Everyone, 

I seem to be getting an inaccurate cost from explain.  Here are two examples 
for one query with two different query plans:

exchange_prod=# set enable_nestloop = on;
SET
exchange_prod=#
exchange_prod=# explain analyze SELECT COUNT(DISTINCT "exchange_uploads"."id") 
FROM "exchange_uploads" INNER JOIN "upload_destinations" ON 
"upload_destinations"."id" = "exchange_uploads"."upload_destination_id" LEFT 
OUTER JOIN "uploads" ON "uploads"."id" = "exchange_uploads"."upload_id" LEFT 
OUTER JOIN "import_errors" ON "import_errors"."exchange_upload_id" = 
"exchange_uploads"."id" LEFT OUTER JOIN "exchanges" ON "exchanges"."id" = 
"upload_destinations"."exchange_id" WHERE (("exchange_uploads"."created_at" >= 
'2012-07-27 21:21:57.363944' AND "upload_destinations"."office_id" = 6));

QUERY PLAN  
   
---
 Aggregate  (cost=190169.54..190169.55 rows=1 width=4) (actual 
time=199806.806..199806.807 rows=1 loops=1)
   ->  Nested Loop Left Join  (cost=0.00..190162.49 rows=2817 width=4) (actual 
time=163.293..199753.548 rows=43904 loops=1)
 ->  Nested Loop  (cost=0.00..151986.53 rows=2817 width=4) (actual 
time=163.275..186869.844 rows=43904 loops=1)
   ->  Index Scan using upload_destinations_office_id_idx on 
upload_destinations  (cost=0.00..29.95 rows=4 width=8) (actual 
time=0.060..0.093 rows=6 loops=1)
 Index Cond: (office_id = 6)
   ->  Index Scan using 
index_exchange_uploads_on_upload_destination_id on exchange_uploads  
(cost=0.00..37978.21 rows=875 width=12) (actual time=27.197..31140.375 
rows=7317 loops=6)
 Index Cond: (upload_destination_id = 
upload_destinations.id)
 Filter: (created_at >= '2012-07-27 
21:21:57.363944'::timestamp without time zone)
 ->  Index Scan using index_import_errors_on_exchange_upload_id on 
import_errors  (cost=0.00..8.49 rows=405 width=4) (actual time=0.291..0.291 
rows=0 loops=43904)
   Index Cond: (exchange_upload_id = exchange_uploads.id)
 Total runtime: 199806.951 ms
(11 rows)

exchange_prod=# 
exchange_prod=# set enable_nestloop = off;
SET
exchange_prod=# 
exchange_prod=# explain analyze SELECT COUNT(DISTINCT "exchange_uploads"."id") 
FROM "exchange_uploads" INNER JOIN "upload_destinations" ON 
"upload_destinations"."id" = "exchange_uploads"."upload_destination_id" LEFT 
OUTER JOIN "uploads" ON "uploads"."id" = "exchange_uploads"."upload_id" LEFT 
OUTER JOIN "import_errors" ON "import_errors"."exchange_upload_id" = 
"exchange_uploads"."id" LEFT OUTER JOIN "exchanges" ON "exchanges"."id" = 
"upload_destinations"."exchange_id" WHERE (("exchange_uploads"."created_at" >= 
'2012-07-27 21:21:57.363944' AND "upload_destinations"."office_id" = 6));

QUERY PLAN  
   
---
 Aggregate  (cost=2535992.33..2535992.34 rows=1 width=4) (actual 
time=133447.507..133447.507 rows=1 loops=1)
   ->  Hash Right Join  (cost=1816553.69..2535985.56 rows=2708 width=4) (actual 
time=133405.326..133417.078 rows=43906 loops=1)
 Hash Cond: (import_errors.exchange_upload_id = exchange_uploads.id)
 ->  Seq Scan on import_errors  (cost=0.00..710802.71 rows=2300471 
width=4) (actual time=0.006..19199.569 rows=2321888 loops=1)
 ->  Hash  (cost=1816519.84..1816519.84 rows=2708 width=4) (actual 
time=112938.606..112938.606 rows=43906 loops=1)
   Buckets: 1024  Batches: 1  Memory Usage: 1544kB
   ->  Hash Join  (cost=28.25..1816519.84 rows=2708 width=4) 
(actual time=42.957..112892.689 rows=43906 loops=1)
 Hash Cond: (exchange_uploads.upload_destination_id = 
upload_destinations.id)
 ->  Index Scan using 
index_upload_destinations_on_created_at on exchange_uploads  
(cost=0.00..1804094.96 rows=3298545 width=12) (actual time=17.686..111649.272 
rows=3303488 loops=1)
   Index Cond: (created_at >= '2012-07-27 
21:21:57.363944'::timestamp without time zone)
 ->  Hash  (cost=28.20..28.20 rows=4 width=8) (actual 
time=0.043..0.043 rows=6 loops=1)
   Buckets: 1024  Batches: 1  Memory Usage: 1kB
   ->  Bitmap Heap Scan on upload_destinations  
(cost=6.28..28.20 rows=4 width=8) (a

Re: [GENERAL] idle in transaction query makes server unresponsive

2012-09-26 Thread Scot Kreienkamp



> -Original Message-
> From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general-
> ow...@postgresql.org] On Behalf Of Thomas Kellerer
> Sent: Tuesday, September 25, 2012 5:25 PM
> To: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] idle in transaction query makes server unresponsive
>
> Scot Kreienkamp wrote on 25.09.2012 22:35:
> > The application is using a pooler and generally runs around 100
> > connections, but I've seen it as high as 200 during the day for
> > normal use.  It's on a large server; 64 cores total and about 500
> > gigs of memory.  That's one of the reasons I left it at 512
> > connections.
>
> We had several web applications where performance was *improved*
> by configuring the connection pool have a a lot less connections.
>
> There is a threshold where too many connections
> will simply flood the server. Lowering the number of processes
> fighting for resource makes each process faster.
>
> You might want to give it a try.
>

[Scot Kreienkamp]
Can I get the same effect by limiting the number of logons for the user that my 
application is connecting as to 250?  Or do I need to lower the number in 
postgresql.conf?  I'd rather go with the first option because I can adjust it 
live without editing and reloading config files.



This message is intended only for the individual or entity to which it is 
addressed. It may contain privileged, confidential information which is exempt 
from disclosure under applicable laws. If you are not the intended recipient, 
please note that you are strictly prohibited from disseminating or distributing 
this information (other than to the intended recipient) or copying this 
information. If you have received this communication in error, please notify us 
immediately by e-mail or by telephone at the above number. Thank you.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] idle in transaction query makes server unresponsive

2012-09-26 Thread Scot Kreienkamp


> -Original Message-
> From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general-
> ow...@postgresql.org] On Behalf Of Albe Laurenz
> Sent: Wednesday, September 26, 2012 5:15 AM
> To: Scot Kreienkamp; pgsql-general@postgresql.org
> Subject: Re: [GENERAL] idle in transaction query makes server unresponsive
>
> Scot Kreienkamp wrote:
> > I have a problem that I've been struggling with for quite some time.
> Every once in a while I will get
> > a connection that goes to idle in transaction on an in-house
> programmed application that connects with
> > JDBC.  That happens fairly regularly and the programmers are trying to
> clean that up, but sometimes
> > the idle in transaction connection makes the PG server entirely
> unresponsive.  I'm not getting
> > connection refused, nothing.  All connections existing or new, JDBC or
> psql, just hang.  I've already
> > got full query logging on to try to catch the problem query or
> connection so I can give the developers
> > somewhere to look to resolve their issue with the application, but
> since queries are logged with
> > runtimes I'm assuming they are only logged after they are complete.
> And since it's idle in
> > transaction it never completes so it never gets logged.  Our
> application is connecting as an
> > unprivileged user named rmstomcat, and the database is limited to 400
> connections out of 512.  I'm not
> > running out of connections as I've got reserved connections set, and
> even connecting as user postgres
> > with psql the connection just hangs.  The server doesn't appear to be
> running out of memory when this
> > happens and nothing is printed in the log.  The only thing that
> resolves it is doing a kill on the PID
> > of any idle in transaction connections existing at the time causing
> them to roll back.  Then
> > everything else picks up right where it left off and works again.
> >
> > Can anyone give me any hints about why PG becomes unresponsive?  Or
> how to fix it so it doesn't?
> >
> > My server is 9.1.2 right now.  I will be upgrading to the latest 9.1
> series soon, but until 9.2 can be
> > run through our development/testing cycle I can't upgrade to 9.2.
> That will take about 6-10 months.
>
> Yes, see if upgrading to 9.1 makes the problem disappear.
>
> It is surprising that you cannot even start new connections.
>
> You could try to "strace" the postmaster during a connection attempt
> and see what happens.  Maybe that helps to spot the place where
> things go wrong.
>
> Yours,
> Laurenz Albe

[Scot Kreienkamp]
I'm willing to give it a try, but I've never done it before.  What do I need to 
do?



This message is intended only for the individual or entity to which it is 
addressed. It may contain privileged, confidential information which is exempt 
from disclosure under applicable laws. If you are not the intended recipient, 
please note that you are strictly prohibited from disseminating or distributing 
this information (other than to the intended recipient) or copying this 
information. If you have received this communication in error, please notify us 
immediately by e-mail or by telephone at the above number. Thank you.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] trigger and plpgsq help needed

2012-09-26 Thread Adrian Klaver


On 09/26/2012 09:42 AM, SUNDAY A. OLUTAYO wrote:

Dear all,

I have issue with the red portion function below;

This IF last_id IS NULL THEN suppose to test if last_id is NULL, that is
the select query did not found it
then execute the rest of the red sql but it always fail to insert the
*NEW.amount *into*amount*, every other things fine.


Is there a NEW.amount?



Kindly help out.




Thanks,

Sunday Olutayo





--
Adrian Klaver
adrian.kla...@gmail.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Odd Invalid type name error in postgresql 9.1

2012-09-26 Thread Jim Wilson

Hi,

After upgrading from 8.3 I found an unusual error related to a plpgsql function.

The database includes a table named "detail".

The procedure/function in question includes a declaration of detail%rowtype.

Loading the server from a dump-all at the time of the upgrade went
fine and the function was loaded along with all the other database
objects. I noticed later on that at runtime (e.g. calling the
function) I was receiving an "invalid type name" error on the
declaration using "detail%rowtype".

This seemed odd. I couldn't find any record of "detail" being a
reserved word, and the usual sql queries using the table perform
without a hitch. But when I changed the declaration in the function to
have quotes around the table name (e.g. "detail"%rowtype) the error
went away.

Any ideas on this? My biggest concern is if there is a structural
problem in the database that might result in a crash later, I'd like
to get it fixed now.

Thanks in advance,

Jim Wilson


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] [PERFORM] Inaccurate Explain Cost

2012-09-26 Thread Shaun Thomas


On 09/26/2012 01:38 PM, Robert Sosinski wrote:


I seem to be getting an inaccurate cost from explain.  Here are two
examples for one query with two different query plans:


Well, there's this:

Nested Loop  (cost=0.00..151986.53 rows=2817 width=4) (actual 
time=163.275..186869.844 rows=43904 loops=1)


If anything's a smoking gun, that is. I could see why you'd want to turn 
off nested loops to get better execution time. But the question is: why 
did it think it would match so few rows in the first place? The planner 
probably would have thrown away this query plan had it known it would 
loop 20x more than it thought.


I think we need to know what your default_statistics_target is set at, 
and really... all of your relevant postgresql settings.


Please see this:

http://wiki.postgresql.org/wiki/Slow_Query_Questions

But you also may need to look a lot more into your query itself. The 
difference between a 2 or a 3 minute query isn't going to  help you 
much. Over here, we tend to spend more of our time turning 2 or 3 minute 
queries into 20 or 30ms queries. But judging by your date range, getting 
the last 2-months of data from a table that large generally won't be 
fast by any means.


That said, looking at your actual query:

SELECT COUNT(DISTINCT eu.id)
  FROM exchange_uploads eu
  JOIN upload_destinations ud ON ud.id = eu.upload_destination_id
  LEFT JOIN uploads u ON u.id = eu.upload_id
  LEFT JOIN import_errors ie ON ie.exchange_upload_id = eu.id
  LEFT JOIN exchanges e ON e.id = ud.exchange_id
 WHERE eu.created_at >= '2012-07-27 21:21:57.363944'
   AND ud.office_id = 6;

Doesn't need half of these joins. They're left joins, and never used in 
the query results or where criteria. You could just use this:


SELECT COUNT(DISTINCT eu.id)
  FROM exchange_uploads eu
  JOIN upload_destinations ud ON (ud.id = eu.upload_destination_id)
 WHERE eu.created_at >= '2012-07-27 21:21:57.363944'
   AND ud.office_id = 6;

Though I presume this is just a count precursor to a query that fetches 
the actul results and does need the left join. Either way, the index 
scan from your second example matches 3.3M rows by using the created_at 
index on exchange_uploads. That's not really very restrictive, and so 
you have two problems:


1. Your nested loop stats from office_id are somehow wrong. Try 
increasing your stats on that column, or just default_statistics_target 
in general, and re-analyze.
2. Your created_at criteria above match way too many rows, and will also 
take a long time to process.


Those are your two actual problems. We can probably get your query to 
run faster, but those are pretty significant hurdles.



--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-444-8534
stho...@optionshouse.com

__

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to 
this email


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] [PERFORM] Inaccurate Explain Cost

2012-09-26 Thread Edson Richter


Em 26/09/2012 17:03, Shaun Thomas escreveu:

On 09/26/2012 01:38 PM, Robert Sosinski wrote:


I seem to be getting an inaccurate cost from explain.  Here are two
examples for one query with two different query plans:


Well, there's this:

Nested Loop  (cost=0.00..151986.53 rows=2817 width=4) (actual 
time=163.275..186869.844 rows=43904 loops=1)


If anything's a smoking gun, that is. I could see why you'd want to 
turn off nested loops to get better execution time. But the question 
is: why did it think it would match so few rows in the first place? 
The planner probably would have thrown away this query plan had it 
known it would loop 20x more than it thought.


I think we need to know what your default_statistics_target is set at, 
and really... all of your relevant postgresql settings.


Please see this:

http://wiki.postgresql.org/wiki/Slow_Query_Questions

But you also may need to look a lot more into your query itself. The 
difference between a 2 or a 3 minute query isn't going to help you 
much. Over here, we tend to spend more of our time turning 2 or 3 
minute queries into 20 or 30ms queries. But judging by your date 
range, getting the last 2-months of data from a table that large 
generally won't be fast by any means.


That said, looking at your actual query:

SELECT COUNT(DISTINCT eu.id)
  FROM exchange_uploads eu
  JOIN upload_destinations ud ON ud.id = eu.upload_destination_id
  LEFT JOIN uploads u ON u.id = eu.upload_id
  LEFT JOIN import_errors ie ON ie.exchange_upload_id = eu.id
  LEFT JOIN exchanges e ON e.id = ud.exchange_id
 WHERE eu.created_at >= '2012-07-27 21:21:57.363944'
   AND ud.office_id = 6;

Doesn't need half of these joins. They're left joins, and never used 
in the query results or where criteria. You could just use this:


Interesting. I've similar situation, where user can choose a set of 
filters, and then the query must have several left joins "just in case" 
(user need in the filer).
I know other database that is able to remove unnecessary outer joins 
from queries when they are not relevant and for instance become faster.

Can't PostgreSQL do the same?

Regards,

Edson.



SELECT COUNT(DISTINCT eu.id)
  FROM exchange_uploads eu
  JOIN upload_destinations ud ON (ud.id = eu.upload_destination_id)
 WHERE eu.created_at >= '2012-07-27 21:21:57.363944'
   AND ud.office_id = 6;

Though I presume this is just a count precursor to a query that 
fetches the actul results and does need the left join. Either way, the 
index scan from your second example matches 3.3M rows by using the 
created_at index on exchange_uploads. That's not really very 
restrictive, and so you have two problems:


1. Your nested loop stats from office_id are somehow wrong. Try 
increasing your stats on that column, or just 
default_statistics_target in general, and re-analyze.
2. Your created_at criteria above match way too many rows, and will 
also take a long time to process.


Those are your two actual problems. We can probably get your query to 
run faster, but those are pretty significant hurdles.







--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] [PERFORM] Inaccurate Explain Cost

2012-09-26 Thread hubert depesz lubaczewski

On Wed, Sep 26, 2012 at 02:38:09PM -0400, Robert Sosinski wrote:
> The first query shows a cost of 190,169.55 and runs in 199,806.951 ms.
> When I disable nested loop, I get a cost of 2,535,992.34 which runs in
> only 133,447.790 ms.  We have run queries on our database with a cost
> of 200K cost before and they ran less then a few seconds, which makes
> me wonder if the first query plan is inaccurate.  The other issue is
> understanding why a query plan with a much higher cost is taking less
> time to run.

Are you under impression that cost should be somehow related to actual
time?
If yes - that's not true, and afaik never was.
the fact that you got similar time and cost is just a coincidence.

Best regards,

depesz

-- 
The best thing about modern society is how easy it is to avoid contact with it.
 http://depesz.com/


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] SELECT …. WHERE id is in pool of ids of subquery……

2012-09-26 Thread Alexander Reichstadt

Hi,

I am reading the docu and am looking for an example our explanation as to the 
difference of setof and arrays.

But maybe this is not even relevant. The reason I am looking for this is 
because I need to build a function that looks for all records whose ids are in 
a pool of other records.

Maybe this is not making any sense

So, here is my function:
defect_usermessageids   "refid_userid" bigint   setof recordsql
Definition
1
2
SELECT DISTINCT messages.id FROM messages
JOIN message_recipients msgrec ON msgrec.refid_messages=messages.id AND 
(msgrec.refid_qmusers=$1 OR messages.refid_sender=$1)

It returns all the IDs of messages of which a given user, the input arg 
refid_userid, is either a recipient or a sender.

This works as expected.

Now I'd like to get the messages for these ids, but it doesn't work.

It should be something like

SELECT * FROM messages WHERE ID EXISTS (SELECT defect_usermessageids(2))

But this doesn't work.

Please, can someone enlighten me?

Thanks and regards
Alex

Re: [GENERAL] SELECT …. WHERE id is in pool of ids of subquery……

2012-09-26 Thread David Johnston

On Sep 26, 2012, at 16:28, Alexander Reichstadt  wrote:

> Hi,
> 
> I am reading the docu and am looking for an example our explanation as to the 
> difference of setof and arrays.
> 
> But maybe this is not even relevant. The reason I am looking for this is 
> because I need to build a function that looks for all records whose ids are 
> in a pool of other records.
> 
> Maybe this is not making any sense
> 
> So, here is my function:
> defect_usermessageids "refid_userid" bigint   setof recordsql
> Definition
> 1
> 2
> SELECT DISTINCT messages.id FROM messages
> JOIN message_recipients msgrec ON msgrec.refid_messages=messages.id AND 
> (msgrec.refid_qmusers=$1 OR messages.refid_sender=$1)
> 
> It returns all the IDs of messages of which a given user, the input arg 
> refid_userid, is either a recipient or a sender.
> 
> This works as expected.
> 
> Now I'd like to get the messages for these ids, but it doesn't work.
> 
> It should be something like
> 
> SELECT * FROM messages WHERE ID EXISTS (SELECT defect_usermessageids(2))
> 
> But this doesn't work.
> 
> Please, can someone enlighten me?
> 
> Thanks and regards
> Alex


"SETOF record" is likely not what you want since ID is a single value not a 
"record".

The data type of ID is what you want to return a SETOF of.

You also want to use IN, not EXISTS.  You generally only use EXISTS with a 
correlated sub-query but in this case the sub-query is constant.

David J.

[GENERAL] Re: [GENERAL] SELECT …. WHERE id is in pool of ids of subquery……

2012-09-26 Thread Alexander Reichstadt

Heureka, it works.
Thanks!

Am 26.09.2012 um 22:37 schrieb David Johnston:

> On Sep 26, 2012, at 16:28, Alexander Reichstadt  wrote:
> 
>> Hi,
>> 
>> I am reading the docu and am looking for an example our explanation as to 
>> the difference of setof and arrays.
>> 
>> But maybe this is not even relevant. The reason I am looking for this is 
>> because I need to build a function that looks for all records whose ids are 
>> in a pool of other records.
>> 
>> Maybe this is not making any sense
>> 
>> So, here is my function:
>> defect_usermessageids"refid_userid" bigint   setof recordsql
>> Definition
>> 1
>> 2
>> SELECT DISTINCT messages.id FROM messages
>> JOIN message_recipients msgrec ON msgrec.refid_messages=messages.id AND 
>> (msgrec.refid_qmusers=$1 OR messages.refid_sender=$1)
>> 
>> It returns all the IDs of messages of which a given user, the input arg 
>> refid_userid, is either a recipient or a sender.
>> 
>> This works as expected.
>> 
>> Now I'd like to get the messages for these ids, but it doesn't work.
>> 
>> It should be something like
>> 
>> SELECT * FROM messages WHERE ID EXISTS (SELECT defect_usermessageids(2))
>> 
>> But this doesn't work.
>> 
>> Please, can someone enlighten me?
>> 
>> Thanks and regards
>> Alex
> 
> 
> "SETOF record" is likely not what you want since ID is a single value not a 
> "record".
> 
> The data type of ID is what you want to return a SETOF of.
> 
> You also want to use IN, not EXISTS.  You generally only use EXISTS with a 
> correlated sub-query but in this case the sub-query is constant.
> 
> David J.

Re: [GENERAL] PostgreSQL, OLAP, and Large Clusters

2012-09-26 Thread Scott Marlowe

On Wed, Sep 26, 2012 at 5:50 AM, Ryan Kelly  wrote:
> Hi:
>
> The size of our database is growing rather rapidly. We're concerned
> about how well Postgres will scale for OLAP-style queries over terabytes
> of data. Googling around doesn't yield great results for vanilla
> Postgres in this application, but generally links to other software like
> Greenplum, Netezza, and Aster Data (some of which are based off of
> Postgres). Too, there are solutions like Stado. But I'm concerned about
> the amount of effort to use such solutions and what we would have to
> give up feature-wise.

If you want fastish OLAP on postgres you need to do several things.

1: Throw very fast disk arrays at it.  Lots of spinners in a linux SW
RAID-10 or RAID-0 if your data is easily replaceable work wonders
here.
2: Throw lots of memory at it.  Memory is pretty cheap.  256G is not
unusual for OLAP machines
3: Throw fast CPUs at it.  Faster CPUs, especially fewer faster cores,
are often helpful.

Applied in that order you can get some pretty impressive results.

A lot of OLAP stuff needs to read hundreds of gigs at a time from the
drive array.  An array of 32 15kRPM drives, each reading at ~100MB/s
or better can flood your  PCI bus at 3200MB/s for reads.  Note that
most RAID controllers aren't as fast for sequential reads on large
drive arrays.  Although a battery backed cache can GREATLY improved
parallel write speed, it doesn't really make a big difference for big
sequential stuff and usually gets in the way here.

Memory to cache as much as possible and allow all your queries to do
hash joins etc in memory (crank up work_mem as needed, but be careful
not to use all your memory up.)

Lastly once you've thrown lots of IO and memory at it, a faster CPU
can make a bit of a difference too.  But honestly I'd rather have a
dual core 2GHz CPU on top of 32 spinners with 256G than a 3.6GHz 8
core CPU on top of 4 drives and 32G of RAM.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] PostgreSQL, OLAP, and Large Clusters

2012-09-26 Thread Gavin Flower


On 26/09/12 23:50, Ryan Kelly wrote:

Hi:

The size of our database is growing rather rapidly. We're concerned
about how well Postgres will scale for OLAP-style queries over terabytes
of data. Googling around doesn't yield great results for vanilla
Postgres in this application, but generally links to other software like
Greenplum, Netezza, and Aster Data (some of which are based off of
Postgres). Too, there are solutions like Stado. But I'm concerned about
the amount of effort to use such solutions and what we would have to
give up feature-wise.

We love a lot of the features that we currently have that (I think)
would only be available with Postgres: arrays, hstore, gin/gist indexes,
extensions. Not to mention a lot of the other great SQL standard
features not available in other open-source databases, especially window
functions and CTEs.

Of course, migration to another solution is feasible (with enough
effort, of course), but given my experiences with Postgres and the
support provided by the community that is second to none, I'd very much
like to stay with PostgreSQL.

Thoughts?

-Ryan Kelly


You might find it worth while to consult a reputable vendor of Postgres, 
such as EnterpriseDB, who contribute change back to the community.  As 
you may have rather specialized needs beyond the capacity of these 
mailing lists.  Also, I suspect paying them to help improve Postgres's 
capability to handle such large datasets would be cheaper than an Oracle 
Licence and costs of migration!


I don't have any specific knowledge of the current ability of Postgres 
to support such high loads, so I can't comment meaningfully on that.  
And even if I could, and it was favourable, I would still recommend 
getting a competent company to provide consultancy for your particular 
situation.



Cheers,
Gavin


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] PostgreSQL, OLAP, and Large Clusters

2012-09-26 Thread Ondrej Ivanič

Hi,

On 26 September 2012 21:50, Ryan Kelly  wrote:
> The size of our database is growing rather rapidly. We're concerned
> about how well Postgres will scale for OLAP-style queries over terabytes
> of data. Googling around doesn't yield great results for vanilla
> Postgres in this application, but generally links to other software like
> Greenplum, Netezza, and Aster Data (some of which are based off of
> Postgres). Too, there are solutions like Stado. But I'm concerned about
> the amount of effort to use such solutions and what we would have to
> give up feature-wise.

We were able to have 2TB of data (16 Cores, 96GB RAM, FC connected
SAN). Performance was good:
- up to 5sec for simple select with multiple where conditions (2 - 25)
order by any column (of 400 columns) and return top 10k
- up to 15sec for executing 5 queries (simultaneously) which return
top 20 combination of any two columns
- up to 25sec for executing 56 queries (using materialised aggregate tables)
- aggregation job ran every 15 minutes and completed under 2 minutes:
5mil rows -> aggregation -> 56 tables
- all queries can be executed over date range up to several months
(monthly partitioned tables, 6 months history)

but it was very hard to scale this solution. We  have tried:
- FusionIO cards: 10 to 100 times better performance, but very hard to
expand storage capacity; Cooling/power issues
- AsterData: nice SQL-MR feature and analytics (decision trees,
frequent items, clustering, ...); No libpq support and you have to use
JDBC or selected ODBC manager
- Greenplum (winer): performance comparable to FusionIO (10 to 50
times); we were able to remove aggregation job (because of columnar
store model); easy to port from postgres but could be complicated if
you are heavy pgpsql user

At this time I would try:
- Postgres-XC
- Stado
- Cassandra + Solr + Hadoop (for example DataStax Enterprise)
- Birst (http://www.birst.com/)

PS. For any commercial solution is better to buy their appliance (if
you can afford it...).

-- 
Ondrej Ivanic
(ondrej.iva...@gmail.com)


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] [PERFORM] Inaccurate Explain Cost

2012-09-26 Thread Samuel Gendler

On Wed, Sep 26, 2012 at 1:21 PM, hubert depesz lubaczewski <
dep...@depesz.com> wrote:

> On Wed, Sep 26, 2012 at 02:38:09PM -0400, Robert Sosinski wrote:
> > The first query shows a cost of 190,169.55 and runs in 199,806.951 ms.
> > When I disable nested loop, I get a cost of 2,535,992.34 which runs in
> > only 133,447.790 ms.  We have run queries on our database with a cost
> > of 200K cost before and they ran less then a few seconds, which makes
> > me wonder if the first query plan is inaccurate.  The other issue is
> > understanding why a query plan with a much higher cost is taking less
> > time to run.
>
> Are you under impression that cost should be somehow related to actual
> time?
> If yes - that's not true, and afaik never was.
> the fact that you got similar time and cost is just a coincidence.
>

Well...only sort of.  In a well-tuned db with accurate statistics, relative
cost between 2 plans should be reflected in relative execution time between
those 2 queries (assuming the data in memory is similar for both runs,
anyway), and that's what he seems to be complaining about.  The plan with
higher cost had lower execution time, which resulted in the planner picking
the slower query.  But the reason for the execution time discrepancy would
appear to be, at least in part, inaccurate statistics resulting in an
incorrect estimate of number of rows in a loop iteration.  More info about
the db config would help to identify other things contributing to the
inaccurate cost estimate - as mentioned earlier, please refer to
http://wiki.postgresql.org/wiki/Slow_Query_Questions when asking
performance questions

And yes, I know you know all of this, Hubert.  I wrote it for the benefit
of the original questioner.

--sam

Re: [GENERAL] Odd Invalid type name error in postgresql 9.1

2012-09-26 Thread Adrian Klaver


On 09/26/2012 12:55 PM, Jim Wilson wrote:

Hi,

After upgrading from 8.3 I found an unusual error related to a plpgsql function.

The database includes a table named "detail".

The procedure/function in question includes a declaration of detail%rowtype.

Loading the server from a dump-all at the time of the upgrade went
fine and the function was loaded along with all the other database
objects. I noticed later on that at runtime (e.g. calling the
function) I was receiving an "invalid type name" error on the
declaration using "detail%rowtype".

This seemed odd. I couldn't find any record of "detail" being a
reserved word, and the usual sql queries using the table perform
without a hitch. But when I changed the declaration in the function to
have quotes around the table name (e.g. "detail"%rowtype) the error
went away.

Any ideas on this? My biggest concern is if there is a structural
problem in the database that might result in a crash later, I'd like
to get it fixed now.


My hunch is it is related to the below, just not sure how. Greater minds 
will need to confirm or deny:)


http://www.postgresql.org/docs/9.1/interactive/release-9-1.html
"Type modifiers of PL/pgSQL variables are now visible to the SQL parser 
(Tom Lane)


A type modifier (such as a varchar length limit) attached to a PL/pgSQL 
variable was formerly enforced during assignments, but was ignored for 
all other purposes. Such variables will now behave more like table 
columns declared with the same modifier. This is not expected to make 
any visible difference in most cases, but it could result in subtle 
changes for some SQL commands issued by PL/pgSQL functions."




Thanks in advance,

Jim Wilson





--
Adrian Klaver
adrian.kla...@gmail.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] [PERFORM] Inaccurate Explain Cost

2012-09-26 Thread Tom Lane

Edson Richter  writes:
>> That said, looking at your actual query:
>> 
>> SELECT COUNT(DISTINCT eu.id)
>> FROM exchange_uploads eu
>> JOIN upload_destinations ud ON ud.id = eu.upload_destination_id
>> LEFT JOIN uploads u ON u.id = eu.upload_id
>> LEFT JOIN import_errors ie ON ie.exchange_upload_id = eu.id
>> LEFT JOIN exchanges e ON e.id = ud.exchange_id
>> WHERE eu.created_at >= '2012-07-27 21:21:57.363944'
>> AND ud.office_id = 6;
>> 
>> Doesn't need half of these joins. They're left joins, and never used 
>> in the query results or where criteria. You could just use this:

> Interesting. I've similar situation, where user can choose a set of 
> filters, and then the query must have several left joins "just in case" 
> (user need in the filer).
> I know other database that is able to remove unnecessary outer joins 
> from queries when they are not relevant and for instance become faster.
> Can't PostgreSQL do the same?

It does, and did - note the query plan is only scanning 3 of the 5
tables mentioned in the query.  (The other left join appears to be
to a non-unique column, which makes it not redundant.)

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] Linux PowerPC 64bits issue

2012-09-26 Thread Leonardo M . Ramé

Hi, I compiled PostgreSql 9.1 from sources in a OpenSuse 10.1 PowerPC
machine. While trying to test one application, I've got errors just
before connecting to the database, and found my app is loading
linux-vdso64.so.1 while libpq.so uses linux-vdso32.so.1

This means the PostgreSql libraries where compiled in 32 bits?. This is
what ldd gives:

ldd /usr/local/pgsql/lib/libpq.so
linux-vdso32.so.1 => (0x0010)
libpthread.so.0 => /lib/power6x/libpthread.so.0 (0x6ff85000)
libc.so.6 => /lib/power6x/libc.so.6 (0x6fe11000)
/lib/ld.so.1 (0x0800)

How can I force PostgreSql to compile in 64 bits?.

Regards,
-- 
Leonardo M. Ramé
Medical IT - Griensu S.A.
Av. Colón 636 - Piso 8 Of. A
X5000EPT -- Córdoba
Tel.: +54(351)4246924 +54(351)4247788 +54(351)4247979 int. 19
Cel.: +54 9 351 6629292



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] [PERFORM] Inaccurate Explain Cost

2012-09-26 Thread Jeff Janes

On Wed, Sep 26, 2012 at 1:21 PM, hubert depesz lubaczewski
 wrote:
> On Wed, Sep 26, 2012 at 02:38:09PM -0400, Robert Sosinski wrote:
>> The first query shows a cost of 190,169.55 and runs in 199,806.951 ms.
>> When I disable nested loop, I get a cost of 2,535,992.34 which runs in
>> only 133,447.790 ms.  We have run queries on our database with a cost
>> of 200K cost before and they ran less then a few seconds, which makes
>> me wonder if the first query plan is inaccurate.  The other issue is
>> understanding why a query plan with a much higher cost is taking less
>> time to run.
>
> Are you under impression that cost should be somehow related to actual
> time?

I am certainly under that impression.  If the estimated cost has
nothing to do with run time, then what is it that the cost-based
optimizer is trying to optimize?

The arbitrary numbers of the cost parameters do not formally have any
units, but they had better have some vaguely proportional relationship
with the dimension of time, or else there is no point in having an
optimizer.  For any given piece of hardware (including table-space, if
you have different table-spaces on different storage), configuration
and cachedness, there should be some constant factor to translate cost
into time.  To the extent that there fails to be such a constant
factor, it is either a misconfiguration, or a room for improvement in
the planner.

The only exceptions I can think of is are 1) when there is only one
way to do something, the planner may not bother to cost it (i.e.
assign it a cost of zero) because it will not help make a decision.
However, the only instances of this that I know of are in DML, not in
pure selects, and 2) the costs of setting hint bits and such in
selects is not estimated, except to the extent they are folded into
something else, like the page visiting costs.

Cheers,

Jeff

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] unc paths, like and backslashes on 8.4

2012-09-26 Thread Chris Curvey

I just don't get how we are supposed to use LIKE with backslashes in
strings in 8.4.  This is particularly vexing, because I have a field
containing UNC paths that I need to search on (and eventually update).  I
have been looking at this page for guidance:
http://www.postgresql.org/docs/8.4/static/functions-matching.html

So I will ask my questions first, then show you what I tried:

1) Why do I get a warning when doubling a backslash?
2) What is the meaning of "E" syntax (E'fs1\\bar')?
3) If I have backslashes in my table, how can I get them back out?
4) I'd like to run an update to change the value '\\fs1\bar' to
\\fs1\foo\bar'.  What incantation would do that.

So, trying to figure it out on my own...

CREATE TABLE FOOBAR
(  UNC_PATH VARCHAR(100)
);

/* first insert attempt */
INSERT INTO FOOBAR VALUES ('\\FS1\BAR');

returns a warning:

WARNING:  nonstandard use of \\ in a string literal
LINE 1: INSERT INTO FOOBAR VALUES ('\\FS1\BAR');
   ^
HINT:  Use the escape string syntax for backslashes, e.g., E'\\'.
Query returned successfully: 1 row affected, 21 ms execution time.

but the row is inserted.  There is one leading backslash, and the "b" is
some unprintable character.  Let's try the "E" syntax, whatever that is:

INSERT INTO FOOBAR VALUES (E'\\FS1\BAR');

No warning, but exactly the same results again (one leading backslash, "b"
replaced by unprintable char).  Let's try E with doubled backslashes:

INSERT INTO FOOBAR VALUES (E'FS1\\BAR');

okay, that worked.  Yay.   Now let's see if I can get the record back out
with "LIKE":

SELECT * FROM FOOBAR WHERE UNC_PATH LIKE E'FS1%';

That gets back a record, but the value returned is "\FS1BAR". I'm missing
two backslashes.  I'm too confused to even attempt the update.

-Chris

Re: [GENERAL] Odd Invalid type name error in postgresql 9.1

2012-09-26 Thread Tom Lane

Jim Wilson  writes:
> After upgrading from 8.3 I found an unusual error related to a plpgsql 
> function.

> The database includes a table named "detail".

> The procedure/function in question includes a declaration of detail%rowtype.

Hmm.  The reason that doesn't work is that DETAIL is now a keyword in
plpgsql.  It's supposed to be an unreserved keyword, so ideally this
usage would work; but I see that plpgsql's read_datatype() function
is naive about such cases (in particular, it fails to check for the
%rowtype construct).  I'll see about fixing it, but in the meantime
your best answer is to quote the name.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[GENERAL] function return value inside a trigger function

2012-09-26 Thread joao viegas

Hello Good Evening all,

does anybody knows how can I have something like:
select function_that_returns_int(NEW.field1) into NEW.field2;

i.e., getting a function output and assign it to a variable inside the trigger.

in a trigger function?
I'm always getting:
ERROR: SELECT query has no destination for result data


thanks in advance
jtv

Re: [GENERAL] function return value inside a trigger function

2012-09-26 Thread Bosco Rama

On 09/26/12 17:56, joao viegas wrote:
> 
> does anybody knows how can I have something like:
> select function_that_returns_int(NEW.field1) into NEW.field2;

I think you want:
   select into NEW.field2 function_that_returns_int(NEW.field1);

or even:
   NEW.field2 := function_that_returns_int(NEW.field1);


Bosco.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] PostgreSQL, OLAP, and Large Clusters

2012-09-26 Thread Stephen Frost

All,

* Scott Marlowe (scott.marl...@gmail.com) wrote:
> If you want fastish OLAP on postgres you need to do several things.
[...]

All good suggestions.

I'd recommend looking at ROLAP approaches and doing aggregations and
materialized views first..  Will depend on exactly what you need/are
looking to do with the data, of course.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [GENERAL] function return value inside a trigger function

2012-09-26 Thread Tom Lane

joao viegas  writes:
> does anybody knows how can I have something like:
> select function_that_returns_int(NEW.field1) into NEW.field2;

Works for me ...

create table fooey (q1 int, q2 int);

create function myt() returns trigger language plpgsql as $$
begin
  select abs(new.q1) into new.q2;
  return new;
end$$;

create trigger mytrig before insert on fooey for each row
  execute procedure myt();

insert into fooey values(-42, 77);

select * from fooey;
 q1  | q2 
-+
 -42 | 42
(1 row)

Perhaps you should show a full example of what you're doing.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] unc paths, like and backslashes on 8.4

2012-09-26 Thread David Johnston

On Sep 26, 2012, at 20:42, Chris Curvey  wrote:

> I just don't get how we are supposed to use LIKE with backslashes in strings 
> in 8.4.  This is particularly vexing, because I have a field containing UNC 
> paths that I need to search on (and eventually update).  I have been looking 
> at this page for guidance:  
> http://www.postgresql.org/docs/8.4/static/functions-matching.html
> 
> So I will ask my questions first, then show you what I tried:
> 
> 1) Why do I get a warning when doubling a backslash?  
> 2) What is the meaning of "E" syntax (E'fs1\\bar')?
> 3) If I have backslashes in my table, how can I get them back out?
> 4) I'd like to run an update to change the value '\\fs1\bar' to 
> \\fs1\foo\bar'.  What incantation would do that.
> 
> So, trying to figure it out on my own...
> 
> CREATE TABLE FOOBAR
> (  UNC_PATH VARCHAR(100)
> );
> 
> /* first insert attempt */
> INSERT INTO FOOBAR VALUES ('\\FS1\BAR');
> 
> returns a warning:
> 
> WARNING:  nonstandard use of \\ in a string literal
> LINE 1: INSERT INTO FOOBAR VALUES ('\\FS1\BAR');
>^
> HINT:  Use the escape string syntax for backslashes, e.g., E'\\'.
> Query returned successfully: 1 row affected, 21 ms execution time.
> 
> but the row is inserted.  There is one leading backslash, and the "b" is some 
> unprintable character.  Let's try the "E" syntax, whatever that is:
> 
> INSERT INTO FOOBAR VALUES (E'\\FS1\BAR');
> 
> No warning, but exactly the same results again (one leading backslash, "b" 
> replaced by unprintable char).  Let's try E with doubled backslashes:
> 
> INSERT INTO FOOBAR VALUES (E'FS1\\BAR');
> 
> okay, that worked.  Yay.   Now let's see if I can get the record back out 
> with "LIKE":
> 
> SELECT * FROM FOOBAR WHERE UNC_PATH LIKE E'FS1%';
> 
> That gets back a record, but the value returned is "\FS1BAR". I'm missing two 
> backslashes.  I'm too confused to even attempt the update.
> 
> -Chris
> 

First, please read the follow section of the docs, though especially 4.1.2

http://www.postgresql.org/docs/8.4/interactive/sql-syntax-lexical.html

Note the callout regarding standard conforming strings.

Since LIKE is an escapable pattern and you are using it in an escapable string 
literal the backslashes behave as such:

"" perform string literal escape -> "\\" perform like escape -> "\"

So on the first pass the four become two since each pair represents a single 
backslash post-literal-escape.  Then the pair supplied to the LIKE becomes one 
post-like-escape.

Post back here if the reason and behavior of E'' is still unclear after reading 
the documentation.

David J.

[GENERAL] problem with recreating database with export

2012-09-26 Thread Dennis Gearon

I have a database export done via copy. It's all the tables, contraints, blah 
blah.

I know (a couple of years ago) that I just used the defaults. REALLY fast btw, 
I 
remember.

All the table create commands work fine. 

I cannot get the whole file to import using (as postgres) psql -d database_name 
-f the_file.sql

I get errors on \N and various other problems. I've tried dividing the file up 
into:

table creation
one 'copy' of a table at a time. 

One, a 35 mbyte seems to have errors on semi colons.

 Dennis Gearon


Never, ever approach a computer saying or even thinking "I will just do this 
quickly."

Re: [GENERAL] problem with recreating database with export

2012-09-26 Thread Adrian Klaver


On 09/26/2012 07:41 PM, Dennis Gearon wrote:

I have a database export done via copy. It's all the tables, contraints,
blah blah.


More information may be in order.

What versions of Postgres are you using?
Are you going from one version to another?
Did you really use COPY for everything(schema included)?
Are you in fact talking about doing a plain text pg_dump?



I know (a couple of years ago) that I just used the defaults. REALLY
fast btw, I remember.

All the table create commands work fine.

I cannot get the whole file to import using (as postgres) psql -d
database_name -f the_file.sql

I get errors on \N and various other problems. I've tried dividing the
file up into:

table creation
one 'copy' of a table at a time.

One, a 35 mbyte seems to have errors on semi colons.
Dennis Gearon

Never, ever approach a computer saying or even thinking "I will just do
this quickly."




--
Adrian Klaver
adrian.kla...@gmail.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] problem with recreating database with export

2012-09-26 Thread Dennis Gearon

It turns out that I had made the export from psql, a text based export.

So I read that it was actually needing to be imported using '\i 
filename_in_local_directory'

 Dennis Gearon


Never, ever approach a computer saying or even thinking "I will just do this 
quickly."

Re: [GENERAL] Linux PowerPC 64bits issue

2012-09-26 Thread a...@hsk.hk

Hi,

Could you provide the steps and commands you used to compile 9.1?

I want to reproduce your case in my machine

Regards
Arthur

On 27 Sep 2012, at 7:46 AM, Leonardo M. Ramé wrote:

> Hi, I compiled PostgreSql 9.1 from sources in a OpenSuse 10.1 PowerPC
> machine. While trying to test one application, I've got errors just
> before connecting to the database, and found my app is loading
> linux-vdso64.so.1 while libpq.so uses linux-vdso32.so.1
> 
> This means the PostgreSql libraries where compiled in 32 bits?. This is
> what ldd gives:
> 
> ldd /usr/local/pgsql/lib/libpq.so
> linux-vdso32.so.1 => (0x0010)
> libpthread.so.0 => /lib/power6x/libpthread.so.0 (0x6ff85000)
> libc.so.6 => /lib/power6x/libc.so.6 (0x6fe11000)
> /lib/ld.so.1 (0x0800)
> 
> How can I force PostgreSql to compile in 64 bits?.
> 
> Regards,
> -- 
> Leonardo M. Ramé
> Medical IT - Griensu S.A.
> Av. Colón 636 - Piso 8 Of. A
> X5000EPT -- Córdoba
> Tel.: +54(351)4246924 +54(351)4247788 +54(351)4247979 int. 19
> Cel.: +54 9 351 6629292
> 
> 
> 
> -- 
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] PostgreSQL, OLAP, and Large Clusters

2012-09-26 Thread Chris Travers

Obviously OLAP performance depends to some extent on what you want to do
with it.   My recommendation is to follow Scott's advice first (keep things
simple as long as you can), and then look at supplemental approaches when
you reach the limit there.

The big limit IMO is the lack of intraquery parallelism.  This has
implications when trying to run OLAP queries on terabytes worth of data and
so at some point you may reach the limits of what a single instance can do.
 At that point you can go with a proprietary vendor like Greenplum or you
can look at PostgresXC (which is open source).  Keep in mind that solving
this issue has significant complexity costs and so you are usually better
off with tuning your existing instance first.

Best Wishes,
Chris Travers

52 matches

Mail list logo