Re: [PERFORM] Query Performance question

2014-07-14 Thread Magers, James
scan may not be as efficient since 4 times the data returned in just a little more time using the index scan. Thank you, James -Original Message- From: Tomas Vondra [mailto:t...@fuzzy.cz] Sent: Monday, July 14, 2014 12:29 PM To: Magers, James Cc: Tomas Vondra; Thomas Kellerer; pgsql

Re: [PERFORM] Cursor + upsert (astronomical data)

2014-07-27 Thread Craig James
, but anyway. > > Could you first apply a kind of grid to your observations? What I mean is > to round your coords to, say, 1/2 arcsec on each axe and group the results. > I think you will have most observations grouped this way and then use your > regular algorithm to combine the results. > > Best regards, Vitalii Tymchyshyn > -- - Craig A. James Chief Technology Officer eMolecules, Inc. -

Re: [PERFORM] Cursor + upsert (astronomical data)

2014-07-29 Thread Craig James
Hi Jiri, I’m really interested in those [clustering] algorithms and study them. But > I would need somebody to point me directly at a specific algorithm to look > at. The main problem with choosing the right one (which couldn’t get over > even my university teacher) is that you don’t know the numb

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-10-10 Thread Craig James
On Fri, Oct 10, 2014 at 5:10 AM, Tomas Vondra wrote: > > I've gone looking for papers on this topic but from what I read this > > isn't so. To get any noticeable improvement you need to read 10-50% of > > the table and that's effectively the same as reading the entire table > > -- and it still ha

Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-10-10 Thread Craig James
On Fri, Oct 10, 2014 at 9:53 AM, Tomas Vondra wrote: > > On 10.10.2014 16:21, Craig James wrote: > > Our index is for chemical structures. Chemicals are indexed on > > chemical fragments > > <http://emolecules.com/info/molecular-informatics>. A search > >

[PERFORM] New server: SSD/RAID recommendations?

2015-07-01 Thread Craig James
We're buying a new server in the near future to replace an aging system. I'd appreciate advice on the best SSD devices and RAID controller cards available today. The database is about 750 GB. This is a "warehouse" server. We load supplier catalogs throughout a typical work week, then on the weeken

Re: [PERFORM] New server: SSD/RAID recommendations?

2015-07-02 Thread Craig James
.@postgresql.org] *On Behalf Of *Andreas Joseph > Krogh > *Sent:* Wednesday, July 01, 2015 6:56 PM > *To:* pgsql-performance@postgresql.org > *Subject:* Re: [PERFORM] New server: SSD/RAID recommendations? > > > > På torsdag 02. juli 2015 kl. 01:06:57, skrev Craig James <

Re: [PERFORM] New server: SSD/RAID recommendations?

2015-07-02 Thread Craig James
On Wed, Jul 1, 2015 at 4:56 PM, Andreas Joseph Krogh wrote: > På torsdag 02. juli 2015 kl. 01:06:57, skrev Craig James < > cja...@emolecules.com>: > > We're buying a new server in the near future to replace an aging system. > I'd appreciate advice on the best

Re: [PERFORM] New server: SSD/RAID recommendations?

2015-07-02 Thread Craig James
On Wed, Jul 1, 2015 at 4:57 PM, Scott Marlowe wrote: > On Wed, Jul 1, 2015 at 5:06 PM, Craig James wrote: > > We're buying a new server in the near future to replace an aging system. > I'd > > appreciate advice on the best SSD devices and RAID controll

Re: [PERFORM] Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?

2015-07-07 Thread Craig James
ll 100% of all eight CPUs. I have no idea why, and never investigated further. The convenience of having the functions in SQL wasn't that important. Craig > > Graeme. > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance > -- - Craig A. James Chief Technology Officer eMolecules, Inc. -

Re: [PERFORM] Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?

2015-07-08 Thread Craig James
On Tue, Jul 7, 2015 at 10:31 PM, Joshua D. Drake wrote: > > On 07/07/2015 08:05 PM, Craig James wrote: > >> >> >> No ideas, but I ran into the same thing. I have a set of C/C++ functions >> that put some chemistry calculations into Postgres as extensions (things

Re: [PERFORM] Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?

2015-07-08 Thread Craig James
On Wed, Jul 8, 2015 at 10:52 AM, Joshua D. Drake wrote: > > On 07/08/2015 10:48 AM, Craig James wrote: > > I admit that I haven't read this whole thread but: >> >> Using Apache Fast-CGI, you are going to fork a process for each >> instance of the

Re: [PERFORM] Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?

2015-07-08 Thread Craig James
On Wed, Jul 8, 2015 at 1:27 PM, Andres Freund wrote: > On 2015-07-08 13:46:53 -0500, Merlin Moncure wrote: > > On Wed, Jul 8, 2015 at 12:48 PM, Craig James > wrote: > > > On Tue, Jul 7, 2015 at 10:31 PM, Joshua D. Drake > > > >> Using Apache Fast-CGI, you

Re: [PERFORM] Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?

2015-07-11 Thread James Cloos
B> scaling will suffer badly on FP functions. That is half as many 256-bit float units; for scalar math and for 128-bit vector math each core gets a half of the float unit. Only for the 256-bit vector math do the schedulars have to compete for float unit access. -JimC -- James Cloos

[PERFORM] Are many idle connections bad?

2015-07-25 Thread Craig James
The canonical advice here is to avoid more connections than you have CPUs, and to use something like pg_pooler to achieve that under heavy load. We are considering using the Apache mod_perl "fast-CGI" system and perl's Apache::DBI module, which caches persistent connections in order to improve per

Re: [PERFORM] Are many idle connections bad?

2015-07-25 Thread Craig James
On Sat, Jul 25, 2015 at 8:04 AM, Tom Lane wrote: > Craig James writes: > > ... This would result in a thousand > > or so Postgres connections on a machine with 32 CPUs. > > > So the question is: do idle connections impact performance? > > Yes. Those connec

Re: [PERFORM] Simple delete query is taking too long (never ends)

2015-11-12 Thread Craig James
rmance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance > -- - Craig A. James Chief Technology Officer eMolecules, Inc. -

[PERFORM] Cloud versus buying my own iron

2016-02-23 Thread Craig James
At some point in the next year we're going to reconsider our hosting environment, currently consisting of several medium-sized servers (2x4 CPUs, 48GB RAM, 12-disk RAID system with 8 in RAID 10 and 2 in RAID 1 for WAL). We use barman to keep a hot standby and an archive. The last time we dug into

[PERFORM] Merge joins on index scans

2016-02-26 Thread James Parks
eing compared to a non-negligible number of rows in (a). I can probably make this data available as a pg_dump file. Let me know if you think that's necessary, and where I should upload it. Regards, James [1] postgres=# explain (analyze,buffers) select b.* from b join a on b.

Re: [PERFORM] Merge joins on index scans

2016-03-01 Thread James Parks
On Sun, Feb 28, 2016 at 2:06 AM, David Rowley wrote: > On 27 February 2016 at 11:07, James Parks wrote: > > > > CREATE TABLE a (id bigint primary key, nonce bigint); > > CREATE TABLE b (id bigint primary key, a_id bigint not null); > > CREATE INDEX a_idx ON

[PERFORM] Can't get two index scans

2016-06-22 Thread Craig James
I'm working with a third-party plugin that does chemistry. It's very fast. However, I'm trying to do a sampling query, such as the first 1% of the database, and I just can't get the planner to create a good plan. Here is the full query (the |>| operator does a subgraph match of a molecular substru

Re: [PERFORM] Can't get two index scans

2016-06-22 Thread Craig James
On Wed, Jun 22, 2016 at 11:36 AM, Jeff Janes wrote: > On Wed, Jun 22, 2016 at 9:03 AM, Craig James > wrote: > > I'm working with a third-party plugin that does chemistry. > > > Out of personal/professional curiosity, which one are you using, if > that can be

Re: [PERFORM] Can't get two index scans

2016-06-23 Thread Craig James
On Thu, Jun 23, 2016 at 8:47 AM, Jeff Janes wrote: > On Wed, Jun 22, 2016 at 9:36 PM, Craig James > wrote: > > On Wed, Jun 22, 2016 at 11:36 AM, Jeff Janes > wrote: > ... > But, JChem's cartridge is apparently not using a GiST index, which is > what my first gues

Re: [PERFORM] Slow query with big tables

2016-08-26 Thread Craig James
version to see if it speeds up things. > > However is there way to keep query time constant as the database size > grows. Should I use partitioning or partial indexes? > > Best Regards, > Tommi Kaksonen > -- - Craig A. James Chief Technology Officer eMolecules, Inc. -

Re: [PERFORM] Slow query with big tables

2016-08-27 Thread Craig James
le! http://BlueTreble.com > 855-TREBLE2 (855-873-2532) mobile: 512-569-9461 > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance > -- - Craig A. James Chief Technology Officer eMolecules, Inc. -

Re: [PERFORM] Millions of tables

2016-09-27 Thread Craig James
lose to 6,000 records/second. The schema has been > simplified since and last test reach just over 20,000 records/second with > 300k tables. > > I'm not looking for alternatives yet but input to my test. Takers? > > I can't promise immediate feedback but will do my best to respond with > results. > > TIA, > -Greg > -- - Craig A. James Chief Technology Officer eMolecules, Inc. -

[PERFORM] limit 1 on view never finishes

2016-10-27 Thread Craig James
This is a weird problem. A "limit 5" query runs quicky as expected, but a "limit 1" query never finishes -- it just blasts along at 100% CPU until I give up. And this is a join between two small tables (262K rows and 109K rows). Both tables were recently analyzed. This is Postgres 9.3.5 (yes, we'l

[PERFORM] Optimizing around retained tuples

2017-03-21 Thread James Parks
iently query table A? Or, in other words, how do I avoid incurring the cost of transaction isolation for queries against B on a case-by-case basis? Anything is on the table for implementation: - moving tables to a different database / cluster / completely different DBMS system - designing an extension to tune either sets of queries - partitioning tables - etc ... although the simpler the better. If you were in this position, what would you do? Regards, James

[PERFORM] tsearch2/GIST performance factors?

2005-10-17 Thread Craig A. James
We are indexing about 5 million small documents using tsearch2/GIST. Each "document" contains 2 to 50 words. This is a "write once, read many" situation. Write performance is unimportant, and the database contents are static. (We build it offline.) We're having problems with inconsistent pe

[PERFORM] tsearch2/GIST performance factors?

2005-10-17 Thread Craig A. James
We are indexing about 5 million small documents using tsearch2/GIST. Each "document" contains 2 to 50 words. This is a "write once, read many" situation. Write performance is unimportant, and the database contents are static. (We build it offline.) We're having problems with inconsistent pe

Re: [PERFORM] tsearch2/GIST performance factors?

2005-10-17 Thread Craig A. James
Oleg wrote: Did you consider *decreasing* SIGLENINT ? Size of index will diminish and performance could be increased. I use in current project SIGLENINT=15 The default value for SIGLENINT actually didn't work at all. It was only by increasing it that I got any performance at all. An examinat

Re: [PERFORM] Need help in setting optimal configuration for a huge

2005-10-23 Thread Craig A. James
We are facing a* critical situation because of the performance of the **database** .* Even a basic query like select count(*) from bigger_table is taking about 4 minutes to return. Several other replies have mentioned that COUNT() requires a full table scan, but this point can't be emphasized

Re: [PERFORM] Used Memory

2005-10-24 Thread Craig A. James
Kevin Grittner wrote: In addition to what Mark pointed out, there is the possibility that a query is running which is scanning a large table or otherwise bringing in a large number of pages from disk. That would first use up all available unused cache space, and then may start replacing some of

Re: [PERFORM] Used Memory

2005-10-24 Thread Craig A. James
Scott Marlowe wrote: What's needed is a way for the application developer to explicitely say, "This object is frequenly used, and I want it kept in memory." There's an interesting conversation happening on the linux kernel hackers mailing list right about now that applies: http://www.gossamer

Re: [PERFORM] Is There Any Way ....

2005-10-24 Thread Craig A. James
Jim C. Nasby" wrote: > Stefan Weiss wrote: > ... IMO it would be useful to have a way to tell > PG that some tables were needed frequently, and should be cached if > possible. This would allow application developers to consider joins with > these tables as "cheap", even when querying on columns

[PERFORM] Expensive function and the optimizer

2005-11-07 Thread Craig A. James
I have a function, call it "myfunc()", that is REALLY expensive computationally. Think of it like, "If you call this function, it's going to telephone the Microsoft Help line and wait in their support queue to get the answer." Ok, it's not that bad, but it's so bad that the optimizer should A

[PERFORM] Perl DBD and an alarming problem

2005-11-16 Thread Craig A. James
I am mystified by the behavior of "alarm" in conjunction with Postgres/perl/DBD. Here is roughly what I'm doing: eval { local $SIG{ALRM} = sub {die("Timeout");}; $time = gettimeofday; alarm 20; $sth = $dbh->prepare("a query that may take a long time..."); $sth->execut

Re: [PERFORM] Perl DBD and an alarming problem

2005-11-17 Thread Craig A. James
Thanks for the info on alarm and timeouts, this was a big help. One further comment: Michael Fuhr wrote: eval { local $SIG{ALRM} = sub {die("Timeout");}; $time = gettimeofday; alarm 20; $sth = $dbh->prepare("a query that may take a long time..."); $sth->execute(); ala

[PERFORM] Storage/Performance and splitting a table

2005-11-19 Thread Craig A. James
In a recent thread, several people pointed out that UPDATE = DELETE+INSERT. This got me to wondering. I have a table that, roughly, looks like this: create table doc ( id integer primary key, document text, keywords tsvector ); where "keywords" has a GIST index. Th

Re: [PERFORM] Perl DBD and an alarming problem

2005-11-19 Thread Craig A. James
When I set statement_timeout in the config file, it just didn't do anything - it never timed out (PG 8.0.3). ... but did you reload the server after you [changed statement_timeout]? Mystery solved. I have two servers; I was reconfiguring one and restarting the other. Duh. Thanks, Craig

[PERFORM] Hyperthreading slows processes?

2005-11-20 Thread Craig A. James
This article on ZDNet claims that hyperthreading can *hurt* performance, due to contention in the L1/L2 cache by a second process: http://news.zdnet.co.uk/0,39020330,39237341,00.htm Has anyone tested this on Postgres yet? (And based on a recent somewhat caustic thread about performance on t

Re: [PERFORM] 15,000 tables

2005-12-01 Thread Craig A. James
So say I need 10,000 tables, but I can create tablespaces. Wouldn't that solve the performance problem caused by Linux's (or ext2/3's) problems with large directories? For example, if each user creates (say) 10 tables, and I have 1000 users, I could create 100 tablespaces, and assign groups o

[PERFORM] Overriding the optimizer

2005-12-15 Thread Craig A. James
I asked a while back if there were any plans to allow developers to override the optimizer's plan and force certain plans, and received a fairly resounding "No". The general feeling I get is that a lot of work has gone into the optimizer, and by God we're going to use it! I think this is just

Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Craig A. James
Tom, I see this over and over. Tweak the parameters to "force" a certain plan, because there's no formal way for a developer to say, "I know the best plan." I think you've misunderstood those conversations entirely. The point is not to force the planner into a certain plan, it is to explore

Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Craig A. James
Christopher Kings-Lynne wrote: select * from my_table where row_num >= 5 and row_num < 10 and myfunc(foo, bar); You just create an index on myfunc(foo, bar) Thanks, but myfunc() takes parameters (shown here as "foo, bar"), one of which is not a column, it's external and c

Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Craig A. James
Christopher Kings-Lynne wrote: select * from my_table where row_num >= 5 and row_num < 10 and myfunc(foo, bar); You just create an index on myfunc(foo, bar) only if myfunc(foo, bar) is immutable... And if it's not then the best any database can do is to index scan ro

Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Craig A. James
Christopher Kings-Lynne wrote: I don't necessarily disagree with your assertion that we need planner hints, but unless you or someone else is willing to submit a patch with the feature it's unlikely to ever be implemented... Now that's an answer I understand and appreciate. Open-source develo

Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Craig A. James
Mark Kirkwood wrote: I hear what you are saying, but to use this fine example - I don't know what the best plan is - these experiments part of an investigation to find *if* there is a better plan, and if so, why Postgres is not finding it. There isn't a database in the world that is as smart

Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Craig A. James
Tom Lane wrote: This discussion has been had before (many times) ... see the -hackers archives for detailed arguments. The one that carries the most weight in my mind is that planner hints embedded in applications will not adapt to changing circumstances --- the plan that was best when you desig

Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Craig A. James
Christopher Kings-Lynne wrote: Can you paste explain analyze and your effective_cache_size, etc. settings. ... This seems like a case where PostgreSQL's current optimiser should easily know what to do if your config settings are correct and you've been running ANALYZE, so I'd like to see your

Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Craig A. James
Christopher Kings-Lynne wrote: So your main example bad query is possibly just a case of lack of analyze stats and wrong postgresql.conf config? And that's what causes you to shut down your database? Don't you want your problem FIXED? I'm trying to help by raising a question that I think is

Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Craig A. James
Yeah it would - an implementation I have seen that I like is where the developer can supply the *entire* execution plan with a query. This is complex enough to make casual use unlikely :-), but provides the ability to try out other plans, and also fix that vital query that must run today.

Re: [PERFORM] Overriding the optimizer

2005-12-15 Thread Craig A. James
Kevin Brown wrote: Hints are dangerous, and I consider them a last resort. If you consider them a last resort, then why do you consider them to be a better alternative than a workaround such as turning off enable_seqscan, when all the other tradeoffs are considered? If I understand enable_seq

Re: [PERFORM] Overriding the optimizer

2005-12-16 Thread Craig A. James
Jaime Casanova wrote: The context is this - in a busy OLTP system, sometimes a query comes through that, for whatever reason (foolishness on my part as a developer, unexpected use by a user, imperfection of the optimizer, etc.), takes a really long time to run, usually because it table-scans one

Re: [PERFORM] When to vacuum a table?

2006-11-26 Thread Craig A. James
Rod Taylor wrote: Just a minor clarification here: INSERT does not create dead rows, only UPDATE and DELETE do. Thus, if you only insert rows, you do not need to vacuum (although you probably need to analyze). Is there no real-time garbage collection at all in Postgres? And if so, is this bec

Re: [PERFORM] Bad iostat numbers

2006-12-05 Thread Craig A. James
Alex Turner wrote: The problem I see with software raid is the issue of a battery backed unit: If the computer loses power, then the 'cache' which is held in system memory, goes away, and fubars your RAID. I'm not sure I see the difference. If data are cached, they're not written whether it

Re: [PERFORM] New to PostgreSQL, performance considerations

2006-12-11 Thread Craig A. James
Ron wrote: We are not going to get valuable contributions nor help people become more valuable to the community by "flaming them into submission". Let's support getting definitive evidence. No matter who brings it to the table ;-) Thanks, Ron, for a voice of respect and reason. Since I fir

[PERFORM] glibc double-free error

2007-01-01 Thread Craig A. James
I'm using pg_dump/pg_restore to quickly copy databases between servers. But my server keeps crashing when I run pg_restore: glibc detected *** double free or corruption (!prev): 0x0a00b1a0 Postgres: 8.1.4 Linux: 2.6.12-1.1381_FC3 glibc: 2.3.6-0.fc3.1 Server: Dell CPU: Xeon 2.80GHz Me

Re: [PERFORM] glibc double-free error

2007-01-01 Thread Craig A. James
Tom Lane wrote: "Craig A. James" <[EMAIL PROTECTED]> writes: I'm using pg_dump/pg_restore to quickly copy databases between servers. But my server keeps crashing when I run pg_restore: glibc detected *** double free or corruption (!prev): 0x0a00b1a0 What can I do

[PERFORM] Trivial function query optimized badly

2007-01-03 Thread Craig A. James
Well, once again I'm hosed because there's no way to tell the optimizer the cost for a user-defined function. I know this issue has already been raised (by me!) several times, but I have to remind everyone about this. I frequently must rewrite my SQL to work around this problem. Here is the

Re: [PERFORM] Trivial function query optimized badly

2007-01-03 Thread Craig A. James
Adam Rich wrote: Craig, What version of postgres are you using? I just tested this on PG 8.1.2 and was unable to reproduce these results. I wrote a simple function that returns the same text passed to it, after sleeping for 1 second. I use it in a where clause, like your example below, and rega

Re: [PERFORM] Trivial function query optimized badly

2007-01-03 Thread Craig A. James
Tom Lane wrote: "Craig A. James" <[EMAIL PROTECTED]> writes: CREATE OR REPLACE FUNCTION cansmiles(text) RETURNS text AS '/usr/local/pgsql/lib/libchem.so', 'cansmiles' LANGUAGE 'C' STRICT IMMUTABLE; Umm ... this is a single-argument func

Re: [PERFORM] High update activity, PostgreSQL vs BigDBMS

2007-01-05 Thread Craig A. James
Guy Rouillier wrote: I've got back access to my test system. I ran another test run with the same input data set. This time I put pg_xlog on a different RAID volume (the unused one that I suspect is a software RAID), and I turned fsync=off in postgresql.conf. I left the rest of the configura

Re: [PERFORM] High update activity, PostgreSQL vs BigDBMS

2007-01-06 Thread Craig A. James
Guy, The application is fairly straightforward, but as you say, what is working okay with BigDBMS isn't working as well under PG. I'm going to try other configuration suggestions made by others before I attempt logic changes. The core logic is unchangeable; millions of rows of data in a sin

Re: [PERFORM] High update activity, PostgreSQL vs BigDBMS

2007-01-07 Thread Craig A. James
Craig A. James wrote: The "idiom" to replace count() was "select col from tbl order by col desc limit 1". It worked miracles for my app. Sorry, I meant to write, "the idiom to replace MAX()", not count()... MAX() was the function that was killing me, 

Re: [PERFORM] How to debug performance problems

2007-02-19 Thread Craig A. James
Andreas Tille wrote: My web application was running fine for years without any problem and the performance was satisfying. Some months ago I added a table containing 450 data rows ... Since about two weeks the application became *drastically* slower and I urgently have to bring back the old

Re: [PERFORM] How to debug performance problems

2007-02-21 Thread Craig A. James
Ray, I'd like to have a toolbox prepared for when performance goes south. I'm clueless. Would someone mind providing some detail about how to measure these four items Craig listed: I hope I didn't give the impression that these were the only thing to look at ... those four items just popped

Re: [PERFORM] Identical Queries

2007-03-01 Thread Craig A. James
Stephan Szabo wrote: I tried posting to the bugs, and they said this is a better question for here. I have to queries. One runs in about 2 seconds. The other takes upwards of 2 minutes. I have a temp table that is created with 2 columns. This table is joined with the larger database of call detai

Re: [PERFORM] Estimate the size of the SQL file generated by pg_dump utility

2007-03-06 Thread Craig A. James
Bill Moran wrote: I'm curious, what problem does the disclaimer cause? I wrote the following TOS for my personal system: https://www.potentialtech.com/cms/node/9 Excerpt of the relevant part: "If you send me email, you are granting me the unrestricted right to use the contents of that email howe

Re: [PERFORM] strange performance regression between 7.4 and 8.1

2007-03-08 Thread Craig A. James
Rodrigo Madera wrote: I would just like to note here that this is an example of inefficient strategy. We could all agree (up to a certain economical point) that Alex saved the most expensive one thousand dollars of his life. I don't know the financial status nor the size of your organization

Re: [PERFORM] strange performance regression between 7.4 and 8.1

2007-03-08 Thread Craig A. James
Carlos, Now, yet another thing that you (Craig) seem to be missing: you're simply putting the expense of all this time under the expenses column in exchange for solving the particular problem... More like I was trying to keep my response short ;-). I think we're all in agreement on pretty mu

Re: [PERFORM] Determining server load from client

2007-03-20 Thread Craig A. James
Dan Harris wrote: I've found that it would be helpful to be able to tell how busy my dedicated PG server is ... I have seen some other nice back-end things exposed through PG functions ( e.g. database size on disk ) and wondered if there was anything applicable to this. I'd write a simple p

Re: [PERFORM] Performance of count(*)

2007-03-22 Thread Craig A. James
Michael Stone wrote: On Thu, Mar 22, 2007 at 01:30:35PM +0200, [EMAIL PROTECTED] wrote: approximated count? why? who would need it? where you can use it? Do a google query. Look at the top of the page, where it says "results N to M of about O". For user interfaces (which is where a lot o

Re: [PERFORM] Performance of count(*)

2007-03-22 Thread Craig A. James
Brian Hurt wrote: One of our biggest single problems is this very thing. It's not a Postgres problem specifically, but more embedded in the idea of a relational database: There are no "job status" or "rough estimate of results" or "give me part of the answer" features that are critical to man

Re: [PERFORM] Performance of count(*)

2007-03-22 Thread Craig A. James
Tino Wildenhain wrote: Craig A. James schrieb: ... In our case (for a variety of reasons, but this one is critical), we actually can't use Postgres indexing at all -- we wrote an entirely separate indexing system for our data... ...There is no need to store or maintain this information

Re: [PERFORM] Performance of count(*)

2007-03-22 Thread Craig A. James
Tino Wildenhain wrote: You guys can correct me if I'm wrong, but the key feature that's missing from Postgres's flexible indexing is the ability to maintain state across queries. Something like this: select a, b, my_index_state() from foo where ... offset 100 limit 10 using my_index(prev_

Re: [PERFORM] Performance of count(*)

2007-03-22 Thread Craig A. James
Steve Atkins wrote: As long as you're ordering by some row in the table then you can do that in straight SQL. select a, b, ts from foo where (stuff) and foo > X order by foo limit 10 Then, record the last value of foo you read, and plug it in as X the next time around. We've been over this be

Re: [PERFORM] Performance of count(*)

2007-03-22 Thread Craig A. James
Tom Lane wrote: "Craig A. James" <[EMAIL PROTECTED]> writes: Steve Atkins wrote: As long as you're ordering by some row in the table then you can do that in straight SQL. select a, b, ts from foo where (stuff) and foo > X order by foo limit 10 Then, record the last va

Re: [PERFORM] [HACKERS] EXISTS optimization

2007-04-03 Thread Craig A. James
Kevin Grittner wrote: Management has simply given a mandate that the software be independent of OS and database vendor, and to use Java to help with the OS independence. ... we write all of our queries in ANSI SQL in our own query tool, parse it, and generate Java classes to run it. A better so

[PERFORM] Can't drop tablespace or user after disk gone

2007-04-04 Thread Craig A. James
I had a 'scratch' database for testing, which I deleted, and then disk went out. No problem, no precious data. But now I can't drop the tablespace, or the user who had that as the default tablespace. I thought about removing the tablespace from pg_tablespaces, but it seems wrong to be monkey

Re: [PERFORM] DELETE with filter on ctid

2007-04-09 Thread Craig A. James
Spiegelberg, Greg wrote: We have a query which generates a small set of rows (~1,000) which are to be used in a DELETE on the same table. The problem we have is that we need to join on 5 different columns and it takes far too long. You may have encountered the same problem I did: You *must*

Re: [PERFORM] Basic Q on superfluous primary keys

2007-04-16 Thread Craig A. James
Merlin Moncure wrote: Using surrogate keys is dangerous and can lead to very bad design habits that are unfortunately so prevalent in the software industry they are virtually taught in schools. ... While there is nothing wrong with them in principle (you are exchanging one key for another as a

Re: [PERFORM] Basic Q on superfluous primary keys

2007-04-17 Thread Craig A. James
Merlin Moncure wrote: In the context of this debate, I see this argument all the time, with the implied suffix: 'If only we used integer keys we would not have had this problem...'. Either the customer identifies parts with a part number or they don't...and if they do identify parts with a numbe

Re: [PERFORM] Basic Q on superfluous primary keys

2007-04-18 Thread Craig A. James
Merlin Moncure wrote: Since the part number is NOT the primary key, duplicate part numbers are not a problem. If you had instead used the part number as the primary key, you'd be dead in the water. You are redefining the primary key to be (part_number, obsoletion_date). Now, if you had not a

Re: [PERFORM] Basic Q on superfluous primary keys

2007-04-22 Thread Craig A. James
Merlin Moncure wrote: In the context of this debate, I see this argument all the time, with the implied suffix: 'If only we used integer keys we would not have had this problem...'. Either the customer identifies parts with a part number or they don't...and if they do identify parts with a numbe

Re: Filesystem fragmentation (Re: [PERFORM] Fragmentation of WAL files)

2007-04-26 Thread Craig A. James
Bill Moran wrote: In response to Heikki Linnakangas <[EMAIL PROTECTED]>: Can anyone else confirm this? I don't know if this is a windows-only issue, but I don't know of a way to check fragmentation in unix. I can confirm that it's only a Windows problem. No UNIX filesystem that I'm aware of s

Re: [PERFORM] Feature Request --- was: PostgreSQL Performance Tuning

2007-04-30 Thread Craig A. James
Greg Smith wrote: If you're going to the trouble of building a tool for offering configuration advice, it can be widly more effective if you look inside the database after it's got data in it, and preferably after it's been running under load for a while, and make your recommendations based on

[PERFORM] Replication

2007-06-14 Thread Craig A. James
Looking for replication solutions, I find: Slony-I Seems good, single master only, master is a single point of failure, no good failover system for electing a new master or having a failed master rejoin the cluster. Slave databases are mostly for safety or for parallelizing queries for perfo

Re: [PERFORM] Suspending SELECTs

2006-01-16 Thread Craig A. James
Alessandro Baretta <[EMAIL PROTECTED]> writes: I am aware that what I am dreaming of is already available through cursors, but in a web application, cursors are bad boys, and should be avoided. What I would like to be able to do is to plan a query and run the plan to retreive a limited number of

Re: [PERFORM] Suspending SELECTs

2006-01-17 Thread Craig A. James
Alessandro Baretta wrote: I think you're trying to do something at the wrong layer of your architecture. This task normally goes in your middleware layer, not your database layer. I am developing my applications in Objective Caml, and I have written the middleware layer myself. I could easil

Re: [PERFORM] [GENERAL] Creation of tsearch2 index is very slow

2006-01-20 Thread Craig A. James
Tom Lane wrote: Well, we're trying to split an index page that's gotten full into two index pages, preferably with approximately equal numbers of items in each new page (this isn't a hard requirement though). ... If that's correct, what you really want is to divide the values so that the unions

Re: [PERFORM] Query optimization with X Y JOIN

2006-01-26 Thread Craig A. James
First, this isn't really the right place to ask -- this forum is about performance, not SQL syntax. Second, this isn't a question anyone can answer in a reasonable length of time. What you're asking for usually is taught in a class on relational database theory, which is typically a semester

Re: [PERFORM] [GENERAL] Creation of tsearch2 index is very

2006-01-26 Thread Craig A. James
Ron <[EMAIL PROTECTED]> writes: We have two problems here. The first is that the page splitting code for these indexes currently has O(N^2) performance. The second is that whatever solution we do use for this functionality, we still need good performance during searches that use the index. No

Re: [PERFORM] Huge Data sets, simple queries

2006-01-29 Thread Craig A. James
Mike Biamonte wrote: Does anyone have any experience with extremely large data sets? I'm mean hundreds of millions of rows. The queries I need to run on my 200 million transactions are relatively simple: select month, count(distinct(cardnum)) count(*), sum(amount) from transactions group by

Re: [PERFORM] Storing Digital Video

2006-02-09 Thread Craig A. James
Nate Byrnes wrote: I must claim some ignorance, I come from the application world... but, from a data integrity perspective, it makes a whole lot of sense to store video, images, documents, whatever in the database rather than on the file system external to it. Personally, I would use LOB's, bu

Re: [PERFORM] Reliability recommendations

2006-02-15 Thread Craig A. James
Jeremy Haile wrote: We are a small company looking to put together the most cost effective solution for our production database environment. Currently in production Postgres 8.1 is running on this machine: Dell 2850 2 x 3.0 Ghz Xeon 800Mhz FSB 2MB Cache 4 GB DDR2 400 Mhz 2 x 73 GB 10K SCSI RAID

Re: qsort again (was Re: [PERFORM] Strange Create Index

2006-02-16 Thread Craig A. James
Markus Schaber wrote: Ron wrote: ...and of course if you know enough about the data to be sorted so as to constrain it appropriately, one should use a non comparison based O(N) sorting algorithm rather than any of the general comparison based O(NlgN) methods. Sounds interesting, could you give

Re: [PERFORM] Cost Issue - How do I force a Hash Join

2006-02-20 Thread Craig A. James
"Virag Saksena" <[EMAIL PROTECTED]> writes: The individual queries run in 50-300 ms. However the optimizer is choosing a nested loop to join them rather than a Hash join... I have what appears to be the identical problem. This is a straightforward query that should be fairly quick, but takes a

Re: [PERFORM] Reliability recommendations

2006-02-24 Thread Craig A. James
Bruce Momjian wrote: Dell often says part X is included, but part X is not the exact same as part X sold by the original manufacturer. To hit a specific price point, Dell is willing to strip thing out of commodity hardware, and often does so even when performance suffers. For many people, this

Re: [PERFORM] Reliability recommendations

2006-02-24 Thread Craig A. James
Joshua D. Drake wrote: I find this strains credibility, that this major manufacturer of PC's would do something deceptive that hurts performance, when it would be easily detected and widely reported. Can anyone cite a specific instances where this has happened? Such as, "I bought Dell model X

<    1   2   3   4   5   6   >