from:"Pierre\-Frédéric Caillaud"

Re: [PERFORM] PostgreSQL vs. Oracle vs. Microsoft

2005-01-10 Thread Pierre-Frédéric Caillaud

The .NET Runtime will be a part of the next MS SQLServer engine. You will be able to have C# as a pl in the database engine with the next version of MSSQL. That certainly will be something to think about. Ah, well, if it's C# (or even VB.NET) then it's serious ! I thought postgres h

Re: [PERFORM] PostgreSQL vs. Oracle vs. Microsoft

2005-01-10 Thread Pierre-Frédéric Caillaud

On Mon, 10 Jan 2005 12:46:01 -0500, Alex Turner <[EMAIL PROTECTED]> wrote: You sir are correct! You can't use perl in MS-SQL or Oracle ;). Can you benefit from the luminous power of Visual Basic as a pl in MSSQL ? ---(end of broadcast)--- TIP 6:

Re: [PERFORM] Low Performance for big hospital server ..

2005-01-03 Thread Pierre-Frédéric Caillaud

Decrease the sort mem too much [8196] make the performance much slower so I use sort_mem = 16384 and leave effective cache to the same value , the result is quite better but I should wait for tomorrow morning [official hour] to see the end result. You could also profile your queries to see w

Re: [PERFORM] Caching of Queries

2004-12-23 Thread Pierre-Frédéric Caillaud

I've looked at PREPARE, but apparently it only lasts per-session - that's worthless in our case (web based service, one connection per data-requiring connection). You don't use persistent connections ??? Your problem might simply be the connection time overhead (also including a fe

Re: [PERFORM] Using LIMIT changes index used by planner

2004-12-23 Thread Pierre-Frédéric Caillaud

On Mon, 13 Dec 2004 17:43:07 -0500, Tom Lane <[EMAIL PROTECTED]> wrote: Sven Willenberger <[EMAIL PROTECTED]> writes: explain analyze select storelocation,order_number from custacct where referrer = 1365 and orderdate between '2004-12-07' and '2004-12-07 12:00:00' order by custacctid limit 10;

Re: [PERFORM] LIMIT causes SEQSCAN in subselect

2004-12-23 Thread Pierre-Frédéric Caillaud

The fact that the estimator knows that the LIMIT is pointless because there are less rows in the subselect than the LIMIT will return is not something we want to count on; sometimes the estimator has innaccurate information. The UNIQUE index makes this more certain, except that I'm not sure

Re: [PERFORM] Poor Query

2004-12-06 Thread Pierre-Frédéric Caillaud

Your suffering comes from the "where ba.bankaccountID = u.bankaccountID" in the subselect. It means postgres has to run the subselect once for each row in Users. You want the subselect to run only once, and return one (or more?) bankaccountid's, then fetch the users from Users. Just

Re: [PERFORM] Poor Query

2004-12-06 Thread Pierre-Frédéric Caillaud

Just One, user can i have only one bankaccount. Ah well, in that case : This is your query : select userID, fname, lname, email, phone, dateEntered, dateCanceled, dateSuspended, billAmount, billDate, dateBilled, datePaid, '?' as searches from Users u where 1=1 AND exists (select userID

Re: [PERFORM] Poor Query

2004-12-06 Thread Pierre-Frédéric Caillaud

Just wanted to know the selectivity of the accountnumber and routingNumber columns. I shoulda written : How many rows do the following queries return : One or few at most, or a lot ? select userID from bankaccount WHERE accountnumber = '12345678' select userID

Re: [PERFORM] Poor Query

2004-12-06 Thread Pierre-Frédéric Caillaud

How many rows do the following queries return : select userID from bankaccount ba where ba.bankaccountID = u.bankaccountID and ba.accountnumber = '12345678' select userID from bankaccount ba where ba.bankaccountID = u.bankaccountID

Re: [PERFORM] Config Check

2004-12-06 Thread Pierre-Frédéric Caillaud

According to these lines you should set max_fsm_pages to at the very least 5306160 You have a humongous amount of RAM, you could set it to 1000 INFO: free space map: 79 relations, 1948399 pages stored; 5306160 total pages needed DETAIL: Allocated FSM size: 500 relations + 200 page

Re: [PERFORM] Data type to use for primary key

2004-11-23 Thread Pierre-Frédéric Caillaud

All, Well, you should still escape any strings you're getting from a web page so you can ensure you're not subject to a SQL insert attack, even if you're expecting integers. Thanks, Peter Darley Well, your framework should do this for you : "integer" specified in your database object class

Re: [PERFORM] Data type to use for primary key

2004-11-23 Thread Pierre-Frédéric Caillaud

It would be nice if PostgreSQL had some form of transparent surrogate keying in the background which would automatically run around and replace your real data with SERIAL integers. It could use a lookup table There is still table inheritance, but it's not really the same. -

Re: [PERFORM] scalability issues on win32

2004-11-22 Thread Pierre-Frédéric Caillaud

Test platform: Pentium 4 3.06 GHz/HT 10k SATA Raptor 1Gb memory Windows XP Pro SP2/Redhat Fedora 3 (64 bit results coming soon) Could you please add information about... - filesystems ? - windows configured as "network server" or as "desktop box" ? - virtual memory In my experience you MUST d

Re: [PERFORM] Data type to use for primary key

2004-11-22 Thread Pierre-Frédéric Caillaud

What is the common approach? Should I use directly the product_code as my ID, or use a sequantial number for speed? (I did the same for the company_id, this is a 'serial' and not the shor name of the customer. I just don't know what is usually done. Use a serial : - you can change product_code f

Re: [PERFORM] memcached and PostgreSQL

2004-11-22 Thread Pierre-Frédéric Caillaud

While an exception, this is a very real possibility in day to day operations. The absence of any feedback or balancing mechanism between the database and cache makes it impossible to know that they are in sync and even a small error percentage multiplied over time will lead to an ever incr

Re: [PERFORM] Index usage for sorted query

2004-11-20 Thread Pierre-Frédéric Caillaud

Instead of : WHERE cd='ca' ORDER BY l_postcode; Write : WHERE cd='ca' ORDER BY cd, l_postcode; You have a multicolumn index, so you should specify a multicolumn sort exactly the same as your index, and the planner will get it. ---(end of broadcast)

Re: [PERFORM] vacuum analyze slows sql query

2004-11-08 Thread Pierre-Frédéric Caillaud

Lets say for a second that you manage to trick it into using index scan, and then you actually call the function with one of the values that returns 1,000s of rows. Probably it will take 10-100 times longer than if it used a seq scan. I don't know if it matters (I suspect that it does) but I am u

Re: [PERFORM] Restricting Postgres

2004-11-04 Thread Pierre-Frédéric Caillaud

check this marvelus piece of 5 minutes of work : http://boutiquenumerique.com/test/iframe_feed.html Yup. If you go the JS route then you can do even better by using JS to load data into JS objects in the background and manipulate the page content directly, no need for even an Iframe. Ignore t

Re: [PERFORM] Restricting Postgres

2004-11-04 Thread Pierre-Frédéric Caillaud

I'm guessing (2) - PG doesn't give the results of a query in a stream. In 1- I was thinking about a cursor... but I think his problem is more like 2- In that case one can either code a special purpose server or use the following hack : In your webpage include an iframe with a Java

Re: [PERFORM] Restricting Postgres

2004-11-04 Thread Pierre-Frédéric Caillaud

On Thu, 4 Nov 2004 18:20:18 -, Matt Clark <[EMAIL PROTECTED]> wrote: Correct the 75% of all hits are on a script that can take anywhere from a few seconds to a half an hour to complete.The script essentially auto-flushes to the browser so they get new information as it arrives creating the

Re: [PERFORM] Restricting Postgres

2004-11-04 Thread Pierre-Frédéric Caillaud

Myself, I like a small Apache with few modules serving static files (no dynamic content, no db connections), and with a mod_proxy on a special path directed to another Apache which generates the dynamic pages (few processes, persistent connections...) You get the best of both, static files

Re: [PERFORM] Anything to be gained from a 'Postgres Filesystem'?

2004-11-04 Thread Pierre-Frédéric Caillaud

posix_fadvise(2) may be a candidate. Read/Write bareers another pone, as well asn syncing a bunch of data in different files with a single call (so that the OS can determine the best write order). I can also imagine some interaction with the FS journalling system (to avoid duplicate efforts). The

Re: [PERFORM] preloading indexes

2004-11-03 Thread Pierre-Frédéric Caillaud

-- uh, you can always load a table in cache by doing a seq scan on it... like select count(1) from table or something... this doesn't work for indexes of course, but you can always look in the system catalogs, find the filename for the index, then just open() it from an external program

Re: [PERFORM] Anything to be gained from a 'Postgres Filesystem'?

2004-10-21 Thread Pierre-Frédéric Caillaud

Reiser4 ? On Thu, 21 Oct 2004 08:58:01 +0100, Matt Clark <[EMAIL PROTECTED]> wrote: I suppose I'm just idly wondering really. Clearly it's against PG philosophy to build an FS or direct IO management into PG, but now it's so relatively easy to plug filesystems into the main open-source Oses,

Re: [PERFORM] Speeding up this function

2004-10-19 Thread Pierre-Frédéric Caillaud

How many lines do you have in your daily logfiles As you can see, this looks for clients who have visited the same site within 20 min. If there is no match, a unique sessionid is assigned from a sequence. If there is a visit, the session id assigned to them is used. I'm only able to process

Re: [PERFORM] execute cursor fetch

2004-10-12 Thread Pierre-Frédéric Caillaud

I just discovered this : http://www.postgresql.org/docs/7.4/static/jdbc-query.html#AEN24298 On Tue, 12 Oct 2004 04:43:43 -0700 (PDT), my ho <[EMAIL PROTECTED]> wrote: Hi, If anyone can help pls, I have a question abt the execution of cursor create/fetch/move , in particular about disk cost.

Re: [PERFORM] sequential scan on select distinct

2004-10-08 Thread Pierre-Frédéric Caillaud

Hashing is at least as fast, if not faster. regards, tom lane Probably quite faster if the dataset is not huge... UniqueSort would be useful for GROUP BY x ORDER BY x though ---(end of broadcast)--- TIP 3: if posti

Re: [PERFORM] integer[] indexing.

2004-10-08 Thread Pierre-Frédéric Caillaud

disclaimer : brainless proposition (SELECT * FROM table WHERE (icount(ids) <= 1 AND ids[1] = 33) UNION ALL (SELECT * FROM table WHERE (icount(ids) > 1 AND ids && '{33}')); ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster

Re: [PERFORM] sequential scan on select distinct

2004-10-08 Thread Pierre-Frédéric Caillaud

The really tricky part is that a DISTINCT ON needs to know about a first() aggregate. And to make optimal use of indexes, a last() aggregate as well. And ideally the planner/executor needs to know something is magic about first()/last() (and potentially min()/max() at some point) and that they

Re: [PERFORM] sequential scan on select distinct

2004-10-07 Thread Pierre-Frédéric Caillaud

I don't really think it would be a useful plan anyway. What *would* be useful is to support HashAggregate as an implementation alternative for DISTINCT --- currently I believe we only consider that for GROUP BY. The DISTINCT planning code is fairly old and crufty and hasn't been redesigned lately

Re: [PERFORM] sequential scan on select distinct

2004-10-06 Thread Pierre-Frédéric Caillaud

There are even three questions here : - given that 'SELECT DISTINCT field FROM table' is exactly the same as 'SELECT field FROM table GROUP BY field", postgres could transform the first into the second and avoid itself a (potentially killer) sort. On my example the table was not too la

Re: [PERFORM] sequential scan on select distinct

2004-10-06 Thread Pierre-Frédéric Caillaud

You could try : explain analyze select "land" from "customer_dim" group by "land"; It will be a lot faster but I can't make it use the index on my machine... Example : create table dummy as (select id, id%255 as number from a large table with 1M rows); so we have a t

Re: [PERFORM] index not used when using function

2004-10-03 Thread Pierre-Frédéric Caillaud

Maybe add an order by artist to force a groupaggregate ? Hi all, a small question: I've got this table "songs" and an index on column artist. Since there's about one distinct artist for every 10 rows, it would be nice if it could use this index when counting artists. It doesn't however:

Re: [PERFORM] Caching of Queries

2004-10-03 Thread Pierre-Frédéric Caillaud

If it was in pgpool or something similar, I could devote a separate machine just for caching results leaving the db server untouched. BUT you would be limited to caching complete queries. There is a more efficient strategy... ---(end of broadcast)

Re: [PERFORM] Caching of Queries

2004-10-03 Thread Pierre-Frédéric Caillaud

1) The materialized data is available in 3 different forms; a list, a detail view, and a spreadsheet. Each form as somewhat different columns and different rules about ordering, which would likely confuse an SQC planner. In this implementation, all 3 forms are able to share the same cache.

Re: [PERFORM] Caching of Queries

2004-10-03 Thread Pierre-Frédéric Caillaud

pgpool (which makes some rather questionable claims IMO); any decent web application language/environment will support connection pooling. That's why it should not be tied to something specific as pgpool. If you want performance, which is the case here, usually you have a webserver serving

Re: [PERFORM] Large # of rows in query extremely slow, not using

2004-09-14 Thread Pierre-Frédéric Caillaud

I have a table with ~8 million rows and I am executing a query which should return about ~800,000 rows. The problem is that as soon as I execute the query it absolutely kills my machine and begins swapping for 5 or 6 minutes before it begins returning results. Is postgres trying to load the whole

Re: [PERFORM] Data Warehouse Reevaluation - MySQL vs Postgres -- merge tables

2004-09-14 Thread Pierre-Frédéric Caillaud

Performance hint : For static data, do not normalize too much. For instance if you have a row which can be linked to several other rows, you can do this : create table parents ( id serial primary key, values... ) create table children ( id serial primary k

Re: [PERFORM] Help with extracting large volumes of records across related tables

2004-09-13 Thread Pierre-Frédéric Caillaud

Thanks for the thanks ! Generally, when grouping stuff together, it is a good idea to have two sorted lists, and to scan them simultaneously. I have already used this solution several times outside of Postgres, and it worked very well (it was with Berkeley DB and there were 3 lists to sc

Re: [PERFORM] Help with extracting large volumes of records across related tables

2004-09-13 Thread Pierre-Frédéric Caillaud

My simple python program dumps 1653992 items in 1654000 categories in : real3m12.029s user1m36.720s sys 0m2.220s It was running on the same machine as postgresql (AthlonXP 2500). I Ctrl-C'd it before it dumped all the database but you get an idea. If you don't know Python and Generators

Re: [PERFORM] Help with extracting large volumes of records across related tables

2004-09-13 Thread Pierre-Frédéric Caillaud

There's a very simple solution using cursors. As an example : create table categories ( id serial primary key, name text ); create table items ( id serial primary key, cat_id integer references categories(id), name text ); create index items_cat_idx on items( cat_id ); insert st

[PERFORM] Question on Byte Sizes

2004-09-10 Thread Pierre-Frédéric Caillaud

Hello, * I need information on the size of pg ARRAY[]'s : I did not find any info in the Docs on this. How many bytes does an array take on disk ? Is there a difference between an array of fixed size elements like integers, and an array of variable length elements like text ? is there a pointe

Re: [PERFORM] The usual sequential scan, but with LIMIT !

2004-09-07 Thread Pierre-Frédéric Caillaud

Yes, you're right as usual. I had not thought about playing with ORDER BY on a field which has only one value in the result set. If you write it as SELECT WHERE topic_id=2 ORDER BY topic_id DESC,id DESC. then an index on (topic_id, id) will work fine. The mixed ASC/DESC ordering is n

Re: [PERFORM] The usual sequential scan, but with LIMIT !

2004-09-06 Thread Pierre-Frédéric Caillaud

OK, thanks a lot for your explanations. Knowing how the planner "thinks", makes it pretty logical. Thank you. Now another question... I have a table of records representing forum posts with a primary key (id), a topic_id, a timestamp, and other fields which I won't detail. I want t

Re: [PERFORM] The usual sequential scan, but with LIMIT !

2004-09-06 Thread Pierre-Frédéric Caillaud

Update : select * from apparts where departement=69 order by departement limit 10; does use an index scan (because of the ORDER BY), even with OFFSET, and it's a lot faster. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to cho

[PERFORM] The usual sequential scan, but with LIMIT !

2004-09-06 Thread Pierre-Frédéric Caillaud

Hello, I have this table : CREATE TABLE apparts ( id SERIAL NOT NULL PRIMARY KEY, priceFLOAT NOT NULL, surfaceINTEGER NOT NULL, price_sq FLOAT NOT NULL, roomsINTEGER NULL, vente

Re: [PERFORM] fsync vs open_sync

2004-09-05 Thread Pierre-Frédéric Caillaud

I trust ReiserFS 3. I wouldn't trust the 4 before maybe 1-2 years. On Sun, 05 Sep 2004 07:41:29 -0400, Geoffrey <[EMAIL PROTECTED]> wrote: Christopher Browne wrote: I'm not sure what all SuSE supports; they're about the only other Linx vendor that EMC would support, and I don't expect t

Re: [PERFORM] fsync vs open_sync

2004-09-05 Thread Pierre-Frédéric Caillaud

Were you upset by my message ? I'll try to clarify. I understood from your email that you are a Windows haters Well, no, not really. I use Windows everyday and it has its strengths. I still don't think the average (non-geek) person can really use Linux as a Desktop OS. The problem I have w

Re: [PERFORM] fsync vs open_sync

2004-09-03 Thread Pierre-Frédéric Caillaud

>There is also the fact that NTFS is a very slow filesystem, and > Linux is > a lot better than Windows for everything disk, caching and IO related. Try > to copy some files in NTFS and in ReiserFS... I'm not so sure I would agree with such a blanket generalization. I find NTFS to be very fa

Re: [PERFORM] seqscan instead of index scan

2004-08-30 Thread Pierre-Frédéric Caillaud

create index t_idx on t((c+d)); select * from t where c+d > 0; Why not : select ((select * from t where c<>0::bigint) UNION (select * from t where d<>0::bigint)) group by whatever; or someting ? ---(end of broadcast)--- TIP 3: if pos

Re: [PERFORM] Why does a simple query not use an obvious index?

2004-08-30 Thread Pierre-Frédéric Caillaud

Another primary key trick : If you insert records with a serial primary key, and rarely delete them or update the timestamp, you can use the primary key to compute an approximate number of rows. a := SELECT pkey FROM table WHERE timestamp() > threshold ORDER BY timestamp ASC LIMIT 1;

Re: [PERFORM] Postgres does not utilize indexes. Why?

2004-08-17 Thread Pierre-Frédéric Caillaud

test where id = 5;â Few times I added 100,000 records, applied cast the 5 to int8 and it will use the index ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes

Re: [PERFORM] fsync vs open_sync

2004-08-13 Thread Pierre-Frédéric Caillaud

What caught my attention initially was the 300+/sec insert performance. On 8.0/NTFS/fsync=on, I can't break 100/sec on a 10k rpm ATA disk. My hardware seems to be more or less in the same league as psql's, so I was naturally curious if this was a NT/Unix issue, a 7.4/8.0 issue, or a combination o

[PERFORM] Reiser4

2004-08-13 Thread Pierre-Frédéric Caillaud

ReiserFS 4 is (will be) a filesystem that implements transactions. Are there any plans in a future Postgresql version to support a special fsync method for Reiser4 which will use the filesystem's transaction engine, instead of an old kludge like fsync(), with a possibility of vastly

Re: [PERFORM] Hardware upgrade for a high-traffic database

2004-08-11 Thread Pierre-Frédéric Caillaud

On Wed, 11 Aug 2004 20:29:04 -0400, Jason Coene <[EMAIL PROTECTED]> wrote: gf=# EXPLAIN ANALYZE SELECT id FROM comments WHERE userid = 51 ORDER BY timestamp DESC LIMIT 5; QUERY PLAN ---

Re: [PERFORM] NUMERIC x VARCHAR

2004-08-10 Thread Pierre-Frédéric Caillaud

Numeric won't store that : (+33) 4 01 23 45 67 On Wed, 11 Aug 2004 02:42:33 -0300, Er Galvão Abbott <[EMAIL PROTECTED]> wrote: Greetings. I have a question regarding performance of certain datatypes: I have a field where I will store my clients phone numbers. I know that this field wil

Re: [PERFORM] Hardware upgrade for a high-traffic database

2004-08-10 Thread Pierre-Frédéric Caillaud

We're currently running Postgres 7.4.1 on FreeBSD 5.2, a Dual Xeon 2.4, 2GB ECC, 3Ware Serial ATA RAID 5 w/ 4 disks (SLOW!!). Cheap solution while you look for another server : Try to use something other than RAID5. You have 4 disks, so you could use a striping+mirroring RAID which would

Re: [PERFORM] Performance Bottleneck

2004-08-04 Thread Pierre-Frédéric Caillaud

Apache processes running for 30 minutes ?. My advice : use frames and Javascript ! In your webpage, you have two frames : "content" and "refresh". "content" starts empty (say, just a title on top of the page). "refresh" is refreshed every five seconds from a script on your

Re: [PERFORM] Tuning queries on large database

2004-08-04 Thread Pierre-Frédéric Caillaud

You often make sums. Why not use separate tables to cache these sums by month, by poste, by whatever ? Rule on insert on the big table updates the cache tables. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose

Re: [PERFORM] Tuning queries on large database

2004-08-04 Thread Pierre-Frédéric Caillaud

not so bad for oracle. What about for PG ? How data is stored I agree with the datatype issue. Smallint, bigint, integer... add a constraint... Also the way order of the records in the database is very important. As you seem to have a very large static population in your table, you should

Re: [PERFORM] Performance Bottleneck

2004-08-03 Thread Pierre-Frédéric Caillaud

The queries themselves are simple, normally drawing information from one table with few conditions or in the most complex cases using joins on two table or sub queries. These behave very well and always have, the problem is that these queries take place in rather large amounts due to the dumb na

Re: [PERFORM] arrays and indexes

2004-07-26 Thread Pierre-Frédéric Caillaud

> SELECT * from content where 42 = ANY (authors); Postgres does have a way to do what you ask, though. It involves GiST indexes and the operators from the contrib/intarray directory from the Postgres source. I have tried to use these indexes, and the performance was very good. It can be faste

63 matches

Mail list logo