Re: [PERFORM] Fusion-io ioDrive
Hi, Jonah H. Harris wrote: I'm not sure how those cards work, but my guess is that the CPU will go 100% busy (with a near-zero I/O wait) on any sizable workload. In this case, the current pgbench configuration being used is quite small and probably won't resemble this. I'm not sure how they work either, but why should they require more CPU cycles than any other PCIe SAS controller? I think they are doing a clever step by directly attaching the NAND chips to PCIe, instead of piping all the data through SAS or (S)ATA (and then through PCIe as well). And if the controller chip on the card isn't absolutely bogus, that certainly has the potential to reduce latency and improve throughput - compared to other SSDs. Or am I missing something? Regards Markus -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] syslog performance when logging big statements
Hi i have experienced really bad performance on both FreeBSD and linux, with syslog, when logging statements involving bytea of size ~ 10 Mb. Consider this scenario: [EMAIL PROTECTED] \d marinerpapers_atts Table "public.marinerpapers_atts" Column| Type | Modifiers -+--+ id | integer | not null default nextval(('public.marinerpapers_atts_id_seq'::text)::regclass) marinerid | integer | not null filename| text | not null mimetype| character varying(50)| not null datecreated | timestamp with time zone | not null default now() docsrc | bytea| not null Indexes: "marinerpapers_atts_pkey" PRIMARY KEY, btree (id) "marinerpapers_atts_ukey" UNIQUE, btree (marinerid, filename) "marinerpapers_atts_marinerid" btree (marinerid) Foreign-key constraints: "$1" FOREIGN KEY (marinerid) REFERENCES mariner(id) ON DELETE CASCADE The way the insert is done is like INSERT INTO marinerpapers_atts(marinerid,filename,mimetype,docsrc) VALUES(1,'foo.pdf','aplication/pdf','%PDF-1.3\\0124 0 o%%EOF\\012'); When someone tries to insert a row in the above table which results to an error (because e.g. violates the "marinerpapers_atts_ukey" constraint), the whole statement is logged to the logging system. File sizes of about 3M result in actual logging output of ~ 10Mb. In this case, the INSERT *needs* 20 minutes to return. This is because the logging through syslog seems to severely slow the system. If instead, i use stderr, even with logging_collector=on, the same statement needs 15 seconds to return. I am using syslog since like the stone age, and i would like to stick with it, however this morning i was caught by this bad performance and i am planning moving to stderr + logging_collector. P.S. Is there a way to better tune pgsql/syslog in order to work more efficiently in cases like that? I know it is a corner case, however i thought i should post my experiences. Thanx -- Achilleas Mantzios -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] syslog performance when logging big statements
Achilleas Mantzios <[EMAIL PROTECTED]> writes: > In this case, the INSERT *needs* 20 minutes to return. This is because the > logging through syslog seems to severely slow the system. > If instead, i use stderr, even with logging_collector=on, the same statement > needs 15 seconds to return. Hmm. There's a function in elog.c that breaks log messages into chunks for syslog. I don't think anyone's ever looked hard at its performance --- maybe there's an O(N^2) behavior? regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] syslog performance when logging big statements
Στις Tuesday 08 July 2008 17:35:16 ο/η Tom Lane έγραψε: > Achilleas Mantzios <[EMAIL PROTECTED]> writes: > > In this case, the INSERT *needs* 20 minutes to return. This is because the > > logging through syslog seems to severely slow the system. > > If instead, i use stderr, even with logging_collector=on, the same > > statement needs 15 seconds to return. > > Hmm. There's a function in elog.c that breaks log messages into chunks > for syslog. I don't think anyone's ever looked hard at its performance > --- maybe there's an O(N^2) behavior? > > regards, tom lane > Thanx, i changed PG_SYSLOG_LIMIT in elog.c:1269 from 128 to 1048576 #ifndef PG_SYSLOG_LIMIT /* #define PG_SYSLOG_LIMIT 128 */ #define PG_SYSLOG_LIMIT 1048576 #endif and i got super fast stderr performance. :) However, i noticed a certain amount of data in the log is lost. Didnt dig much to the details tho. -- Achilleas Mantzios -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Fusion-io ioDrive
Well, what does a revolution like this require of Postgres? That is the question. I have looked at the I/O drive, and it could increase our DB throughput significantly over a RAID array. Ideally, I would put a few key tables and the WAL, etc. I'd also want all the sort or hash overflow from work_mem to go to this device. Some of our tables / indexes are heavily written to for short periods of time then more infrequently later -- these are partitioned by date. I would put the fresh ones on such a device then move them to the hard drives later. Ideally, we would then need a few changes in Postgres to take full advantage of this: #1 Per-Tablespace optimizer tuning parameters. Arguably, this is already needed. The tablespaces on such a solid state device would have random and sequential access at equal (low) cost. Any one-size-fits-all set of optimizer variables is bound to cause performance issues when two tablespaces have dramatically different performance profiles. #2 Optimally, work_mem could be shrunk, and the optimizer would have to not preferentially sort - group_aggregate whenever it suspected that work_mem was too large for a hash_agg. A disk based hash_agg will pretty much win every time with such a device over a sort (in memory or not) once the number of rows to aggregate goes above a moderate threshold of a couple hundred thousand or so. In fact, I have several examples with 8.3.3 and a standard RAID array where a hash_agg that spilled to disk (poor or -- purposely distorted statistics cause this) was a lot faster than the sort that the optimizer wants to do instead. Whatever mechanism is calculating the cost of doing sorts or hashes on disk will need to be tunable per tablespace. I suppose both of the above may be one task -- I don't know enough about the Postgres internals. #3 Being able to move tables / indexes from one tablespace to another as efficiently as possible. There are probably other enhancements that will help such a setup. These were the first that came to mind. On Tue, Jul 8, 2008 at 2:49 AM, Markus Wanner <[EMAIL PROTECTED]> wrote: > Hi, > > Jonah H. Harris wrote: > >> I'm not sure how those cards work, but my guess is that the CPU will >> go 100% busy (with a near-zero I/O wait) on any sizable workload. In >> this case, the current pgbench configuration being used is quite small >> and probably won't resemble this. >> > > I'm not sure how they work either, but why should they require more CPU > cycles than any other PCIe SAS controller? > > I think they are doing a clever step by directly attaching the NAND chips > to PCIe, instead of piping all the data through SAS or (S)ATA (and then > through PCIe as well). And if the controller chip on the card isn't > absolutely bogus, that certainly has the potential to reduce latency and > improve throughput - compared to other SSDs. > > Or am I missing something? > > Regards > > Markus > > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance >
Re: [PERFORM] syslog performance when logging big statements
Achilleas Mantzios <[EMAIL PROTECTED]> writes: > ΣÏÎ¹Ï Tuesday 08 July 2008 17:35:16 ο/η Tom Lane ÎγÏαÏε: >> Hmm. There's a function in elog.c that breaks log messages into chunks >> for syslog. I don't think anyone's ever looked hard at its performance >> --- maybe there's an O(N^2) behavior? > Thanx, > i changed PG_SYSLOG_LIMIT in elog.c:1269 from 128 to 1048576 > and i got super fast stderr performance. :) Doesn't seem like a very good solution given its impact on the stack depth right there. Looking at the code, the only bit that looks like repeated work are the repeated calls to strchr(), which would not be an issue in the "typical" case where the very long message contains reasonably frequent newlines. Am I right in guessing that your problematic statement contained megabytes worth of text with nary a newline? If so, we can certainly fix it by arranging to remember the last strchr() result across loop iterations, but I'd like to confirm the theory before doing that work. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] max fsm pages question
Hi, when i issued the vaccuum cmd, I recieved this message: echo "VACUUM --full -d ARSys" | psql -d dbname WARNING: relation "public.tradetbl" contains more than "max_fsm_pages" pages with useful free space HINT: Consider compacting this relation or increasing the configuration parameter "max_fsm_pages". NOTICE: number of page slots needed (309616) exceeds max_fsm_pages (153600) HINT: Consider increasing the configuration parameter "max_fsm_pages" to a value over 309616. VACUUM What does the warning indicate? How will it adversely affect the system. Thanks. -rs -- It is all a matter of perspective. You choose your view by choosing where to stand. --Larry Wall -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Fusion-io ioDrive
Scott Carey wrote: Well, what does a revolution like this require of Postgres? That is the question. [...] #1 Per-Tablespace optimizer tuning parameters. ... automatically measured? Cheers, Jeremy -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] max fsm pages question
In response to "Radhika S" <[EMAIL PROTECTED]>: > > when i issued the vaccuum cmd, I recieved this message: > > echo "VACUUM --full -d ARSys" | psql -d dbname > > WARNING: relation "public.tradetbl" contains more than > "max_fsm_pages" pages with useful free space > HINT: Consider compacting this relation or increasing the > configuration parameter "max_fsm_pages". > NOTICE: number of page slots needed (309616) exceeds max_fsm_pages (153600) > HINT: Consider increasing the configuration parameter "max_fsm_pages" > to a value over 309616. > VACUUM > > What does the warning indicate? How will it adversely affect the system. It means any combination of the following things: 1) You're not vacuuming often enough 2) Your FSM settings are too low 3) You recently had some unusually high update/delete activity on that table that's exceeded your normal settings for FSM and vacuum and will need special attention to get back on track. If you know it's #3, then just take steps to get things back on track and don't worry about 1 or 2. If you don't think it's #3, then you may want to increase the frequency of vacuum and/or increase the FSM settings in your conf file. You can do a CLUSTER to get that table back in shape, but be sure to read up on CLUSTER so you understand the implications before you do so. You can also temporarily raise the FSM settings to allow vacuum to work, then lower them back down when it's under control again. This is one of the few circumstances where you may want to VACUUM FULL. If you don't handle this, that table will continue to grow in size on the disk, taking up space unnecessarily and probably negatively impacting performance. -- Bill Moran Collaborative Fusion Inc. http://people.collaborativefusion.com/~wmoran/ [EMAIL PROTECTED] Phone: 412-422-3463x4023 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] syslog performance when logging big statements
On Jul 8, 2008, at 8:24 AM, Achilleas Mantzios wrote: File sizes of about 3M result in actual logging output of ~ 10Mb. In this case, the INSERT *needs* 20 minutes to return. This is because the logging through syslog seems to severely slow the system. If instead, i use stderr, even with logging_collector=on, the same statement needs 15 seconds to return. In syslog.conf is the destination for PG marked with a "-"? (ie -/var/ log/pg.log) which tells syslog to not sync after each line logged. That could explain a large chunk of the difference in time. -- Jeff Trout <[EMAIL PROTECTED]> http://www.stuarthamm.net/ http://www.dellsmartexitin.com/ -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] syslog performance when logging big statements
Jeff <[EMAIL PROTECTED]> writes: > On Jul 8, 2008, at 8:24 AM, Achilleas Mantzios wrote: >> File sizes of about 3M result in actual logging output of ~ 10Mb. >> In this case, the INSERT *needs* 20 minutes to return. This is >> because the logging through syslog seems to severely slow the system. >> If instead, i use stderr, even with logging_collector=on, the same >> statement needs 15 seconds to return. > In syslog.conf is the destination for PG marked with a "-"? (ie -/var/ > log/pg.log) which tells syslog to not sync after each line logged. > That could explain a large chunk of the difference in time. I experimented with this a bit here. There definitely is an O(N^2) penalty from the repeated strchr() calls, but it doesn't really start to hurt till 1MB or so statement length. Even with that patched, syslog logging pretty much sucks performance-wise. Here are the numbers I got on a Fedora 8 workstation, testing the time to log a statement of the form SELECT length('123456...lots of data, no newlines...7890'); statement length1MB 10MB CVS HEAD2523ms 215588ms + patch to fix repeated strchr 529ms 36734ms after turning off syslogd's fsync569ms5881ms PG_SYSLOG_LIMIT 1024, fsync on 216ms2532ms PG_SYSLOG_LIMIT 1024, no fsync 242ms2692ms For comparison purposes: logging statements to stderr 155ms2042ms execute statement without logging 42ms 520ms This machine is running a cheap IDE drive that caches writes, so the lack of difference between fsync off and fsync on is not too surprising --- on a machine with server-grade drives there'd be a lot more difference. (The fact that there is a difference in the 10MB case probably reflects filling the drive's write cache.) On my old HPUX machine, where fsync really works (and the syslogd doesn't seem to allow turning it off), the 1MB case takes 195957ms with the strchr patch, 22922ms at PG_SYSLOG_LIMIT=1024. So there's a fairly clear case to be made for fixing the repeated strchr, but I also think that there's a case for jacking up PG_SYSLOG_LIMIT. As far as I can tell the current value of 128 was chosen *very* conservatively without thought for performance: http://archives.postgresql.org/pgsql-hackers/2000-05/msg01242.php At the time we were looking at evidence that the then-current Linux syslogd got tummyache with messages over about 1KB: http://archives.postgresql.org/pgsql-hackers/2000-05/msg00880.php Some experimentation with the machines I have handy now says that Fedora 8: truncates messages at 2KB (including syslog's header) HPUX 10.20 (ancient): ditto Mac OS X 10.5.3:drops messages if longer than about 1900 bytes So it appears to me that setting PG_SYSLOG_LIMIT = 1024 would be perfectly safe on current systems (and probably old ones too), and would give at least a factor of two speedup for logging long strings --- more like a factor of 8 if syslogd is fsync'ing. Comments? Anyone know of systems where this is too high? Perhaps we should make that change only in HEAD, not in the back branches, or crank it up only to 512 in the back branches? regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] syslog performance when logging big statements
On Tue, 8 Jul 2008, Tom Lane wrote: Jeff <[EMAIL PROTECTED]> writes: On Jul 8, 2008, at 8:24 AM, Achilleas Mantzios wrote: File sizes of about 3M result in actual logging output of ~ 10Mb. In this case, the INSERT *needs* 20 minutes to return. This is because the logging through syslog seems to severely slow the system. If instead, i use stderr, even with logging_collector=on, the same statement needs 15 seconds to return. In syslog.conf is the destination for PG marked with a "-"? (ie -/var/ log/pg.log) which tells syslog to not sync after each line logged. That could explain a large chunk of the difference in time. I experimented with this a bit here. There definitely is an O(N^2) penalty from the repeated strchr() calls, but it doesn't really start to hurt till 1MB or so statement length. Even with that patched, syslog logging pretty much sucks performance-wise. Here are the numbers I got on a Fedora 8 workstation, testing the time to log a statement of the form SELECT length('123456...lots of data, no newlines...7890'); statement length1MB 10MB CVS HEAD2523ms 215588ms + patch to fix repeated strchr 529ms 36734ms after turning off syslogd's fsync569ms5881ms PG_SYSLOG_LIMIT 1024, fsync on 216ms2532ms PG_SYSLOG_LIMIT 1024, no fsync 242ms2692ms For comparison purposes: logging statements to stderr 155ms2042ms execute statement without logging 42ms 520ms This machine is running a cheap IDE drive that caches writes, so the lack of difference between fsync off and fsync on is not too surprising --- on a machine with server-grade drives there'd be a lot more difference. (The fact that there is a difference in the 10MB case probably reflects filling the drive's write cache.) On my old HPUX machine, where fsync really works (and the syslogd doesn't seem to allow turning it off), the 1MB case takes 195957ms with the strchr patch, 22922ms at PG_SYSLOG_LIMIT=1024. So there's a fairly clear case to be made for fixing the repeated strchr, but I also think that there's a case for jacking up PG_SYSLOG_LIMIT. As far as I can tell the current value of 128 was chosen *very* conservatively without thought for performance: http://archives.postgresql.org/pgsql-hackers/2000-05/msg01242.php At the time we were looking at evidence that the then-current Linux syslogd got tummyache with messages over about 1KB: http://archives.postgresql.org/pgsql-hackers/2000-05/msg00880.php Some experimentation with the machines I have handy now says that Fedora 8: truncates messages at 2KB (including syslog's header) HPUX 10.20 (ancient): ditto Mac OS X 10.5.3:drops messages if longer than about 1900 bytes So it appears to me that setting PG_SYSLOG_LIMIT = 1024 would be perfectly safe on current systems (and probably old ones too), and would give at least a factor of two speedup for logging long strings --- more like a factor of 8 if syslogd is fsync'ing. Comments? Anyone know of systems where this is too high? Perhaps we should make that change only in HEAD, not in the back branches, or crank it up only to 512 in the back branches? with linux ext2/ext3 filesystems I have seen similar problems when the syslog starts getting large. there are several factors here 1. fsync after each write unless you have "-" in syslog.conf (only available on linux AFAIK) 2. ext2/ext3 tend to be very inefficiant when doing appends to large files. each write requires that the syslog daemon seek to the end of the file (becouse something else may have written to the file in the meantime) and with the small block sizes and chaining of indirect blocks this can start to be painful when logfiles get up in to the MB range. note that you see this same problem when you start to get lots of files in one directory as well. even if you delete a lot of files the directory itself is still large and this can cause serious performance problems. other filesystems are much less sensitive to file (and directory) sizes. my suggestion would be to first make sure you are doing async writes to syslog, and then try putting the logfiles on different filesystems to see how they differ. personally I use XFS most of the time where I expect lots of files or large files. David Lang -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] syslog performance when logging big statements
> Jeff <[EMAIL PROTECTED]> writes: > > On Jul 8, 2008, at 8:24 AM, Achilleas Mantzios wrote: > >> File sizes of about 3M result in actual logging output of ~ 10Mb. > >> In this case, the INSERT *needs* 20 minutes to return. This is > >> because the logging through syslog seems to severely slow the system. > >> If instead, i use stderr, even with logging_collector=on, the same > >> statement needs 15 seconds to return. > > > In syslog.conf is the destination for PG marked with a "-"? (ie -/var/ > > log/pg.log) which tells syslog to not sync after each line logged. > > That could explain a large chunk of the difference in time. > > I experimented with this a bit here. There definitely is an O(N^2) > penalty from the repeated strchr() calls, but it doesn't really start > to hurt till 1MB or so statement length. Even with that patched, > syslog logging pretty much sucks performance-wise. Here are the numbers > I got on a Fedora 8 workstation, testing the time to log a statement of > the form SELECT length('123456...lots of data, no newlines...7890'); > > statement length 1MB 10MB > > CVS HEAD 2523ms 215588ms > + patch to fix repeated strchr 529ms 36734ms > after turning off syslogd's fsync 569ms5881ms > PG_SYSLOG_LIMIT 1024, fsync on 216ms2532ms > PG_SYSLOG_LIMIT 1024, no fsync 242ms2692ms > For comparison purposes: > logging statements to stderr 155ms2042ms > execute statement without logging 42ms 520ms > > This machine is running a cheap IDE drive that caches writes, so > the lack of difference between fsync off and fsync on is not too > surprising --- on a machine with server-grade drives there'd be > a lot more difference. (The fact that there is a difference in > the 10MB case probably reflects filling the drive's write cache.) > > On my old HPUX machine, where fsync really works (and the syslogd > doesn't seem to allow turning it off), the 1MB case takes > 195957ms with the strchr patch, 22922ms at PG_SYSLOG_LIMIT=1024. > > So there's a fairly clear case to be made for fixing the repeated > strchr, but I also think that there's a case for jacking up > PG_SYSLOG_LIMIT. As far as I can tell the current value of 128 > was chosen *very* conservatively without thought for performance: > http://archives.postgresql.org/pgsql-hackers/2000-05/msg01242.php > > At the time we were looking at evidence that the then-current > Linux syslogd got tummyache with messages over about 1KB: > http://archives.postgresql.org/pgsql-hackers/2000-05/msg00880.php > > Some experimentation with the machines I have handy now says that > > Fedora 8: truncates messages at 2KB (including syslog's header) > HPUX 10.20 (ancient): ditto > Mac OS X 10.5.3: drops messages if longer than about 1900 bytes > > So it appears to me that setting PG_SYSLOG_LIMIT = 1024 would be > perfectly safe on current systems (and probably old ones too), > and would give at least a factor of two speedup for logging long > strings --- more like a factor of 8 if syslogd is fsync'ing. > > Comments? Anyone know of systems where this is too high? > Perhaps we should make that change only in HEAD, not in the > back branches, or crank it up only to 512 in the back branches? I'm a little bit worried about cranking up PG_SYSLOG_LIMIT in the back branches. Cranking it up will definitely change syslog messages text style and might confuse syslog handling scripts(I have no evince that such scripts exist though). So I suggest to change PG_SYSLOG_LIMIT only in CVS HEAD. -- Tatsuo Ishii SRA OSS, Inc. Japan -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] syslog performance when logging big statements
Tatsuo Ishii <[EMAIL PROTECTED]> writes: > I'm a little bit worried about cranking up PG_SYSLOG_LIMIT in the back > branches. Cranking it up will definitely change syslog messages text > style and might confuse syslog handling scripts(I have no evince that > such scripts exist though). So I suggest to change PG_SYSLOG_LIMIT > only in CVS HEAD. Hmm, good point. It would be an externally visible behavior change, not just a speedup. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance