Re: [PERFORM] SSD + RAID

2010-03-03 Thread Ron Mayer
Greg Smith wrote: > Bruce Momjian wrote: >> I always assumed SCSI disks had a write-through cache and therefore >> didn't need a drive cache flush comment. Some do. SCSI disks have write-back caches. Some have both(!) - a write-back cache but the user can explicitly send write-through requests.

Re: [PERFORM] SSD + RAID

2010-03-02 Thread Pierre C
I always assumed SCSI disks had a write-through cache and therefore didn't need a drive cache flush comment. Maximum performance can only be reached with a writeback cache so the drive can reorder and cluster writes, according to the realtime position of the heads and platter rotation. T

Re: [PERFORM] SSD + RAID

2010-03-01 Thread Greg Smith
Bruce Momjian wrote: I always assumed SCSI disks had a write-through cache and therefore didn't need a drive cache flush comment. There's more detail on all this mess at http://wiki.postgresql.org/wiki/SCSI_vs._IDE/SATA_Disks and it includes this perception, which I've recently come to bel

Re: [PERFORM] SSD + RAID

2010-03-01 Thread Bruce Momjian
Greg Smith wrote: > Ron Mayer wrote: > > Linux apparently sends FLUSH_CACHE commands to IDE drives in the > > exact sample places it sends SYNCHRONIZE CACHE commands to SCSI > > drives[2]. > > [2] http://hardware.slashdot.org/comments.pl?sid=149349&cid=12519114 > > > > Well, that's old enough

Re: [PERFORM] SSD + RAID

2010-03-01 Thread Bruce Momjian
Ron Mayer wrote: > Bruce Momjian wrote: > > Greg Smith wrote: > >> Bruce Momjian wrote: > >>> I have added documentation about the ATAPI drive flush command, and the > >> > >> If one of us goes back into that section one day to edit again it might > >> be worth mentioning that FLUSH CACHE EXT i

Re: [PERFORM] SSD + RAID

2010-02-27 Thread Greg Smith
Ron Mayer wrote: Linux apparently sends FLUSH_CACHE commands to IDE drives in the exact sample places it sends SYNCHRONIZE CACHE commands to SCSI drives[2]. [2] http://hardware.slashdot.org/comments.pl?sid=149349&cid=12519114 Well, that's old enough to not even be completely right anymore

Re: [PERFORM] SSD + RAID

2010-02-27 Thread Ron Mayer
Bruce Momjian wrote: > Greg Smith wrote: >> Bruce Momjian wrote: >>> I have added documentation about the ATAPI drive flush command, and the >> >> If one of us goes back into that section one day to edit again it might >> be worth mentioning that FLUSH CACHE EXT is the actual ATAPI-6 command >

Re: [PERFORM] SSD + RAID

2010-02-27 Thread Bruce Momjian
Greg Smith wrote: > Bruce Momjian wrote: > > I have added documentation about the ATAPI drive flush command, and the > > typical SSD behavior. > > > > If one of us goes back into that section one day to edit again it might > be worth mentioning that FLUSH CACHE EXT is the actual ATAPI-6 comman

Re: [PERFORM] SSD + RAID

2010-02-27 Thread Greg Smith
Bruce Momjian wrote: I have added documentation about the ATAPI drive flush command, and the typical SSD behavior. If one of us goes back into that section one day to edit again it might be worth mentioning that FLUSH CACHE EXT is the actual ATAPI-6 command that a drive needs to support pr

Re: [PERFORM] SSD + RAID

2010-02-26 Thread Bruce Momjian
I have added documentation about the ATAPI drive flush command, and the typical SSD behavior. --- Greg Smith wrote: > Ron Mayer wrote: > > Bruce Momjian wrote: > > > >> Agreed, thought I thought the problem was that SSDs

Re: [PERFORM] SSD + RAID

2010-02-24 Thread Dave Crooke
It's always possible to rebuild into a consistent configuration by assigning a precedence order; for parity RAID, the data drives take precedence over parity drives, and for RAID-1 sets it assigns an arbitrary master. You *should* never lose a whole stripe ... for example, RAID-5 updates do "read

Re: [PERFORM] SSD + RAID

2010-02-23 Thread Mark Mielke
On 02/23/2010 04:22 PM, da...@lang.hm wrote: On Tue, 23 Feb 2010, Aidan Van Dyk wrote: * da...@lang.hm [100223 15:05]: However, one thing that you do not get protection against with software raid is the potential for the writes to hit some drives but not others. If this happens the software

Re: [PERFORM] SSD + RAID

2010-02-23 Thread david
On Tue, 23 Feb 2010, Aidan Van Dyk wrote: * da...@lang.hm [100223 15:05]: However, one thing that you do not get protection against with software raid is the potential for the writes to hit some drives but not others. If this happens the software raid cannot know what the correct contents of

Re: [PERFORM] SSD + RAID

2010-02-23 Thread Aidan Van Dyk
* da...@lang.hm [100223 15:05]: > However, one thing that you do not get protection against with software > raid is the potential for the writes to hit some drives but not others. > If this happens the software raid cannot know what the correct contents > of the raid stripe are, and so you co

Re: [PERFORM] SSD + RAID

2010-02-23 Thread david
On Tue, 23 Feb 2010, da...@lang.hm wrote: On Mon, 22 Feb 2010, Ron Mayer wrote: Also worth noting - Linux's software raid stuff (MD and LVM) need to handle this right as well - and last I checked (sometime last year) the default setups didn't. I think I saw some stuff in the last few month

Re: [PERFORM] SSD + RAID

2010-02-23 Thread Scott Carey
On Feb 23, 2010, at 3:49 AM, Pierre C wrote: > Now I wonder about something. SSDs use wear-leveling which means the > information about which block was written where must be kept somewhere. > Which means this information must be updated. I wonder how crash-safe and > how atomic these updates

Re: [PERFORM] SSD + RAID

2010-02-23 Thread Nikolas Everett
On Tue, Feb 23, 2010 at 6:49 AM, Pierre C wrote: > Note that's power draw per bit. dram is usually much more densely >> packed (it can be with fewer transistors per cell) so the individual >> chips for each may have similar power draws while the dram will be 10 >> times as densely packed as the

Re: [PERFORM] SSD + RAID

2010-02-23 Thread Pierre C
Note that's power draw per bit. dram is usually much more densely packed (it can be with fewer transistors per cell) so the individual chips for each may have similar power draws while the dram will be 10 times as densely packed as the sram. Differences between SRAM and DRAM : - price per byte

Re: [PERFORM] SSD + RAID

2010-02-23 Thread david
On Mon, 22 Feb 2010, Ron Mayer wrote: Also worth noting - Linux's software raid stuff (MD and LVM) need to handle this right as well - and last I checked (sometime last year) the default setups didn't. I think I saw some stuff in the last few months on this issue on the kernel mailing list.

Re: [PERFORM] SSD + RAID

2010-02-22 Thread Scott Marlowe
On Mon, Feb 22, 2010 at 7:21 PM, Scott Marlowe wrote: > On Mon, Feb 22, 2010 at 6:39 PM, Greg Smith wrote: >> Mark Mielke wrote: >>> >>> I had read the above when posted, and then looked up SRAM. SRAM seems to >>> suggest it will hold the data even after power loss, but only for a period >>> of t

Re: [PERFORM] SSD + RAID

2010-02-22 Thread Scott Marlowe
On Mon, Feb 22, 2010 at 6:39 PM, Greg Smith wrote: > Mark Mielke wrote: >> >> I had read the above when posted, and then looked up SRAM. SRAM seems to >> suggest it will hold the data even after power loss, but only for a period >> of time. As long as power can restore within a few minutes, it see

Re: [PERFORM] SSD + RAID

2010-02-22 Thread Greg Smith
Mark Mielke wrote: I had read the above when posted, and then looked up SRAM. SRAM seems to suggest it will hold the data even after power loss, but only for a period of time. As long as power can restore within a few minutes, it seemed like this would be ok? The normal type of RAM everyone u

Re: [PERFORM] SSD + RAID

2010-02-22 Thread Greg Smith
Ron Mayer wrote: I know less about other file systems. Apparently the NTFS guys are aware of such stuff - but don't know what kinds of fsync equivalent you'd need to make it happen. It's actually pretty straightforward--better than ext3. Windows with NTFS has been perfectly aware how to d

Re: [PERFORM] SSD + RAID

2010-02-22 Thread Mark Mielke
On 02/22/2010 08:04 PM, Greg Smith wrote: Arjen van der Meijden wrote: That's weird. Intel's SSD's didn't have a write cache afaik: "I asked Intel about this and it turns out that the DRAM on the Intel drive isn't used for user data because of the risk of data loss, instead it is used as memor

Re: [PERFORM] SSD + RAID

2010-02-22 Thread Greg Smith
Arjen van der Meijden wrote: That's weird. Intel's SSD's didn't have a write cache afaik: "I asked Intel about this and it turns out that the DRAM on the Intel drive isn't used for user data because of the risk of data loss, instead it is used as memory by the Intel SATA/flash controller for d

Re: [PERFORM] SSD + RAID

2010-02-22 Thread Ron Mayer
Bruce Momjian wrote: > Greg Smith wrote: >> If you have a regular SATA drive, it almost certainly >> supports proper cache flushing > > OK, but I have a few questions. Is a write to the drive and a cache > flush command the same? I believe they're different as of ATAPI-6 from 2001. >

Re: [PERFORM] SSD + RAID

2010-02-22 Thread Bruce Momjian
Ron Mayer wrote: > Bruce Momjian wrote: > > Agreed, thought I thought the problem was that SSDs lie about their > > cache flush like SATA drives do, or is there something I am missing? > > There's exactly one case I can find[1] where this century's IDE > drives lied more than any other drive with

Re: [PERFORM] SSD + RAID

2010-02-22 Thread Bruce Momjian
Greg Smith wrote: > Ron Mayer wrote: > > Bruce Momjian wrote: > > > >> Agreed, thought I thought the problem was that SSDs lie about their > >> cache flush like SATA drives do, or is there something I am missing? > >> > > > > There's exactly one case I can find[1] where this century's IDE >

Re: [PERFORM] SSD + RAID

2010-02-21 Thread Arjen van der Meijden
On 22-2-2010 6:39 Greg Smith wrote: But the point of this whole testing exercise coming back into vogue again is that SSDs have returned this negligent behavior to the mainstream again. See http://opensolaris.org/jive/thread.jspa?threadID=121424 for a discussion of this in a ZFS context just last

Re: [PERFORM] SSD + RAID

2010-02-21 Thread Greg Smith
Ron Mayer wrote: Bruce Momjian wrote: Agreed, thought I thought the problem was that SSDs lie about their cache flush like SATA drives do, or is there something I am missing? There's exactly one case I can find[1] where this century's IDE drives lied more than any other drive with a ca

Re: [PERFORM] SSD + RAID

2010-02-21 Thread Ron Mayer
Bruce Momjian wrote: > Agreed, thought I thought the problem was that SSDs lie about their > cache flush like SATA drives do, or is there something I am missing? There's exactly one case I can find[1] where this century's IDE drives lied more than any other drive with a cache: Under 120GB Maxto

Re: [PERFORM] SSD + RAID

2010-02-21 Thread Bruce Momjian
Scott Carey wrote: > On Feb 20, 2010, at 3:19 PM, Bruce Momjian wrote: > > > Dan Langille wrote: > >> -BEGIN PGP SIGNED MESSAGE- > >> Hash: SHA1 > >> > >> Bruce Momjian wrote: > >>> Matthew Wakeling wrote: > On Fri, 13 Nov 2009, Greg Smith wrote: > > In order for a drive to work r

Re: [PERFORM] SSD + RAID

2010-02-21 Thread Scott Carey
On Feb 20, 2010, at 3:19 PM, Bruce Momjian wrote: > Dan Langille wrote: >> -BEGIN PGP SIGNED MESSAGE- >> Hash: SHA1 >> >> Bruce Momjian wrote: >>> Matthew Wakeling wrote: On Fri, 13 Nov 2009, Greg Smith wrote: > In order for a drive to work reliably for database use such as for

Re: [PERFORM] SSD + RAID

2010-02-20 Thread Bruce Momjian
Dan Langille wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > Bruce Momjian wrote: > > Matthew Wakeling wrote: > >> On Fri, 13 Nov 2009, Greg Smith wrote: > >>> In order for a drive to work reliably for database use such as for > >>> PostgreSQL, it cannot have a volatile write cache.

Re: [PERFORM] SSD + RAID

2010-02-20 Thread Dan Langille
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Bruce Momjian wrote: > Matthew Wakeling wrote: >> On Fri, 13 Nov 2009, Greg Smith wrote: >>> In order for a drive to work reliably for database use such as for >>> PostgreSQL, it cannot have a volatile write cache. You either need a write >>> cache

Re: [PERFORM] SSD + RAID

2010-02-20 Thread Bruce Momjian
Matthew Wakeling wrote: > On Fri, 13 Nov 2009, Greg Smith wrote: > > In order for a drive to work reliably for database use such as for > > PostgreSQL, it cannot have a volatile write cache. You either need a write > > cache with a battery backup (and a UPS doesn't count), or to turn the cache

Re: [PERFORM] SSD + RAID

2009-12-08 Thread Matthew Wakeling
On Fri, 13 Nov 2009, Greg Smith wrote: In order for a drive to work reliably for database use such as for PostgreSQL, it cannot have a volatile write cache. You either need a write cache with a battery backup (and a UPS doesn't count), or to turn the cache off. The SSD performance figures you

Re: [PERFORM] SSD + RAID

2009-12-03 Thread Scott Carey
On 11/19/09 1:04 PM, "Greg Smith" wrote: > That won't help. Once the checkpoint is done, the problem isn't just > that the WAL segments are recycled. The server isn't going to use them > even if they were there. The reason why you can erase/recycle them is > that you're doing so *after* writi

Re: [PERFORM] SSD + RAID

2009-11-30 Thread Bruce Momjian
Ron Mayer wrote: > Bruce Momjian wrote: > > Greg Smith wrote: > >> Bruce Momjian wrote: > >>> I thought our only problem was testing the I/O subsystem --- I never > >>> suspected the file system might lie too. That email indicates that a > >>> large percentage of our install base is running on unr

Re: [PERFORM] SSD + RAID

2009-11-30 Thread Ron Mayer
Bruce Momjian wrote: > Greg Smith wrote: >> Bruce Momjian wrote: >>> I thought our only problem was testing the I/O subsystem --- I never >>> suspected the file system might lie too. That email indicates that a >>> large percentage of our install base is running on unreliable file >>> systems ---

Re: [PERFORM] SSD + RAID

2009-11-30 Thread Ron Mayer
Bruce Momjian wrote: >> For example, ext3 fsync() will issue write barrier commands >> if the inode was modified; but not if the inode wasn't. >> >> See test program here: >> http://www.mail-archive.com/linux-ker...@vger.kernel.org/msg272253.html >> and read two paragraphs further to see how touchi

Re: [PERFORM] SSD + RAID

2009-11-30 Thread Bruce Momjian
Greg Smith wrote: > Bruce Momjian wrote: > > I thought our only problem was testing the I/O subsystem --- I never > > suspected the file system might lie too. That email indicates that a > > large percentage of our install base is running on unreliable file > > systems --- why have I not heard abo

Re: [PERFORM] SSD + RAID

2009-11-29 Thread Greg Smith
Bruce Momjian wrote: I thought our only problem was testing the I/O subsystem --- I never suspected the file system might lie too. That email indicates that a large percentage of our install base is running on unreliable file systems --- why have I not heard about this before? Do the write barr

Re: [PERFORM] SSD + RAID

2009-11-29 Thread Bruce Momjian
Ron Mayer wrote: > Bruce Momjian wrote: > > Greg Smith wrote: > >> A good test program that is a bit better at introducing and detecting > >> the write cache issue is described at > >> http://brad.livejournal.com/2116715.html > > > > Wow, I had not seen that tool before. I have added a link to

Re: [PERFORM] SSD + RAID

2009-11-29 Thread Ron Mayer
Bruce Momjian wrote: > Greg Smith wrote: >> A good test program that is a bit better at introducing and detecting >> the write cache issue is described at >> http://brad.livejournal.com/2116715.html > > Wow, I had not seen that tool before. I have added a link to it from > our documentation, an

Re: [PERFORM] SSD + RAID

2009-11-28 Thread Bruce Momjian
Greg Smith wrote: > Merlin Moncure wrote: > > I am right now talking to someone on postgresql irc who is measuring > > 15k iops from x25-e and no data loss following power plug test. > The funny thing about Murphy is that he doesn't visit when things are > quiet. It's quite possible the window fo

Re: [PERFORM] SSD + RAID

2009-11-21 Thread Merlin Moncure
On Fri, Nov 20, 2009 at 7:27 PM, Greg Smith wrote: > Richard Neill wrote: >> >> The key issue for short,fast transactions seems to be >> how fast an fdatasync() call can run, forcing the commit to disk, and >> allowing the transaction to return to userspace. >> Attached is a short C program which

Re: [PERFORM] SSD + RAID

2009-11-20 Thread Greg Smith
Richard Neill wrote: The key issue for short,fast transactions seems to be how fast an fdatasync() call can run, forcing the commit to disk, and allowing the transaction to return to userspace. Attached is a short C program which may be of use. Right. I call this the "commit rate" of the storage

Re: [PERFORM] SSD + RAID

2009-11-20 Thread Richard Neill
Axel Rau wrote: Am 13.11.2009 um 14:57 schrieb Laszlo Nagy: I was thinking about ARECA 1320 with 2GB memory + BBU. Unfortunately, I cannot find information about using ARECA cards with SSD drives. They told me: currently not supported, but they have positive customer reports. No date yet for

Re: [PERFORM] SSD + RAID

2009-11-20 Thread Jeff Janes
On Wed, Nov 18, 2009 at 8:24 PM, Tom Lane wrote: > Scott Carey writes: >> For your database DATA disks, leaving the write cache on is 100% acceptable, >> even with power loss, and without a RAID controller. And even in high write >> environments. > > Really? How hard have you tested that config

Re: [PERFORM] SSD + RAID

2009-11-20 Thread Matthew Wakeling
On Thu, 19 Nov 2009, Greg Smith wrote: This is why turning the cache off can tank performance so badly--you're going to be writing a whole 128K block no matter what if it's force to disk without caching, even if it's just to write a 8K page to it. Theoretically, this does not need to be the ca

Re: [PERFORM] SSD + RAID

2009-11-20 Thread Axel Rau
Am 13.11.2009 um 14:57 schrieb Laszlo Nagy: I was thinking about ARECA 1320 with 2GB memory + BBU. Unfortunately, I cannot find information about using ARECA cards with SSD drives. They told me: currently not supported, but they have positive customer reports. No date yet for implementatio

Re: [PERFORM] SSD + RAID

2009-11-19 Thread Scott Marlowe
On Thu, Nov 19, 2009 at 2:39 PM, Merlin Moncure wrote: > On Thu, Nov 19, 2009 at 4:10 PM, Greg Smith wrote: >> You can use pgbench to either get interesting peak read results, or peak >> write ones, but it's not real useful for things in between.  The standard >> test basically turns into a huge

Re: [PERFORM] SSD + RAID

2009-11-19 Thread Merlin Moncure
On Thu, Nov 19, 2009 at 4:10 PM, Greg Smith wrote: > You can use pgbench to either get interesting peak read results, or peak > write ones, but it's not real useful for things in between.  The standard > test basically turns into a huge stack of writes to a single table, and the > select-only one

Re: [PERFORM] SSD + RAID

2009-11-19 Thread Greg Smith
Scott Marlowe wrote: On Thu, Nov 19, 2009 at 10:01 AM, Merlin Moncure wrote: pgbench is actually a pretty awesome i/o tester assuming you have big enough scaling factor Seeing as how pgbench only goes to scaling factor of 4000, are the any plans on enlarging that number? I'm doing pgbenc

Re: [PERFORM] SSD + RAID

2009-11-19 Thread Greg Smith
Scott Carey wrote: Have PG wait a half second (configurable) after the checkpoint fsync() completes before deleting/ overwriting any WAL segments. This would be a trivial "feature" to add to a postgres release, I think. Actually, it already exists! Turn on log archiving, and have the script th

Re: [PERFORM] SSD + RAID

2009-11-19 Thread Brad Nicholson
On Thu, 2009-11-19 at 19:01 +0100, Anton Rommerskirchen wrote: > Am Donnerstag, 19. November 2009 13:29:56 schrieb Craig Ringer: > > On 19/11/2009 12:22 PM, Scott Carey wrote: > > > 3: Have PG wait a half second (configurable) after the checkpoint > > > fsync() completes before deleting/ overwriti

Re: [PERFORM] SSD + RAID

2009-11-19 Thread Anton Rommerskirchen
Am Donnerstag, 19. November 2009 13:29:56 schrieb Craig Ringer: > On 19/11/2009 12:22 PM, Scott Carey wrote: > > 3: Have PG wait a half second (configurable) after the checkpoint > > fsync() completes before deleting/ overwriting any WAL segments. This > > would be a trivial "feature" to add to a

Re: [PERFORM] SSD + RAID

2009-11-19 Thread Scott Marlowe
On Thu, Nov 19, 2009 at 10:01 AM, Merlin Moncure wrote: > On Wed, Nov 18, 2009 at 11:39 PM, Scott Carey wrote: >> Well, that is sort of true for all benchmarks, but I do find that bonnie++ >> is the worst of the bunch.  I consider it relatively useless compared to >> fio.  Its just not a great be

Re: [PERFORM] SSD + RAID

2009-11-19 Thread Merlin Moncure
On Wed, Nov 18, 2009 at 11:39 PM, Scott Carey wrote: > Well, that is sort of true for all benchmarks, but I do find that bonnie++ > is the worst of the bunch.  I consider it relatively useless compared to > fio.  Its just not a great benchmark for server type load and I find it > lacking in the ab

Re: [PERFORM] SSD + RAID

2009-11-19 Thread Greg Smith
Scott Carey wrote: Moral of the story: Nothing is 100% safe, so sometimes a small bit of KNOWN risk is perfectly fine. There is always UNKNOWN risk. If one risks losing 256K of cached data on an SSD if you're really unlucky with timing, how dangerous is that versus the chance that the raid car

Re: [PERFORM] SSD + RAID

2009-11-19 Thread Karl Denninger
Greg Smith wrote: > Scott Carey wrote: >> For your database DATA disks, leaving the write cache on is 100% >> acceptable, >> even with power loss, and without a RAID controller. And even in >> high write >> environments. >> >> That is what the XLOG is for, isn't it? That is where this behavior is

Re: [PERFORM] SSD + RAID

2009-11-19 Thread Greg Smith
Scott Carey wrote: For your database DATA disks, leaving the write cache on is 100% acceptable, even with power loss, and without a RAID controller. And even in high write environments. That is what the XLOG is for, isn't it? That is where this behavior is critical. But that has completely di

Re: [PERFORM] SSD + RAID

2009-11-19 Thread Craig Ringer
On 19/11/2009 12:22 PM, Scott Carey wrote: > 3: Have PG wait a half second (configurable) after the checkpoint fsync() > completes before deleting/ overwriting any WAL segments. This would be a > trivial "feature" to add to a postgres release, I think. How does that help? It doesn't provide any

Re: [PERFORM] SSD + RAID

2009-11-18 Thread Scott Carey
On 11/17/09 10:58 PM, "da...@lang.hm" wrote: > > keep in mind that bonnie++ isn't always going to reflect your real > performance. > > I have run tests on some workloads that were definantly I/O limited where > bonnie++ results that differed by a factor of 10x made no measurable > difference in

Re: [PERFORM] SSD + RAID

2009-11-18 Thread Scott Carey
On 11/17/09 10:51 AM, "Greg Smith" wrote: > Merlin Moncure wrote: >> I am right now talking to someone on postgresql irc who is measuring >> 15k iops from x25-e and no data loss following power plug test. > The funny thing about Murphy is that he doesn't visit when things are > quiet. It's quit

Re: [PERFORM] SSD + RAID

2009-11-18 Thread Tom Lane
Scott Carey writes: > For your database DATA disks, leaving the write cache on is 100% acceptable, > even with power loss, and without a RAID controller. And even in high write > environments. Really? How hard have you tested that configuration? > That is what the XLOG is for, isn't it? Once

Re: [PERFORM] SSD + RAID

2009-11-18 Thread Scott Carey
On 11/15/09 12:46 AM, "Craig Ringer" wrote: > Possible fixes for this are: > > - Don't let the drive lie about cache flush operations, ie disable write > buffering. > > - Give Pg some way to find out, from the drive, when particular write > operations have actually hit disk. AFAIK there's no su

Re: [PERFORM] SSD + RAID

2009-11-18 Thread Scott Carey
On 11/13/09 10:21 AM, "Karl Denninger" wrote: > > One caution for those thinking of doing this - the incremental > improvement of this setup on PostGresql in WRITE SIGNIFICANT environment > isn't NEARLY as impressive. Indeed the performance in THAT case for > many workloads may only be 20 or

Re: [PERFORM] SSD + RAID

2009-11-18 Thread Kenny Gorman
I found a bit of time to play with this. I started up a test with 20 concurrent processes all inserting into the same table and committing after each insert. The db was achieving about 5000 inserts per second, and I kept it running for about 10 minutes. The host was doing about 5MB/s of P

Re: [PERFORM] SSD + RAID

2009-11-17 Thread david
On Wed, 18 Nov 2009, Greg Smith wrote: Merlin Moncure wrote: But what's up with the 400 iops measured from bonnie++? I don't know really. SSD writes are really sensitive to block size and the ability to chunk writes into larger chunks, so it may be that Peter has just found the worst-case be

Re: [PERFORM] SSD + RAID

2009-11-17 Thread Greg Smith
Merlin Moncure wrote: But what's up with the 400 iops measured from bonnie++? I don't know really. SSD writes are really sensitive to block size and the ability to chunk writes into larger chunks, so it may be that Peter has just found the worst-case behavior and everybody else is seeing som

Re: [PERFORM] SSD + RAID

2009-11-17 Thread Mark Mielke
On 11/17/2009 01:51 PM, Greg Smith wrote: Merlin Moncure wrote: I am right now talking to someone on postgresql irc who is measuring 15k iops from x25-e and no data loss following power plug test. The funny thing about Murphy is that he doesn't visit when things are quiet. It's quite possible

Re: [PERFORM] SSD + RAID

2009-11-17 Thread Merlin Moncure
On Tue, Nov 17, 2009 at 1:51 PM, Greg Smith wrote: > Merlin Moncure wrote: >> >> I am right now talking to someone on postgresql irc who is measuring >> 15k iops from x25-e and no data loss following power plug test. > > The funny thing about Murphy is that he doesn't visit when things are quiet.

Re: [PERFORM] SSD + RAID

2009-11-17 Thread Greg Smith
Merlin Moncure wrote: I am right now talking to someone on postgresql irc who is measuring 15k iops from x25-e and no data loss following power plug test. The funny thing about Murphy is that he doesn't visit when things are quiet. It's quite possible the window for data loss on the drive is v

Re: [PERFORM] SSD + RAID

2009-11-17 Thread Peter Eisentraut
On tis, 2009-11-17 at 11:36 -0500, Merlin Moncure wrote: > I am right now talking to someone on postgresql irc who is measuring > 15k iops from x25-e and no data loss following power plug test. I am > becoming increasingly suspicious that peter's results are not > representative: given that 90% of

Re: [PERFORM] SSD + RAID

2009-11-17 Thread Scott Marlowe
On Tue, Nov 17, 2009 at 9:54 AM, Brad Nicholson wrote: > On Tue, 2009-11-17 at 11:36 -0500, Merlin Moncure wrote: >> 2009/11/13 Greg Smith : >> > As far as what real-world apps have that profile, I like SSDs for small to >> > medium web applications that have to be responsive, where the user shows

Re: [PERFORM] SSD + RAID

2009-11-17 Thread Brad Nicholson
On Tue, 2009-11-17 at 11:36 -0500, Merlin Moncure wrote: > 2009/11/13 Greg Smith : > > As far as what real-world apps have that profile, I like SSDs for small to > > medium web applications that have to be responsive, where the user shows up > > and wants their randomly distributed and uncached dat

Re: [PERFORM] SSD + RAID

2009-11-17 Thread Merlin Moncure
2009/11/13 Greg Smith : > As far as what real-world apps have that profile, I like SSDs for small to > medium web applications that have to be responsive, where the user shows up > and wants their randomly distributed and uncached data with minimal latency. > SSDs can also be used effectively as se

Re: [PERFORM] SSD + RAID

2009-11-15 Thread Heikki Linnakangas
Craig James wrote: > I've wondered whether this would work for a read-mostly application: Buy > a big RAM machine, like 64GB, with a crappy little single disk. Build > the database, then make a really big RAM disk, big enough to hold the DB > and the WAL. Then build a duplicate DB on another mach

Re: [PERFORM] SSD + RAID

2009-11-15 Thread Craig James
I've wondered whether this would work for a read-mostly application: Buy a big RAM machine, like 64GB, with a crappy little single disk. Build the database, then make a really big RAM disk, big enough to hold the DB and the WAL. Then build a duplicate DB on another machine with a decent disk

Re: [PERFORM] SSD + RAID

2009-11-15 Thread Laszlo Nagy
- Pg doesn't know the erase block sizes or positions. It can't group writes up by erase block except by hoping that, within a given file, writing in page order will get the blocks to the disk in roughly erase-block order. So your write caching isn't going to do anywhere near as good a job as the

Re: [PERFORM] SSD + RAID

2009-11-15 Thread Craig Ringer
On 15/11/2009 2:05 PM, Laszlo Nagy wrote: > >> A change has been written to the WAL and fsync()'d, so Pg knows it's hit >> disk. It can now safely apply the change to the tables themselves, and >> does so, calling fsync() to tell the drive containing the tables to >> commit those changes to disk.

Re: [PERFORM] SSD + RAID

2009-11-15 Thread Laszlo Nagy
A change has been written to the WAL and fsync()'d, so Pg knows it's hit disk. It can now safely apply the change to the tables themselves, and does so, calling fsync() to tell the drive containing the tables to commit those changes to disk. The drive lies, returning success for the fsync when

Re: [PERFORM] SSD + RAID

2009-11-15 Thread Craig Ringer
On 15/11/2009 11:57 AM, Laszlo Nagy wrote: > Ok, I'm getting confused here. There is the WAL, which is written > sequentially. If the WAL is not corrupted, then it can be replayed on > next database startup. Please somebody enlighten me! In my mind, fsync > is only needed for the WAL. If I could c

Re: [PERFORM] SSD + RAID

2009-11-14 Thread Laszlo Nagy
* I could buy two X25-E drives and have 32GB disk space, and some redundancy. This would cost about $1600, not counting the RAID controller. It is on the edge. This was the solution I went with (4 drives in a raid 10 actually). Not a cheap solution, but the performance is amazing.

Re: [PERFORM] SSD + RAID

2009-11-14 Thread Laszlo Nagy
Robert Haas wrote: 2009/11/14 Laszlo Nagy : 32GB is for one table only. This server runs other applications, and you need to leave space for sort memory, shared buffers etc. Buying 128GB memory would solve the problem, maybe... but it is too expensive. And it is not safe. Power out -> data lo

Re: [PERFORM] SSD + RAID

2009-11-14 Thread Merlin Moncure
On Sat, Nov 14, 2009 at 8:47 AM, Heikki Linnakangas wrote: > Merlin Moncure wrote: >> On Sat, Nov 14, 2009 at 6:17 AM, Heikki Linnakangas >> wrote: lots of ram doesn't help you if: *) your database gets written to a lot and you have high performance requirements >>> When all the (h

Re: [PERFORM] SSD + RAID

2009-11-14 Thread Robert Haas
2009/11/14 Laszlo Nagy : > 32GB is for one table only. This server runs other applications, and you > need to leave space for sort memory, shared buffers etc. Buying 128GB memory > would solve the problem, maybe... but it is too expensive. And it is not > safe. Power out -> data loss. Huh? ...Rob

Re: [PERFORM] SSD + RAID

2009-11-14 Thread Laszlo Nagy
Heikki Linnakangas wrote: Laszlo Nagy wrote: * I need at least 32GB disk space. So DRAM based SSD is not a real option. I would have to buy 8x4GB memory, costs a fortune. And then it would still not have redundancy. At 32GB database size, I'd seriously consider just buying

Re: [PERFORM] SSD + RAID

2009-11-14 Thread Heikki Linnakangas
Merlin Moncure wrote: > On Sat, Nov 14, 2009 at 6:17 AM, Heikki Linnakangas > wrote: >>> lots of ram doesn't help you if: >>> *) your database gets written to a lot and you have high performance >>> requirements >> When all the (hot) data is cached, all writes are sequential writes to >> the WAL,

Re: [PERFORM] SSD + RAID

2009-11-14 Thread Merlin Moncure
On Sat, Nov 14, 2009 at 6:17 AM, Heikki Linnakangas wrote: >> lots of ram doesn't help you if: >> *) your database gets written to a lot and you have high performance >> requirements > > When all the (hot) data is cached, all writes are sequential writes to > the WAL, with the occasional flushing

Re: [PERFORM] SSD + RAID

2009-11-14 Thread Heikki Linnakangas
Merlin Moncure wrote: > 2009/11/13 Heikki Linnakangas : >> Laszlo Nagy wrote: >>>* I need at least 32GB disk space. So DRAM based SSD is not a real >>> option. I would have to buy 8x4GB memory, costs a fortune. And >>> then it would still not have redundancy. >> At 32GB database size,

Re: [PERFORM] SSD + RAID

2009-11-14 Thread Ivan Voras
Lists wrote: Laszlo Nagy wrote: Hello, I'm about to buy SSD drive(s) for a database. For decision making, I used this tech report: http://techreport.com/articles.x/16255/9 http://techreport.com/articles.x/16255/10 Here are my concerns: * I need at least 32GB disk space. So DRAM based SS

Re: [PERFORM] SSD + RAID

2009-11-14 Thread Lists
Laszlo Nagy wrote: Hello, I'm about to buy SSD drive(s) for a database. For decision making, I used this tech report: http://techreport.com/articles.x/16255/9 http://techreport.com/articles.x/16255/10 Here are my concerns: * I need at least 32GB disk space. So DRAM based SSD is not a rea

Re: [PERFORM] SSD + RAID

2009-11-13 Thread Kenny Gorman
The FusionIO products are a little different. They are card based vs trying to emulate a traditional disk. In terms of volatility, they have an on-board capacitor that allows power to be supplied until all writes drain. They do not have a cache in front of them like a disk-type SSD might. I

Re: [PERFORM] SSD + RAID

2009-11-13 Thread Greg Smith
Fernando Hevia wrote: Shouldn't their write performance be more than a trade-off for fsync? Not if you have sequential writes that are regularly fsync'd--which is exactly how the WAL writes things out in PostgreSQL. I think there's a potential for SSD to reach a point where they can give go

Re: [PERFORM] SSD + RAID

2009-11-13 Thread Merlin Moncure
2009/11/13 Greg Smith : > As far as what real-world apps have that profile, I like SSDs for small to > medium web applications that have to be responsive, where the user shows up > and wants their randomly distributed and uncached data with minimal latency. > SSDs can also be used effectively as se

Re: [PERFORM] SSD + RAID

2009-11-13 Thread Greg Smith
Brad Nicholson wrote: Out of curiosity, what are those narrow use cases where you think SSD's are the correct technology? Dave Crooke did a good summary already, I see things like this: * You need to have a read-heavy app that's bigger than RAM, but not too big so it can still fit on SSD * You

Re: [PERFORM] SSD + RAID

2009-11-13 Thread Fernando Hevia
> -Mensaje original- > Laszlo Nagy > > My question is about the last option. Are there any good RAID > cards that are optimized (or can be optimized) for SSD > drives? Do any of you have experience in using many cheaper > SSD drives? Is it a bad idea? > > Thank you, > >Laszlo >

  1   2   >