Re: [PERFORM] Experience with HP Smart Array P400 and SATA drives?

2008-12-11 Thread Matthew Wakeling

On Wed, 10 Dec 2008, Greg Smith wrote:
I'd be interested in recommendations for RAID cards for small SATA systems. 
It's not anything to do with Postgres - I'm just intending to set up a 
little four-drive array for my home computer, with cheap 1TB SATA drives.


Then why are you thinking of RAID cards?  On a Linux only host, you might as 
well just get a standard cheap multi-port SATA card that's compatible with 
the OS, plug the four drives in, and run software RAID.  Anything else you 
put in the middle is going to add complications in terms of things like 
getting SMART error data from the drives, and the SW RAID will probably be 
faster too.


A great source for checking Linux compatibility is 
http://linux-ata.org/driver-status.html


Thanks, that is the kind of info I was looking for. It looks like most 
sensible SATA controller manufacturers are converging towards the open 
ahci controller standard, which is useful.


Matthew

--
The best way to accelerate a Microsoft product is at 9.8 metres per second
per second.
- Anonymous

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Need help with 8.4 Performance Testing

2008-12-11 Thread Josh Berkus



I would expect higher shared_buffers to raise the curve before the first
breakpoint but after the first breakpoint make the drop steeper and deeper.
The equilibrium where the curve becomes flatter should be lower.


On SpecJAppserver specifically, I remember seeing a drop when the 
database size grew beyond the size of shared_buffers.


--Josh

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Need help with 8.4 Performance Testing

2008-12-11 Thread Josh Berkus

Greg Stark wrote:

On Sun, Dec 7, 2008 at 7:38 PM, Josh Berkus  wrote:

Also, the following patches currently still have bugs, but when the bugs are
fixed I'll be looking for performance testers, so please either watch the
wiki or watch this space:
...
-- posix_fadvise (Gregory Stark)


Eh? Quite possibly but none that I'm aware of. The only problem is a
couple of trivial bits of bitrot. I'll a post an update now if you
want.


I'm just going off the status on hackers archives.  I didn't actually 
try to build it before posting that.


If you have an updated patch, link on CommitFest page?  Thanks.

--Josh

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Need help with 8.4 Performance Testing

2008-12-11 Thread Josh Berkus

Tom,


Hmm ... I wonder whether this means that the current work on
parallelizing I/O (the posix_fadvise patch in particular) is a dead
end.  Because what that is basically going to do is expend more CPU
to improve I/O efficiency.  If you believe this thesis then that's
not the road we want to go down.


Nope.  People who adminster small databases keep forgetting that there 
is another class of users with multiple terabytes of data.  Those users 
aren't getting away from spinning disk anytime in the next 5 years.


Additionally, but making PostgreSQL work better with OS-based FS 
optimization, we are well positioned to take advantage of any special 
features which Linux, Solaris, BSD etc. add to utilize new hardware like 
SSDs.  posix_fadvise is a great example of this.


--Josh

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Need help with 8.4 Performance Testing

2008-12-11 Thread James Mansion

Scott Marlowe wrote:

involves tiny bits of data scattered throughout the database.  Our
current database is about 20-25 Gig, which means it's quickly reaching
the point where it will not fit in our 32G of ram, and it's likely to
grow too big for 64Gig before a year or two is out.
  

...

I wonder how many hard drives it would take to be CPU bound on random
access patterns?  About 40 to 60?  And probably 15k / SAS drives to
  
Well, its not a very big database and you're seek bound - so what's 
wrong with the latest
generation flash drives?  They're perfect for what you want to do it 
seems, and you can

probably get what you need using the new ARC cache on flash stuff in ZFS.



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Need help with 8.4 Performance Testing

2008-12-11 Thread Greg Smith

On Tue, 9 Dec 2008, Scott Carey wrote:

My system is now CPU bound, the I/O can do sequential reads of more than 
1.2GB/sec but Postgres can't do a seqscan 30% as fast because it eats up 
CPU like crazy just reading and identifying tuples... In addition to the 
fadvise patch, postgres needs to merge adjacent I/O's into larger ones 
to reduce the overhead.


Do you have any profile data to back that up?  I think it's more likely 
that bottlenecks are on the tuple processing side of things as you also 
suggested.  There's really no sense guessing; one quick session with 
something like oprofile would be more informative than any amount of 
speculation on what's going on.


Additionally, the "If your operating system has any reasonable caching 
itself" comment earlier in this conversation --- Linux (2.6.18, Centos 
5.2) does NOT.  I can easily make it spend 100% CPU in system time 
trying to figure out what to do with the system cache for an hour.


Have you ever looked into how much memory ends up showing up as 
"Writeback" in /proc/meminfo when this happens?  The biggest problem with 
that kernel out of the box on the kind of workload you're describing is 
that it will try and buffer way too much on the write side by default, 
which can easily get you into the sort of ugly situations you describe. 
I regularly adjust that kernel to lower dirty_ratio in particular 
dramatically from the default of 40 to keep that from happening.  I did a 
whole blog entry on one of those if you're not familiar with this 
particular bit of painful defaults already: 
http://notemagnet.blogspot.com/2008/08/linux-write-cache-mystery.html


I feel confident in saying that in about a year, I could spec out a 
medium sized budget for hardware ($25k) for almost any postgres setup 
and make it almost pure CPU bound.


The largest database I manage is running on a Sun X4500, which is right at 
that price point.  I've never seen it not be CPU bound.  Even though 
people are pulling data that's spread across a few TB of disk, the only 
time I ever see it straining to keep up with something there's always a 
single CPU pegged.


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance