[GENERAL] postgresql.conf
Hey Guys, I am setting up a new dedicated Postgres server, and will serve about 60 databases to a web site serving 250,000 people at the rate of about 20,000 a day. That may all be irrellevent though for the purposes of this conversation. The main thing about the application is that we're talking about lots and lots of little transactions and onyl a few big ones. The machine in question will do nothing but serve databases. It's a dual 3.2Ghz Xeon with 100GB or 15K RPM RAID 5 and 8 GB of RAM. I'd like to configure it to get the most out of the server possible as far as shared memory, sort memore, etc. I haven't found a lot of documentation on this. By the way, for some reason Postgres 7.4.x wouldn't install properly - had a problem with initdb - had to use 7.3.6 Any help is greatly appreciated. -John ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [GENERAL] postgresql.conf
I'm running Red Hat Enterprise ES with all the most recent updates. The error - in initdb - was that the system couldn't find "ascii_and_mic" libraries. 7.3.6 ran without a hitch. The RAID 1+0 - is that a stripped / mirrored condifuration? How big of a difference will that make in performance do you think - are we talking 10 % or leaps and bounds? The help is much appreciated - I'm reading that tuning manual now. -John On Wed, 22 Dec 2004 10:59:43 -0500, Frank D. Engel, Jr. <[EMAIL PROTECTED]> wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > > On Dec 22, 2004, at 10:15 AM, John Cunningham wrote: > > > like to configure it to get the most out of the server possible as far > > as shared memory, sort memore, etc. I haven't found a lot of > > documentation on this. > > http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html > > - --- > Frank D. Engel, Jr. <[EMAIL PROTECTED]> > > $ ln -s /usr/share/kjvbible /usr/manual > $ true | cat /usr/manual | grep "John 3:16" > John 3:16 For God so loved the world, that he gave his only begotten > Son, that whosoever believeth in him should not perish, but have > everlasting life. > $ > -BEGIN PGP SIGNATURE- > Version: GnuPG v1.2.4 (Darwin) > > iD8DBQFByZnw7aqtWrR9cZoRAgl8AJ9PcpktGBeA/Oboh5+ZKPwS1niP9QCfZaL3 > yf1Yvav5baMeQZOjGmvmk6E= > =Hnti > -END PGP SIGNATURE- > > ___ > $0 Web Hosting with up to 120MB web space, 1000 MB Transfer > 10 Personalized POP and Web E-mail Accounts, and much more. > Signup at www.doteasy.com > > ---(end of broadcast)--- > TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match > ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [GENERAL] postgresql.conf
OK Guys - here's the config file as I've writtten it. I'll paste in the whole thing before, but this is the important stuff: max_connections = 256 shared_buffers = 32768 # (256 MB) sort_mem = 1024 # min 64, size in KB fsync = No wal_sync_method = fsync # the default varies across platforms: effective_cache_size = 786432 # (6 GB) random_page_cost = 2# units are one sequential page fetch cost I am seriously considering breaking the machine all the way down and changing to a stripped / mirrored config if that will be the fastest way to run it. Need advice on that. Here's the config file: # # # Connection Parameters # tcpip_socket = false #ssl = false max_connections = 256 superuser_reserved_connections = 2 port = 5432 #hostname_lookup = false #show_source_port = false #unix_socket_directory = '' #unix_socket_group = '' #unix_socket_permissions = 0777 # octal #virtual_host = '' #krb_server_keyfile = '' # # Shared Memory Size # shared_buffers = 32768 # min max_connections*2 or 16, 8KB each #max_fsm_relations = 1000 # min 10, fsm is free space map, ~40 bytes #max_fsm_pages = 1 # min 1000, fsm is free space map, ~6 bytes #max_locks_per_transaction = 64 # min 10 #wal_buffers = 8# min 4, typically 8KB each # # Non-shared Memory Sizes # sort_mem = 1024 # min 64, size in KB #vacuum_mem = 8192 # min 1024, size in KB # # Write-ahead log (WAL) # #checkpoint_segments = 3# in logfile segments, min 1, 16MB each #checkpoint_timeout = 300 # range 30-3600, in seconds # #commit_delay = 0 # range 0-10, in microseconds #commit_siblings = 5# range 1-1000 # fsync = No wal_sync_method = fsync # the default varies across platforms: # # fsync, fdatasync, open_sync, or open_datasync #wal_debug = 0 # range 0-16 # # Optimizer Parameters # #enable_seqscan = true #enable_indexscan = true #enable_tidscan = true #enable_sort = true #enable_nestloop = true #enable_mergejoin = true #enable_hashjoin = true effective_cache_size = 786432 # typically 8KB each random_page_cost = 2# units are one sequential page fetch cost #cpu_tuple_cost = 0.01 # (same) #cpu_index_tuple_cost = 0.001 # (same) #cpu_operator_cost = 0.0025 # (same) #default_statistics_target = 10 # range 1-1000 # # GEQO Optimizer Parameters # #geqo = true #geqo_selection_bias = 2.0 # range 1.5-2.0 #geqo_threshold = 11 #geqo_pool_size = 0 # default based on tables in statement, # range 128-1024 #geqo_effort = 1 #geqo_generations = 0 #geqo_random_seed = -1 # auto-compute seed # # Message display # #server_min_messages = notice # Values, in order of decreasing detail: # debug5, debug4, debug3, debug2, debug1, # info, notice, warning, error, log, fatal, # panic #client_min_messages = notice # Values, in order of decreasing detail: # debug5, debug4, debug3, debug2, debug1, # log, info, notice, warning, error #silent_mode = false #log_connections = false #log_pid = false #log_statement = false #log_duration = false #log_timestamp = false #log_min_error_statement = panic # Values in order of increasing severity: # debug5, debug4, debug3, debug2, debug1, # info, notice, warning, error, panic(off) #debug_print_parse = false #debug_print_rewritten = false #debug_print_plan = false #debug_pretty_print = false #explain_pretty_print = true # requires USE_ASSERT_CHECKING #debug_assertions = true # # Syslog # #syslog = 0 # range 0-2 #syslog_facility = 'LOCAL0' #syslog_ident = 'postgres' # # Statistics # #show_parser_stats = false #show_planner_stats = false #show_executor_stats = false #show_statement_stats = false # requires BTREE_BUILD_STATS #show_btree_build_stats = false # # Access statistics collection # #stats_start_collector = true #stats_reset_on_server_start = true #stats_command_string = false #stats_row_level = false #stats_block_level = false # # Lock Tracing # #trace_notify = false # requires LOCK_DEBUG #trace_locks = false #trace_userlocks = false #trace_lwlocks = false #debug_deadlocks = false #trace_lock_oidmin = 16384 #trace_lock_table = 0 # # Misc # #autocommit = true #dynamic_library_path = '$libdir' #search_path = '$user,public' #datestyle = 'iso, us' #timezone = unknown # actually, defaults to TZ environment setting #australian_timezones = false #client_encoding = sql_ascii# actually, defaults to database encoding #authentication_timeout = 60# 1-600, in seconds #deadlock_tim
Re: [GENERAL] postgresql.conf
The server is a DELL Poweredge 2650 with it's built in RAID - 4 disks currently in a RAID 5 config. I will check on the battery backup. I'm putting this server together and rebuilding our overall db structure all at the same time, so I have a good amount of flexiblity. I realized I was not taking as much advantage of the machine as possible before - hence the call out to the list. I'll look in to the persistent connections - last time we built this it was several versions ago and the overall attitude was that it didn't work terrible well. Is this a PG setting or something in PHP? The shared buffers was a big concern - I've read that there's a limit that helps, but as the machine will only do DB transactions, I don't know what else to do with the RAM. It's intended for PG's use. Obviously I'd rather have fsync on - I was really looking for some opinions on this. Better safe than sorry - but I am trying to sqeeze every bit of juice possible out of this machine. The machine has 4 drives - should I do a RAID 1+0 or a 5? Most of the searching is char fields then linking ids from one table to another. BTW - this is one of the best discussions I've been on - glad everyone can be so helpful. Thanks! -John On Wed, 22 Dec 2004 12:08:10 -0600, Scott Marlowe <[EMAIL PROTECTED]> wrote: > On Wed, 2004-12-22 at 11:30, John Cunningham wrote: > > OK Guys - here's the config file as I've writtten it. > > > > I'll paste in the whole thing before, but this is the important stuff: > > > > max_connections = 256 > > Are you using a connection pooling scheme (jdbc based pooling, pgpool, > etc...)? If not, you probably should, and then drop the max connections > to something much smaller, like 30 or 40 or so. If you can. > > > shared_buffers = 32768 # (256 MB) > > That's really high, even for a machine with 6+ gigs of ram. Unless > you're working data set is that big, it's too big. IF, on average, > you're working with smaller amounts of data at a time, it might be > better to drop it down to 5000 to 1. Few, if any benchmarks have > shown an improvement at settings over 1. OTOH, you might be the one > person out of a thousand or so who needs larger shared_buffers. Note > that shared buffers aren't cache, and when the last backend referencing > a particular data set in memory stops referencing it, the data it > dropped and the buffer memory released back to the pool, so to speak. > The kernel is generally better at caching than postgresql anyway. > > With 8.0's ARC cache algorithm in place, it might be time for someone to > start testing postgresql with a persistant buffer cache (i.e. make it > hold on to the old data sets intead of freeing up the space.) > > > sort_mem = 1024 # min 64, size in KB > > You can probably up this a bit, especially if you pool your > connections. Try 8192 for a starting point. Setting this too large can > be dangerous to the health of your OS, since you can starve the OS for > memory and make it start swapping processes out to come up with sort_mem > > > fsync = No > > Not really safe, and not the performance gain it once was, if I remember > a post from Tom recently correctly. On the other hand, IDE disks do > this by design, so if you were using those (with their cache enabled) > you'd be in the same boat. > > > wal_sync_method = fsync # the default varies across platforms: > > effective_cache_size = 786432 # (6 GB) > > random_page_cost = 2# units are one sequential page fetch cost > > You can probably drop this down to 1.2 to 1.4 or so on a machine with a > fast disk subsystem and caching controller. > > > I am seriously considering breaking the machine all the way down and > > changing to a stripped / mirrored config if that will be the fastest > > way to run it. Need advice on that. Here's the config file: > > Do you have a battery backed caching raid controller? If not, that's > one of the first steps to better performance. After that, if you've got > lots of disks, a RAID 5 or RAID 1+0 should both be pretty fast. If > you've got <8 or so disks, the RAID 1+0 will normally be faster, > assuming your RAID controller handles that configuration well. Some > older / cheaper controllers can't parallelize their I/O and run the same > speed in 1+0 as they would in plain old 1. > > > > #max_fsm_relations = 1000 # min 10, fsm is free space map, ~40 bytes > > #max_fsm_pages = 1 # min 1000, fsm is free space map, ~6 bytes > > Are you sure you're vacuuming often enough and that these settings are > high enough? What does 'vacuum
Re: [GENERAL] pgpool
Hmmm - so to bring one more idea here... The usage of each of these systems will be different on different days - Monday may have high usage on one or two databases and Wednesday may be high on 20 others. This makes me thing that configuring thigs this way would mean that pgpool would be limiting one connection for each database / user combination - whereas leaving the connections standard would mean that if one set was busy it could take multiple connections simultaneously. This seems to be another knock against pgpool - or am I way off? -John On Thu, 13 Jan 2005 22:06:12 +0100, Peter Eisentraut <[EMAIL PROTECTED]> wrote: > John Cunningham wrote: > > concerned that if I drop the number of connections to less than the > > number of databases I have, that pgpool would open the limit of > > connections, hold them open and not allow any connections to the > > remaining databases. Is this a concern? If I set up pgpool will I > > have to have the same number of connections as I have databases? > > That depends on how you configure pgpool. pgpool is not aware of the > connection limit count in the PostgreSQL server, so it will happily > open connections until there are no more slots available. > > pgpool will require max_pool * num_init_children connection slots. > max_pool should be the number of database/user combinations you use > (300 in your case, assuming only one database user account), and > num_init_children should be on the order of how many concurrent > connections you expect to each combination ("several" in your case). > So you should have at least 300 * several PostgreSQL connection slots, > which is probably more than the 1000 or so that is the default. > > -- > Peter Eisentraut > http://developer.postgresql.org/~petere/ > ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[GENERAL] pgpool
I was considering putting pgpool in to place and was hoping to hear some feedback from those who use it. I am mostly concerned about the configuration I have. In my setup, any one database server contains between 100 and 300 databases on it, each of which may be accessed at any time by one of several web servers. The database servers I use are monsters - Dual Xeon 3.2 with 8GB of RAM. As I was querying this group earlier as to postgresql.conf tweaks I could do to speed things up the idea of setting fewer connections and using pgpool was brought up repeatedly. I am concerned that if I drop the number of connections to less than the number of databases I have, that pgpool would open the limit of connections, hold them open and not allow any connections to the remaining databases. Is this a concern? If I set up pgpool will I have to have the same number of connections as I have databases? Your input is appreciated. -John ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] pgpool
So - would it then be worth doing pgpool? On Sat, 15 Jan 2005 11:12:04 +0900 (JST), Tatsuo Ishii <[EMAIL PROTECTED]> wrote: > > John Cunningham wrote: > > > concerned that if I drop the number of connections to less than the > > > number of databases I have, that pgpool would open the limit of > > > connections, hold them open and not allow any connections to the > > > remaining databases. Is this a concern? If I set up pgpool will I > > > have to have the same number of connections as I have databases? > > > > That depends on how you configure pgpool. pgpool is not aware of the > > connection limit count in the PostgreSQL server, so it will happily > > open connections until there are no more slots available. > > > > pgpool will require max_pool * num_init_children connection slots. > > > > max_pool should be the number of database/user combinations you use > > (300 in your case, assuming only one database user account), > > Not really. If a user connects to pgpool and all onnection slots are > already full, then pgpool will release the oldest connection slot and > reuse it for the new connection. So even if there are 300 > database/user combinations, it's ok to set max_pool as low as, for > example, 4. Of course this will have unwanted side effect in that > connection caches are not very well kept, though. > -- > Tatsuo Ishii > > > and > > num_init_children should be on the order of how many concurrent > > connections you expect to each combination ("several" in your case). > > So you should have at least 300 * several PostgreSQL connection slots, > > which is probably more than the 1000 or so that is the default. > > > > -- > > Peter Eisentraut > > http://developer.postgresql.org/~petere/ > > > > ---(end of broadcast)--- > > TIP 4: Don't 'kill -9' the postmaster > > > ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])