Re: [PERFORM] Query take 101 minutes, help, please

2005-09-07 Thread Alex Hayward
On Wed, 7 Sep 2005, Meetesh Karia wrote:

> PG is creating the union of January, February and March tables first and
> that doesn't have an index on it. If you're going to do many queries using
> the union of those three tables, you might want to place their contents into
> one table and create an index on it.
>
> Otherwise, try something like this:
>
> SELECT "Rut Cliente"
> FROM "Internet_Abril"
> WHERE "Rut Cliente" NOT IN (SELECT "Rut Cliente" FROM
> "Internet_Enero")
> AND "Rut Cliente" NOT IN (SELECT "Rut Cliente" FROM
> "Internet_Febrero")
> AND "Rut Cliente" NOT IN (SELECT "Rut Cliente" FROM
> "Internet_Marzo");

You may also wish to try:

SELECT "Rut Cliente"
FROM "Internet_Abril"
WHERE NOT EXISTS
  (SELECT 1 FROM "Internet_Enero"
   WHERE "Internet_Enero"."Rut Cliente"="Internet_Abril"."Rut Cliente")
AND NOT EXISTS
  (SELECT 1 FROM "Internet_Febrero"
   WHERE "Internet_Febrero"."Rut Cliente"="Internet_Abril"."Rut Cliente")
AND NOT EXISTS
  (SELECT 1 FROM "Internet_Marzo"
   WHERE "Internet_Marzo"."Rut Cliente"="Internet_Abril"."Rut Cliente")

which will probably scan the indexes on the January, February and March
indexes once for each row in the April table.


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


pgsql-performance@postgresql.org

2006-03-20 Thread Alex Hayward
On Mon, 20 Mar 2006, Jim C. Nasby wrote:

> No, this is perfectly fine. Inactive memory in FreeBSD isn't the same as
> Free. It's the same as 'active' memory except that it's pages that
> haven't been accessed in X amount of time (between 100 and 200 ms, I
> think). When free memory starts getting low, FBSD will start moving
> pages from the inactive queue to the free queue (possibly resulting in
> writes to disk along the way).
>
> IIRC, Cache is the directory cache, and Buf is disk buffers, which is
> somewhat akin to shared_buffers in PostgreSQL.

I don't believe that's true. I'm not an expert in FreeBSD's VM internals,
but this is how I believe it works:

Active pages are pages currently mapped in to a process's address space.

Inactive pages are pages which are marked dirty (must be written to
backing store before they can be freed) and which are not mapped in to a
process's address. They're still associated with a VM object of some kind
- like part of a process's virtual address space or a as part of the cache
for a file on disk. If it's still part of a process's virtual address
space and is accessed a fault is generated. The page is then put back in
to the address mappings.

Cached pages are like inactive pages but aren't dirty. Then can be either
re-mapped or freed immediately.

Free pages are properly free. Wired pages are unswappable. Buf I'm not
sure about. It doesn't represent that amount of memory used to cache files
on disk, I'm sure of that. The sysctl -d description is 'KVA memory used
for bufs', so I suspect that it's the amount of kernel virtual address
space mapped to pages in the 'active', 'inactive' and 'cache' queues.

-- 
  Alex Hayward
  Seatbooker


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Best OS & Configuration for Dual Xeon w/4GB & Adaptec

2006-03-21 Thread Alex Hayward
On Tue, 21 Mar 2006, Jim C. Nasby wrote:

> On Tue, Mar 21, 2006 at 11:03:26PM +1200, Mark Kirkwood wrote:
> >
> > So its really all about accounting, in a sense - whether pages end up in
> > the 'Buf' or 'Inactive' queue, they are still cached!
>
> So what's the difference between Buf and Active then? Just that active
> means it's a code page, or that it's been directly mapped into a
> processes memory (perhaps via mmap)?

I don't think that Buf and Active are mutually exclusive. Try adding up
Active, Inactive, Cache, Wired, Buf and Free - it'll come to more than
your physical memory.

Active gives an amount of physical memory. Buf gives an amount of
kernel-space virtual memory which provide the kernel with a window on to
pages in the other categories. In fact, I don't think that 'Buf' really
belongs in the list as it doesn't represent a 'type' of page at all.

-- 
  Alex Hayward
  Seatbooker

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Hardware: HP StorageWorks MSA 1500

2006-04-21 Thread Alex Hayward
On Thu, 20 Apr 2006, Mikael Carneholm wrote:

> We're going to get one for evaluation next week (equipped with dual
> 2Gbit HBA:s and 2x14 disks, iirc). Anyone with experience from them,
> performance wise?

We (Seatbooker) use one. It works well enough. Here's a sample bonnie
output:

---Sequential Output ---Sequential Input-- 
--Random--
-Per Char- --Block--- -Rewrite-- -Per Char- --Block--- 
--Seeks---
  MachineMB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec 
%CPU

  16384 41464 30.6 41393 10.0 16287  3.7 92433 83.2 119608 18.3 674.0  
0.8

which is hardly bad (on a four 15kRPM disk RAID 10 with 2Gbps FC).
Sequential scans on a table produce about 40MB/s of IO with the
'disk' something like 60-70% busy according to FreeBSD's systat.

Here's diskinfo -cvt output on a not quite idle system:

/dev/da1
512 # sectorsize
59054899200 # mediasize in bytes (55G)
115341600   # mediasize in sectors
7179# Cylinders according to firmware.
255 # Heads according to firmware.
63  # Sectors according to firmware.

I/O command overhead:
time to read 10MB block  0.279395 sec   =0.014 msec/sector
time to read 20480 sectors  11.864934 sec   =0.579 msec/sector
calculated command overhead =0.566 msec/sector

Seek times:
Full stroke:  250 iter in   0.836808 sec =3.347 msec
Half stroke:  250 iter in   0.861196 sec =3.445 msec
Quarter stroke:   500 iter in   1.415700 sec =2.831 msec
Short forward:400 iter in   0.586330 sec =1.466 msec
Short backward:   400 iter in   1.365257 sec =3.413 msec
Seq outer:   2048 iter in   1.184569 sec =0.578 msec
Seq inner:   2048 iter in   1.184158 sec =0.578 msec
Transfer rates:
outside:   102400 kbytes in   1.367903 sec =74859 kbytes/sec
middle:102400 kbytes in   1.472451 sec =69544 kbytes/sec
inside:102400 kbytes in   1.521503 sec =67302 kbytes/sec


It (or any FC SAN, for that matter) isn't an especially cheap way to get
storage. You don't get much option if you have an HP blade enclosure,
though.

HP's support was poor. Their Indian call-centre seems not to know much
about them and spectacularly failed to tell us if and how we could connect
this (with the 2/3-port FC hub option) to two of our blade servers, one of
which was one of the 'half-height' ones which require an arbitrated loop.
We ended up buying a FC switch.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Hardware: HP StorageWorks MSA 1500

2006-04-24 Thread Alex Hayward
On Mon, 24 Apr 2006, Mark Kirkwood wrote:

> If I'm reading the original post correctly, the biggest issue is likely
> to be that the 14 disks on each 2Gbit fibre channel will be throttled to
> 200Mb/s by the channel , when in fact you could expect (in RAID 10
> arrangement) to get about 7 * 70 Mb/s = 490 Mb/s.

The two controllers and two FC switches/hubs are intended for redundancy,
rather than performance, so there's only one 2Gbit channel. I don't know
if its possible to use both in parallel to get better performance.

I believe it's possible to join two or more FC ports on the switch
together, but as there's only port going to the controller internally this
presumably wouldn't help.

There are two SCSI U320 buses, with seven bays on each. I don't know what
the overhead of SCSI is, but you're obviously not going to get 490MB/s
for each set of seven even if the FC could do it.

Of course your database may not spend all day doing sequential scans one
at a time over 14 disks, so it doesn't necessarily matter...


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Hardware: HP StorageWorks MSA 1500

2006-04-27 Thread Alex Hayward
On Tue, 25 Apr 2006, Mark Kirkwood wrote:

> Mikael Carneholm wrote:
>
> >
> >> There are two SCSI U320 buses, with seven bays on each. I don't know
> > what the overhead of SCSI is, but you're obviously not going to get >
> > 490MB/s for each set of seven even if the FC could do it.
> >
>
> You should be able to get close to 300Mb/s on each SCSI bus - provided
> the PCI bus on the motherboard is 64-bit and runs at 133Mhz or better
> (64-bit and 66Mhz give you a 524Mb/s limit).

I've no idea if the MSA1500's controllers use PCI internally. Obviously
this argument applies to the PCI bus you plug your FC adapters in to,
though.

AIUI it's difficult to get PCI to actually give you it's theoretical
maximum bandwidth. Those speeds are still a lot more than 200MB/s, though.

> >> Of course your database may not spend all day doing sequential scans
> > one at a time over 14 disks, so it doesn't necessarily matter...
> >
>
> Yeah, it depends on the intended workload, but at some point most
> databases end up IO bound... so you really want to ensure the IO system
> is as capable as possible IMHO.

IO bound doesn't imply IO bandwidth bound. 14 disks doing a 1ms seek
followed by an 8k read over and over again is a bit over 100MB/s. Adding
in write activity would make a difference, too, since it'd have to go to
at least two disks. There are presumably hot spares, too.

I still wouldn't really want to be limited to 200MB/s if I expected to use
a full set of 14 disks for active database data where utmost performance
really matters and where there may be some sequential scans going on,
though.

> > That's probably true, but *knowing* that the max seq scan speed is that
> > high gives you some confidence (true or fake) that the hardware will be
> > sufficient the next 2 years or so. So, if dual 2GBit FC:s still don't
> > deliver more than 200Mb/s, what does?
> >
>
> Most modern PCI-X or PCIe RAID cards will do better than 200Mb/s (e.g.
> 3Ware 9550SX will do ~800Mb/s).
>
> By way of comparison my old PIII with a Promise TX4000 plus 4 IDE drives
> will do 215Mb/s...so being throttled to 200Mb/s on modern hardware seems
> unwise to me.

Though, of course, these won't do many of the things you can do with a SAN
- like connect several computers, or split a single array in to two pieces
and have two computers access them as if they were separate drives, or
remotely shut down one database machine and then start up another using
the same disks and data. The number of IO operations per second they can
do is likely to be important, too...possibly more important.

There's 4GB FC, and so presumably 4GB SANs, but that's still not vast
bandwidth. Using multiple FC ports is the other obvious way to do it with
a SAN. I haven't looked, but I suspect you'll need quite a budget to get
that...

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] PostgreSQL performance issues

2006-08-30 Thread Alex Hayward
On Wed, 30 Aug 2006, Willo van der Merwe wrote:

> Merlin Moncure wrote:
> > On 8/29/06, Willo van der Merwe <[EMAIL PROTECTED]> wrote:
> >
> >>  and it has 743321 rows and a explain analyze select count(*) from
> >> property_values;
> >>
> >
> > you have a number of options:
> All good ideas and I'll be sure to implement them later.
>
> > I am curious why you need to query the count of records in the log
> > table to six digits of precision.
> I'm not with you you here.
> I'm drawing statistic for the my users on a per user basis in real-time,
> so there are a couple of where clauses attached.

Most of the advice so far has been aimed at improving the performance of
the query you gave. If this query isn't representative of your load then
you'll get better advice if you post the queries you are actually making
along with EXPLAIN ANALYZE output.

> Hi Merlin,
>
> This was just an example. All queries have slowed down. Could it be that
> I've reached some cut-off and now my disk is thrashing?
>
> Currently the load looks like this:
> Cpu0  : 96.8% us,  1.9% sy,  0.0% ni,  0.3% id,  0.0% wa,  0.0% hi,  1.0% si
> Cpu1  : 97.8% us,  1.6% sy,  0.0% ni,  0.3% id,  0.0% wa,  0.0% hi,  0.3% si
> Cpu2  : 96.8% us,  2.6% sy,  0.0% ni,  0.3% id,  0.0% wa,  0.0% hi,  0.3% si
> Cpu3  : 96.2% us,  3.2% sy,  0.0% ni,  0.3% id,  0.0% wa,  0.0% hi,  0.3% si

It seems to be a sort of standing assumption on this list that databases
are much larger than memory and that database servers are almost always IO
bound. This isn't always true, but as we don't know the size of your
database or working set we can't tell. You'd have to look at your OS's IO
statistics to be sure, but it doesn't look to me to be likely that you're
IO bound.

If there are significant writes going on then it may also be interesting
to know your context switch rate and whether dropping your foreign key
constraint makes any difference. IIRC your foreign key constraint will
result in the row in log_sites being locked FOR UPDATE and cause updates
and inserts into your log table for a particular site to be serialized (I
may be out of date on this, it's a while since I heavily used foreign
keys).

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] performance problems.

2006-08-30 Thread Alex Hayward
On Wed, 30 Aug 2006, Jim C. Nasby wrote:

> On Wed, Aug 30, 2006 at 10:10:28AM -0400, Vivek Khera wrote:
> > effective_cache_size = 27462# `sysctl -n
> > vfs.hibufspace` / 8192 (BLKSZ)
> > random_page_cost = 2
>
> You misunderstand how effective_cache_size is used. It's the *only*
> memory factor that plays a role in cost estimator functions. This means
> it should include the memory set aside for caching in shared_buffers.
>
> Also, hibufspace is only talking about filesystem buffers in FreeBSD,
> which AFAIK has nothing to do with total memory available for caching,
> since VM pages are also used to cache data.

I believe it's not talking about quantities of buffers at all, but about
kernel virtual address space. It's something like the amount of kernel
virtual address space available for mapping buffer-cache pages in to
kernel memory. It certainly won't tell you (or even approximate) how much
PostgreSQL data is being cached by the OS. Cached PostgreSQL data will
appear in the active, inactive and cached values - and (AFAIK) there isn't
any distinction between file-backed pages and swap-backed pages amongst
those.

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Performance Bottleneck

2004-08-10 Thread Alex Hayward
On Sun, 8 Aug 2004, Matt Clark wrote:

> > And this is exactly where the pgpool advantage lies.
> > Especially with the
> > TPC-W, the Apache is serving a mix of PHP (or whatever CGI
> > technique is
> > used) and static content like images. Since the 200+ Apache
> > kids serve
> > any of that content by random and the emulated browsers very much
> > encourage it to ramp up MaxClients children by using up to 4
> > concurrent
> > image connections, one does end up with MaxClients DB
> > connections that
> > are all relatively low frequently used. In contrast to that the real
> > pgpool causes lesser, more active DB connections, which is better for
> > performance.
>
> There are two well-worn and very mature techniques for dealing with the
> issue of web apps using one DB connection per apache process, both of which
> work extremely well and attack the issue at its source.
>
> 1)Use a front-end caching proxy like Squid as an accelerator.  Static
> content will be served by the accelerator 99% of the time.  Additionally,
> large pages can be served immediately to the accelerator by Apache, which
> can then go on to serve another request without waiting for the end user's
> dial-up connection to pull the data down.  Massive speedup, fewer apache
> processes needed.

Squid also takes away the work of doing SSL (presuming you're running it
on a different machine). Unfortunately it doesn't support HTTP/1.1 which
means that most generated pages (those that don't set Content-length) end
up forcing squid to close and then reopen the connection to the web
server.

Because you no longer need to worry about keeping Apache processes around
to dribble data to people on the wrong end of modems you can reduce
MaxClients quite a bit (to, say, 10 or 20 per web server). This keeps the
number of PostgreSQL connections down. I'd guess that above some point
you're going to reduce performance by increasing MaxClients and running
queries in parallel rather than queueing the request and doing them
serially.

I've also had some problems when Squid had a large number of connections
open (several thousand); though that may have been because of my
half_closed_clients setting. Squid 3 coped a lot better when I tried it
(quite a few months ago now - and using FreeBSD and the special kqueue
system call) but crashed under some (admittedly synthetic) conditions.

> I'm sure pgpool and the like have their place, but being band-aids for
> poorly configured websites probably isn't the best use for them.

You still have periods of time when the web servers are busy using their
CPUs to generate HTML rather than waiting for database queries. This is
especially true if you cache a lot of data somewhere on the web servers
themselves (which, in my experience, reduces the database load a great
deal). If you REALLY need to reduce the number of connections (because you
have a large number of web servers doing a lot of computation, say) then
it might still be useful.

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org