Re: [GENERAL] 8.2.3: Server crashes on Windows using Eclipse/Junit

2007-10-21 Thread Magnus Hagander
Trevor Talbot wrote:
> On 10/20/07, Rainer Bauer <[EMAIL PROTECTED]> wrote:
> 
>> Anyway, the problem are the no. of semaphores created by Postgres:
>> Every backend creates at least 4* semaphores. Just
>> increase  to an unusual high value (say 1) and
>> start creating new connections while monitoring the handle count.
> 
> Hmm, they're actually the same semaphores, so the only cost is for
> slots in each process's handle table, which comes from kernel paged
> pool.  Testing shows I can easily create about 30 million handles to a
> given object on this machine.  This is under win2003 with 1.25GB RAM,
> which gives it a paged pool limit of 352MB.
> 
> I tried going up to 2 max_connections, and still blew postmaster's
> VM space long before paged pool was exhausted.  I couldn't test any
> higher values, as there's some interaction between max_connections and
> shared_buffers that prevents it from mapping the buffer contiguously.
> 
> Something's missing though, since I'm not hitting the same issue you
> are.  How are you generating the connections?  I just have an app
> calling PQconnectdb() in a loop, but I guess that's not good enough.

Yeah, something is obviously missing.. Are you guys on the exactly the
same Windows versions? WRT both version and servivepack. Anybody on x64
windows?

Another thing worth testing - check if the amount of shared memory used
makes a noticable difference. Try both very small and very large values.

I don't think the paged pool is the problem - I think it's the nonpaged
pool. Would be interesting to track that one in the failing case (using
performance monitor, up to the point where it fails). And the nonpaged
one is smaller... If that looks like it's the problem, it could be
helpful to do a pooltag trace on it (see for example
http://blogs.msdn.com/ntdebugging/archive/2006/12/18/Understanding-Pool-Consumption-and-Event-ID_3A00_--2020-or-2019.aspx)

//Magnus

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] 8.2.3: Server crashes on Windows using Eclipse/Junit

2007-10-21 Thread Magnus Hagander
Trevor Talbot wrote:
> On 10/17/07, Magnus Hagander <[EMAIL PROTECTED]> wrote:
>> On Wed, Oct 17, 2007 at 02:40:14AM -0400, Tom Lane wrote:
> 
>>> Maybe we should put an #ifdef WIN32 into guc.c to limit max_connections
>>> to something we know the platform can stand?  It'd be more comfortable
>>> if we understood exactly where the limit was, but I think I'd rather
>>> have an "I'm sorry Dave, I can't do that" than random-seeming crashes.
>> Yeayh, that's probably a good idea - except we never managed to figure out
>> where the limit is. It appears to vary pretty wildly between different
>> machines, for reasons we don't really know why (total RAM has some effect
>> on it, but that's not the only one, for example)
> 
> I tried generating idle connections in an effort to reproduce
> Laurent's problem, but I ran into a local limit instead: for each
> backend, postmaster creates a thread and burns 4MB of its 2GB address
> space.  It fails around 490.

Oh, that's interesting. That's actually a sideeffect of us increasing
the stack size for the postgres.exe executable in order to work on other
things. By default, it burns 1MB/thread, but ours will do 4MB. Never
really thought of the problem that it'll run out of address space.
Unfortunately, that size can't be changed in the CreateThread() call -
only the initially committed size can be changed there.

There are two ways to get around it - one is not using a thread for each
backend, but a single thread that handles them all and then some sync
objects around it. We originally considered this but said we won't
bother changing it because the current way is simpler, and the overhead
of a thread is tiny compared to a process. I don't think anybody even
thought about the fact that it'd run you out of address space...

The other way is to finish off win64 support :-) Which I plan to look
at, but I don't think that alone should be considered a solution.

The question is if it's worth fixing that part, if it will just fall
down for other reasons before we reach these 500 connections anyway. Can
you try having your program actually run some queries and so, and not
just do a PQconnect? To see if it falls over then, because it's been
doing more?


> Laurent's issue must depend on other load characteristics.  It's
> possible to get a trace of DLL loads, but I haven't found a
> noninvasive way of doing that.  It seems to require a debugger be
> attached.

AFAIK, it does require that, yes.

//Magnus

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] keeping an index in memory

2007-10-21 Thread Bill Moran
Rajarshi Guha <[EMAIL PROTECTED]> wrote:
>
> Hi, relating to my previous queries on doing spatial searches on 10M  
> rows, it seems that most of my queries return within 2 minutes.  
> Generally this is not too bad, though faster is always better.
> 
> Interestingly, it appears that the CUBE index for the table in  
> question is about 3GB (the table itself is about 14GB). Not knowing  
> the details of the postgres internals, I assume that when a query  
> tries to use the index, it will need to read a 3GB file. Is this a  
> correct assumption?
> 
> In such a situation, is there a way to keep the index in memory? My  
> machine has 8GB installed and currently has about 7.4GB free RAM (64  
> bit linux 2.6.9)

Free or cached/buffered?  Your OS should be using most of that to
buffer disk blocks.

> A side effect of the size of the index is that if I do a query that  
> performs a seq scan (say using cube_distance) it takes longer than  
> when an index is used, but not significantly longer. And this is on a  
> 10M row table.
> 
> What strategies do people follow when the index becomes very big?

What version of PG are you using and what is your shared_buffers setting?

With 8G of RAM, you should start with shared_buffers around 2 - 3G, if
you're using a modern version of PG.  With that much shared memory, a
large portion of that index should stay in RAM, as long as it's being
used often enough that PG doesn't swap it for other data.

-- 
Bill Moran
http://www.potentialtech.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] looking for some real world performance numbers

2007-10-21 Thread Bill Moran
snacktime <[EMAIL PROTECTED]> wrote:
>
> I'm working through the architecture design for a new product.  We
> have a small group working on this.  It's a web app that will be using
> ruby on rails.  The challenge I'm running into is that the latest
> conventional wisdom seems to be that since obviously databases don't
> scale on the web, you should just not use them at all.

Who are the people saying this?  It doesn't sound very wise to me.

Where are they proposing to put the data, if not in a database?  That's
what I'd like to know.

> I have a group
> of otherwise very bright people trying to convince me that a rdbms is
> not a good place to store relational data because eventually it won't
> scale.

What is _their_ evidence?

> And of course we don't even have version 1 of our product out
> of the door.

E.S. Raymond's "The Art of UNIX Programming": Rule #15: Write a
prototype before you optimize.

Nothing is funnier than watching people try to performance optimize
software that hasn't even been written yet.  Very few people are
smart enough to know where the performance bottlenecks will be before
they've coded anything.  If they insist on doing it wrong, at least
you'll have a good laugh.

> I'll admit we do have a very good chance of actually
> getting tons of traffic, but my position is to use a rdbms for
> relational data, and then if and when it won't scale any more, deal
> with it then.

That's sane.

> So what would really help me is some real world numbers on how
> postgresql is doing in the wild under pressure.  If anyone cares to
> throw some out I would really appreciate it.

http://people.freebsd.org/~kris/scaling/

Lots of interesting graphs on that page ... most of them seem to indicate
that RDBMS scale rather nicely.

-- 
Bill Moran
http://www.potentialtech.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] looking for some real world performance numbers

2007-10-21 Thread andy
I'll agree with Bill's response...  If they dont want a rdbms what do 
they  want?  If they know of something that scales better and is faster, 
I'll bet they can make a lot of money.  Lot of high traffic sites would 
love to hear what they think.


> conventional wisdom seems to be that since obviously databases don't
> scale on the web,

Conventional?  No, I don't think so.  If you have 200 Gig of data, 
what's going to search it faster than a rdbms?


If you have 200 Gig of data, with very intensive database queries, what 
scales better than having one web server round-robin requests to 10 
database servers?


I think the conventional wisdom is that non-database people cannot setup 
a database to run quickly to save their life.  And then blame the database.


-Andy

snacktime wrote:

I'm working through the architecture design for a new product.  We
have a small group working on this.  It's a web app that will be using
ruby on rails.  The challenge I'm running into is that the latest
conventional wisdom seems to be that since obviously databases don't
scale on the web, you should just not use them at all.  I have a group
of otherwise very bright people trying to convince me that a rdbms is
not a good place to store relational data because eventually it won't
scale.  And of course we don't even have version 1 of our product out
of the door.  I'll admit we do have a very good chance of actually
getting tons of traffic, but my position is to use a rdbms for
relational data, and then if and when it won't scale any more, deal
with it then.

So what would really help me is some real world numbers on how
postgresql is doing in the wild under pressure.  If anyone cares to
throw some out I would really appreciate it.

Chris

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] Inheritance foreign key unexpected behaviour

2007-10-21 Thread Scott Marlowe
On 10/20/07, M. van Egmond <[EMAIL PROTECTED]> wrote:
> Hi all,
>
> Im trying to use table inheritance in my database. I need it because i want
> to be able to link any object in the database to another. So i created a
> table my_object which has a serial, nothing more. All the other tables in
> the system are inherited from this my_object table. Im having difficulties
> adding foreign keys to the tables. This is my test setup:

>From the inheritance docs at
http://www.postgresql.org/docs/8.2/static/ddl-inherit.html

A serious limitation of the inheritance feature is that indexes
(including unique constraints) and foreign key constraints only apply
to single tables, not to their inheritance children. This is true on
both the referencing and referenced sides of a foreign key constraint.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] keeping an index in memory

2007-10-21 Thread Martijn van Oosterhout
On Sun, Oct 21, 2007 at 07:36:00AM -0400, Bill Moran wrote:
> What version of PG are you using and what is your shared_buffers setting?
> 
> With 8G of RAM, you should start with shared_buffers around 2 - 3G, if
> you're using a modern version of PG.  With that much shared memory, a
> large portion of that index should stay in RAM, as long as it's being
> used often enough that PG doesn't swap it for other data.

With that much memory, the index is likely to remain in memory no
matter what size shared_memory he has. Anything in shared_memory is
going to be in the system cache anyway. I wonder if there's something
else we havn't been told, like how big the actual table is and whether
there are any other large tables/indexes.

Have a nice day,
-- 
Martijn van Oosterhout   <[EMAIL PROTECTED]>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to 
> litigate.


signature.asc
Description: Digital signature


Re: [GENERAL] keeping an index in memory

2007-10-21 Thread Rajarshi Guha


On Oct 21, 2007, at 7:36 AM, Bill Moran wrote:


Rajarshi Guha <[EMAIL PROTECTED]> wrote:


Hi, relating to my previous queries on doing spatial searches on 10M
rows, it seems that most of my queries return within 2 minutes.
Generally this is not too bad, though faster is always better.

Interestingly, it appears that the CUBE index for the table in
question is about 3GB (the table itself is about 14GB). Not knowing
the details of the postgres internals, I assume that when a query
tries to use the index, it will need to read a 3GB file. Is this a
correct assumption?

In such a situation, is there a way to keep the index in memory? My
machine has 8GB installed and currently has about 7.4GB free RAM (64
bit linux 2.6.9)


Free or cached/buffered?  Your OS should be using most of that to
buffer disk blocks.


Aah, correct. Yes they are cached/buffered


A side effect of the size of the index is that if I do a query that
performs a seq scan (say using cube_distance) it takes longer than
when an index is used, but not significantly longer. And this is on a
10M row table.

What strategies do people follow when the index becomes very big?


What version of PG are you using and what is your shared_buffers  
setting?


8.2.5

My original shared_buffers setting was 128MB.


With 8G of RAM, you should start with shared_buffers around 2 - 3G, if
you're using a modern version of PG.


I can do that but I'm a little confused. Earlier postings on the list  
indicate that shared_buffers should be about 10% of the system RAM  
and that effective_cache_size can be a large fraction of RAM.


As a result I had effective_cache_size set to 2500MB

Thanks for the pointers

---
Rajarshi Guha  <[EMAIL PROTECTED]>
GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04  06F7 1BB9 E634 9B87 56EE
---
How I wish I were what I was when I wished I were what I am.



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] keeping an index in memory

2007-10-21 Thread Rajarshi Guha


On Oct 21, 2007, at 10:40 AM, Martijn van Oosterhout wrote:


On Sun, Oct 21, 2007 at 07:36:00AM -0400, Bill Moran wrote:
What version of PG are you using and what is your shared_buffers  
setting?


With 8G of RAM, you should start with shared_buffers around 2 -  
3G, if

you're using a modern version of PG.  With that much shared memory, a
large portion of that index should stay in RAM, as long as it's being
used often enough that PG doesn't swap it for other data.


With that much memory, the index is likely to remain in memory no
matter what size shared_memory he has. Anything in shared_memory is
going to be in the system cache anyway. I wonder if there's something
else we havn't been told, like how big the actual table is and whether
there are any other large tables/indexes.


The table itself is about 10M rows corresponding to 14GB. The only  
other index on this table is a btree index whose size is ~300MB. The  
machine is not running anything else.


Now, it might just be the case that given the size of the index, I  
cannot make bounding box queries (which will use the CUBE index) go  
any faster. But I am surprised that that the other type of query  
(using cube_distance which by definition must use a seq scan) is only  
slightly longer. If nothing else, scanning through 14GB of data  
should be 3 times slower than scanning through 3GB of data.



---
Rajarshi Guha  <[EMAIL PROTECTED]>
GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04  06F7 1BB9 E634 9B87 56EE
---
All laws are simulations of reality.
-- John C. Lilly



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] keeping an index in memory

2007-10-21 Thread Tom Lane
Rajarshi Guha <[EMAIL PROTECTED]> writes:
> Now, it might just be the case that given the size of the index, I  
> cannot make bounding box queries (which will use the CUBE index) go  
> any faster. But I am surprised that that the other type of query  
> (using cube_distance which by definition must use a seq scan) is only  
> slightly longer. If nothing else, scanning through 14GB of data  
> should be 3 times slower than scanning through 3GB of data.

A single index probe should not touch anything like all of the index ---
unless your data is such that the index is very non-optimally laid out.
GiST should work well if there are lots of subsets of the data that
have bounding boxes disjoint from other subsets'.  If not, maybe you
need to reconsider your data representation.

Have you done any examination of how much of the index gets touched
during a typical query?  I'd try turning on stats_block_level and see
how the delta in pg_statio_all_indexes.idx_blks_read compares to the
index size.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


[GENERAL] Photos from the PostgreSQL Conference Fall 2007

2007-10-21 Thread Daniel Browning
The PostgreSQL Conference Fall 2007 was informative, fun, and well-executed.
Thanks to Selena Deckelmann, Joshua Drake, and everyone else who made it 
happen. Here are my photos of the event:

http://db.endpoint.com/pgcon07/

--
Daniel Browning
End Point Corporation
http://www.endpoint.com/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] keeping an index in memory

2007-10-21 Thread Gregory Stark
"Rajarshi Guha" <[EMAIL PROTECTED]> writes:

> The table itself is about 10M rows corresponding to 14GB. 

Each row is on average 1.4kB ? Perhaps you should send more details of the
table definition and the typical size of each column. It's possible you have
the columns you're selecting on being stored out of line ("toasted") which
would hurt performance if you're often accessing many of those columns.

If it's not true then you may have a lot of dead space in your table which
would decrease performance. 

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] keeping an index in memory

2007-10-21 Thread Scott Marlowe
On 10/21/07, Rajarshi Guha <[EMAIL PROTECTED]> wrote:
>
> > With 8G of RAM, you should start with shared_buffers around 2 - 3G, if
> > you're using a modern version of PG.
>
> I can do that but I'm a little confused. Earlier postings on the list
> indicate that shared_buffers should be about 10% of the system RAM
> and that effective_cache_size can be a large fraction of RAM.

That was true with 7.4 and before because their cache management
wasn't very efficient.  With 8.0 and above, PostgreSQL can handle much
larger shared_buffer sizes.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[GENERAL] Explicit Named Indexes for Constraints

2007-10-21 Thread Jeff Larsen
I'm new to PosgtreSQL, a veteran with Informix. We are considering a migration.

In Informix, it is recommended to create explicit named indexes on
columns for primary and foreign keys prior to creating the
constraints. Otherwise, the server create the indexes for you with
meaningless names. This is not generally a problem, except when you
dump the schema, you get all the constraint indexes in the DDL,
exported as if they were explicitly created, but with the server
generated names. It's a mess to sort through.

What's the recommended procedure in PG?  At first glance it appears
that PG hides the implicit indexes from you at all times, including
pg_dump. So it appears that explicit index creation can be skipped
without leaving you with a mess later. Is this just a non-issue in PG?

Jeff

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [GENERAL] looking for some real world performance numbers

2007-10-21 Thread Dave Cramer

snacktime wrote:

I'm working through the architecture design for a new product.  We
have a small group working on this.  It's a web app that will be using
ruby on rails.  The challenge I'm running into is that the latest
conventional wisdom seems to be that since obviously databases don't
scale on the web, you should just not use them at all.  I have a group
of otherwise very bright people trying to convince me that a rdbms is
not a good place to store relational data because eventually it won't
scale.  And of course we don't even have version 1 of our product out
of the door.  I'll admit we do have a very good chance of actually
getting tons of traffic, but my position is to use a rdbms for
relational data, and then if and when it won't scale any more, deal
with it then.

So what would really help me is some real world numbers on how
postgresql is doing in the wild under pressure.  If anyone cares to
throw some out I would really appreciate it.

  

I've got a client doing 18M page views/ day and postgresql isn't really
sweating.

Dave

Chris

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

  




---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] 8.2.3: Server crashes on Windows using Eclipse/Junit

2007-10-21 Thread Rainer Bauer
Magnus Hagander wrote:

>Trevor Talbot wrote:
>> On 10/20/07, Rainer Bauer <[EMAIL PROTECTED]> wrote:
>> 
>>> Anyway, the problem are the no. of semaphores created by Postgres:
>>> Every backend creates at least 4* semaphores. Just
>>> increase  to an unusual high value (say 1) and
>>> start creating new connections while monitoring the handle count.
>> 
>> Hmm, they're actually the same semaphores, so the only cost is for
>> slots in each process's handle table, which comes from kernel paged
>> pool.  Testing shows I can easily create about 30 million handles to a
>> given object on this machine.  This is under win2003 with 1.25GB RAM,
>> which gives it a paged pool limit of 352MB.

On my system I can only create about 4 millions semaphores.

>> I tried going up to 2 max_connections, and still blew postmaster's
>> VM space long before paged pool was exhausted.  I couldn't test any
>> higher values, as there's some interaction between max_connections and
>> shared_buffers that prevents it from mapping the buffer contiguously.
>> 
>> Something's missing though, since I'm not hitting the same issue you
>> are.  How are you generating the connections?  I just have an app
>> calling PQconnectdb() in a loop, but I guess that's not good enough.

I am using the ASCII version of the psqlODBC driver version 8.2.4.2 to
establish the test connections.

>Yeah, something is obviously missing.. Are you guys on the exactly the
>same Windows versions? WRT both version and servivepack. Anybody on x64
>windows?

No, I am using WinXP SP2 32 bit with 2GB RAM.

These are my altered settings from the default 8.2.5 Postgres installation:
ssl = on
shared_buffers = 512MB
work_mem = 16MB
maintenance_work_mem = 256MB
wal_sync_method = fsync_writethrough
checkpoint_segments = 15
checkpoint_timeout = 30min
random_page_cost = 3.0
effective_cache_size = 1GB
autovacuum_vacuum_scale_factor = 0.10
autovacuum_analyze_scale_factor = 0.05

>Another thing worth testing - check if the amount of shared memory used
>makes a noticable difference. Try both very small and very large values.

Well I tried different shared_buffers settings, but the result was consisting:
with max_connections set to 1, I can create 150 database connections.

However, I checked the handle count at the moment the last connection fails
and it is only at 1,5 million. So it seems the handles are not the primary
problem.

Let me know if you want any other tests performed on this machine. I also have
VS2005 installed, but until now I haven't compiled Postgres here (I was
waiting for 8.3 which fully supports building with VS).

Rainer

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] Explicit Named Indexes for Constraints

2007-10-21 Thread Tom Lane
"Jeff Larsen" <[EMAIL PROTECTED]> writes:
> In Informix, it is recommended to create explicit named indexes on
> columns for primary and foreign keys prior to creating the
> constraints. Otherwise, the server create the indexes for you with
> meaningless names. This is not generally a problem, except when you
> dump the schema, you get all the constraint indexes in the DDL,
> exported as if they were explicitly created, but with the server
> generated names. It's a mess to sort through.

Ugh.  In PG, you can specify the names for server-generated indexes;
they're just the same names given to the constraints:

CREATE TABLE foo (f1 int constraint foo_primary_key primary key);

The index underlying this constraint will be named foo_primary_key.
If you leave off the "constraint name" clause then you get an
autogenerated name, but it's not so meaningless that there's a strong
need to override it --- in this example it'd be "foo_pkey".

Manual creation of indexes duplicating a constraint is definitely
*not* the thing to do in PG; you'll end up with redundant indexes.

> What's the recommended procedure in PG?  At first glance it appears
> that PG hides the implicit indexes from you at all times, including
> pg_dump.

I wouldn't say they are "hidden", you just don't need to mention them
separately in the DDL commands.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [GENERAL] looking for some real world performance numbers

2007-10-21 Thread paul rivers

snacktime wrote:

I'm working through the architecture design for a new product.  We
have a small group working on this.  It's a web app that will be using
ruby on rails.  The challenge I'm running into is that the latest
conventional wisdom seems to be that since obviously databases don't
scale on the web, you should just not use them at all.  I have a group
of otherwise very bright people trying to convince me that a rdbms is
not a good place to store relational data because eventually it won't
scale.  And of course we don't even have version 1 of our product out
of the door.  I'll admit we do have a very good chance of actually
getting tons of traffic, but my position is to use a rdbms for
relational data, and then if and when it won't scale any more, deal
with it then.

So what would really help me is some real world numbers on how
postgresql is doing in the wild under pressure.  If anyone cares to
throw some out I would really appreciate it.   


It sounds like the RoR people are talking about any relational database, 
and not just Postgres.   Many very busy sites do use relational 
databases successfully.  So it can work.  Many other have failed.  So it 
can fail, if the situation is exceptionally unusual, or IMHO more 
likely, it´s poorly implemented.


What the main argument of their ¨won´t scale¨ stance?  Why not setup a 
test case to prove or disprove it?  I don´t think anything we can 
suggest based on what we know of your project will help, unless someone 
happens to throw out a nearly identical case.


I would be surprised if avoiding a database is a better solution.  But 
regardless, I would be more worried about using a technology when most 
of the core group doesn´t believe in it.  That often leads to bad 
results, regardless of whether it should.


Paul




---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] looking for some real world performance numbers

2007-10-21 Thread Thomas Kellerer

snacktime wrote on 21.10.2007 08:11:

I have a group
of otherwise very bright people trying to convince me that a rdbms is
not a good place to store relational data 


Hmm.

Those bright people say that a /relational/ database management system is not a 
good place to store /relational/ data?


I can't understand that reasoning...

Where else do they want to store relational data than in a RDBMS?


Thomas


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] looking for some real world performance numbers

2007-10-21 Thread Tom Lane
Thomas Kellerer <[EMAIL PROTECTED]> writes:
> Where else do they want to store relational data than in a RDBMS?

Indeed.  It seems like we can hardly answer the OP's question without
asking "compared to what?"  If they're afraid an RDBMS won't scale,
what have they got in mind that they are so certain will scale?

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[GENERAL] SQL spec/implementation question: UPDATE

2007-10-21 Thread Kevin Hunter
Hullo list,

A perhaps esoteric question:

Short version:

What do the specs say (if anything) about returning information from
UPDATE commands?  Or about handling update request that don't
effectively do anything?

Longer version:

CREATE TABLE test (
  id  SERIAL NOT NULL,
  nameTEXT   NOT NULL,
  passion TEXT   NOT NULL,

  PRIMARY KEY( id )
);

INSERT INTO test (name, passion) VALUES ('colin', 'contra-dancing');
INSERT INTO test (name, passion) VALUES ('alex',  'contemplating');
INSERT INTO test (name, passion) VALUES ('kevin', 'soccer');
INSERT INTO test (name, passion) VALUES ('toby',  'biking');

BEGIN;
UPDATE test SET name = 'kevin' WHERE passion = 'soccer';
Previous statement 5 times (or whatever)
COMMIT;

Even though the last 5 statements effectively do nothing, every UPDATE
returns "UPDATE 1".  If I do the same thing in MySQL, I get "Rows
matched: 1  Changed: 0  Warnings: 0".  (I used the INNODB engine in MySQL.)

In PHP, the {pg,mysql}_affected_rows functions return the same results:
1 from Postgres and 0 from MySQL.

So, two questions: which behavior is correct, or is it even defined?  If
Postgres behavior is correct, why does it need to write to disk, (since
the tuple isn't actually changing in value)?

Experience tells me that Postgres is probably doing the correct thing,
but it almost seems that it could be corner case, doesn't matter either
way, and is could be just a consequence of the MVCC guarantees, etc.

TIA,

Kevin

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[GENERAL] Select Command

2007-10-21 Thread Bob Pawley
I have a column with data structured as follows.

32TT - 0002
32LT- 0004
32PT-0005

Is there a way of selecting all of the rows containing LT in that column??


I have attempted variations of ' *LT* ' with out success.

Bob Pawley

Re: [GENERAL] Select Command

2007-10-21 Thread Alvaro Herrera
Bob Pawley wrote:
> I have a column with data structured as follows.
> 
> 32TT - 0002
> 32LT- 0004
> 32PT-0005
> 
> Is there a way of selecting all of the rows containing LT in that column??
> 
> 
> I have attempted variations of ' *LT* ' with out success.

LIKE '%LT%' perhaps?

Or ~ 'LT' (unanchored regex)

-- 
Alvaro Herrera   Valdivia, Chile   ICBM: S 39º 49' 18.1", W 73º 13' 56.4"
"El número de instalaciones de UNIX se ha elevado a 10,
y se espera que este número aumente" (UPM, 1972)

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] SQL spec/implementation question: UPDATE

2007-10-21 Thread andy

Kevin Hunter wrote:

Hullo list,

A perhaps esoteric question:

Short version:

What do the specs say (if anything) about returning information from
UPDATE commands?  Or about handling update request that don't
effectively do anything?

Longer version:

CREATE TABLE test (
  id  SERIAL NOT NULL,
  nameTEXT   NOT NULL,
  passion TEXT   NOT NULL,

  PRIMARY KEY( id )
);

INSERT INTO test (name, passion) VALUES ('colin', 'contra-dancing');
INSERT INTO test (name, passion) VALUES ('alex',  'contemplating');
INSERT INTO test (name, passion) VALUES ('kevin', 'soccer');
INSERT INTO test (name, passion) VALUES ('toby',  'biking');

BEGIN;
UPDATE test SET name = 'kevin' WHERE passion = 'soccer';
Previous statement 5 times (or whatever)
COMMIT;

Even though the last 5 statements effectively do nothing, every UPDATE
returns "UPDATE 1".  If I do the same thing in MySQL, I get "Rows
matched: 1  Changed: 0  Warnings: 0".  (I used the INNODB engine in MySQL.)

In PHP, the {pg,mysql}_affected_rows functions return the same results:
1 from Postgres and 0 from MySQL.

So, two questions: which behavior is correct, or is it even defined?  If
Postgres behavior is correct, why does it need to write to disk, (since
the tuple isn't actually changing in value)?

Experience tells me that Postgres is probably doing the correct thing,
but it almost seems that it could be corner case, doesn't matter either
way, and is could be just a consequence of the MVCC guarantees, etc.

TIA,

Kevin


I think your comparing apples and oranges.  I'll bet that mysql is 
taking a shortcut and testing the value before updating it.


The update is probably more close to:
update test set name = 'kevin' where passion = 'soccer' and name <> 'kevin';

In this case, pg too, would only update once.

-Andy

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] Select Command

2007-10-21 Thread Adrian Klaver
On Sunday 21 October 2007 2:32 pm, Bob Pawley wrote:
> I have a column with data structured as follows.
>
> 32TT - 0002
> 32LT- 0004
> 32PT-0005
>
> Is there a way of selecting all of the rows containing LT in that column??
>
>
> I have attempted variations of ' *LT* ' with out success.
>
> Bob Pawley

select col where col LIKE '%LT%';

See also:
http://www.postgresql.org/docs/8.2/interactive/functions-matching.html#FUNCTIONS-LIKE
-- 
Adrian Klaver
[EMAIL PROTECTED]

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] SQL spec/implementation question: UPDATE

2007-10-21 Thread Tom Lane
andy <[EMAIL PROTECTED]> writes:
> I think your comparing apples and oranges.  I'll bet that mysql is 
> taking a shortcut and testing the value before updating it.

> The update is probably more close to:
> update test set name = 'kevin' where passion = 'soccer' and name <> 'kevin';

Yeah, that seems to be what they're doing.  PG does not bother to make
such a test, on the grounds that it would waste more net cycles than it
would save.  Most people are not in the habit of issuing lots of no-op
updates.

Also, if you have a case where you think that is what will happen, you
can add the WHERE-condition for yourself; whereas there is no way in
mysql to get rid of the useless test even if you know it's useless.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] looking for some real world performance numbers

2007-10-21 Thread Guy Rouillier

Dave Cramer wrote:

snacktime wrote:

I'm working through the architecture design for a new product.  We
have a small group working on this.  It's a web app that will be using
ruby on rails.  The challenge I'm running into is that the latest
conventional wisdom seems to be that since obviously databases don't
scale on the web, you should just not use them at all.  I have a group
of otherwise very bright people trying to convince me that a rdbms is
not a good place to store relational data because eventually it won't
scale.  And of course we don't even have version 1 of our product out
of the door.  I'll admit we do have a very good chance of actually
getting tons of traffic, but my position is to use a rdbms for
relational data, and then if and when it won't scale any more, deal
with it then.

So what would really help me is some real world numbers on how
postgresql is doing in the wild under pressure.  If anyone cares to
throw some out I would really appreciate it.


I missed the original post on this, so I'm replying to Dave's response. 
 To the OP, I don't know where you obtain your conventional wisdom 
from, but I'd look for another source.  Just about any site you might 
visit that handles lots of data has a DBMS of some sort behind it; given 
that IMS and Adabase have been out of favor for 20 years, most of  those 
DBMSs are relational.  So if it can work for your bank, E*Trade and 
eBay, chances are it can work for you.


As far as real world numbers, we have a data-intensive app (network data 
collection for a telecom company) that is currently inserting about 16 
million rows a day.  I benchmarked PG for that app and with some 
tweaking, PG could handle it.  The current app uses stored procedures 
for all inserts, and PG didn't do well with that approach; substituting 
embedded inserts fixed that problem.  So PG can definitely "handle" very 
large transaction volumes.  As with any DBMS and any application, you 
may encounter challenges (like the one I point out with using stored 
procs for high-volume inserts) that require you to address with some 
thought.


--
Guy Rouillier

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] looking for some real world performance numbers

2007-10-21 Thread Ow Mun Heng

On Sun, 2007-10-21 at 22:54 -0400, Guy Rouillier wrote:
>  The current app uses stored procedures 
> for all inserts, and PG didn't do well with that approach; substituting 
> embedded inserts fixed that problem.  So PG can definitely "handle" very 


Can you explain what is embedded inserts?


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] SQL spec/implementation question: UPDATE

2007-10-21 Thread Ron Johnson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 10/21/07 17:52, Tom Lane wrote:
> andy <[EMAIL PROTECTED]> writes:
>> I think your comparing apples and oranges.  I'll bet that mysql is 
>> taking a shortcut and testing the value before updating it.
> 
>> The update is probably more close to:
>> update test set name = 'kevin' where passion = 'soccer' and name <> 'kevin';
> 
> Yeah, that seems to be what they're doing.  PG does not bother to make
> such a test, on the grounds that it would waste more net cycles than it
> would save.  Most people are not in the habit of issuing lots of no-op
> updates.
> 
> Also, if you have a case where you think that is what will happen, you
> can add the WHERE-condition for yourself; whereas there is no way in
> mysql to get rid of the useless test even if you know it's useless.

Not to bash MySQL (much...) but ISTM that this is another example of
MySQL playing fast and loose with SQL.

IOW, the RDBMS shouldn't try to out-think me even if I seem seem to
be doing something odd.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHHBoGS9HxQb37XmcRAnGwAKCmiUnUvXHDyGs5Z0q0dZYlVOFaUgCcClhu
hwwRK9w9RhFM9lmAPZl2oP8=
=6Tso
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


[GENERAL] command to view the tables

2007-10-21 Thread Shwe Yee Than
Hi all,
 
 I'm new to postgresql.  I'm running postgresql database on Linux platform.  I 
just wanna know the command to view all the tables inside a specific database.  
Can anyone tell me?
 
 Thanks!
 
 ~Shwe~
 
 __
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

Re: [GENERAL] command to view the tables

2007-10-21 Thread Ow Mun Heng

On Sun, 2007-10-21 at 20:35 -0700, Shwe Yee Than wrote:

> I'm new to postgresql.  I'm running postgresql database on Linux
> platform.  I just wanna know the command to view all the tables inside
> a specific database.  Can anyone tell me?

\d would be what you use in psql

or just do a \? for help..




---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


[GENERAL] Indexes & Primary Keys (based on the same columns)

2007-10-21 Thread Ow Mun Heng
I'm wondering if what I'm doing is redundant.

I have a primary key on columns (A,B,C,D)
and I've also defined an index based on the same columns (A,B,C,D)

and sometimes in the query explain, I see the pkey being used for the
scan instead of the index.

So.. That made me think perhaps the additional index on the _same_
parameter is redundant.

Appreciate comments.



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] Indexes & Primary Keys (based on the same columns)

2007-10-21 Thread Joshua D. Drake

Ow Mun Heng wrote:

I'm wondering if what I'm doing is redundant.

I have a primary key on columns (A,B,C,D)
and I've also defined an index based on the same columns (A,B,C,D)

and sometimes in the query explain, I see the pkey being used for the
scan instead of the index.

So.. That made me think perhaps the additional index on the _same_
parameter is redundant.


A primary key creates an index so having a second index with the same 
definition is redundant.




Appreciate comments.



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match




---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] Indexes & Primary Keys (based on the same columns)

2007-10-21 Thread Ow Mun Heng

On Sun, 2007-10-21 at 20:49 -0700, Joshua D. Drake wrote:
> Ow Mun Heng wrote:
> > I'm wondering if what I'm doing is redundant.
> > 
> > I have a primary key on columns (A,B,C,D)
> > and I've also defined an index based on the same columns (A,B,C,D)
> > 
> > and sometimes in the query explain, I see the pkey being used for the
> > scan instead of the index.
> > 
> > So.. That made me think perhaps the additional index on the _same_
> > parameter is redundant.
> 
> A primary key creates an index so having a second index with the same 
> definition is redundant.

Many thanks for the confirmation.

I'm dropping them...

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] Indexes & Primary Keys (based on the same columns)

2007-10-21 Thread Tom Lane
Ow Mun Heng <[EMAIL PROTECTED]> writes:
> I'm wondering if what I'm doing is redundant.
> I have a primary key on columns (A,B,C,D)
> and I've also defined an index based on the same columns (A,B,C,D)

Yup, 100% redundant.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] looking for some real world performance numbers

2007-10-21 Thread Joshua D. Drake

Dave Cramer wrote:

snacktime wrote:

I'm working through the architecture design for a new product.  We
have a small group working on this.  It's a web app that will be using
ruby on rails.  The challenge I'm running into is that the latest
conventional wisdom seems to be that since obviously databases don't
scale on the web, you should just not use them at all.  I have a group
of otherwise very bright people trying to convince me that a rdbms is
not a good place to store relational data because eventually it won't
scale.  And of course we don't even have version 1 of our product out
of the door.  I'll admit we do have a very good chance of actually
getting tons of traffic, but my position is to use a rdbms for
relational data, and then if and when it won't scale any more, deal
with it then.

So what would really help me is some real world numbers on how
postgresql is doing in the wild under pressure.  If anyone cares to
throw some out I would really appreciate it.

  

I've got a client doing 18M page views/ day and postgresql isn't really
sweating.


We have a client doing 15k/tps via a website. I would say these, "very 
bright people" are "very bright but excessively ignorant".


Joshua D. Drake



Dave

Chris

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

  




---(end of broadcast)---
TIP 6: explain analyze is your friend




---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] looking for some real world performance numbers

2007-10-21 Thread Harvey, Allan AC
> As far as real world numbers, we have a data-intensive app 
> (network data 
> collection for a telecom company) that is currently inserting 
> about 16 
> million rows a day.  I benchmarked PG for that app and with some 
> tweaking, PG could handle it.
Me too, not telco though. 5.5 million per day across 2240 tables and 4 
databases with 50 days on-line.
Tweaking to auto vaccuum to get it to keep up with the daily deletes and fsync 
off, slow disks not PG's fault but have UPS.

Allan


The material contained in this email may be confidential, privileged or 
copyrighted. If you are not the intended recipient, use, disclosure or copying 
of this information is prohibited. If you have received this document in error, 
please advise the sender and delete the document. Neither OneSteel nor the 
sender accept responsibility for any viruses contained in this email or any 
attachments.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] looking for some real world performance numbers

2007-10-21 Thread Guy Rouillier

Ow Mun Heng wrote:

On Sun, 2007-10-21 at 22:54 -0400, Guy Rouillier wrote:
 The current app uses stored procedures 
for all inserts, and PG didn't do well with that approach; substituting 
embedded inserts fixed that problem.  So PG can definitely "handle" very 



Can you explain what is embedded inserts?


Insert via embedded SQL insert statements in our Java code, as opposed 
to embedded SQL stored proc invocations, which in turn do the inserts. 
The existing code base used the latter approach, which didn't work well 
with PG.  I suspect it has to do with PG's stored proc overload 
capability.  The short of it is that *any* DBMS you use will have its 
own quirks that you become acquainted with and learn to work around.


--
Guy Rouillier

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] looking for some real world performance numbers

2007-10-21 Thread Ow Mun Heng

On Mon, 2007-10-22 at 01:23 -0400, Guy Rouillier wrote:
> Ow Mun Heng wrote:
> > On Sun, 2007-10-21 at 22:54 -0400, Guy Rouillier wrote:
> >>  The current app uses stored procedures 
> >> for all inserts, and PG didn't do well with that approach; substituting 
> >> embedded inserts fixed that problem.  So PG can definitely "handle" very 
> > 
> > 
> > Can you explain what is embedded inserts?
> 
> Insert via embedded SQL insert statements in our Java code, as opposed 
> to embedded SQL stored proc invocations, which in turn do the inserts. 

AH.. so you sort of hard-code

insert into table values($x,$y,$z,$w)

instead of 

execute sp_insert($x,$y,$z,$w)

> The existing code base used the latter approach, which didn't work well 
> with PG.  I suspect it has to do with PG's stored proc overload 
> capability.  The short of it is that *any* DBMS you use will have its 
> own quirks that you become acquainted with and learn to work around.

Yeah..

I found out that PG doesn't like this statement.

where audit_key_dtime >= (select last_refreshed from denorm_log where tablename 
= 'zon')
and   audit_key_dtime <  (select last_refreshed + refresh_interval from 
denorm_log where tablename = 'zon')

Index Cond: ((audit_key_dtime >= $0) AND (audit_key_dtime < $1))

but prefers this

where audit_key_dtime >= '2007-08-08 18:00:00'
and   audit_key_dtime <  '2007-08-08 18:01:00'

Index Cond: ((audit_key_dtime >= '2007-08-08 18:00:00'::timestamp without time 
zone) 
  AND (audit_key_dtime < '2007-08-08 18:01:00'::timestamp without time 
zone))

even though they are of the same 1 min interval 


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org/


Re: [GENERAL] SQL spec/implementation question: UPDATE

2007-10-21 Thread Kevin Hunter
At 6:00p -0400 on 21 Oct 2007, andy wrote:
> I think your comparing apples and oranges. 

That's why I ask the list!  To learn when I'm doing that.  ;-)

> I'll bet that mysql is
> taking a shortcut and testing the value before updating it.

Heh.  And as Tom points out downthread, that "shortcut" probably doesn't
gain anything in the long run.

Kevin

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] SQL spec/implementation question: UPDATE

2007-10-21 Thread Kevin Hunter
At 6:52p -0400 on 21 Oct 2007, Tom Lane wrote:
> andy <[EMAIL PROTECTED]> writes:
>> I think your comparing apples and oranges.  I'll bet that mysql is 
>> taking a shortcut and testing the value before updating it.
> 
>> The update is probably more close to:
>> update test set name = 'kevin' where passion = 'soccer' and name <> 'kevin';
> 
> Yeah, that seems to be what they're doing.  PG does not bother to make
> such a test, on the grounds that it would waste more net cycles than it
> would save.  Most people are not in the habit of issuing lots of no-op
> updates.

Makes sense.  In this particular case, it's a moot point as it's
guaranteed to update a single row only (or less), but I was idly
curious.  In fact, for the application in question, having the behavior
of Postgres would make it possible to clean up the application logic a
bit, but eh.  I'm stuck with MySQL for this project.  :-(

Kevin

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] SQL spec/implementation question: UPDATE

2007-10-21 Thread Kevin Hunter
At 11:33p -0400 on 21 Oct 2007, Ron Johnson wrote:
> Not to bash MySQL (much...) but ISTM that this is another example of
> MySQL playing fast and loose with SQL.

I don't have handy a spec guide.  Does this mean that MySQL is indeed
showing incorrect behavior?  I like what's been said upthread:

The query MySQL apparently sees:
UPDATE test SET name = 'kevin' WHERE passion = 'soccer' AND name <> 'kevin';

The query as I wrote:
UPDATE test SET name = 'kevin' WHERE passion = 'soccer';

Even though it is, in fact, the same, it should still be updated because
that's what I said.  Is that what you're saying?  Is that the spec?

Thanks,

Kevin

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] SQL spec/implementation question: UPDATE

2007-10-21 Thread Tom Lane
Kevin Hunter <[EMAIL PROTECTED]> writes:
> I don't have handy a spec guide.  Does this mean that MySQL is indeed
> showing incorrect behavior?

I think this is really outside the spec.  The relevant sections of SQL92
seem to be in 13.10 :

  ::=
  UPDATE 
SET 
[ WHERE  ]

...

b) If a  is specified, then it is applied
  to each row of T with the  bound to that row,
  and the object rows are those rows for which the result of
  the  is true. The  is
  effectively evaluated for each row of T before updating any
  row of T.

...

 8) Each object row is updated as specified by each .

There is not anything I can see addressing whether an "update" should or
should not be considered to occur if a target column happens to not
change as a result of a commanded update.  There is certainly not
anything specifically requiring mysql's behavior, but I don't see
anything specifically rejecting it either.

There is

 4) If the set of object rows is empty, then a completion condition
is raised: no data.

but this refers to the case where the given WHERE condition selects
no rows; I see no argument for claiming that it involves whether the
new field values match the old ones.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend