Re: [GENERAL] Native DB replication for PG

2010-05-01 Thread Greg Smith

Scott Marlowe wrote:

I tested 8.4 what I thought was fairly hardly last year only
to have 8.4.1 die under the same load that 8.3 handled without a
problem, and reverted to the known working version putting testing
8.4.1 on hold.

So to ME, the choice is a fully functional 8.3 installation that has
NO problems with free space map because of configuration choices, or
an 8.4 with a known (to me) issue of crashing and dying.
  


FYI, since December of 2009 (release of 8.4.2) there have been 10 bugs 
fixed with the word "crash" in their description, as well as 7 memory 
leaks that could potentially lead to crash.  Even six months ago I was 
still hesitant to push 8.4 toward production systems; the number of bugs 
shaken out in the last two releases has been substantial.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Inheritance efficiency

2010-05-01 Thread Greg Smith

Vincenzo Romano wrote:

While I can agree that "Enterprise grade" is a buzzword, it does mean
something: "very large amount of data" among other.
  


http://thedailywtf.com/Articles/Bitten_by_the_Enterprise_Bug.aspx

It's quite straighforward to get PostgreSQL up and running with many 
terabytes of data, so long as you respect the design trade-offs in some 
options.  What you can't do is say those are wrong and reject 
alternative implementation suggestions just because they're not 
"enterprise".  Whenever anyone uses that word at me, I mentally replace 
it with "super duper", and



There's no "fundamentally good design", but only a design which takes
limitations and constraints into account.
  


You mean like taking into account the fact that partitioning performance 
has an unavoidable trade-off, where you have to balance the query 
optimizer overhead of supporting many partitions against the improvement 
from splitting data into smaller pieces? 


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Native DB replication for PG

2010-05-01 Thread Scott Marlowe
On Sat, May 1, 2010 at 12:59 AM, Greg Smith  wrote:
> Scott Marlowe wrote:
>>
>> I tested 8.4 what I thought was fairly hardly last year only
>> to have 8.4.1 die under the same load that 8.3 handled without a
>> problem, and reverted to the known working version putting testing
>> 8.4.1 on hold.
>>
>> So to ME, the choice is a fully functional 8.3 installation that has
>> NO problems with free space map because of configuration choices, or
>> an 8.4 with a known (to me) issue of crashing and dying.
>>
>
> FYI, since December of 2009 (release of 8.4.2) there have been 10 bugs fixed
> with the word "crash" in their description, as well as 7 memory leaks that
> could potentially lead to crash.  Even six months ago I was still hesitant
> to push 8.4 toward production systems; the number of bugs shaken out in the
> last two releases has been substantial.

Exactly, which is why I posted a followup saying I knew it was quite
possible the bug had been fixed.  But hope is not a method, so until I
can test to be sure the problem I was hitting was one of the ones
fixed, I'll keep production on 8.3 for now.  Because for me, it's
proven itself reliable over ~2 years of very heavy use.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Native DB replication for PG

2010-05-01 Thread Devrim GÜNDÜZ
On Sat, 2010-05-01 at 02:59 -0400, Greg Smith wrote:
> FYI, since December of 2009 (release of 8.4.2) there have been 10 bugs
> fixed with the word "crash" in their description, as well as 7 memory
> leaks that could potentially lead to crash.  Even six months ago I was
> still hesitant to push 8.4 toward production systems; the number of
> bugs shaken out in the last two releases has been substantial. 

FWIW, we are using 8.4.3 under heavy (thousands of transactions per
second) update+delete load (as compared to inserts), and new FSM helped
us a lot -- also it did not crash even for a second. We got rid of all
performance related issues after switching from 8.3 to 8.4. 
-- 
Devrim GÜNDÜZ
PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer
PostgreSQL RPM Repository: http://yum.pgrpms.org
Community: devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr
http://www.gunduz.org  Twitter: http://twitter.com/devrimgunduz


signature.asc
Description: This is a digitally signed message part


Re: [GENERAL] Native DB replication for PG

2010-05-01 Thread Scott Marlowe
2010/5/1 Devrim GÜNDÜZ :
> On Sat, 2010-05-01 at 02:59 -0400, Greg Smith wrote:
>> FYI, since December of 2009 (release of 8.4.2) there have been 10 bugs
>> fixed with the word "crash" in their description, as well as 7 memory
>> leaks that could potentially lead to crash.  Even six months ago I was
>> still hesitant to push 8.4 toward production systems; the number of
>> bugs shaken out in the last two releases has been substantial.
>
> FWIW, we are using 8.4.3 under heavy (thousands of transactions per
> second) update+delete load (as compared to inserts), and new FSM helped
> us a lot -- also it did not crash even for a second. We got rid of all
> performance related issues after switching from 8.3 to 8.4.

I tested 8.4.1 under three different loads.  A mostly transactional
load, a most write load for storing web site statistics, and a
mnogosearch db with moderate load.  They all ran great for a week or
so in testing.  In order to put it into production we started with the
least important of those dbs, statistics.  We can afford to lose an
hour here or there no a saturday, so we tried it.  And it worked fine
for 3 weeks when pg 8.4.1, on the same machine that was still running
8.3.latest at the time, crashed.  Happened again a few days later, and
we switched back over during maintenance.  The old 8.3 db for stats
was still there so it was just a matter of taking down stats, loading
in the data from the 8.4 db and switching it over.

Mnogo was switched over at the same time as the stats db and has never
once crashed the 8.4.x db it is running on.  So whatever we were doing
that made 8.4 crash may have had to do with the partitioned nature of
our stats db, or some other issue unique to our stats db.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Inheritance efficiency

2010-05-01 Thread Vincenzo Romano
2010/5/1 Greg Smith :
> Vincenzo Romano wrote:
>>
>> While I can agree that "Enterprise grade" is a buzzword, it does mean
>> something: "very large amount of data" among other.
>>
>
> http://thedailywtf.com/Articles/Bitten_by_the_Enterprise_Bug.aspx
>
> It's quite straighforward to get PostgreSQL up and running with many
> terabytes of data, so long as you respect the design trade-offs in some
> options.  What you can't do is say those are wrong and reject alternative
> implementation suggestions just because they're not "enterprise".  Whenever
> anyone uses that word at me, I mentally replace it with "super duper", and
>
>> There's no "fundamentally good design", but only a design which takes
>> limitations and constraints into account.
>>
>
> You mean like taking into account the fact that partitioning performance has
> an unavoidable trade-off, where you have to balance the query optimizer
> overhead of supporting many partitions against the improvement from
> splitting data into smaller pieces?

Or taking into account that some parts of the engine are not scalable.
Almost all current RDBMS can cope with terabytes.
Almost none (if any) can cope with large number of partial indexes (provided
they support them) or child tables or table level constraints and so on.
This is a fact as far as I've read so far.

Then we can discuss about the enterprise grade, the fault tolerance
and whatever else
buzzword pops up in our minds.
These ones maybe not.

-- 
Vincenzo Romano
NotOrAnd Information Technologies
NON QVIETIS MARIBVS NAVTA PERITVS

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Inheritance efficiency

2010-05-01 Thread John R Pierce

Greg Smith wrote:
Enterprise grade doesn't mean anything.  Partitioning designs that 
require thousands of child tables to work right are fundamentally 
misdesigned anyway, so there is no reason for any of the contributors 
to the project to work on improving support for them.  There are far 
too many obvious improvements that could be made to PostgreSQL, ones 
that will benefit vastly more people, to divert resources toward 
something you shouldn't be dong anyway like that.




my sql developer, who's been doing oracle for 15+ years, says postgres' 
partitioning is flawed from his perspective because if you have a 
prepared statement like..


   SELECT fields FROM partitioned_table WHERE primarykey = $1;

it doesn't optimize this very well and ends up looking at all the 
sub-table indicies.   ir you instead execute the statement


   SELECT fields FROM parritioned_table WHERE primarykey = constant;

he says the planner will go straight to the correct partition.

i haven't confirmed this for myself.



--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Native DB replication for PG

2010-05-01 Thread Scott Marlowe
On Fri, Apr 30, 2010 at 9:02 PM, Greg Smith  wrote:
> Joshua D. Drake wrote:
>>
>> On Fri, 2010-04-30 at 13:42 -0700, Gauthier, Dave wrote:
>>
>
> If I had to plan server deployments for the next year (and I do) I'd
> be sticking with pg 8.3 and a proven replication engine.  Next summer
>

 Surely you mean 8.4? :-)

>>
>> No, I would buy the 8.3 argument as well. Depending on your conservative
>> level. 8.4 is fine and all but 8.3 is about as rock solid as it gets.
>
> Unless you don't vacuum enough on a bigger database, run out of FSM pages,
> and the whole vacuum strategy goes to hell afterwards.  I would say that
> running into that issue is *probable* for an 8.3 install of any significant
> size, whereas the odds of running into a regression in 8.4 relative to 8.3
> is pretty low.  The whole "the older version is always more reliable" mantra
> doesn't make sense when you've got a major known issue in the older release
> that just goes away by using the newer one, and I feel that's the case with
> 8.4 vs. 8.3.

Exactly.  I've got a LOT of effort involved in free space map sizing
and monitoring on 8.3.  However, for me it's no longer a serious
problem.  Free space map is 10 to 20x what it needs to be on my
machines now and works like a charm in 8.3.  8.4 randomly crashed, and
honestly I can't afford to test and help fix it right now.  This
summer I can and will either with 9.0 or 8.4.  But we're talking db
crashes that were happening once every 2 to 3 weeks for me, so testing
it takes a lot of time for me.  And I can't do it with my productions
servers.  I tested 8.4 what I thought was fairly hardly last year only
to have 8.4.1 die under the same load that 8.3 handled without a
problem, and reverted to the known working version putting testing
8.4.1 on hold.

So to ME, the choice is a fully functional 8.3 installation that has
NO problems with free space map because of configuration choices, or
an 8.4 with a known (to me) issue of crashing and dying.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Select with string that has a lone hyphen yields nothing

2010-05-01 Thread Craig Ringer

On 30/04/2010 5:33 PM, Kenichiro Tanaka wrote:

Hi

The hyphen which written in 'Olympus E-PL1' is different from
the one which written in 'Camera - Black'.

em-dash
http://www.fileformat.info/info/unicode/char/2014/index.htm
en-dash
http://www.fileformat.info/info/unicode/char/2013/index.htm
figure-dash
http://www.fileformat.info/info/unicode/char/2012/index.htm

I have no idea to fix using PostgreSQL's function,because they don't equal.
I think you have to change the data or change the behavior of your
application .


The usual solution to this sort of thing is to provide a functional 
index on the problem field that computes a "simplified" version of the 
text - stripping accents, dumbing all dashes down to simple minus signs, 
etc.


I'm not aware of any canned tool to do this in PostgreSQL. Everyone's 
needs seem to vary, so it'd be hard to provide one.


--
Craig Ringer

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Native DB replication for PG

2010-05-01 Thread Scott Marlowe
On Fri, Apr 30, 2010 at 11:33 PM, Scott Marlowe  wrote:
> On Fri, Apr 30, 2010 at 9:02 PM, Greg Smith  wrote:
>> Joshua D. Drake wrote:
>>>
>>> On Fri, 2010-04-30 at 13:42 -0700, Gauthier, Dave wrote:
>>>
>>
>> If I had to plan server deployments for the next year (and I do) I'd
>> be sticking with pg 8.3 and a proven replication engine.  Next summer
>>
>
> Surely you mean 8.4? :-)
>
>>>
>>> No, I would buy the 8.3 argument as well. Depending on your conservative
>>> level. 8.4 is fine and all but 8.3 is about as rock solid as it gets.
>>
>> Unless you don't vacuum enough on a bigger database, run out of FSM pages,
>> and the whole vacuum strategy goes to hell afterwards.  I would say that
>> running into that issue is *probable* for an 8.3 install of any significant
>> size, whereas the odds of running into a regression in 8.4 relative to 8.3
>> is pretty low.  The whole "the older version is always more reliable" mantra
>> doesn't make sense when you've got a major known issue in the older release
>> that just goes away by using the newer one, and I feel that's the case with
>> 8.4 vs. 8.3.
>
> Exactly.  I've got a LOT of effort involved in free space map sizing
> and monitoring on 8.3.  However, for me it's no longer a serious
> problem.  Free space map is 10 to 20x what it needs to be on my
> machines now and works like a charm in 8.3.  8.4 randomly crashed, and
> honestly I can't afford to test and help fix it right now.  This
> summer I can and will either with 9.0 or 8.4.  But we're talking db
> crashes that were happening once every 2 to 3 weeks for me, so testing
> it takes a lot of time for me.  And I can't do it with my productions
> servers.  I tested 8.4 what I thought was fairly hardly last year only
> to have 8.4.1 die under the same load that 8.3 handled without a
> problem, and reverted to the known working version putting testing
> 8.4.1 on hold.
>
> So to ME, the choice is a fully functional 8.3 installation that has
> NO problems with free space map because of configuration choices, or
> an 8.4 with a known (to me) issue of crashing and dying.

Note that it could well be fixed by now, and I'll first test either
9.0 beta or 8.4.latest. as slony slaves before I go any further.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Inheritance efficiency

2010-05-01 Thread Cédric Villemain
2010/5/1 John R Pierce :
> Greg Smith wrote:
>>
>> Enterprise grade doesn't mean anything.  Partitioning designs that require
>> thousands of child tables to work right are fundamentally misdesigned
>> anyway, so there is no reason for any of the contributors to the project to
>> work on improving support for them.  There are far too many obvious
>> improvements that could be made to PostgreSQL, ones that will benefit vastly
>> more people, to divert resources toward something you shouldn't be dong
>> anyway like that.
>>
>
> my sql developer, who's been doing oracle for 15+ years, says postgres'
> partitioning is flawed from his perspective because if you have a prepared
> statement like..
>
>   SELECT fields FROM partitioned_table WHERE primarykey = $1;
>
> it doesn't optimize this very well and ends up looking at all the sub-table
> indicies.   ir you instead execute the statement
>
>   SELECT fields FROM parritioned_table WHERE primarykey = constant;
>
> he says the planner will go straight to the correct partition.
>
> i haven't confirmed this for myself.

It has nothing to do with partitionning but how the planner works.
Even if the use case remain correct

>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>



-- 
Cédric Villemain

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Inheritance efficiency

2010-05-01 Thread Alban Hertroys
On 1 May 2010, at 5:33, John R Pierce wrote:

> Greg Smith wrote:
> my sql developer, who's been doing oracle for 15+ years, says postgres' 
> partitioning is flawed from his perspective because if you have a prepared 
> statement like..
> 
>   SELECT fields FROM partitioned_table WHERE primarykey = $1;
> 
> it doesn't optimize this very well and ends up looking at all the sub-table 
> indicies.

Yes it would, for a very logical reason.

A prepared statement is nothing but a stored query plan - its benefits are 
mostly that you can skip the query planning step before performing a query, 
which helps queries that are performed very frequently in a short time or that 
take a long time planning.

But skipping the query planner also has a drawback; the planner has to make a 
general assumption about what kind of data you'll be querying. It can't vary 
the query plan depending on what data you're querying for.

If someone is writing a query on a partitioned table and wants to rely on 
constraint exclusion and they're trying to use a prepared statement then they 
don't understand what prepared statements are.

You could argue that some logic could be added to the handling of prepared 
statements to insert query-subplans depending on what data you use for your 
parameters, but then you're moving back in the direction of unprepared 
statements (namely invoking the query planner). It would help cases like this 
one, but it would hurt all other prepared statements. It would at the least add 
a parse tree back into the queries path, which would be a fairly simplistic one 
in the case of table partitioning, but would get fairly complex for prepared 
statements involving more parameters - so much so that the benefit of using a 
prepared statement (not spending time planning the query) would get reduced 
significantly.
It's possible that Oracle implemented something like this, but as you see it's 
not necessarily an improvement.

In practice people either query the correct table partition directly or do not 
use a prepared statement.

>   ir you instead execute the statement
> 
>   SELECT fields FROM parritioned_table WHERE primarykey = constant;
> 
> he says the planner will go straight to the correct partition.
> 
> i haven't confirmed this for myself.

Alban Hertroys

--
Screwing up is an excellent way to attach something to the ceiling.


!DSPAM:737,4bdc08fc10416246414315!



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Inheritance efficiency

2010-05-01 Thread Alban Hertroys
On 1 May 2010, at 12:56, Alban Hertroys wrote:

> You could argue that some logic could be added to the handling of prepared 
> statements to insert query-subplans depending on what data you use for your 
> parameters, but then you're moving back in the direction of unprepared 
> statements (namely invoking the query planner). It would help cases like this 
> one, but it would hurt all other prepared statements. It would at the least 
> add a parse tree back into the queries path, which would be a fairly 
> simplistic one in the case of table partitioning, but would get fairly 
> complex for prepared statements involving more parameters - so much so that 
> the benefit of using a prepared statement (not spending time planning the 
> query) would get reduced significantly.


And of course it would add time for planning the query-tree to the creation of 
the prepared statement - which could be significant compared to the time people 
expect to save by not invoking the planner on later invocations of the same 
query. That said, the more frequent the query is executed the less that hurts 
performance, while it doesn't really matter for queries that are executed 
infrequently.

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,4bdc0ba010411331128920!



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Native DB replication for PG

2010-05-01 Thread Tom Lane
Greg Smith  writes:
> Scott Marlowe wrote:
>> I tested 8.4 what I thought was fairly hardly last year only
>> to have 8.4.1 die under the same load that 8.3 handled without a
>> problem, and reverted to the known working version putting testing
>> 8.4.1 on hold.

> FYI, since December of 2009 (release of 8.4.2) there have been 10 bugs 
> fixed with the word "crash" in their description, as well as 7 memory 
> leaks that could potentially lead to crash.  Even six months ago I was 
> still hesitant to push 8.4 toward production systems; the number of bugs 
> shaken out in the last two releases has been substantial.

Are we reading the same CVS log?  I find quite a few commit messages
mentioning the word "crash", but the *only* post-8.4.0 crash fix that
applied to 8.4 and didn't also get committed into 8.3 (and often a
lot further back than that) was this, which made it into 8.4.2:

2009-09-22 11:46  tgl

* src/: backend/catalog/dependency.c,
test/regress/expected/rules.out, test/regress/sql/rules.sql
(REL8_4_STABLE), backend/catalog/dependency.c,
test/regress/expected/rules.out, test/regress/sql/rules.sql: Fix
crash if a DROP is attempted on an internally-dependent object. 
Introduced in 8.4 rewrite of dependency.c.  Per bug #5072 from Amit
Khandekar.

I searched the CVS log in detail back to the start of 2010, and found
only half a dozen patches of any flavor that were applied to 8.4 but not
8.3 (discounting cosmetic changes such as docs-only patches).  Only one
of these was a server crash condition, and most of them applied to
new-in-8.4 features that you wouldn't even exercise if you were simply
migrating an 8.3-compatible application.

So in general I think there is no objective evidence to support a
position that 8.4 is less stable than 8.3, at least not since about
8.4.2.  This is fairly typical of our release series, IME.  I expect
9.0 will have a much longer period before its bug curve falls to match
the previous releases, of course.

I'm curious to get to the bottom of Scott's report.  It's possible that
he hit one of the two or three 8.4-only crashes we fixed since 8.4.1;
or the bug may still be lurking.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Native DB replication for PG

2010-05-01 Thread Scott Marlowe
On Sat, May 1, 2010 at 9:27 AM, Tom Lane  wrote:
>
> I'm curious to get to the bottom of Scott's report.  It's possible that
> he hit one of the two or three 8.4-only crashes we fixed since 8.4.1;
> or the bug may still be lurking.

I'll definitely be testing it this summer to see if it triggers a
crash on the latest 8.4 and possibly 9.0 as well.  The stats db is the
ne place I can run bleeding edge releases since the data there isn't
really critical for daily operations, especially the data created
during the summer lull.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Avoiding surrogate keys

2010-05-01 Thread Lew

Philippe Lang wrote:

I think nobody mentioned Object-Relational mappers.
If you intend to used one (or think you may be using one in the future), 
using surrogate keys is more straightforward, if not necessary.


Neither of those claims is even slightly true.  Using Hibernate, EclipseLink 
or OpenJPA (for Java applications), natural keys are sufficient and far more 
straightforward than surrogate keys.


--
Lew

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] PostgreSQL vs. Microsoft SQL server

2010-05-01 Thread Thomas Løcke
Anybody know of any recent comparisons made between the two?

I'm in the process of buying a new telephony related software suite,
and I'm getting mixed advice. Some say that MSSQL is _much_
better/faster than PostgreSQL, and others say the opposite.

The vendor is more or less indifferent, with a small plus to the
Microsoft solution because, well, they are a Microsoft shop. The
sales-people all bang on about MSSQL being the superior choice, and
PostgreSQL being a "toy compared to the Microsoft RDBMS". The tech
people though are divided into three groups: One group says the two
systems are more or less equal, another group who says the Microsoft
database is superior and finally a group who speaks highly of
PostgreSQL.

I've not been able to convince them to send me some actual benchmark
numbers, which actually should turn on quite a few alarms, come to
think about it.  :o)

Maybe you guys are aware of some recent generic tests/comparisons
between the two systems?

Regards,
Thomas

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Avoiding surrogate keys

2010-05-01 Thread Merlin Moncure
On Sat, May 1, 2010 at 12:09 PM, Lew  wrote:
> Philippe Lang wrote:
>>
>> I think nobody mentioned Object-Relational mappers.
>> If you intend to used one (or think you may be using one in the future),
>> using surrogate keys is more straightforward, if not necessary.
>
> Neither of those claims is even slightly true.  Using Hibernate, EclipseLink
> or OpenJPA (for Java applications), natural keys are sufficient and far more
> straightforward than surrogate keys.

right -- to be fair though is quite a bit of (generally bad) software
out there that assumes or at least heavily encourages surrogate keys.

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL vs. Microsoft SQL server

2010-05-01 Thread Scott Marlowe
On Sat, May 1, 2010 at 12:47 PM, Thomas Løcke  wrote:
> Anybody know of any recent comparisons made between the two?
>
> I'm in the process of buying a new telephony related software suite,
> and I'm getting mixed advice. Some say that MSSQL is _much_
> better/faster than PostgreSQL, and others say the opposite.
>
> The vendor is more or less indifferent, with a small plus to the
> Microsoft solution because, well, they are a Microsoft shop. The
> sales-people all bang on about MSSQL being the superior choice, and
> PostgreSQL being a "toy compared to the Microsoft RDBMS". The tech
> people though are divided into three groups: One group says the two
> systems are more or less equal, another group who says the Microsoft
> database is superior and finally a group who speaks highly of
> PostgreSQL.
>
> I've not been able to convince them to send me some actual benchmark
> numbers, which actually should turn on quite a few alarms, come to
> think about it.  :o)
>
> Maybe you guys are aware of some recent generic tests/comparisons
> between the two systems?

It's probably much more important what you're running each on than the
db itself for most of these situations.

Anyone who thinks pgsql is a toy is an idiot.  or a sales person.
Wait, that might be redundant here.

Anyone who judges one poorly while using only the other is a
cheerleader.  I like cheerleaders, when they're in a uniform dancing
around on the gym floor. I don't tend to look to them for advice on
which dbms to use.

I've only used MSSQL enough to know I don't really care for it that
much, and I've converted a dozen or so MSSQL users to PostgreSQL just
by letting them connect to my db instead of theirs (think corporate
intranet).  But that could have just been because my server was better
run and configured than theirs.

Anyway, both are reasonably good servers.  Don't listen to a sales
person for goodness sake, they're in it to sell you stuff, and a
bigger price tag is better for them.  Test it for yourself.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Avoiding surrogate keys

2010-05-01 Thread John R Pierce


If your 'natural key' is a large text field, I'd have to assume there's 
some point at which a surrogate index would be more efficient.  Would 
this be above a few dozen characters, or a few 100 characters?   I 
wouldn't want a PK based on a multi-K byte text field for a table that 
has many 10s or 100s of 1000s of rows, for sure.






--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL vs. Microsoft SQL server

2010-05-01 Thread Rich Shepard

On Sat, 1 May 2010, Thomas Løcke wrote:


Anybody know of any recent comparisons made between the two?


  A Google search will turn up a lot of comparisons.


I'm in the process of buying a new telephony related software suite, and
I'm getting mixed advice. Some say that MSSQL is _much_ better/faster than
PostgreSQL, and others say the opposite.


  This opens a world of potential flames. The first thing you should ask is
on what basis the comparisons are being made. Initial price? Licensing fees?
Support? While MS-SQL requires a Microsoft OS underneath it, postgres can be
installed on that OS or on linux or the *BSDs. No charge for the underlying
OS, either.

  What is the basis for "fast?" How does the dbms relate to the telephone
software? Do you need real-time transaction processing or as a lookup for
phone numbers?

  I've been using postgres for more than a dozen years, and know that there
are a wide range of applications where it's the back end. Some applications
require fast read/write capability (which it has), others capability to
store hundreds of millions of rows per table, and it handles these, too.

  Ask more specific questions of your proposed vendor and have them back up
the answers with meaningful support, not marketing fluff from Redmond.

Rich

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL vs. Microsoft SQL server

2010-05-01 Thread Merlin Moncure
On Sat, May 1, 2010 at 2:47 PM, Thomas Løcke  wrote:
> Anybody know of any recent comparisons made between the two?
>
> I'm in the process of buying a new telephony related software suite,
> and I'm getting mixed advice. Some say that MSSQL is _much_
> better/faster than PostgreSQL, and others say the opposite.
>
> The vendor is more or less indifferent, with a small plus to the
> Microsoft solution because, well, they are a Microsoft shop. The
> sales-people all bang on about MSSQL being the superior choice, and
> PostgreSQL being a "toy compared to the Microsoft RDBMS". The tech
> people though are divided into three groups: One group says the two
> systems are more or less equal, another group who says the Microsoft
> database is superior and finally a group who speaks highly of
> PostgreSQL.
>
> I've not been able to convince them to send me some actual benchmark
> numbers, which actually should turn on quite a few alarms, come to
> think about it.  :o)
>
> Maybe you guys are aware of some recent generic tests/comparisons
> between the two systems?

most comparisons you see are going to be feature checklists of no real
value.  what really matters is how you are going to access the
database and what you are going to do with it.  if you are going to
primarily use microsoft tools (c#/visual studio/reporting services)
you are probably going to be better off with sql server.  these
technologies integrate better with sql server.

if you are writing your client applications in jdbc/php/etc, your
database choice is a wash in terms of connectivity. postgresql with
its bsd license and strong open source pedigree has a lot of
advantages, so lean postgres but keep ms in mind if you have all
microsoft servers and/or ms administrators.

if you are writing stuff in C/C++, doing significant coding INSIDE the
database (pl/pgsql, C, etc) and/or think of the database as a self
contained development environment, then postgresql is definitely the
right choice.  our C interface (both libpq and internal) is excellent,
variety of server side programming options is second to none, and
postgres's flexible type system is fantastic and capable of doing all
kinds of things that bend the rules of what a database is 'supposed to
do'.

also, spend some time reviewing the postgresql mailing lists, and note
the quality and quantity of responses you get to difficult questions.
we are helpful bunch :-)

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL vs. Microsoft SQL server

2010-05-01 Thread John R Pierce

Thomas Løcke wrote:

Anybody know of any recent comparisons made between the two?
  


for purely SQL,  I prefer Postgres by a wide margin.But, MS SQL 
Server comes with a whole infrastructure that includes a lot of powerful 
tools, like replication, data extraction and translation, active 
directory integrated single sign-on, and so forth.   If you're working 
in a pure MS world, with .NOT -er- .NET, etc etc,  its pretty hard to 
get away from MS SQL as there are so many tools



the bottom line tends to be driven by whatever the applications 
support.   I have one MS SQL database in my lab hosting a program which 
only supports MS SQL.   Luckily, the demands on this database are low 
enough that i can use the free SQL Express.





--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Indexing queries with bit masks

2010-05-01 Thread Filip Rembiałkowski
2010/4/30 Mike Christensen :
> Ok I've been blatantly lying, err, purposely simplifying the problem for the
> sake of the original email :)
>
> I've read over the responses, and am actually now considering just not using
> any index at all.  Here's why:
>
> First, this actually isn't the only thing on the WHERE clause.  It will only
> query for users who are "friends" with you so it can notify them of your
> activities.  That's done via a weird JOIN on a table that holds all the
> friend relationships.  So in reality, it will only load maybe a hundred
> rows, or maybe a thousand every once in a while if you're way popular.  If
> I'm not mistaken, it should use the index to narrow it down to the list of
> friends, and then use a sequential scan to weed out the ones who subscribe
> to that type of notification.
>
> Second, the only thing /ever/ that will do this query is the queue service
> whose job it is to process notifications (which are files dropped on the
> file system) and email people all day long.  This service handles one job at
> a time, and could potentially run on its own machine with its own read-only
> copy of the database.  Thus, even if it was a fairly slow query, it's not
> gonna bring down the rest of the site.
>
> Regarding the idea of putting an index on each bit, I thought about this
> earlier as well as just kinda cringed.  The users table gets updated quite a
> bit (last logon, session id, any time they change their profile info,
> etc)..  Too many indexes is bad.  I could just put the data in another table
> of course, which lead me to another idea.  Have a table called Subscriptions
> and have each row hold a user id and a notification type.  I could index
> both, and join on (Subscriptions.UserId = Users.UserId AND
> Subscriptions.Type = 8).  This would be pretty dang fast, however updates
> are kinda a royal pain.  When the user changes which types of subscriptions
> they want (via a list of checkboxes), I'd have to figure out which rows to
> delete and which new ones to insert.  However, I think I have an idea in
> mind for a PgSQL function you pass in the bitmask to and then it
> "translates" it to conditional deletes and inserts.
>
> A third idea I'm tossing around is just not worry about it.  Put the bitmask
> in the DB, but not filter on it.  Every "friend" would be loaded into the
> dataset, but the queue processor would just "skip" rows if they didn't
> subscribe to that event.  In other words, move the problem down to the
> business layer.  The drawback is potentially large number of rows are
> loaded, serialized, etc into memory that will just be ignored.  But of
> course the DB is probably a read-only copy and it's not even close to the
> bottle neck of the email queue under heavy load, so it's probably a
> non-issue.  If mailing is slow, I just add more queue services..
>
> I'm exploring all these ideas.  I predict using the bitwise AND on the where
> clause isn't gonna be the worst design ever, and it's sure easier to
> implement than a table of subscriptions.  What do you guys think?

I would say "normalize". Which means that I like your "separate table"
idea best.
It's clear, obvious, and 3NF - conforming solution.
Changing the set of subscriptions with delete-update-insert combo is
not so bad as you would think.
Encapsulating it in some kind of functional API looks nice too.

Filip

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Native DB replication for PG

2010-05-01 Thread Greg Smith

Tom Lane wrote:

Greg Smith  writes:
  
FYI, since December of 2009 (release of 8.4.2) there have been 10 bugs 
fixed with the word "crash" in their description, as well as 7 memory 
leaks that could potentially lead to crash.

Are we reading the same CVS log?  I find quite a few commit messages
mentioning the word "crash", but the *only* post-8.4.0 crash fix that
applied to 8.4 and didn't also get committed into 8.3 (and often a
lot further back than that) was this...


I based those figures on uses of the word "crash" in the release notes 
for things fixed in 8.4.2 and 8.4.3, not the CVS logs.  And I didn't 
filter out the stuff that was also backported, on the theory that Scott 
might be running into one of those issues in his test 8.4.1 install, but 
not his 8.3 one even though the 8.3 instance was theoretically 
vulnerable to it as well.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL vs. Microsoft SQL server

2010-05-01 Thread Scott Ribe
On May 1, 2010, at 12:47 PM, Thomas Løcke wrote:

> The
> sales-people all bang on about MSSQL being the superior choice, and
> PostgreSQL being a "toy compared to the Microsoft RDBMS".

This is complete bullshit.

I say that as someone who spent years using MS SQL Server, and who very much 
enjoyed using it, and who still likes it. (Though I haven't used it much 
lately.)

The high-level overview is that from a technical standpoint they are both 
excellent products, and I consider them very much comparable for the databases 
I've done. They have some different specific strengths & weaknesses for sure. 
PG's locking scheme, MVCC, basically precludes certain specific optimizations 
that means a small number of very specific queries don't perform as well, while 
at the same time it means that throughput with multiple simultaneous 
connections scales extremely well with multiple processors.

-- 
Scott Ribe
scott_r...@elevated-dev.com
http://www.elevated-dev.com/
(303) 722-0567 voice





-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL vs. Microsoft SQL server

2010-05-01 Thread Greg Smith

Scott Ribe wrote:

PG's locking scheme, MVCC, basically precludes certain specific optimizations 
that means a small number of very specific queries don't perform as well, while 
at the same time it means that throughput with multiple simultaneous 
connections scales extremely well with multiple processors.
  


SQL Server uses MVCC too as of their 2005 release, implemented with row 
versioning similarly to Postgres.  The main non-MVCC holdout at this 
point is DB2.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL vs. Microsoft SQL server

2010-05-01 Thread Scott Ribe
On May 1, 2010, at 5:16 PM, Greg Smith wrote:

> SQL Server uses MVCC too as of their 2005 release, implemented with row 
> versioning similarly to Postgres.  The main non-MVCC holdout at this point is 
> DB2.

Funny, I've ported to even later versions than that, but missed the change. 
Well, OK, I'm sure I read it in the release notes at the time, but it made no 
important different to the app I was working on at the time.

-- 
Scott Ribe
scott_r...@elevated-dev.com
http://www.elevated-dev.com/
(303) 722-0567 voice





-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL vs. Microsoft SQL server

2010-05-01 Thread jus...@magwerks.com
 On Sat, 2010-05-01 at 20:47 +0200, Thomas Løcke wrote:

-->Anybody know of any recent comparisons made between the two?

I'm in the process of buying a new telephony related software suite,
and I'm getting mixed advice. Some say that MSSQL is _much_
better/faster than PostgreSQL, and others say the opposite.
-->
-->No shocker there. Their salesmen out for a sale, having no under standing 
how to compare DB's .
-->


sales-people all bang on about MSSQL being the superior choice, and
PostgreSQL being a "toy compared to the Microsoft RDBMS".
-->I've been using MSSQL from 7.0 to MSSQL 2008. Been using Postgresql as of 
8.2 and think very highly of it.

The administrative overhead of PG IMHO is far less, hardware requirements are 
lower, easier to develop in and against.

MSSQL has lots of MS tools you can use which all cost lots of money. Hardware 
requirements far steeper, primary because the OS hardware requirements are 
higher.

My first choice is PG
-->
-->


The tech
people though are divided into three groups: One group says the two
systems are more or less equal, another group who says the Microsoft
database is superior and finally a group who speaks highly of
PostgreSQL.
-->
-->To say PG is superior to MSSQL and vice versus is a loaded argument. What is 
the bases of the comparison.
-->
-->Comparing the two strictly as DB to DB they are equal. Both support large 
section of SQL standard, they both have excellent track records not corrupting 
data, both do WAL, both scale up to thousands of transactions per second. both 
are ACID
-->
-->But the equality between the 2 stops there and each product has pluses and 
Minus.
-->
-->wiki has some comparison info
-->http://en.wikipedia.org/wiki/Comparison_of_relational_database_management_systems
-->http://en.wikipedia.org/wiki/Comparison_of_database_tools
-->
-->


I've not been able to convince them to send me some actual benchmark numbers, 
which actually should turn on quite a few alarms, come to think about it. :o) 
-->
-->Benchmarking is an important piece but should not be a deciding factor.
-->
-->Benchmarks on DB's are miss leading, because the way each may execute a 
given set of queries can and will result in drastically different numbers.
-->
-->Tweaking a DB and the queries is a time consuming process and normally 
results in rewriting the queries, add indexes, changing configurations or even 
changing table layouts.
-->
-->My experience shows both are very fast if properly configured and the 
developers understands how a specific DB works to properly write queries. Seen 
more than once a DB taken to its knees because poorly written SQL statements or 
design . Performance and General mailing list are packed with such examples.
-->


Maybe you guys are aware of some recent generic tests/comparisons between the 
two systems? -->
-->General tests and comparisons are worthless.

This list is packed with examples as are other DB mailing list where users pick 
apart benchmarks because the Tester missed some setting, or some other arcane 
trick that is Database specific.


-->
-->






All legitimate Magwerks Corporation quotations are sent in a .PDF file 
attachment with a unique ID number generated by our proprietary quotation 
system. Quotations received via any other form of communication will not be 
honored.

CONFIDENTIALITY NOTICE: This e-mail, including attachments, may contain legally 
privileged, confidential or other information proprietary to Magwerks 
Corporation and is intended solely for the use of the individual to whom it 
addresses. If the reader of this e-mail is not the intended recipient or 
authorized agent, the reader is hereby notified that any unauthorized viewing, 
dissemination, distribution or copying of this e-mail is strictly prohibited. 
If you have received this e-mail in error, please notify the sender by replying 
to this message and destroy all occurrences of this e-mail immediately.
Thank you.


Re: [GENERAL] PostgreSQL vs. Microsoft SQL server

2010-05-01 Thread Craig Ringer
On 02/05/10 02:47, Thomas Løcke wrote:

> I've not been able to convince them to send me some actual benchmark
> numbers, which actually should turn on quite a few alarms, come to
> think about it.  :o)

Is performance really your #1 criterion anyway?

I'd be looking at management, reliability, backup, integration into the
rest of the infrastructure, product longevity, support, etc. Performance
you can always throw hardware at.

-- 
Craig Ringer

Tech-related writing: http://soapyfrogs.blogspot.com/

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] PostgreSQL vs. Microsoft SQL server

2010-05-01 Thread Scott Marlowe
On Sat, May 1, 2010 at 8:12 PM, Craig Ringer
 wrote:
>
> I'd be looking at management, reliability, backup, integration into the
> rest of the infrastructure, product longevity, support, etc. Performance
> you can always throw hardware at.

And given the relatively high costs of a MSSQL installation, you can
throw a LOT of hardware at a PostgreSQL server.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Am I supposed to be all scared of compound primary keys?

2010-05-01 Thread Mike Christensen
I have a table that stores a user ID and a subscription type, and this is
really all it needs to store and any pair of values will always be unique.
In fact, I think this pair should be the primary key on the table.  However,
I'm using Castle ActiveRecord which says at:

http://www.castleproject.org/activerecord/documentation/v1rc1/usersguide/pks.html#CompositePK

And I quote:

Quick Note: Composite keys are highly discouraged. Use only when you
have no other alternative.

I get the feeling they're discouraged from a SQL point of view, but it
doesn't actually say why anywhere.  Is there any good reason to avoid using
composite keys on a table?  Why waste the space of an extra key if you don't
have to?  Thanks!

Mike


Re: [GENERAL] Am I supposed to be all scared of compound primary keys?

2010-05-01 Thread Scott Marlowe
On Sat, May 1, 2010 at 8:25 PM, Mike Christensen  wrote:
> I have a table that stores a user ID and a subscription type, and this is
> really all it needs to store and any pair of values will always be unique.
> In fact, I think this pair should be the primary key on the table.  However,
> I'm using Castle ActiveRecord which says at:
>
> http://www.castleproject.org/activerecord/documentation/v1rc1/usersguide/pks.html#CompositePK
>
> And I quote:
>
> Quick Note: Composite keys are highly discouraged. Use only when you have no
> other alternative.
>
> I get the feeling they're discouraged from a SQL point of view, but it
> doesn't actually say why anywhere.  Is there any good reason to avoid using
> composite keys on a table?  Why waste the space of an extra key if you don't
> have to?  Thanks!

From reading that, they're discouraged from a hibernate point of view.
 I've never had a problem with composite keys in SQL myself.

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Am I supposed to be all scared of compound primary keys?

2010-05-01 Thread Christophe Pettus


On May 1, 2010, at 7:25 PM, Mike Christensen wrote:


\And I quote:

Quick Note: Composite keys are highly discouraged. Use only when you  
have no other alternative.


I get the feeling they're discouraged from a SQL point of view, but  
it doesn't actually say why anywhere.  Is there any good reason to  
avoid using composite keys on a table?  Why waste the space of an  
extra key if you don't have to?  Thanks!


A composite key is generally better than creating a surrogate key just  
so you have a single-column key.  It's possible that the note is  
referring to systems that handle composite keys poorly (PostgreSQL  
handles them just fine), or are concerned about ORMs which don't  
support them at all (like Django's) or support them badly.


--
-- Christophe Pettus
   x...@thebuild.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Am I supposed to be all scared of compound primary keys?

2010-05-01 Thread jus...@magwerks.com
 On Sat, 2010-05-01 at 19:25 -0700, Mike Christensen wrote:


-->I have a table that stores a user ID and a subscription type, and this 
is really all it needs to store and any pair of values will always be unique. 
In fact, I think this pair should be the primary key on the table. However, I'm 
using Castle ActiveRecord which says at:


http://www.castleproject.org/activerecord/documentation/v1rc1/usersguide/pks.html#CompositePK

And I quote:



Quick Note: Composite keys are highly discouraged. Use only when you have 
no other alternative.

I get the feeling they're discouraged from a SQL point of view, but it 
doesn't actually say why anywhere. Is there any good reason to avoid using 
composite keys on a table? Why waste the space of an extra key if you don't 
have to? Thanks!
-->
-->I'm not familiar with this project.. That said it seems they have some 
automated SQL updating/insert/relation building going on in the classes. 
Nothing more than simplifying the class initialising of .net ADO record sets 
which are overly complicated.
-->
-->It seems the class automation can not work with composite keys directly to 
build relation between classes; create SQL commands to up the records, and make 
sure within castle framework the composite key has been updated to all the 
other classes having relations.
-->
-->This warning has no impact on any database but a limitation and warning 
about Castle-project framework itself.
-->
-->The database does not care either way. Given this limitation is within the 
framework i follow the advice and not use a composite key.  -->






All legitimate Magwerks Corporation quotations are sent in a .PDF file 
attachment with a unique ID number generated by our proprietary quotation 
system. Quotations received via any other form of communication will not be 
honored.

CONFIDENTIALITY NOTICE: This e-mail, including attachments, may contain legally 
privileged, confidential or other information proprietary to Magwerks 
Corporation and is intended solely for the use of the individual to whom it 
addresses. If the reader of this e-mail is not the intended recipient or 
authorized agent, the reader is hereby notified that any unauthorized viewing, 
dissemination, distribution or copying of this e-mail is strictly prohibited. 
If you have received this e-mail in error, please notify the sender by replying 
to this message and destroy all occurrences of this e-mail immediately.
Thank you.