Re: [mail] Re: [HACKERS] Windows Build System

Lamar Owen Thu, 30 Jan 2003 15:18:45 -0800

On Thursday 30 January 2003 15:29, Tom Lane wrote:
> Lamar Owen <[EMAIL PROTECTED]> writes:
> > While I understand (and agree with) your (and Vince's) reasoning on why
> > Windows should be considered less reliable, neither of you have provided


> Windows shares none of that heritage.  It is the first truly new port,
> onto a system without any Unix background, that we have ever done AFAIK.
> Claiming that it doesn't require an increased level of testing is
> somewhere between ridiculous and irresponsible.

I am saying that as we mature we need increased testing across the board.  And 
it is a very low percentage of code that is tied into the OS API, right?  The 
majority of the code (the vast majority) isn't touched by it. 

> that we suspect there will be problems.  And if you don't suspect
> there will be problems on Windows, you are being way too naive :-(

Reread my statement above.  I _agree_ with the rationale -- but I fear it will 
have the opposite impact.  And I am not convinced that just because we have 
good history with the unixoid ports means that we can slack on them -- Linux, 
*BSD, etc all change.  The strftime(3) breakage with RedHat of a cycle ago 
should show us that much.

I suspect there will be problems on Win32 -- it is, after all, a new port.  
But if we're going to immediately throw pathological test cases at it that 
we're not even bothering to test against now, that immediately throws up a 
flag to me.  And TESTING IS BEING DONE on the Win32 port, nobody is yet 
trying to put the PGDG blessing on it as yet, and progress is being made by 
those who wish to see it made.  It is still being touted as beta software, 
right?  The patches from Jan are very preliminary still, correct?  Katie 
hasn't issued a press release saying that it's not beta, right?

<hyperbole>
I don't see what the uproar is about, other than 'Win32 is so unstable that it 
can't possibly work as well as you are seeing it work -- you must be doing 
something wrong.  Test it harder.  Pull the plug repeatedly!! Test it until 
it breaks!  HA! Told you it would break! (yeah, firing up the old 
oxyacetlyene torch and hitting the hard drive with a 6,000 degree flame did 
the trick -- this has got to be a bad operating system!)'
</hyperbole>

And, by the way, who in their right mind tests a database server by repeated 
yanking of the AC power?  To go to that extreme for Win32 when we caution 
against something as mundane as a kill -9 of postmaster on Unix is absurd.  
And, yes, I know the difference.  I also know that the AC power pull has 
nothing to do with PostgreSQL, but it has to do with the OS under it.  
Although a kill -9, from the point of view of the running process, is 
identical to a power failure. It simply dies (unless it becomes a zombie, in 
which case it is undead) either way.  The effects of a kill -9 shouldn't be 
as severe as a power fail, since the OS can properly flush written buffers 
even after the process writing them has died.

And I also can point the finger at some Unix swervers (spelling intentional) 
that would fail that test in a miserable way.  I can also point at a few VMS 
machines that couldn't pass that test.  I've even seen machines blow up due 
to improper power cycling.  

And I've seen Win2k machines come right up after repeated power blips (I've 
also seen them not come up).  

It really depends upon what the hard disk is doing at the instant the 
regulators drop out the 5 and 12V supplies (and which supply goes out first, 
which can depend upon the respective loads -- for modern Pentium 4 systems 
the 12V will probably go down first since it is more heavily loaded than the 
5V supply in these systems).  Under certain conditions where the 12V goes 
down before the 5V does, the head might still be writing as the servo spirals 
towards park, causing all manner of damage (maybe even to servo information, 
which normally cannot be written). So the power cycle becomes a test of 
hardware, too, played Russian Roulette-style.

Talk about an unscientific test.

A database server that needs that kind of testing is going to be hardened 
hardware on a doubly redundant UPS anyway.

But, then again I've seen a Linux server survive a power cycle with no lost 
data (ext3 filesystem -- I've seen lost data with ext2).  And I've seen the 
same server barf all over itself due to a single bit error in memory.  Blew 
out the entire root filesystem, which was journaled and residing on a RAID 1 
partition (the corruption was perfectly mirrored, by the way).  Serves me 
right for not having ECC RAM installed at the time.

> If it passes the tests, good for it.  I honestly do not expect that it
> will.  My take on this is that we want to be able to document the
> problems in advance, rather than be blindsided.

I fully expect that Katie, Jan, Dave, and all the others working on this share 
your concerns and want the Win32 port to be as solid as is possible on that 
OS.
-- 
Lamar Owen
WGCR Internet Radio
1 Peter 4:11


---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

Re: [mail] Re: [HACKERS] Windows Build System

Reply via email to