On Thu, Jan 30, 2003 at 01:27:59PM -0600, Greg Copeland wrote:
> That was going to be my question too.
>
> I thought NFS didn't have some of the requisite file system behaviors
> (locking, flushing, etc. IIRC) for PostgreSQL to function correctly or
> reliably.
I don't know what locking sheme Pos
[ On Friday, January 31, 2003 at 11:54:27 (-0500), D'Arcy J.M. Cain wrote: ]
> Subject: Re: PostgreSQL, NetBSD and NFS
>
> On Thursday 30 January 2003 18:32, Simon J. Gerraty wrote:
> > Is postgreSQL trying to lock a file perhaps? Would seem a sensible thing
> > for it to be doing...
>
> Is that
On Wed, Feb 05, 2003 at 03:09:09PM -0500, Tom Lane wrote:
> "D'Arcy J.M. Cain" <[EMAIL PROTECTED]> writes:
> > On Wednesday 05 February 2003 13:04, Ian Fry wrote:
> >> How about adjusting the read and write-size used by the NetBSD machine? I
> >> think the default is 32k for both read and write on
On Wed, 5 Feb 2003, D'Arcy J.M. Cain wrote:
[DJC: This feels rather fragile. I doubt that it is hardware related because I dad
[DJC: tried it on the other ethernet interface in the machine which was on a
[DJC: completely different network than the one I am on now.
All I can offer up is that at o
On Wed, Feb 05, 2003 at 12:18:29PM -0500, Tom Lane wrote:
> "D'Arcy J.M. Cain" <[EMAIL PROTECTED]> writes:
> > On Wednesday 05 February 2003 11:49, Tom Lane wrote:
> >> I wonder if it is possible that, every so often,
> >> you are losing just the last few bytes of an NFS transfer?
> > Yah, that's k
On Wed, 5 Feb 2003, Tom Lane wrote:
[TL: Could be. By "heritage" I meant BSD-without-any-adjective. It is
[TL: perfectly clear from Leffler, McKusick et al. (_The Design and
[TL: Implementation of the 4.3BSD UNIX Operating System_) that back then,
[TL: 8K was the standard filesystem block size.
I've been watching this thread since the beginning, and now that y'all
brought up networking, I believe I may have some useful suggestions in that
arena.
Tom Lane <[EMAIL PROTECTED]> writes:
> I'm thinking maybe one or both LAN cards have a problem with packets
> exceeding a certain size.
>
Are
On Wed, Feb 05, 2003 at 03:45:11PM -0500, Tom Lane wrote:
> Thor Lancelot Simon <[EMAIL PROTECTED]> writes:
> >> Unless NetBSD has changed from its heritage, the kernel disk cache
> >> buffers are 8K, and so an 8K NFS read or write would never cross a
> >> cache buffer boundary. But 32K would.
>
> If he is using UDP rather than TCP
> as the transport layer, another potential issue is that 32K requests will
> end up as IP packets with a very large number of fragments, potentially
> exposing some kind of network stack bug in which the last fragment is
> dropped or corrupted.
Actually it is
On February 06, 2003 at 03:50, Justin Clift wrote:
> Tom Lane wrote:
>
> >Hoo boy. I was already suspecting data corruption in the index, and
> >this looks like more of the same. My thoughts are definitely straying
> >in the direction of "the NFS server is dropping bits, somehow".
> >
> >Both th
On Wed, Feb 05, 2003 at 09:24:48PM +, David Laight wrote:
> > If he is using UDP rather than TCP
> > as the transport layer, another potential issue is that 32K requests will
> > end up as IP packets with a very large number of fragments, potentially
> > exposing some kind of network stack bug
> Hmmm... does anyone remember the name of that NFS testing tool the
> FreeBSD guys were using? Think it came from Apple. They used it to
> find and isolate bugs in the FreeBSD code a while ago.
fsx
Chris
---(end of broadcast)---
TIP 1: subscr
Thor Lancelot Simon <[EMAIL PROTECTED]> writes:
>> Unless NetBSD has changed from its heritage, the kernel disk cache
>> buffers are 8K, and so an 8K NFS read or write would never cross a
>> cache buffer boundary. But 32K would.
> I don't know what "heritage" you're referring to, but it has never
"D'Arcy J.M. Cain" <[EMAIL PROTECTED]> writes:
> On Wednesday 05 February 2003 13:04, Ian Fry wrote:
>> How about adjusting the read and write-size used by the NetBSD machine? I
>> think the default is 32k for both read and write on i386 machines now.
>> Perhaps try setting them back to 8k (it's th
On Wednesday 05 February 2003 13:04, Ian Fry wrote:
> > Wild thought here: can you reduce the MTU on the LAN linking the NFS
> > server to the NetBSD box? If so, does it help?
>
> How about adjusting the read and write-size used by the NetBSD machine? I
> think the default is 32k for both read and
Tom Lane wrote:
> Greg Copeland <[EMAIL PROTECTED]> writes:
> > On Wed, 2003-02-05 at 11:18, Tom Lane wrote:
> >> Wild thought here: can you reduce the MTU on the LAN linking the NFS
> >> server to the NetBSD box? If so, does it help?
>
> > I'm curious as to why you think adjusting the MTU may ha
James Hubbard wrote:
Justin Clift wrote:
Hmmm... does anyone remember the name of that NFS testing tool the
FreeBSD guys were using? Think it came from Apple. They used it to
find and isolate bugs in the FreeBSD code a while ago.
Sounds like it might be useful here.
:-)
You can find a wri
Greg Copeland <[EMAIL PROTECTED]> writes:
> On Wed, 2003-02-05 at 11:18, Tom Lane wrote:
>> Wild thought here: can you reduce the MTU on the LAN linking the NFS
>> server to the NetBSD box? If so, does it help?
> I'm curious as to why you think adjusting the MTU may have an effect on
> this. Low
Justin Clift wrote:
Hmmm... does anyone remember the name of that NFS testing tool the
FreeBSD guys were using? Think it came from Apple. They used it to
find and isolate bugs in the FreeBSD code a while ago.
Sounds like it might be useful here.
:-)
You can find a write about it here:
http
On Wed, 2003-02-05 at 11:18, Tom Lane wrote:
> "D'Arcy J.M. Cain" <[EMAIL PROTECTED]> writes:
> > On Wednesday 05 February 2003 11:49, Tom Lane wrote:
> >> I wonder if it is possible that, every so often,
> >> you are losing just the last few bytes of an NFS transfer?
>
> > Yah, that's kind of wha
Tom Lane wrote:
Hoo boy. I was already suspecting data corruption in the index, and
this looks like more of the same. My thoughts are definitely straying
in the direction of "the NFS server is dropping bits, somehow".
Both this and the (admittedly unproven) bt_moveright loop suggest
corrupted
"D'Arcy J.M. Cain" <[EMAIL PROTECTED]> writes:
> On Wednesday 05 February 2003 11:49, Tom Lane wrote:
>> I wonder if it is possible that, every so often,
>> you are losing just the last few bytes of an NFS transfer?
> Yah, that's kind of what it looked like when I tried this before
> Christmas too
On Wednesday 05 February 2003 11:49, Tom Lane wrote:
> "D'Arcy J.M. Cain" <[EMAIL PROTECTED]> writes:
> > Hmm. This time it passed that point but this happened:
> >
> > COPY "certificate" FROM stdin;
> > NOTICE: copy: line 253677, bt_insertonpg[certificate_pkey]: parent page
> > unfound - fixing
"D'Arcy J.M. Cain" <[EMAIL PROTECTED]> writes:
> Hmm. This time it passed that point but this happened:
> COPY "certificate" FROM stdin;
> NOTICE: copy: line 253677, bt_insertonpg[certificate_pkey]: parent page
> unfound - fixing branch
> ERROR: copy: line 253677, bt_fixlevel[certificate_pkey]
On Wednesday 05 February 2003 10:12, Tom Lane wrote:
> "D'Arcy J.M. Cain" <[EMAIL PROTECTED]> writes:
> > Well, it does appear to be working but it never finishes. Here are two
> > backtraces. One was taken while it was running and the other after a
> > kill -9. The primary key file should have
"D'Arcy J.M. Cain" <[EMAIL PROTECTED]> writes:
> Well, it does appear to be working but it never finishes. Here are two
> backtraces. One was taken while it was running and the other after a kill
> -9. The primary key file should have had 322846720 bytes based on the
> database that I was cop
On Sunday 02 February 2003 12:26, Tom Lane wrote:
> At this point I think you need to rebuild with --enable-debug and
> --enable-cassert (if you didn't already) and then capture some
> stack traces from the stuck backend. We have to find out what the
> backend thinks it's doing.
Well, it does app
On Sunday 02 February 2003 12:26, Tom Lane wrote:
> "D'Arcy J.M. Cain" <[EMAIL PROTECTED]> writes:
> > Also odd, why would running over NFS have any bearing on it if we
> > could find such a place?
>
> Yup, 'tis the question. The only theory I have been able to come up
> with is that there's somet
"D'Arcy J.M. Cain" <[EMAIL PROTECTED]> writes:
> Also odd, why would running over NFS have any bearing on it if we
> could find such a place?
Yup, 'tis the question. The only theory I have been able to come up
with is that there's something flaky about your network hardware,
such that Postgres s
On Saturday 01 February 2003 15:48, Tom Lane wrote:
> More and more bizarre. What is the hardware platform --- does it have TAS?
NetBSD on a Pentium (i386 port) so yes, it does have TAS. I assume you were
thinking about the spinlock emulation.
I have been looking through backend/storage/lmgr/l
On Fri, 31 Jan 2003, mlw wrote:
> . There are always issues with file locking across various
> platforms. I recall reading about mmap issues across NFS a while ago...
Postgres uses neither of these, IIRC, so that should be fine. (Actually,
postgres does effectively use mmap for shared memory on N
"D'Arcy J.M. Cain" <[EMAIL PROTECTED]> writes:
> On Saturday 01 February 2003 14:43, Tom Lane wrote:
>> What else was going on? As far as I can see, the code never does a
>> semop unless it's waiting for some other backend process.
> Nothing except the standard background processes are running.
On Saturday 01 February 2003 14:43, Tom Lane wrote:
> "D'Arcy J.M. Cain" <[EMAIL PROTECTED]> writes:
> > Here's the log. As you can see, nothing was logged after the COPY
> > command.
>
> What else was going on? As far as I can see, the code never does a
> semop unless it's waiting for some other
"D'Arcy J.M. Cain" <[EMAIL PROTECTED]> writes:
> 100Mb instead of 100Mb -->1000Mb. I tried mounting with and without the TCP
> option and it seemed to act the same but it was better than before. Now it
> doesn't crash but trying to load a large table hangs. It gets to a point
> where it is ca
On Saturday 01 February 2003 14:00, Tom Lane wrote:
> What was the query it failed on, exactly? That last page it read
> seems to be an empty index page --- it should have moved on to the
> next index page, I'd think, rather than doing anything that could
> hang up.
Here's the log. As you can se
What was the query it failed on, exactly? That last page it read
seems to be an empty index page --- it should have moved on to the
next index page, I'd think, rather than doing anything that could
hang up.
regards, tom lane
---(end of broadcast)--
"D'Arcy J.M. Cain" <[EMAIL PROTECTED]> writes:
> That's a 4.7 MB file. The dump might be quite huge.
I really just want to see the dump of that one page, and maybe the pages
before and after it for comparison's sake.
regards, tom lane
---(end of b
On Saturday 01 February 2003 13:09, Tom Lane wrote:
> Very bizarre. Looks like the last page it read was block 104
> (851968/8192) in file "/source/data/cert/base/16556/17063". Could you
> provide a formatted dump of that page? I'm partial to pg_filedump which
> you can get from http://sources.r
On Thursday 30 January 2003 14:02, mlw wrote:
> Forgive my stupidity, are you running PostgreSQL with the data on an NFS
> share?
Yes, sorry. PostgreSQL is running from the local disk but the data is on the
mounted drive.
--
D'Arcy J.M. Cain| Democracy is three wolves
http://www.druid.net
On Thursday 30 January 2003 14:27, Greg Copeland wrote:
> That was going to be my question too.
>
> I thought NFS didn't have some of the requisite file system behaviors
> (locking, flushing, etc. IIRC) for PostgreSQL to function correctly or
> reliably.
>
> Please correct as needed.
Yes, doubly s
On Thursday 30 January 2003 12:07, Tom Lane wrote:
> "D'Arcy J.M. Cain" <[EMAIL PROTECTED]> writes:
> > I have posted before about this but I am now posting to both NetBSD and
> > PostgreSQL since it seems to be some sort of interaction between the two.
> > I have a NetAPP filer on which I am putt
D'Arcy J.M. Cain wrote:
On Thursday 30 January 2003 14:02, mlw wrote:
Forgive my stupidity, are you running PostgreSQL with the data on an NFS
share?
Yes, sorry. PostgreSQL is running from the local disk but the data is on the
mounted drive.
I'm not sure, I guess it could work, but N
On Fri, 31 Jan 2003, D'Arcy J.M. Cain wrote:
> On Thursday 30 January 2003 12:07, Tom Lane wrote:
> > Perhaps the next thing to do is to strace (ktrace, trace, truss,
> > whatever system-call tracing utility you got) the postmaster and
> > child processes. If we could determine what system call i
On Thursday 30 January 2003 18:32, Simon J. Gerraty wrote:
> Is postgreSQL trying to lock a file perhaps? Would seem a sensible thing
> for it to be doing...
Is that a problem? FWIW I am running statd and lockd on the NetBSD box.
--
D'Arcy J.M. Cain| Democracy is three wolves
http://www.d
On Thu, 30 Jan 2003, D'Arcy J.M. Cain wrote:
> Does the shared memory stuff use disk at all? Perhaps that's the
> difference between PostgreSQL and other applications.
Shared memory in NetBSD is just an interface to mmap'd pages, so it can
be swapped to disk. But I assume your swap is not on NFS.
--On Thursday, January 30, 2003 16:02:17 -0500 Tom Lane <[EMAIL PROTECTED]>
wrote:
Greg Copeland <[EMAIL PROTECTED]> writes:
That was going to be my question too.
I thought NFS didn't have some of the requisite file system behaviors
(locking, flushing, etc. IIRC) for PostgreSQL to function cor
Greg Copeland <[EMAIL PROTECTED]> writes:
> That was going to be my question too.
> I thought NFS didn't have some of the requisite file system behaviors
> (locking, flushing, etc. IIRC) for PostgreSQL to function correctly or
> reliably.
Whether the thing is trustworthy is a different issue ;-).
That was going to be my question too.
I thought NFS didn't have some of the requisite file system behaviors
(locking, flushing, etc. IIRC) for PostgreSQL to function correctly or
reliably.
Please correct as needed.
Regards,
Greg
On Thu, 2003-01-30 at 13:02, mlw wrote:
> Forgive my stu
Forgive my stupidity, are you running PostgreSQL with the data on an NFS
share?
D'Arcy J.M. Cain wrote:
I have posted before about this but I am now posting to both NetBSD and
PostgreSQL since it seems to be some sort of interaction between the two. I
have a NetAPP filer on which I am puttin
"D'Arcy J.M. Cain" <[EMAIL PROTECTED]> writes:
> I have posted before about this but I am now posting to both NetBSD and
> PostgreSQL since it seems to be some sort of interaction between the two. I
> have a NetAPP filer on which I am putting a PostgreSQL database. I run
> PostgreSQL on a NetB
I have posted before about this but I am now posting to both NetBSD and
PostgreSQL since it seems to be some sort of interaction between the two. I
have a NetAPP filer on which I am putting a PostgreSQL database. I run
PostgreSQL on a NetBSD box. I used rsync to get the database onto the file
51 matches
Mail list logo