Serious people for a serious proposition about FREEDOM qtzgx

2000-08-01 Thread don
93% WHO RESPOND TO MY AD DON'T MAKE THE CUT!
(Only highly motivated people should call)

I'm dead serious! Now let me tell you why...

Most people don't have drive! They sit around waiting for something to happen.
They see that other people have nice homes, new cars, and MONEY, but why can't they?
They basically feel sorry for themselves, and I can't help these people!

Another reason people are not qualified...

They are skeptical. BUT, the truth is, they are skeptical of themselves!
They don't have the courage to break their routine.
They are comfortable with their set hours, their set pay, and their set future.
I was not comfortable with someone controlling my future,
and I'm not comfortable working with people who are!

STOP
If you are anything like the above, we won't be able to work together!

WHAT DO THE 7% WHO DO MAKE THE CUT DISCOVER?!

People I select through an "interview like" process are forever changed!
They are opened to a world around them that they didn't know existed.
In fact, it's a world that has existed around them their whole lives,
but was purposely hidden from them!


How would you like to...

- Drastically reduce personal, business and capital gains taxes?
- Protect all assets from any form of seizure, liens, or judgments?
- Create a six figure income every 4 months?

 How about...

Restoring and preserving complete personal and financial privacy?
Amassing personal wealth, multiplying it and protecting it?
Realizing a 3 to 6 times greater return on your money?
Legally making yourself and your assets completely judgment-proof,
lien-proof, divorce-proof, attorney-proof, IRS-proof?
I could go on...

TAKE A SERIOUS LOOK AT YOUR LIFE

Do you think you are paid what you are worth?
Will you be set to retire in the next few years?
Do you control the course of your day? ...your life?

The fact is we have many people in our enterprise that earn over 50K per month
from the privacy of their own homes, and are retiring in 2 to 3 years (wealthy)
and have total freedom - both personal and financial!

Many have been conditioned to believe it must be illegal, immoral or unethical
to ever earn any real profits from our efforts.

The sad truth is, it's been designed that way by the ultra-rich
and ultra-powerful since before any of us were born!

Who am I?

I'm a BIG thinker, a BIG dreamer, and I believe that I deserve the best that life has to offer.
I answered an ad much like the one you are reading now,
and was eager to hear what it was about.

I knew that I couldn't invest only a few bucks and spend an hour or two
per week to achieve the results I was looking for.  I found the right information
and now I control my own destiny!

If you are interested in radically changing your thoughts and your financial future,
I invite you to call TOLL FREE:

1 800 707 4817

My name is Krystina, and I look forward to working with some of you!

* REMEMBER *

I can't do it for you, but I can show you exactly how I do it.
It's as simple as that!



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message


Re: rand() is broken

2003-02-02 Thread Don
> > Binary packages from third party software vendors.
>
> What about them? They either,
> a) link to a static libc, and use its rand() always; or
> b) link to a shared libc, and use its rand(), as the binary API hasn't
> changed; or
It isn't a question of the API. It's a question of expected function
output.

> c) if they really need their own specific RNG, they include it themselves, and
> don't rely on libc at all.
>
> So I fail to see the problem here.
The opinion of a random user:

I run FreeBSD and not Linux because of the stability and predictability of
the system. Changing a critical function like rand() when we know that
there are applications which depend on its output does not seem like a
good idea.

A seperate function for those who need cryptographic randomness seems like
a _much_ better idea.

This is my person opinion. I am not a developer so please take my comments
as such.

-Don

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: rand() is broken

2003-02-02 Thread Don
> > It isn't a question of the API. It's a question of expected function
> > output.
>
> Then it's applicable not only to binary packages as Terry states, but any
> source that uses rand().
I think Terry mentioned binary packages simply because it is harder to fix
them than something available as source but I could be mistaken.

> I would say that depending on the internal algorithm used by rand() (or
> random()) is a bad idea;  however, I don't know what the relevant standards
> say about this, so I won't say any further.
>
> (Why is it a bad idea?  Because I'm not going to write software which makes
> this assumption; I'm sure that even if at some point in time all systems use
> an identical algorithm, at some point my software will have to run on a
> system which uses something different.  So if I really need it, I will take
> rand() from libc and place it in my own code.)
If only all developers were as good as you we would not have a problem.

> > A seperate function for those who need cryptographic randomness seems like
> > a _much_ better idea.
>
> I'm not sure Yet Another RNG API (of course arc4random() already exists) gains
> anything unless rand()/random() absolutely cannot be changed; and as I say
> I'm not convinced this is the case.
I am by no means convinced either. I do, however, think this is something
that should not be changed without a lot of consideration and testing.

Your point about arc4random() is a good one. Why depend on rand() for
cryptographic randomness when we already have arc4random()?

> Doesn't even the 0 / RAND_MAX fix change
> the algorithm?  Software which relies on that behaviour will break ..
Any software which always needs to get back maxint when it calls rand() is
hopelessly broken :) Besides which, I don't recall advocating that change
either.

-Don

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: xmms looping forever

2003-02-06 Thread Don
> I had problems with almost all my apps, at first. Third part
> applications that weren't recompiled during the buildworld (ports) had
> linking problems. Take wget, for example:
>
> anarcat@lenny[~]% wget
> /usr/libexec/ld-elf.so.1: /usr/local/lib/libintl.so.4: Undefined symbol "stpcpy"
> anarcat@lenny[~]%
>
> Recopmiling wget solves the problem.
Have you tried simply recompiling xmms? I have been using xmms on -CURRENT
and -RELEASE for about a month with 0 problems.

-Don

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: TR : IPFilter

2003-02-09 Thread Don
> Btw, I was looking for some docs on the FreeBSD website and didn't found
> anything interesting, only firewall that FreeBSD seems to support
> nowadays
> is the old IPFW, which is quite obsolete now imo. Why are documentation
> pages not dealing with IPF at all ? is there any reason ?
Try ipfw2

-Don

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: make kernel bombs out

2003-03-23 Thread Don
> I'm using source that was sync'd about 18 hours ago.
> Buildworld built properly and kernel looks like it's almost completely
> built then I get what is below.
>
> Any suggestions/ideas?
> This is the second time it's bombed out like this and I CVSup'd and
> rebuilt world inbetween the two attempts.
Please read /usr/src/UPDATING

-Don

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message


Re: playing mp3s and burning a cd

2003-03-24 Thread Don
> Should a PR be filed or some QA team contacted to make sure this
> problem doesn't stay alive in 5.2? :)
This isn't, by chance, a problem with your setting for the
sysctl "hw.ata.atapi_dma" is it?

-Don

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message


Re: Solved??? Re: playing mp3s and burning a cd

2003-03-24 Thread Don
> On Mon Mar 24, 2003 at 02:14:48PM -0500, Don wrote:
> > > Should a PR be filed or some QA team contacted to make sure this
> > > problem doesn't stay alive in 5.2? :)
> > This isn't, by chance, a problem with your setting for the
> > sysctl "hw.ata.atapi_dma" is it?
> How extraordinarly cute! This solves it! I'm currently listening to
> Me, Mom and Morgentaler and burning a 4x CD without any slowdown, this
> is great.
Glad to be of help. Anytime you have odd system problems like that give
"sysctl -a" a perusal. Sometimes you will be surprised at what you find.

There was definitely a reason for turning off DMA access for atapi devices
by default, I just can not remember why. I'm sure this issue and the
reason were mentioned on the list already.

> PS: what's the proper way to enable ATAPI DMA in the loader.conf file?
> I don't see any flag WRT that there.. I'm tempted to add:
>
> set hw.ata.atapi_cam=1
hw.ata.atapi_dma not _cam :)

-Don

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message


Re: Removing Sendmail

2003-04-02 Thread Don
> Don't you think that if syslog is unreliable, then it should be fixed ?
> If things are as you say, we have 2 problems: Sendmail gettings CERTs
> every other day and an unreliable system logger. Would you rather just
> let things be as they are ?
Absolutely not! Fix the problems and they would be happy to commit your
fixes.

Seriously though, I _always_ replace sendmail with postfix and I have
never had a problem doing so. Other than one or two really trivial
anyway.

What problems do people run into when replacing sendmail? How many of
those problems come as a result of not reading the install messages for
the particular port?

-Don
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: half-fix for stream.c

2000-01-20 Thread Don Lewis

On Jan 20,  5:41pm, Alfred Perlstein wrote:
} Subject: half-fix for stream.c
} you can find it at:
} 
} http://www.freebsd.org/~alfred/tcp_fix.diff

Don't you want to defer the checksum even further (after the bogus
packets have been dropped)?  It doesn't look like the change you
made will save any unnecessary work.

Also, it looks like you can save a few CPU cycles by only searching
for wildcard sockets if the SYN flag is set, so only set the 6th
argument to in_pcblookup_hash() if (thflags & TH_SYN) is true.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: bzip2 in src tree (Was Re: ports/16252: bsd.port.mk: Add bzip2 support for distribution patches)

2000-01-22 Thread Don Lewis

On Jan 22,  5:04pm, Alex Zepeda wrote:
} Subject: Re: bzip2 in src tree (Was Re: ports/16252: bsd.port.mk: Add bzip

} What if we began to use bzip2 instead of gzip for things like man pages,
} or releases, etc?

Doesn't bzip2 require a lot more memory for decompression?  As I
recall, someone mentioned that this would cause problems for installing
releases on machines with only a small amount of RAM.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



lots of ad_timeouts

2000-02-12 Thread Don Croyle

I recently moved one of my machines to -current and I see a lot of
these when I do buildworlds and other large builds.  With a kernel
built yesterday after the last set of ATA changes and and boot_verbose
set, the exact message is:

ad0: ad_timeout: lost disk contact - resetting
ata0: resetting devices .. ata0: mask=03 status0=52 status1=00
ata0-master: success setting up WDMA2 mode on PIIX4 chip
ata0-slave: timeout waiting for command=ef s=10 e=60
ata0-slave: failed setting up PIO3 mode on generic chip
ata0-slave: using PIO mode set by BIOS
done

Relevant sections of my kernel config:

# ATA and ATAPI devices
device  ata
device  atadisk # ATA disk drives
device  atapicd # ATAPI CDROM drives

And extracts from the boot-time dmesg:

ata-pci0:  port 0xf000-0xf00f at device 7.1 on pci0
ata0: iobase=0x01f0 altiobase=0x03f6 bmaddr=0xf000
ata0: mask=03 status0=50 status1=00
ata0: mask=03 status0=50 status1=00
ata0: devices = 0x9
ata0 at 0x01f0 irq 14 on ata-pci0
ata1: iobase=0x0170 altiobase=0x0376 bmaddr=0xf008
ata1: mask=03 status0=1c status1=1c
ata1: mask=03 status0=0c status1=0c
ata1: devices = 0x0

ata0-master: success setting up WDMA2 mode on PIIX4 chip
ad0:  ATA-3 disk at ata0 as master
ad0: 3815MB (7814016 sectors), 7752 cyls, 16 heads, 63 S/T, 512 B/S
ad0: 16 secs/int, 1 depth queue, WDMA2
ad0: piomode=4 dmamode=2 udmamode=-1 cblid=0
Creating DISK ad0
Creating DISK wd0
ata0-slave: piomode=3 dmamode=1 udmamode=-1 dmaflag=1
ata0-slave: timeout waiting for command=ef s=10 e=00
ata0-slave: failed setting up PIO3 mode on generic chip
ata0-slave: using PIO mode set by BIOS
acd0:  CDROM drive at ata0 as slave
acd0: read 689KB/s (689KB/s), 128KB buffer, BIOSPIO
acd0: Reads: CD-DA
acd0: Audio: play, 256 volume levels
acd0: Mechanism: ejectable tray
acd0: Medium: CD-ROM 120mm data disc loaded, unlocked, lock protected
Mounting root from ufs:/dev/ad0s1a
ad0s1: type 0xa5, start 0, end = 7814015, size 7814016 
ad0s1: C/H/S end 486/101/63 (3129461) != end 7814015: invalid
-- 
I've always wanted to be a dilettante, but I've never quite been ready
to make the commitment.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: lots of ad_timeouts

2000-02-12 Thread Don Croyle

I really can't tell you for sure.  This machine had been running
2.2.8 before, and I didn't think to save a dmesg before I wiped it and
reinstalled from a snapshot.

If it would help, I'll try setting it to PIO with sysctl and doing
another buildworld.

Soren Schmidt <[EMAIL PROTECTED]> writes:

> It seems Don Croyle wrote:
> > I recently moved one of my machines to -current and I see a lot of
> > these when I do buildworlds and other large builds.  With a kernel
> > built yesterday after the last set of ATA changes and and boot_verbose
> > set, the exact message is:
> > 
> > ad0: ad_timeout: lost disk contact - resetting
> > ata0: resetting devices .. ata0: mask=03 status0=52 status1=00
> > ata0-master: success setting up WDMA2 mode on PIIX4 chip
> > ata0-slave: timeout waiting for command=ef s=10 e=60
> > ata0-slave: failed setting up PIO3 mode on generic chip
> > ata0-slave: using PIO mode set by BIOS
> > done
> 
> Did you run with DMA enabled before the upgrade ??
> 
> -Søren
> 

-- 
I've always wanted to be a dilettante, but I've never quite been ready
to make the commitment.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: lots of ad_timeouts

2000-02-12 Thread Don Croyle

Soren Schmidt <[EMAIL PROTECTED]> writes:

> > If it would help, I'll try setting it to PIO with sysctl and doing
> > another buildworld.
> 
> It probably will...

Made it through the world safely.  I've just started a make release.
Assuming that this isn't something that's readily fixable, where would
the best place to put the sysctl command so I don't have to remember
it every time I reboot?
-- 
I've always wanted to be a dilettante, but I've never quite been ready
to make the commitment.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Crashing netscape?

2000-02-21 Thread Don Lewis

On Feb 21,  7:51pm, Alex Le Heux wrote:
} Subject: Crashing netscape?
} Hi,
} 
} Am I the only one who's experiencing an amzing amount of crashes on
} Netscape?
} 
} It's been going on for quite some time now (months), upgrading Netscape or
} switching from the Linux to the FreeBSD to the BSDI version doesn't help.
} The most stable version seems to be the Linux version, but that even
} crashes 5-10 times per day. It will *always* crash when a page uses java,
} but I've not been able to find a non-java page that will always crash it.

Have you tried "netscape -sync"?


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



current hangs during boot if ET/5025-16 card is installed

2000-03-02 Thread Don Lewis


I happened to try to install 4.0-CURRENT on a box that has an
Emerging Technologies ET/5025-16 ISA card installed and found that
the kernel wedges during boot.  It hangs hard and won't respond to
anything except the reset switch.  The motherboard is an Asus P3B-F
and I believe I have the BIOS propery configured with the correct
settings to match the IRQ and memory addresses used by the ET card.

I also discovered that older versions of -CURRENT will boot correctly
on this box.  I did a binary search on the -CURRENT snapshots and found
that the floppies from the January 12th and earlier snapshots boot,
while the floppies from the January 14th and later snapshots hang.

Here's the dmesg.boot file that I get by doing a "boot -v" with
using late January kernel with the offending card removed.  With
the card installed, the boot process gets as far as
Trying Read_Port at 3c3
but doesn't get to
isa_probe_children: disabling PnP devices


drtype=0x00, mfdev=0
subordinatebus=0secondarybus=0
map[20]: type 1, range 32, base d800, size  4
found-> vendor=0x8086, dev=0x7112, revid=0x01
class=0c-03-00, hdrtype=0x00, mfdev=0
subordinatebus=0secondarybus=0
intpin=d, irq=255
map[20]: type 1, range 32, base d400, size  5
found-> vendor=0x8086, dev=0x7113, revid=0x02
class=06-80-00, hdrtype=0x00, mfdev=0
subordinatebus=0secondarybus=0
map[90]: type 1, range 32, base e800, size  4
found-> vendor=0x8086, dev=0x1229, revid=0x08
class=02-00-00, hdrtype=0x00, mfdev=0
subordinatebus=0secondarybus=0
intpin=a, irq=9
map[10]: type 1, range 32, base e180, size 12
map[14]: type 1, range 32, base d000, size  6
map[18]: type 1, range 32, base e100, size 20
found-> vendor=0x8086, dev=0x1229, revid=0x08
class=02-00-00, hdrtype=0x00, mfdev=0
subordinatebus=0secondarybus=0
intpin=a, irq=15
map[10]: type 1, range 32, base e080, size 12
map[14]: type 1, range 32, base b800, size  6
map[18]: type 1, range 32, base e000, size 20
found-> vendor=0x8086, dev=0x1229, revid=0x08
class=02-00-00, hdrtype=0x00, mfdev=0
subordinatebus=0secondarybus=0
intpin=a, irq=10
map[10]: type 1, range 32, base df80, size 12
map[14]: type 1, range 32, base b400, size  6
map[18]: type 1, range 32, base df00, size 20
found-> vendor=0x8086, dev=0x1229, revid=0x08
class=02-00-00, hdrtype=0x00, mfdev=0
subordinatebus=0secondarybus=0
intpin=a, irq=11
map[10]: type 1, range 32, base de80, size 12
map[14]: type 1, range 32, base b000, size  6
map[18]: type 1, range 32, base de00, size 20
pci0:  on pcib0
pcib1:  at device 1.0 on pci0
found-> vendor=0x104c, dev=0x3d07, revid=0x01
class=03-00-00, hdrtype=0x00, mfdev=0
subordinatebus=0secondarybus=0
intpin=a, irq=11
map[10]: type 1, range 32, base e300, size 17
map[14]: type 1, range 32, base e280, size 23
map[18]: type 1, range 32, base e200, size 23
pci1:  on pcib1
vga-pci0:  mem 
0xe200-0xe27f,0xe280-0xe2ff,0xe300-0xe301 irq 11 at device 0.0 
on pci1
isab0:  at device 4.0 on pci0
isa0:  on isab0
ata-pci0:  port 0xd800-0xd80f at device 4.1 on pci0
ata-pci0: Busmastering DMA supported
ata0: iobase=0x01f0 altiobase=0x03f6 bmaddr=0xd800
ata0: mask=03 status0=50 status1=50
ata0: mask=03 status0=50 status1=00
ata0: devices = 0x9
ata0 at 0x01f0 irq 14 on ata-pci0
ata1: iobase=0x0170 altiobase=0x0376 bmaddr=0xd808
ata1: mask=00 status0=ff status1=ff
pci0: Intel 82371AB/EB (PIIX4) USB controller (vendor=0x8086, dev=0x7112) at 4.2
chip1:  port 0xe800-0xe80f at device 4.3 on 
pci0
fxp0:  port 0xd000-0xd03f mem 
0xe100-0xe10f,0xe180-0xe1800fff irq 9 at device 9.0 on pci0
fxp0: Ethernet address 00:90:27:c6:bb:b8
bpf: fxp0 attached
fxp1:  port 0xb800-0xb83f mem 
0xe000-0xe00f,0xe080-0xe0800fff irq 15 at device 10.0 on pci0
fxp1: Ethernet address 00:90:27:c6:bc:b9
bpf: fxp1 attached
fxp2:  port 0xb400-0xb43f mem 
0xdf00-0xdf0f,0xdf80-0xdf800fff irq 10 at device 11.0 on pci0
fxp2: Ethernet address 00:90:27:c6:bc:c0
bpf: fxp2 attached
fxp3:  port 0xb000-0xb03f mem 
0xde00-0xde0f,0xde80-0xde800fff irq 11 at device 12.0 on pci0
fxp3: Ethernet address 00:90:27:c6:a6:12
bpf: fxp3 attached
Trying Read_Port at 203
Trying Read_Port at 243
Trying Read_Port at 283
Trying Read_Port at 2c3
Trying Read_Port at 303
Trying Read_Port at 343
Trying Read_Port at 383
Trying Read_Port at 3c3
isa_probe_children: disabling PnP devices
isa_probe_children: probing non-PnP devices
fe0: not probed (disabled)
fdc0:  at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0
fdc0: FIFO enabled, 8 bytes threshold
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
a

4.0-CURRENT hangs in ex_isa_identify() (was: current hangs during boot if ET/5025-16 card is installed)

2000-03-03 Thread Don Lewis

On Mar 2,  4:09am, Don Lewis wrote:
} Subject: current hangs during boot if ET/5025-16 card is installed
} 
} I happened to try to install 4.0-CURRENT on a box that has an
} Emerging Technologies ET/5025-16 ISA card installed and found that
} the kernel wedges during boot.  It hangs hard and won't respond to
} anything except the reset switch.  The motherboard is an Asus P3B-F
} and I believe I have the BIOS propery configured with the correct
} settings to match the IRQ and memory addresses used by the ET card.
} 
} I also discovered that older versions of -CURRENT will boot correctly
} on this box.  I did a binary search on the -CURRENT snapshots and found
} that the floppies from the January 12th and earlier snapshots boot,
} while the floppies from the January 14th and later snapshots hang.

By adding a whole bunch of printf statements to the code, I was able
to track this problem to ex_isa_identify().  The ET card is jumpered
to I/O address 0x240, and it appears to consume 32 bytes starting at
this address.  When the ioport loop in ex_isa_identify() gets to 0x250,
look_for_card() appears to wedge.  I don't see how that can happen
unless the CPU gets stuck in inb().  I haven't looked at ISA hardware
in ages, can an ISA I/O read really hang forever?

What really sucks is that there is no way to disable the ex driver
at boot time, so the standard install floppies can no longer be used
to boot a box that contains one of these ET cards.

Should the ex driver be doing all this stuff at identify time, or was
the older method of doing this at probe time more correct?


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: 4.0-CURRENT hangs in ex_isa_identify() (was: current hangs during boot if ET/5025-16 card is installed)

2000-03-03 Thread Don Lewis

On Mar 3, 11:16am, "Matthew N. Dodd" wrote:
} Subject: Re: 4.0-CURRENT hangs in ex_isa_identify() (was: current hangs du
} On Fri, 3 Mar 2000, Don Lewis wrote:
} > What really sucks is that there is no way to disable the ex driver
} > at boot time, so the standard install floppies can no longer be used
} > to boot a box that contains one of these ET cards.
} > 
} > Should the ex driver be doing all this stuff at identify time, or was
} > the older method of doing this at probe time more correct?
} 
} Thats really the only place for such a routine.  What needs to happen is
} for if_ex to a little more selective about which addresses it
} probes.  While it is using a non-destructive probe (see
} look_for_card()) it should also use the resource manager to check and see
} if a port is assigned before it does anything else.

Unfortunately the GENERIC kernel doesn't have a driver that could claim
the ET card.  Also ex_isa_identify() is called before the legacy ISA
probes are done.

IMHO, the best way to fix this would be for the dual-mode PnP/legacy
drivers to identify any cards in PnP mode, then do legacy ISA probes
using the old hard-wired port numbers, where legacy ISA probes can
be controlled by userconfig.  This is really ugly, but then we all
agree that ISA sucks.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Why not gzip iso images?

2000-03-15 Thread Don Lewis

On Mar 15,  9:03am, Kris Kennaway wrote:
} Subject: Re: Why not gzip iso images?
} On Wed, 15 Mar 2000, Alfred Perlstein wrote:
} 
} > I feel pretty confident assuming that most people that burn ISOs probably
} > keep enough disk space free to hold one and not much more, going from
} > a requirement of ~650MB to ~1.2GB wouldn't be a smart move imo.
} 
} fetch -o - ftp://path/to/iso.gz | gunzip -c - > /path/to/image.iso

This doesn't allow you to restart a failed transfer, which you might
want to be able to do if it takes two or three days to transfer the
entire file.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: kern/8324

2000-03-18 Thread Don Lewis

On Mar 17,  6:27pm, Alfred Perlstein wrote:
} Subject: Re: kern/8324
} * Archie Cobbs <[EMAIL PROTECTED]> [000317 17:55] wrote:
} > This bug has been around since at least 2.2.6 and is still present
} > in RELENG_3, RELENG_4, and -current.
} > 
} >   http://www.freebsd.org/cgi/query-pr.cgi?pr=8324
} > 
} > Is anyone planning to tackle it? What would be required to fix it?
} > (it's not clear (to me anyway) from Bruce's description how hard
} > this is to fix..)

I never heard of using SIGIO for output, but section 6.4 of the daemon
book says that SIGIO is sent "when a read or write becomes possible".
On the other hand, section 10.8 (Terminal Operations) mentions SIGIO 
for input but not for output.  I also looked at rev 1.1 of kern/tty.c
and it only sends a SIGIO when input is ready, so this seems to be
the historical behaviour, so I'm suprised that this program even
worked with plain tty devices.

} I think Bruce sort of went off into a tangent with his diagnosis,
} anyhow this is untested (of course :) ), but looks like the right
} thing to do (from sys_pipe.c).
} 
} Perhaps the fcntls and ioctls aren't being propogated enough to set
} the flags properly, but if they are then it should work sort of the
} way SIGIO does, basically generating a signal for /some condition/
} on a descriptor.

This patch (vs the 3.4-STABLE version of tty.c) causes SIGIO to be
sent when a regular or pseudo tty becomes writeable.


--- tty.c.orig  Sun Aug 29 09:26:09 1999
+++ tty.c   Sat Mar 18 03:09:32 2000
@@ -2133,6 +2133,8 @@
 
if (tp->t_wsel.si_pid != 0 && tp->t_outq.c_cc <= tp->t_olowat)
selwakeup(&tp->t_wsel);
+   if (ISSET(tp->t_state, TS_ASYNC) && tp->t_sigio != NULL)
+   pgsigio(tp->t_sigio, SIGIO, (tp->t_session != NULL));
if (ISSET(tp->t_state, TS_BUSY | TS_SO_OCOMPLETE) ==
TS_SO_OCOMPLETE && tp->t_outq.c_cc == 0) {
CLR(tp->t_state, TS_SO_OCOMPLETE);


BTW, I had to add:
fcntl(1, F_SETOWN, getpid());
to the test program since there is no longer a default target to send
the signal to.  The old scheme had the defect of sending SIGIO to the
process group that owned the terminal, which implied that the terminal
had to be the controlling terminal for the process group.  This limited
a process to only receiving SIGIO from one terminal device even if it
had more than one open and it wanted to receive SIGIO from all of them.
Also, SIGIO was sent to the entire process group, but it may be desireable
to limit this to one process.  I wonder if it might make sense to go
back to the old default for tty devices so that processes only receive
SIGIO when they are in the foreground ...


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: More benchmarking stuff...

1999-09-17 Thread Don Lewis

On Sep 17,  2:03pm, Brad Knowles wrote:
} Subject: Re: More benchmarking stuff...
} 
}   Sadly, when I go to the second set of tests (20,000 files and 
} 50,000 transactions), my performance goes into the crapper.  I know 
} that softupdates trades memory for speed, and I guess this PPro 200 
} w/ 128MB RAM just doesn't have enough memory to keep up.
} 
}   For this stage, I now get:
} 
}   Transactions per second:33
}   KBytes Read per second: 79.66
}   KBytes Written per second:  144.31

I'd expect a NetApp to do a lot better than UFS on FreeBSD if there are
large directories.  Directory lookups in UFS require a sequential scan
whereas the NetApp filesystem uses some sort of hashing scheme.

Also FreeBSD only caches a limited number of directory blocks.   This
was discussed on -hackers in April.  Search for the subject "Directories
not VMIO cached at all!".  Matt Dillon posted a patch to to better
cache directories (at the possible expense of wasted RAM and which breaks
NFS) in Message-ID <[EMAIL PROTECTED]>.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



On hub.freebsd.org refusing to talk to dialups

1999-09-23 Thread Don Croyle

Mark Murray <[EMAIL PROTECTED]> writes:

[hub.freebsd.org now blocking IP adresses on the DUL]

> If you use your ISP's mailer as a "smarthost", you will avoid this
> problem.
> 
> Thos of us in the anti-spam community think thsat this is a Good
> Thing.

That's one way to cut down on support mail from new users, I suppose.
This is yet another feel-good measure that will end up doing at least
as much harm as good.

At a minimum the default freebsd.mc needs something like the attached
patch.  Something prominent in the FAQ and support for defining
SMART_HOST in sysinstall is probably called for as well.

--- freebsd.mc.orig Sun Feb 28 14:52:06 1999
+++ freebsd.mc  Thu Sep 23 16:31:33 1999
@@ -53,6 +53,8 @@
 FEATURE(virtusertable, `hash -o /etc/mail/virtusertable')dnl
 dnl Uncomment to activate Realtime Blackhole List (recommended!)
 dnl FEATURE(rbl)dnl
+dnl Dialup users should uncomment and define this appropriately
+dnl define(`SMART_HOST', `your.isp.mail.server')dnl
 FEATURE(local_lmtp)dnl
 define(`LOCAL_MAILER_FLAGS', LOCAL_MAILER_FLAGS`'P)dnl
 define(`confCW_FILE', `-o /etc/mail/sendmail.cw')dnl

-- 
I've always wanted to be a dilettante, but I've never quite been ready
to make the commitment.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: HEADS UP: sigset_t changes committed

1999-09-30 Thread Don Lewis

On Sep 30, 11:24pm, Marcel Moolenaar wrote:
} Subject: Re: HEADS UP: sigset_t changes committed

} As for me, I'm trying to define the problem as detailed and consise as
} possible. I already have some specific thoughts and ideas. I'm thinking
} large here: real cross-compilation capabilities and such (it may be
} handy for FreeBSD/IA64)...

While proper cross-compilation would be really nice to have, it won't solve
the "make world" problem.  It would get you through "make buildworld", but
"make installworld" will overwrite the system binaries with new versions that
use the new signal syscalls that the currently running kernel doesn't support.
It would even be possible to cross-compile a new kernel, but it still has
to be installed and the system rebooted before installing userland.

In this particular case, the only thing cross-compilation would buy us
is the ability to build (but not install) 4.x binaries on a machine
running 3.x.  It sounds like some folks would be satisfied just having
that.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: HEADS UP: sigset_t changes committed

1999-09-30 Thread Don Lewis

On Sep 30,  4:14pm, John-Mark Gurney wrote:
} Subject: Re: HEADS UP: sigset_t changes committed
} > 
} > In this particular case, the only thing cross-compilation would buy us
} > is the ability to build (but not install) 4.x binaries on a machine
} > running 3.x.  It sounds like some folks would be satisfied just having
} > that.
} 
} I'm sorry, this is easy to fix... have a set of tools you copy to /ibin
} that are used for the install (all staticly compiled binaries hopefully)
} and run the install world out of /ibin...  maybe include some binaries
} for system recovery to make sure...

... but as soon as you run the stuff in /ibin to install the new userland,
you won't even be able to run a shell script, because the newly installed
/bin/sh will be using the new signal syscalls.  The install process will
have to include installing a new kernel and will have to be followed by
an immediate reboot.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: kern/8324

2000-03-30 Thread Don Lewis

On Mar 20, 11:00am, Archie Cobbs wrote:
} Subject: Re: kern/8324
} Don Lewis writes:

} > This patch (vs the 3.4-STABLE version of tty.c) causes SIGIO to be
} > sent when a regular or pseudo tty becomes writeable.
} > 
} > 
} > --- tty.c.orig  Sun Aug 29 09:26:09 1999
} > +++ tty.c   Sat Mar 18 03:09:32 2000
} > @@ -2133,6 +2133,8 @@
} >  
} > if (tp->t_wsel.si_pid != 0 && tp->t_outq.c_cc <= tp->t_olowat)
} > selwakeup(&tp->t_wsel);
} > +   if (ISSET(tp->t_state, TS_ASYNC) && tp->t_sigio != NULL)
} > +   pgsigio(tp->t_sigio, SIGIO, (tp->t_session != NULL));
} > if (ISSET(tp->t_state, TS_BUSY | TS_SO_OCOMPLETE) ==
} > TS_SO_OCOMPLETE && tp->t_outq.c_cc == 0) {
} > CLR(tp->t_state, TS_SO_OCOMPLETE);
} > 
} > 
} > BTW, I had to add:
} > fcntl(1, F_SETOWN, getpid());
} > to the test program since there is no longer a default target to send
} > the signal to.  The old scheme had the defect of sending SIGIO to the
} > process group that owned the terminal, which implied that the terminal
} > had to be the controlling terminal for the process group.  This limited
} > a process to only receiving SIGIO from one terminal device even if it
} > had more than one open and it wanted to receive SIGIO from all of them.
} > Also, SIGIO was sent to the entire process group, but it may be desireable
} > to limit this to one process.  I wonder if it might make sense to go
} > back to the old default for tty devices so that processes only receive
} > SIGIO when they are in the foreground ...
} 
} Don-
} 
} After applying your patch to kern/tty.c and adding the F_SETOWN,
} the problem indeed seems to go away..
} 
} Is this patch ready to be committed, or do we need more reviewers?

Sorry for the delay, I was out of town most of last week and sick most
of this week.

It's probably safe to commit to -current if someone can give it a quick
test there.  Unfortunately I don't have a box running -current to test
it on.

Now, on to some more of my 6280 unread email messages :-(



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: cvs commit: src/sys/contrib/softupdates softdep.h ffs_softdep.c

2000-06-22 Thread Don Lewis

On Jun 22, 10:30am, Adrian Chadd wrote:
} Subject: Re: cvs commit: src/sys/contrib/softupdates softdep.h ffs_softdep
} 
} [shifting conversation to -current .. ]
} 
} On Thu, Jun 22, 2000, Anders Andersson wrote:
} > on Tor, Jun 22, 2000 at 01:46:34pm +0900, Akinori -Aki- MUSHA wrote:
} > > 
} > > Yes, it has been working quite stably here too.  Besides, one must do
} > > a "tunefs -n enable" for every partition that he or she wants to do
} > > softupdates anyway, so just adding the support for softupdates to the
} > > GENERIC kernel won't hurt anyone who don't want to turn that feature
} > > on by default, except a little code increase.
} > 
} > Please take a look at what NetBSD just recently did:
} > http://www.netbsd.org/Changes/#softdepsmount
} > 
} > These changes disables the whole 'tunefs' process, and let you control
} > softupdates state with mount, (-o softdep). So all you have to do is to
} > tune your /etc/fstab to enable softupdates. I think this will make it
} > more easy to enable SOFTUPDATES by default.
} 
} I like this. Would anyone object if this was brought over from NetBSD ?

I'm pretty sure that Kirk had some reason for using tunefs.  It might take
me a while to dig up the information, though, assuming I still have it.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: cvs commit: src/sys/contrib/softupdates softdep.h ffs_softdep.c

2000-06-22 Thread Don Lewis

On Jun 22,  2:21am, Don Lewis wrote:
} Subject: Re: cvs commit: src/sys/contrib/softupdates softdep.h ffs_softdep
} On Jun 22, 10:30am, Adrian Chadd wrote:
} } Subject: Re: cvs commit: src/sys/contrib/softupdates softdep.h ffs_softdep
} } 
} } [shifting conversation to -current .. ]
} } 
} } On Thu, Jun 22, 2000, Anders Andersson wrote:

} } > Please take a look at what NetBSD just recently did:
} } > http://www.netbsd.org/Changes/#softdepsmount
} } > 
} } > These changes disables the whole 'tunefs' process, and let you control
} } > softupdates state with mount, (-o softdep). So all you have to do is to
} } > tune your /etc/fstab to enable softupdates. I think this will make it
} } > more easy to enable SOFTUPDATES by default.
} } 
} } I like this. Would anyone object if this was brought over from NetBSD ?
} 
} I'm pretty sure that Kirk had some reason for using tunefs.  It might take
} me a while to dig up the information, though, assuming I still have it.

Found it, see
<http://www.FreeBSD.org/cgi/getmsg.cgi?fetch=106647+109142+/usr/local/www/db/text/1998/freebsd-current/19980510.freebsd-current>.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



RE: Using serial console to debug system hangs ...

2001-03-03 Thread Don Lewis

On Mar 3,  9:20pm, John Baldwin wrote:
} Subject: RE: Using serial console to debug system hangs ...
} 
} On 04-Mar-01 The Hermit Hacker wrote:
} > 
} > Wow, that was painful ... after 2 hrs, I got as far as:
} 
} Yeah, it spews out a lot of crap. :-/  You prolly want to use a 115200 serial
} console if at all possible.  Should've mentioned that earlier..

.. so I shouldn't plan in using my ASR-33 for this, I guess.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Request for review [Re: /bin/ls patch round #2]

2001-03-21 Thread Don Croyle

"Andrey A. Chernov" <[EMAIL PROTECTED]> writes:

> I fully agree. wctype.h and isw*() must be implemented first instead of
> hacking or using private interface (like runes) in userland program.
> It will be easy to implement them over existen ctype mechanism masking
> runes with wchar_t. Any takers?

If we're not going to bring in CITRUS, I'd prefer to see runes junked
as an unnecessary layer of abstraction.  Doing so would break
backwards compatibility for locales, but I think we're going to end up
doing that eventually anyway.
-- 
I've always wanted to be a dilettante, but I've never quite been ready
to make the commitment.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: /usr/local/etc/rc.d and /etc/rc.d

2000-09-08 Thread Don Lewis

On Sep 9, 12:05am, Matthew Thyer wrote:
} Subject: Re: /usr/local/etc/rc.d and /etc/rc.d
} Neil Blakey-Milner wrote:

} > I'd prefer a dependency based system.  (cf. Eivind Eklund's newrc, at
} > http://people.FreeBSD.org/~eivind/newrc.tar.gz)

How does this compare with what NetBSD implemented?

} I haven't looked at this yet but off the top of my head, a dependency
} based system sounds overly complicated (consider ports authors) and
} unecessarily different from other systems.

NetBSD switched to a dependency based system a while back.  Judging by
the traffic on their mail lists, it was somewhat controversial ...


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Repeated panic out of chgsbsize

2000-09-29 Thread Don Lewis

On Sep 29, 11:30am, Greg Lehey wrote:
} Subject: Repeated panic out of chgsbsize
} In the past couple of days, I've had a couple of panics out of chgsbsize:
} 
} (kgdb) bt

 [ snip ]

} #12 0xc01cbac9 in panic (fmt=0xc0356920 "reducing sbsize: lost count, uid = %d") at 
../../kern/kern_shutdown.c:553
} #13 0xc01c8d7b in chgsbsize (uid=50, diff=-17520, max=9223372036854775807) at 
../../kern/kern_proc.c:206
} #14 0xc01ee6aa in sbrelease (sb=0xcdc091f4, so=0xcdc09180) at 
../../kern/uipc_socket2.c:453
} #15 0xc01eb9fb in sofree (so=0xcdc09180) at ../../kern/uipc_socket.c:261
} #16 0xc0221e0b in in_pcbdetach (inp=0xce1c3aa0) at ../../netinet/in_pcb.c:542
} #17 0xc022c462 in tcp_close (tp=0xce1c3b60) at ../../netinet/tcp_subr.c:711
} #18 0xc0229bf6 in tcp_input (m=0xc0e96500, off0=20, proto=6) at 
../../netinet/tcp_input.c:2012
} #19 0xc02247ee in ip_input (m=0xc0e96500) at ../../netinet/ip_input.c:756
} #20 0xc022484b in ipintr () at ../../netinet/ip_input.c:784
} #21 0xc0309195 in swi_net_next ()

That version of the per-uid accounting implementation has some race
conditions between the kernel top and bottom halves.  I'd recommend
upgrading to PRE_SMPNG.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: installworld failure - libsdbm.a

2000-11-05 Thread Don Lewis

On Nov 4, 11:54am, Kent Stewart wrote:
} Subject: Re: installworld failure - libsdbm.a
} 
} 
} Steven Farmer wrote:
} > 
} > After this morning's cvsup and buildworld, installworld failed trying
} > to build libsdbm.a.  I worked around the problem by adding chmod to
} > Makefile.inc1 as shown below.  BTW - isn't it kind of wierd for a
} > library to be _built_ at installworld time?
} 
} Yes, it is. It is supposed to be build in buildworld where is also
} chmod'ed appropriately. Something triggers the build during
} installworld, which is a place they don't want to add chmod to. I have
} had it hit me once.

I had the same thing happen to me yesterday abuse six hours into
a -current "make release".  The problem didn't recur when I reran
"make release".  One possible quirk is that I am mounting the scratch
area from a 4.1-stable NFS server.  Notice that only the .a file is
getting built, and not the .o files.  I suspect that the file
timestamps are getting messed up, causing make to rebuild the .a
file.

} I added chmod to the progs line like you did and
} it did the build. I have an idea that something didn't trigger the
} build in buildworld and it was needed during the installworld. It has
} never been a problem since. I had a patch like you created and ran it
} after every cvsup but then I found out that I didn't need it. I
} capture the make output for buildworld and installworld and it hasn't
} failed since I started doing that.
} 
} Kent
} 
} > 
} > Cheers,
} > 
} > Steve
} > 
} > -
} > ===> gnu/usr.bin/perl/library/SDBM_File
} > cd /usr/obj/usr/src/gnu/usr.bin/perl/library/SDBM_File/ext/SDBM_File ; make -B 
install  INSTALLPRIVLIB=/usr/libdata/perl/5.00503  
INSTALLARCHLIB=/usr/libdata/perl/5.00503/mach
} > cd sdbm && make all
} > rm -rf libsdbm.a
} > ar cr libsdbm.a sdbm.o  pair.o  hash.o && : libsdbm.a
} > chmod 755 libsdbm.a
} > chmod:No such file or directory
} > *** Error code 1


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: keymaps

1999-01-25 Thread Don Lewis
On Jan 21,  9:40pm, Warner Losh wrote:
} Subject: Re: keymaps
} In message <199901220043.laa22...@lightning.itga.com.au> Gregory Bond writes:
} : my vote: A version of the standard keymap with CapsLock and LeftCtl
} : functions swapped so the control key is under my left finger like
} : God intended!
} 
} What's wrong with us.unix.kbd?

Two things for me:

It's not in the sysinstall menu.

I'm not sure I like the Esc <-> ~` swap.  

Does anyone know of any decent PC keyboards with a Unix-friendly layout?
I'm pretty happy with the layout on a Sun Type-5 keyboard, which puts
Esc right above Tab and to the left of 1 (where PC's generally have ~`).
The Return key is wide, but is confined to the home row, and Backspace
is also wide and is in the row immediately above it.  This leaves room
in the top row (below the function keys, where  PC's put Backspace),
for |\, which PC keyboards put in various random places, and ~`.

To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message


Re: Heads up! New swapper and VM changes have been committed to -4.x

1999-01-26 Thread Don Lewis
On Jan 26, 12:20pm, Dag-Erling Smorgrav wrote:
} Subject: Re: Heads up! New swapper and VM changes have been committed to -
} Brian Feldman  writes:
} > On 24 Jan 1999, Dag-Erling Smorgrav wrote:

} > > These are dynamically linked, and will automatically pick up the new
} > > libkvm.
} > But (most) still require the structures to be the exact same way,
} > which is the reason for the recompile anyway... don't forget that!
} 
} No, because the libkvm interface has not changed, only its internals.
} libkvm must be updated to be able to talk to the kernel, and
} applications which use it must be relinked with it. In the case of
} dynamically linked applications, this is done automatically at load
} time. Or am I reading this wrong?

It depends on what has changed.  If the application asks libkvm to
fetch some structure from the kernel, and the application's idea of
what the structure looks like is different that what is compiled into
the kernel and libkvm, the application will not work correctly.  For
instance, if the layout of the proc structure changes, an application
that was compiled with the old structure definition that calls
kvm_getprocs() will get a pointer to a structure with the new layout.
When the application dereferences the pointer that kvm_getprocs() returns
at some offset into the structure, it will be looking at some other part
of the proc structure than what it wants.

To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message


Re: btokup().. patch to STYLE(9) (fwd)

1999-01-31 Thread Don Lewis
On Jan 29, 12:05pm, Sheldon Hearn wrote:
} Subject: Re: btokup().. patch to STYLE(9) (fwd)

} The reason I'm interested in this (now tiresome) thread is that I'd much
} rather have to read
} 
}   /*
}* Bail out if the time left to next transaction is less than
}* the duration of the previous transaction.
}*/
}   if (t % u - n % u < d % u) {
} 
} than
} 
}   if (((t % u) - (n % u)) < (d % u)) {
} 
} Giving folks the go-ahead to use parens as a form of documentation is
} misguided and will end in tears. MHO.

This is a fairly trivial example, but I find the second version slightly
easier to read at a glance.  I do think it's overly parenthesized, though.
I prefer

if ((t % u - n % u) < (d % u)) {

or 

if ((t % u - n % u) < d % u) {

because they are less cluttered.

To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message


Re: btokup().. patch to STYLE(9) (fwd)

1999-01-31 Thread Don Lewis
On Jan 29,  9:13am, Poul-Henning Kamp wrote:
} Subject: Re: btokup().. patch to STYLE(9) (fwd)
} 
} On the other hand style(9) should still firmly outlaw stuff like:
} 
}   /* wait 10 ms */
}   if (((error = tsleep((caddr_t)dev, PPBPRI | PCATCH,
}   "ppbpoll", hz/100)) != EWOULDBLOCK) != 0) {
}   return (error);
}   }

The "!= 0" is obviously bogus, but what about:

if ((error = tsleep((caddr_t)dev, PPBPRI | PCATCH, "ppbpoll", hz/100))
!= EWOULDBLOCK) {
return (error);
}

It would be better if the "!=" fit on the previous line.

What if the expression fit on one 80 character line?

BTW, something I like that I picked up from Paul Vixie's code is indenting
all the arguments to a function by the same amount.  Forcing an unneccesary
line wrap:

if ((error = tsleep((caddr_t)dev, PPBPRI | PCATCH,
"ppbpoll", hz/100)) != EWOULDBLOCK) {
return (error);
}

which isn't real clean because of the trailing "!= EWOULDBLOCK".  The
downside of this style is that some arguments won't fit in the available
space or the argument list will occupy quite a few lines if the arguments
start too far to the right.


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message


Re: btokup().. patch to STYLE(9) (fwd)

1999-01-31 Thread Don Lewis
On Jan 29,  8:34am, Brian Somers wrote:
} Subject: Re: btokup().. patch to STYLE(9) (fwd)
} 
} My argument is that this sort of thing gets out of hand.  I've seen 
} things such as
} 
}   if (((a == b) || (c == d)))
} 
} where a, b, c & d are just simple variables - there are so many 
} redundant brackets that you have to double-check that there isn't 
} some weird grouping

You can pretty clearly dump the outer parens, since it makes no sense
to write "(expression)" instead of "expression".  In general, "a OP b"
should not be parenthesized if both "a" and "b" are atoms unless the
context requires it.

In general my preferred style doesn't use parentheses in expressions
using "+-*/" according to their naturual precedence rules.  I might
drop the whitespace around "*", just like you'd write "2n" in mathematics.
Likewise, I don't use parentheses in logical expressions or bitwise
expressions where the terms are atoms.  Expressions used as terms in
logical expressions or comparision expressions might be parenthesized
if they are complicated so I can find the extent of the expression by
using '%' in vi.  I always parenthesize the interfaces between bitwise
and other expressions, since K&R admits that C botched the precedence
of the bitwise operators and this seems to be one common place for
bugs to occur.  In general, I always parenthesize non-atomic arguments to
the shift and ternary operators.



To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message


Re: Slow seq. write on Seagate ST36530N

1999-02-22 Thread Don Lewis
On Feb 19,  2:15pm, "Kenneth D. Merry" wrote:
} Subject: Re: Slow seq. write on Seagate ST36530N
} 
} The Write Cache Enable (WCE) bit is in mode page 8.  To check it:
} 
} camcontrol modepage -n da -u 1 -v -m 8
} 
} To edit the mode page:
} 
} camcontrol modepage -n da -u 1 -v -m 8 -e

To make this change permanent, you need to do

camcontrol modepage -n da -u 1 -v -m 8 -e -P 3


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message



ELF interpreter /compat/linux/lib/ld-linux.so.1 not found

1999-02-22 Thread Don Sullivan

RSI's IDL/ENVI product's license manager (lmgrd) previously (2.2.8)
ran just fine. (linux emulation) Now that I've upgraded to 3.1, it
exits with an abort trap, and the above message...

Any thoughts/suggestions would be REALLY welcome.

Thanks in advance,
Don

P.S. the library indicated (/compat/linux/lib/ld-linux.so.1) most
definately IS there, new Globetrotter lmgrd acts identically, the
balance of the (linux) distribution, i.e. IDL and ENVI run just 
fine, albeit in demo mode.
---
   Don Sullivan
   NASA Ames Research Center
   MS 242-4
   Moffett Field, CA 94035-1000
   Voice: 650 604 0526
   Fax:   650 604 4680
   email: dsulli...@gaia.arc.nasa.gov
---




To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message



RE: Hyperthreading and machdep.cpu_idle_hlt

2003-01-31 Thread Don Bowman
> From: Matthew Dillon [mailto:[EMAIL PROTECTED]]
> 
> :The cache and most of the execution hardware is shared.  The 
> execution
> :units can run something like 4 instructions per clock.  If the "idle"
> :logical core is in a spinloop, then it is generating instructions for
> :execution, so you are dividing the execution resources 
> between one context
> :that is doing real work, and the other context that is 
> burning off the
> :"excess" resources.  Overall, it is a huge loss.  It is 
> absolutely essential
> :that logical cpus be halted when they are not doing useful work.
> 
> Ah, that makes sense.  Are the two logical cpus shared 50-50?

Hyperthreading is also called symmetric multi-threading (hyperthread
is a trademark of intel, SMT is the general term).
The two logical cpu's are like a co-operative scheduler. Whenever there
is a stall on one, the other wakes up on the same tick.
THe most common cause for the stall is an access to memory. Ie when
the first 'cpu' does a load-word, the memory controller tries to
get that from L1->L2->L3->memory, with increasing latency. The
other 'CPU' starts executing on the same cycle as the latency 
to the memory starts, and only stops when it too stalls.

Thus the worst thing you could have would be a nop-loop with
no stalls, which would squeeze the other to death.

This is common in the network-processor world (e.g. AMCC, etc)
since those applications are governed by memory latency.

As the clock rate of memory has gone up, the overall latency to
the first word has stayed relatively constant, so even though
DDR 266 memory may have a much faster throughput, it takes
just as long for that first access.

Intel also has a speculative prefetch which tries to guess 
which memory will be needed next, and bring that in. There is
an explicit prefetch in the SSE2/MMX set if you know better
than the processor. This is good for for e.g. prefetch both
halves of a tree before you do the compare.

In practise I've found intel's numbers to be true, that the
SMT gives you a ~20% boost, implying that there is nowhere
close to a 50-50% split in normal use.

--don

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



RE: L440gx+ serial BIOS needs text mode

2003-02-03 Thread Don Bowman
> From: Lucky Green [mailto:[EMAIL PROTECTED]]
>
> 1) Is there a way to prevent the FreeBSD kernel from ever 
> switching into
> this different video mode, thus allowing me to continue to use the
> built-in serial terminal? The machine is a headless server, I 
> don't care
> if video works as long as I can pull out a serial terminal.

enable the comconsole option, set the baud rate.
As for the colours etc, our machines have BIOS that do that
too... I added a:

set console=comconsole

\ What's all this about then? It resets the console (to fix up the bios
\ colour change), and outputs a banner.
cr .( ^[c)

to the start of /boot/loader.rc

which is a vt100 reset code. This isn't a general purpose
solution, but works for me. Our BIOS also changes the colour
just as it exits, leaving it black on black :)

--don

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



RE: L440gx+ serial BIOS needs text mode

2003-02-03 Thread Don Bowman
> From: Terry Lambert [mailto:[EMAIL PROTECTED]]
 ...
> 
> Anyway, it was a particular problem with the SuperMicro motherboards
> with the AMI BIOS that's been the subject of the rest of this
> discussion (i.e. the ones that kick out the escape sequence at the
> end, for no good reason, except to screw up non-monochrome VTxxx
> emulators, and make it hard to use a UNIX box as the serial console).

FYI, I've found that running under 'screen' fixes the problem
with the odd escape sequences.
One type of system we have puts out an escape sequence that
is escape followed by 12 [. Not sure what that is :)
The other works just great, except it switches to black on
black just as it switches to the OS :)

--don

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



problem with if_em and polling with RXSEQ interrupt? (and patch)

2003-02-28 Thread Don Bowman
In the if_em driver, it appears there is a possibility of 
a system crash when POLLING is enabled in conjunction with
a link change.

em_disable_intr leaves the RXSEQ interrupt enabled (which
occurs when a link goes up or down). THe em_intr routine,
when in polling mode, just returns (with the interrupt
still active), which then re-enters it. the kern_poll
will never get a chance to do its POLL_AND_CHECK_STATUS in
this case.

I'm proposing to add to em_intr a call to em_poll with
POLL_AND_CHECK_STATUS if there is an interrupt pending
and we are in polling mode:

1086c1086,1087
<   if (ifp->if_ipending & IFF_POLLING)
---
>   if (ifp->if_ipending & IFF_POLLING) {
>   em_poll(ifp, POLL_AND_CHECK_STATUS, 1);
1087a1089
>   }

I also propose to enable the link status change interrupt in
em_disable_intr():
2259c2261
<   (0x & ~E1000_IMC_RXSEQ));
---
>   (0x & ~(E1000_IMC_RXSEQ | E1000_ICR_LSC)));

Does anybody have any comments on this?

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message


Re: LOR tcp_input.c vs. tcp_usrreq.c (was: Re: 2 LORs on my NFSserver.)

2003-08-16 Thread Don Lewis
On 16 Aug, Tilman Linneweh wrote:
> * Tilman Linneweh [Fr, 15 Aug 2003 at 16:17 GMT]:
>> 
>> My CURRENT is already a bit old:
>> 
>> # uname -a
>> FreeBSD polly.arved.de 5.1-CURRENT FreeBSD 5.1-CURRENT #1: Sun Jul 20
>> 01:00:14 CEST 2003
>> [EMAIL PROTECTED]:/usr/obj/usr/src/CURRENT/sys/POLLY  i386
> 
> I updated my CURRENT to 
> 
> polly# uname -a
> FreeBSD polly.arved.de 5.1-CURRENT FreeBSD 5.1-CURRENT #1: Sat Aug 16
> 10:11:52 CEST 2003
> [EMAIL PROTECTED]:/usr/obj/usr/source/CURRENT/sys/POLLY  i386
> 
> and this LOR is reproducable. 
>  
>> This happend while the machine was NFS-serving around 3 clients with
>> normal udp NFS and a  fourth. client tried to mount something via
>> mount_nfs -T -a 2
> 
> The problem is the client with TCP mounts. I tried this time with a single
> NetBSD client that does a TCP mount and cd'd to the mounted directory.
> 
> lock order reversal
>  1st 0xc1a17278 inp (inp) @ /usr/source/CURRENT/sys/netinet/tcp_input.c:654
>  2nd 0xc046bd6c tcp (tcp) @ /usr/source/CURRENT/sys/netinet/tcp_usrreq.c:621
> Stack backtrace:
> backtrace(1,0,,c0445068,c04451d0) at backtrace+0x12
> witness_lock(c046bd6c,8,c03c334c,26d,0) at witness_lock+0x55e
> _mtx_lock_flags(c046bd6c,0,c03c334c,26d) at _mtx_lock_flags+0x7d
> tcp_usr_rcvd(c1ce8800,80) at tcp_usr_rcvd+0x1b
> soreceive(c1ce8800,c891ab1c,c891ab28,c891ab20,0) at soreceive+0x815
> nfsrv_rcv(c1ce8800,c1a70780,4) at nfsrv_rcv+0x75
> sowakeup(c1ce8800,c1ce884c) at sowakeup+0x7f
> tcp_input(c0b9ac00,14) at tcp_input+0x11f6
> ip_input(c0b9ac00) at ip_input+0x7c8
> swi_net(0) at swi_net+0xe6
> ithread_loop(c0b87180,c891ad48,c0b87180,c0221660,0) at ithread_loop+0x11c
> fork_exit(c0221660,c0b87180,c891ad48) at fork_exit+0xab
> fork_trampoline() at fork_trampoline+0x8
> --- trap 0x1, eip = 0, esp = 0xc891ad7c, ebp = 0 ---
> Debugger("witness_lock")
> Stopped at  Debugger+0x45:  xchgl   %ebx,in_Debugger.0
> 

This is a known issue.

-- Forwarded message --
From: Don Lewis <[EMAIL PROTECTED]>
 Subject: Re: LOR in NFS server
Date: Thu, 24 Apr 2003 21:20:56 -0700 (PDT)
  To: [EMAIL PROTECTED]
  Cc: [EMAIL PROTECTED]

On 24 Apr, Gordon Tetlow wrote:
> I generated it while running nessus against my local machine.
> 
> lock order reversal
>  1st 0xc9384c44 inp (inp) @ /local/usr.src/sys/netinet/tcp_input.c:649
>  2nd 0xc05aa84c tcp (tcp) @ /local/usr.src/sys/netinet/tcp_usrreq.c:621
> Stack backtrace:
> backtrace(c04e9f03,c05aa84c,c04f0770,c04f0770,c04f1ae4) at backtrace+0x17
> witness_lock(c05aa84c,8,c04f1ae4,26d,0) at witness_lock+0x692
> _mtx_lock_flags(c05aa84c,0,c04f1ae4,26d,0) at _mtx_lock_flags+0xb2
> tcp_usr_rcvd(c8a63800,80,c04ea514,df0e9a9c,3b9aca00) at tcp_usr_rcvd+0x30
> soreceive(c8a63800,df0e9ad8,df0e9ae4,df0e9adc,0) at soreceive+0x86a
> nfsrv_rcv(c8a63800,c6d4fb00,4,34,10430) at nfsrv_rcv+0x8a
> sowakeup(c8a63800,c8a6384c,c04f11d5,434,108) at sowakeup+0x97
> tcp_input(c21f5400,14,c0304f91,df0e9c5c,c02f60ba) at tcp_input+0x1341
> ip_input(c21f5400,0,c04efede,e9,c21bd280) at ip_input+0x7b0
> swi_net(0,0,c04e4eed,217,c21c73c0) at swi_net+0x111
> ithread_loop(c21c6100,df0e9d48,c04e4d5d,314,c21c8d10) at ithread_loop+0x16c
> fork_exit(c02ec2d0,c21c6100,df0e9d48) at fork_exit+0xc0
> fork_trampoline() at fork_trampoline+0x1a
> --- trap 0x1, eip = 0, esp = 0xdf0e9d7c, ebp = 0 ---


Hmn ... does NFS over TCP even work with a -current box as the server?
It looks like tcp_input() has grabbed the locks in tcbinfo and inp, and
then tcp_usr_rcvd() attempts to grab the same locks.


I can think of three possible ways of fixing this problem.

1) Drop the locks in tcp_input() before calling sorwakeup() and grab
   them again if necessary.  One has to be careful not to break
   anything by doing this.  This also adds overhead for non-NFS
   traffic.

2) Never call soreceive() from nfsrv_rcv(), always wake nfsd instead.
   This has the advantage of minimizing the amount of time that the
   locks are held, but increases overhead under lightly loaded
   conditions.

3) Somehow tell tcp_usr_rcvd() not to attempt to grab the locks in
   this specific case.

-- End forwarded message --

-- Forwarded message --
From: Jeffrey Hsu <[EMAIL PROTECTED]>
 Subject: Re: LOR in NFS server
Date: Fri, 25 Apr 2003 01:02:56 -0700
  To: [EMAIL PROTECTED]
  Cc: [EMAIL PROTECTED]

  > 1st 0xc9384c44 inp (inp) @ /local/usr.src/sys/netinet/tcp_input.c:649
  > 2nd 0xc05aa84c tcp (tcp) @ /local/usr.src/sys/netinet/tcp_usrreq.c:621

This old nag warning has been there since last year and was first reported
by Lars Eggert <[EMAIL PROTECTED]>.  I ma

freebsd-current@freebsd.org

2003-08-19 Thread Don Lewis
On 19 Aug, Mark Sergeant wrote:
> Hi All,
> 
>   When trying to compile a kernel for my 8 cpu DELL 8450's I recieve an
> extremly puzzling error, I get a bunch of errors when compiling a kernel
> that has the following options in it...
> 
> options WITNESS
> options NETSMB
> options NETSMBCRYPTO
> options LIBMCHAIN
> options LIBICONV
> options PAE
> options SMP 
> options APIC_IO
> 
> Without  PAE SMP or APIC_IO the kernel will compile fine. With these
> options I get the following error when compiling the sym scsi driver.

Take a look at /usr/src/sys/i386/conf/PAE.  It says:

# What follows is a list of drivers that are normally in GENERIC, but either
# don't work or are untested with PAE.  Be very careful before enabling any
# of these drivers.  Drivers which use DMA and don't handle 64 bit physical
# address properly may cause data corruption when used in a machine with more
# than 4 gigabytes of memory.

and under this comment it lists the sym and usb drivers.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: make buildworld errors (libcam)

2003-09-03 Thread Don Lewis
On  3 Sep, Michael Bretterklieber wrote:
> Hi,
> 
> buildworld fails (cvsup some minutes ago):
> In file included from /usr/src/sys/cam/scsi/scsi_da.c:51:
> /usr/src/sys/sys/taskqueue.h:33:2: #error "no user-servicable parts
> inside"
> mkdep: compile failed

The following patch works for me:

Index: sys/cam/scsi/scsi_da.c
===
RCS file: /home/ncvs/src/sys/cam/scsi/scsi_da.c,v
retrieving revision 1.157
diff -u -r1.157 scsi_da.c
--- sys/cam/scsi/scsi_da.c  3 Sep 2003 04:46:28 -   1.157
+++ sys/cam/scsi/scsi_da.c  3 Sep 2003 07:35:54 -
@@ -48,7 +48,9 @@
 #include 
 #include 
 #include 
+#ifdef _KERNEL
 #include 
+#endif /* _KERNEL */
 
 #include 
 

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


dirtybuf: 0xc643f000 interlock is not locked but should be

2003-09-03 Thread Don Lewis
I just upgraded to a fresh version of -current and started getting a lot
of these vnode lock violation messages when running with the
DEBUG_VFS_LOCKS kernel option.

I only ever saw the stack trace below, but it is not obvious to me that
other callers of getdirtybuf() would not have the same problem with the
vnode interlock.



dirtybuf: 0xc643f000 interlock is not locked but should be
Debugger("Lock violation.
")
Stopped at  Debugger+0x54:  xchgl   %ebx,in_Debugger.0
db> bt
No such command
db> tr
Debugger(c055816e,c0565751,c643f000,c05581a5,eb68) at Debugger+0x54
vfs_badlock(c05581a5,c0565751,c643f000,0,eba0) at vfs_badlock+0x45
assert_vi_locked(c643f000,c0565751,0,c61e8850,c0616f80) at assert_vi_locked+0x3a
getdirtybuf(ebb4,0,1,d2899610,1) at getdirtybuf+0xee
flush_deplist(c64532cc,1,ebdc,ebe0,0) at flush_deplist+0x43
flush_inodedep_deps(c641c000,6d45b,,c6507a44,124) at flush_inodedep_deps+0xa3
softdep_sync_metadata(eca4,0,c0565af4,124,0) at softdep_sync_metadata+0x87
ffs_fsync(eca4,c054a4a8,c0558b14,ad8,0) at ffs_fsync+0x3b9
fsync(c61e8850,ed10,c056cc5c,3eb,1) at fsync+0x1d4
syscall(2f,2f,2f,8054cdc,bfbfeda0) at syscall+0x273
Xint0x80_syscall() at Xint0x80_syscall+0x1d
--- syscall (95, FreeBSD ELF32, fsync), eip = 0x480cf38f, esp = 0xbfbfe7ec, ebp = 
0xbfbfedc8 ---

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


using tip on machine that has COMCONSOLE set to serial

2003-09-16 Thread Don Bowman

This may be a dumb question, but I have 
a situation where machine A and B both have
enabled serial console. I'm ssh'ing into A to
try and debug a problem on B. I'm trying to
use tip, but am getting interference from the 
fact that A also has a serial console.

If i disable the getty, its a bit better.

Is there a way to make this work reliably, or
am I SOL?

--don
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


RE: using tip on machine that has COMCONSOLE set to serial

2003-09-17 Thread Don Bowman
From: Terry Lambert [mailto:[EMAIL PROTECTED]
> Don Bowman wrote:
> > This may be a dumb question, but I have
> > a situation where machine A and B both have
> > enabled serial console. I'm ssh'ing into A to
> > try and debug a problem on B. I'm trying to
> > use tip, but am getting interference from the
> > fact that A also has a serial console.
> > 
> > If i disable the getty, its a bit better.
> > 
> > Is there a way to make this work reliably, or
> > am I SOL?
> 
> Use or modify a getty to require multiple CR's to activate.  Or
> use one that only activates on a break.
> 
> Best would be to use a getty that respected lock files, needed
> 2 CR's to start after off-to-on DTR/DCD transition (you will be
> using a NULL-modem cable), and your tip/cu/whatever program did
> appropriate locking, and knew how to back off.
> 
> Then you could put the getty's back-to-back and they would not
> chat each other to death, and you could call out of the one
> machine into the other, and your local getty would not eat half
> the characters.
> 
> See also "uugetty" and "mgetty" in ports.

What i ended up doing which worked OK was to changed /etc/ttys
on the machine i wanted to run tip on to comment out the 'ttyd0'
line, and HUP init. I then installed 'minicom' port, and used
it. The machine i was running it on has to be quiescent so no
kernel printfs occur. minicom used /dev/cuaa0.

Bruce Evans suggested using 'db' to poke a '0xc3' into the
kernel printf start to make it return right away, if this
were a bigger problem for me I would give that a try.

--don
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


RE: Bad performance

2003-09-17 Thread Don Bowman
From: sebastian ssmoller [mailto:[EMAIL PROTECTED]
 ...
> 
> i turned of acpi on startup an voila :) : gdm starts two 
> times faster as
> before (!) (30s -> 15-17s)
> 
> can anyone explain me why, pls ?

I wonder how hot your processor is? perhaps ACPI is throttling
the clock back, either duty cycle or frequency.
In your bios you can set the power mode, perhaps you 
can set 'full power always'.

lmmon might show something.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


ThinkPad R40 hangs during ACPI power down

2003-09-24 Thread Don Lewis
I've got an IBM ThinkPad R40 that hangs when I do a "shutdown -p".  It s
wedges after printing "Powering system off using ACPI".  The display
stays on, and judging by the heat, it seems that the CPU is on as well.
It doesn't respond to the keyboard, so I haven't been able to get into
DDB.  The only thing I can do at this point is to hold the power button
down to force it to power off.  The next boot is clean.

I've seen the same behaviour with September 8th and September 21st
versions of 5.1-CURRENT.

Attempting to use 'acpiconf -s" to suspend produces similar hangs.


I tried compiling a version of the kernel with the ACPI_DEBUG option
listed in NOTES, but buildkernel dies here:

cc -c -O -pipe -mcpu=pentiumpro -Wall -Wredundant-decls -Wnested-externs -Wstric
t-prototypes  -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual  -fforma
t-extensions -std=c99 -g -nostdinc -I-  -I. -I/usr/src/sys -I/usr/src/sys/contri
b/dev/acpica -I/usr/src/sys/contrib/ipfilter -I/usr/src/sys/contrib/dev/ath -I/u
sr/src/sys/contrib/dev/ath/freebsd -D_KERNEL -include opt_global.h -fno-common -
finline-limit=15000 -fno-strict-aliasing  -mno-align-long-strings -mpreferred-st
ack-boundary=2 -ffreestanding -Werror  /usr/src/sys/dev/acpica/acpi_button.c
/usr/src/sys/dev/acpica/acpi_button.c: In function `acpi_button_fixed_handler':
/usr/src/sys/dev/acpica/acpi_button.c:246: error: `_Dbg' undeclared (first use i
n this function)
/usr/src/sys/dev/acpica/acpi_button.c:246: error: (Each undeclared identifier is
 reported only once
/usr/src/sys/dev/acpica/acpi_button.c:246: error: for each function it appears i
n.)
*** Error code 1

Stop in /usr/obj/usr/src/sys/ACPI_DEBUG.
*** Error code 1

Stop in /usr/src.

NOTES says that to use this option, the Intel code must have the
USE_DEBUGGER flag set.  I didn't see references to this in the code, but
there are a bunch of #ifdefs that refer to ACPI_DEBUGGER.  I tried
adding this to my kernel configuration, but config barfs on it.

The APCI asl file is at 
and the dmesg.boot is at
.  I downloaded the
ACPI spec and attempted to use it to decipher my asl file, but I decided
it was hopeless since I didn't even know far the ACPI code was getting.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: ThinkPad R40 hangs during ACPI power down

2003-09-26 Thread Don Lewis
On 25 Sep, Nate Lawson wrote:
>> I've got an IBM ThinkPad R40 that hangs when I do a "shutdown -p".  It
>> wedges after printing "Powering system off using ACPI".
>>
>> Attempting to use 'acpiconf -s" to suspend produces similar hangs.
> 
> Your system is halting correctly but powering off is failing.  A cursory
> glance at your ASL shows nothing particularly amiss.  It's very similar to
> my laptop (T23).
> 
>> I tried compiling a version of the kernel with the ACPI_DEBUG option
>> listed in NOTES, but buildkernel dies
> 
> This was fixed on Sept 21 so cvsup and recompile.  Set hw.acpi.verbose=1
> in loader.conf to get more messages.

I didn't get much more ...

acpi_cmbat1: battery initialization failed, giving up  
Waiting (max 60 seconds) for system process `vnlru' to stop...stopped  
Waiting (max 60 seconds) for system process `bufdaemon' to stop...stopped  
Waiting (max 60 seconds) for system process `syncer' to stop...stopped  
  
syncing disks, buffers remaining... 11 11   
done  
Uptime: 1m10s  
Powering system off using ACPI 


> To debug this, please boot a newer kernel with the ACPI_DEBUG option with
> the following options in loader.conf:
> 
> debug.acpi.layer="ACPI_ALL_COMPONENTS ACPI_ALL_DRIVERS"
> debug.acpi.level="ACPI_LV_FUNCTIONS"
> 
> You'll get spammed with way too many messages on boot but just ignore
> these.

You're not kidding.  I gave up after 2 hours and 7.5 MB of output.  It
looks like it's looping, see below ...


> Then do shutdown -p and log the printed messages (hopefully you
> have a serial console).
> 
> I'll map the debugging tunables to a sysctl since it would be better if
> you could just set this just before testing rather than for the full boot.

Yeah, that would be a lot better.



 psscope-0236 [10] PsPushScope   : Entry 0xc402e9a8
   SYNCH-0156 [11] AcpiOsWaitSemaphore   : Entry
   SYNCH-0162 [11] AcpiOsWaitSemaphore   : Exit- AE_OK
   SYNCH-0314 [11] AcpiOsSignalSemaphore : Entry
   SYNCH-0336 [11] AcpiOsSignalSemaphore : Exit- AE_OK
  utmisc-1062 [11] UtPushGenericState: Entry
  utmisc-1070 [11] UtPushGenericState: Exit-
 psscope-0271 [10] PsPushScope   : Exit- AE_OK
   SYNCH-0156 [10] AcpiOsWaitSemaphore   : Entry
   SYNCH-0162 [10] AcpiOsWaitSemaphore   : Exit- AE_OK
   SYNCH-0314 [10] AcpiOsSignalSemaphore : Entry
   SYNCH-0336 [10] AcpiOsSignalSemaphore : Exit- AE_OK
 psparse-0405 [10] PsNextParseState  : Entry 0xc403b128
 psparse-0501 [10] PsNextParseState  : Exit- AE_OK
  psargs-0489 [10] PsGetNextSimpleArg: Entry 0001
  psargs-0562 [10] PsGetNextSimpleArg: Exit-
 psparse-0405 [10] PsNextParseState  : Entry 0xc403b128
 psparse-0501 [10] PsNextParseState  : Exit- AE_OK
 psparse-0231 [10] PsCompleteThisOp  : Entry 0xc403b128
 psparse-0378 [10] PsCompleteThisOp  : Exit-
 psscope-0301 [10] PsPopScope: Entry
  utmisc-1093 [11] UtPopGenericState : Entry
  utmisc-1106 [11] UtPopGenericState : Exit- 0xc40248a8
  utmisc-1333 [11] UtDeleteGenericState  : Entry
   SYNCH-0156 [12] AcpiOsWaitSemaphore   : Entry
   SYNCH-0162 [12] AcpiOsWaitSemaphore   : Exit- AE_OK
   SYNCH-0314 [12] AcpiOsSignalSemaphore : Entry
   SYNCH-0336 [12] AcpiOsSignalSemaphore : Exit- AE_OK
  utmisc-1337 [11] UtDeleteGenericState  : Exit-
 psscope-0334 [10] PsPopScope: Exit-
 nsutils-0982 [10] NsOpensScope  : Entry Integer
 nsutils-0993 [10] NsOpensScope  : Exit-0   0
 psparse-0405 [10] PsNextParseState  : Entry 0xc402e9a8
 psparse-0501 [10] PsNextParseState  : Exit- AE_OK
 psparse-0231 [10] PsCompleteThisOp  : Entry 0xc402e9a8
  pswalk-0340 [11] PsDeleteParseTree : Entry 0xc402e9a8
  utmisc-1165 [12] UtCreateThreadState   : Entry
   SYNCH-0156 [13] AcpiOsWaitSemaphore   : Entry
   SYNCH-0162 [13] AcpiOsWaitSemaphore   : Exit- AE_OK
   SYNCH-0314 [13] AcpiOsSignalSemaphore : Entry
   SYNCH-0336 [13] AcpiOsSignalSemaphore : Exit- AE_OK
  utmisc-1181 [12] UtCreateThreadState   : Exit- 0xc40248a8
dswstate-0950 [12] DsCreateWalkState : Entry
   SYNCH-0156 [13] AcpiOsWaitSemaphore   : Entry
   SYNCH-0162 [13] AcpiOsWaitSemaphore   : Exit- AE_OK
   SYNCH-0314 [13] AcpiOsSignalSemaphore : Entry
   SYNCH-0336 [13] AcpiOsSignalSemaphore : Exit- AE_OK
dsmthdat-0158 [13] DsMethodDataInit  : Entry
dsmthdat-0186 [13] DsMethodDataInit  : Exit-
   SYNCH-0156 [13] AcpiOsWaitSemaphore   : Entry
   SYNCH-0162 [13] AcpiOsWaitSemaphore   : Exit- AE_OK
   SYNCH-0314 [13] AcpiOsSignalSemaphore : Entry
   SYNCH-0336 [13] AcpiOsSignalSemaphore : Exit- AE_OK
  utmisc-1062 [13] UtPushGenericState: Entry
  utmisc-1070 [13] UtPushGenericState: Exit-
dswstate-0872 [13] DsPushWalkState   : Entry
dswstate-0878 [13] DsPushWalkState   : 

Re: ThinkPad R40 hangs during ACPI power down

2003-09-26 Thread Don Lewis
On 26 Sep, To: [EMAIL PROTECTED] wrote:
> On 25 Sep, Nate Lawson wrote:

>> To debug this, please boot a newer kernel with the ACPI_DEBUG option with
>> the following options in loader.conf:
>> 
>> debug.acpi.layer="ACPI_ALL_COMPONENTS ACPI_ALL_DRIVERS"
>> debug.acpi.level="ACPI_LV_FUNCTIONS"
>> 
>> You'll get spammed with way too many messages on boot but just ignore
>> these.
> 
> You're not kidding.  I gave up after 2 hours and 7.5 MB of output.  It
> looks like it's looping, see below ...

I let it run overnight and it spewed 25 MB of output before I killed it.
It didn't get very far into the boot, just after the CPU is probed.  I
uploaded the full trace to
.

The system boots ok if I remove these two debug options.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Improvements to fsck performance in -current ...?

2003-10-02 Thread Don Lewis
On  2 Oct, Terry Lambert wrote:
> Jens Rehsack wrote:
>> Kevin Oberman wrote:
>> > Current has two major changes re speeding up fsck.
>> >
>> > The most significant is the background operation of fsck on file
>> > system with soft updates enabled. Because of the way softupdates
>> > works, you are assured of metadata consistency on reboot, so the file
>> > systems can be mounted and used immediately with fsck started up in
>> > the background about a minute after the system comes up.
>> 
>> Be careful what you promise :-)
>> Most new disks have an own disk cache and some of them have a
>> write cache enabled. In case of a hardware failure (or power
>> failure) this data may get lost and the disk's metadata isn't
>> consistent. It's only when no write cache below the system
>> is active.
> 
> Actually, write caching is not so much the problem, as the disk
> reporting that the write has completed before the contents of
> the transaction saved in the write cache have actually been
> committed to stable storage.
> 
> Unfortunately, IDE disks do not permit disconnected writes, due
> to a bug in the original IDE implementation, which has been
> carried forward for [insert no good reason here].
> 
> Therefore IDE disks almost universally lie to the driver any
> time write caching is enabled on an IDE drive.
> 
> In most cases, if you use SCSI, the problem will go away.

Nope, they "lie" as well unless you turn of the WCE bit.  Fortunately
with tagged command queuing there is very little performance penalty for
doing this in most cases.  The main exception to this is when you run
newfs which talks to the raw partition and only has one command
outstanding at a time.

Back in the days when our SCSI implementation would spam the console
whenever it reduced the number of tagged openings because the drive
indicated that its queue was full, I'd see the number of tagged openings
stay at 63 if write caching was disabled, but the number would drop
significantly under load (50%?) if write caching was enabled.  I always
suspected that the drive's cache was full of data for write commands
that it had indicated to the host as being complete even though the data
hadn't been written to stable storage.

Unfortunately SCSI drives all seem to ship with the WCE bit set,
probably for "benchmarking" reasons, so I always have to remember to
turn this bit off whenever I install a new drive.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


cardbus code still broken in current

2003-10-06 Thread Don Lewis
cc -c -O -pipe -mcpu=pentiumpro -Wall -Wredundant-decls -Wnested-externs -Wstric
t-prototypes  -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual  -fforma
t-extensions -std=c99 -g -nostdinc -I-  -I. -I/usr/src/sys -I/usr/src/sys/contri
b/dev/acpica -I/usr/src/sys/contrib/ipfilter -I/usr/src/sys/contrib/dev/ath -I/u
sr/src/sys/contrib/dev/ath/freebsd -D_KERNEL -include opt_global.h -fno-common -
finline-limit=15000 -fno-strict-aliasing  -mno-align-long-strings -mpreferred-st
ack-boundary=2 -ffreestanding -Werror  /usr/src/sys/dev/cardbus/cardbus_cis.c
/usr/src/sys/dev/cardbus/cardbus_cis.c:80: warning: `decode_tuple_copy' declared
 `static' but never defined
*** Error code 1

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


RE: Why is em nic generating interrupts?

2003-10-09 Thread Don Bowman
From: Terry Lambert [mailto:[EMAIL PROTECTED]
> "Michael O. Boev" wrote:
> > I've got a [uniprocessor 5.1-RELEASE] router machine with 
> fxp and em nics.
> > I've built my kernel with the following included:
> > 
> > options DEVICE_POLLING
> > options HZ=2500
> > 
> > and enabled polling in /etc/sysctl.conf.
> [ ... ]
> > What's happening? Is polling working in my case?
> > If yes, why is vmstat showing interrupts? I see clearly,
> > that fxp's counter doesn't increase, and em's is constantly growing.
> > 
> > Is there anyone who knows for sure that em's polling works?
> 
> You may want to ask Luigi; polling is his code.
> 
> However, I believe the issue is that polling doesn't start
> until you take an interrupt, and it stops as soon as there is
> no more data to process, and waits for the next interrupt.
> 
> If you were to jack your load way up, you would probably see
> an increase in interrupts, then them dropping off dramatically.
> 
> If all else fails, read the source code... 8-).

FWIW, this works for me with 4.7. As terry says, you
do see a couple of initial interrupts, as below, and
then they stop.

$ vmstat -i
interrupt   total   rate
em0 irq16   4  0
em1 irq17   4  0
ahc0 irq18   5653  0
ahc1 irq19 15  0
em2 irq20   4  0
mux irq21   3  0
sio0 irq41583  0
sio1 irq3   1  0
clk irq0 14545633   2501
rtc irq8   744224128
Total15297124   2631
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Seeing system-lockups on recent current

2003-10-10 Thread Don Lewis
On 10 Oct, Dag-Erling Smørgrav wrote:
> Doug White <[EMAIL PROTECTED]> writes:
>> On Fri, 10 Oct 2003, Garance A Drosihn wrote:
>> > For the past week or so, I have been having a frustrating time
>> > with my freebsd-current/i386 system.  It is a dual Athlon
>> > system.  [...]
>> It would be useful to isolate exactly what day the problem started
>> occuring.
> 
> I experienced similar problems on a dual Athlon system (MSI K7D
> Master-L motherboard, AMD 760MPX chipset, dual Athlon MP 2200+) which
> is barely a couple of months old.  I ended up reverting to RELENG_5_1.
> With -CURRENT, both UP and SMP kernels will crash with symptoms which
> suggest hardware trouble.  With RELENG_5_1, UP is rock solid (knock on
> wood) while SMP crashes within minutes of booting.  I've run out of
> patience with this system, so I'll keep running RELENG_5_1 on it until
> someone manages to convince me that -CURRENT will run properly on AMD
> hardware (maybe around 5.3 or so...)

My Athlon XP 1900+/AMD 761 UP box is happily running a late October 6th
version of -current.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: panic: pmap_zero_page: CMAP3 busy

2003-10-11 Thread Don Lewis
On 11 Oct, Steve Kargl wrote:
> Upgrade tonight (7pm PST) and received the following
> on rebooting
> 
> panic: pmap_zero_page: CMAP3 busy
> 
> Unfortunately, this system does not have a serial
> console and the panic locked it up tight.  Only
> a hard reset brought the system back.

I was just about to type "make installworld" when I got this message

I checked the commit logs and didn't see any recent commits that looked
suspicious, and since I do have a serial console I decided to throw
caution to the wind and give the new kernel a try.

Other than an annoyingly long pause while GEOM waits for my SCSI cdrom
drive to figure out that it is empty (which has been noted in another
thread), my system booted without any problems.  My kernel has
everything commited to the present time except:

tjr 2003/10/11 21:25:26 PDT

  FreeBSD src repository

  Modified files:
sys/i386/ibcs2   ibcs2_misc.c ibcs2_signal.c
 ibcs2_socksys.c ibcs2_util.c ibcs2_util.h
 imgact_coff.c
  Log:
  Fix a multitude of security bugs in the iBCS2 emulator:
  - Return NULL instead of returning memory outside of the stackgap
in stackgap_alloc() (FreeBSD-SA-00:42.linux)
  - Check for stackgap_alloc() returning NULL in ibcs2_emul_find();
other calls to stackgap_alloc() have not been changed since they
are small fixed-size allocations.
  - Replace use of strcpy() with strlcpy() in exec_coff_imgact()
to avoid buffer overflow
  - Use strlcat() instead of strcat() to avoid a one byte buffer
overflow in ibcs2_setipdomainname()
  - Use copyinstr() instead of copyin() in ibcs2_setipdomainname()
to ensure that the string is null-terminated
  - Avoid integer overflow in ibcs2_setgroups() and ibcs2_setgroups()
by checking that gidsetsize argument is non-negative and
no larger than NGROUPS_MAX.
  - Range-check signal numbers in ibcs2_wait(), ibcs2_sigaction(),
ibcs2_sigsys() and ibcs2_kill() to avoid accessing array past
the end (or before the start)

  Revision  ChangesPath
  1.52  +21 -3 src/sys/i386/ibcs2/ibcs2_misc.c
  1.32  +7 -2  src/sys/i386/ibcs2/ibcs2_signal.c
  1.19  +5 -3  src/sys/i386/ibcs2/ibcs2_socksys.c
  1.17  +4 -2  src/sys/i386/ibcs2/ibcs2_util.c
  1.17  +4 -1  src/sys/i386/ibcs2/ibcs2_util.h
  1.61  +1 -1  src/sys/i386/ibcs2/imgact_coff.c


Maybe this problem only affects certain hardware.  Here is my dmesg.boot
for comparison:

Copyright (c) 1992-2003 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 5.1-CURRENT #28: Sat Oct 11 21:58:42 PDT 2003
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERICSMB
Preloaded elf kernel "/boot/kernel/kernel" at 0xc0a8f000.
Preloaded elf module "/boot/kernel/aout.ko" at 0xc0a8f244.
Preloaded elf module "/boot/kernel/acpi.ko" at 0xc0a8f2f0.
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: AMD Athlon(tm) XP 1900+ (1608.23-MHz 686-class CPU)
  Origin = "AuthenticAMD"  Id = 0x662  Stepping = 2
  
Features=0x383fbff
  AMD Features=0xc048
real memory  = 1073676288 (1023 MB)
avail memory = 1033592832 (985 MB)
Pentium Pro MTRR support enabled
npx0: [FAST]
npx0:  on motherboard
npx0: INT 16 interface
acpi0:  on motherboard
pcibios: BIOS version 2.10
Using $PIR table, 11 entries at 0xc00fdc30
acpi0: Power Button (fixed)
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x4008-0x400b on acpi0
acpi_cpu0:  on acpi0
acpi_button0:  on acpi0
acpi_button1:  on acpi0
pcib0:  port 
0x6000-0x607f,0x5000-0x500f,0x4080-0x40ff,0x4000-0x407f,0xcf8-0xcff on acpi0
pci0:  on pcib0
pcib0: slot 7 INTD is routed to irq 10
pcib0: slot 7 INTD is routed to irq 10
pcib0: slot 10 INTA is routed to irq 11
pcib0: slot 12 INTA is routed to irq 15
agp0:  port 0xc000-0xc003 mem 
0xef02-0xef020fff,0xe800-0xebff at device 0.0 on pci0
pcib1:  at device 1.0 on pci0
pci1:  on pcib1
pci_cfgintr: 1:5 INTA BIOS irq 15
pci1:  at device 5.0 (no driver attached)
isab0:  at device 7.0 on pci0
isa0:  on isab0
atapci0:  port 0xc400-0xc40f at device 7.1 on pci0
ata0: at 0x1f0 irq 14 on atapci0
ata0: [MPSAFE]
ata1: at 0x170 irq 15 on atapci0
ata1: [MPSAFE]
uhci0:  port 0xc800-0xc81f irq 10 at device 7.2 on pci0
usb0:  on uhci0
usb0: USB revision 1.0
uhub0: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhub0: port error, restarting port 1
uhub0: port error, giving up port 1
uhub0: port error, restarting port 2
uhub0: port error, giving up port 2
uhci1:  port 0xcc00-0xcc1f irq 10 at device 7.3 on pci0
usb1:  on uhci1
usb1: USB revision 1.0
uhub1: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
uhub1: port error, restarting port 1
uhub1: port error, giving up port 1
uhub1: port error, restarting port 2
uh

Re: Unable to boot cvsup 20031011

2003-10-14 Thread Don Lewis
On 12 Oct, Anish Mistry wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> I finally recvsupped today as some problems with my ata stuff was 
> fixed.  Went through the normal buildworld/kernel progress and on 
> reboot of loading the new kernel, it loads the kernel and modules and 
> then as it starts booting it just causes my machine to restart.  It 
> doesn't have a serial port so I can't get any debug info that way.  I 
> can still boot in with an old kernel, so i can get debug info that 
> way if needed.  Old dmesg and pciconf attached.

What version of sys/i386/i386/pmap.c do you have?  If you are getting
the "pmap_zero_page: CMAP3 busy", it should be fixed by version 1.446,
which phk checked in 2003/10/12 10:55:45.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: samba 3 on CURRENT and net.inet.tcp.blackhole

2003-10-19 Thread Don Lewis
On 14 Oct, Michal wrote:
> Hello,
> I have a problem with samba 3.0.
> I had to reinstall FreeBSD-CURRENT after known problems with ATAng and 
> atapicam (beginning of September(?)), since then I can't set
> net.inet.tcp.blackhole=2 in /etc/sysctl.conf. If I add the option to 
> sysctl then
> samba will hung until I press ^C. If I boot without this option then samba
> starts fine. However running now
> sysctl net.inet.tcp.blackhole=2 prevent smbclient from running. I still 
> will be
> able to connect to smb shere from another computer.

How long did you wait before interrupting samba?  My suspicion is that
it is attempting to connect to a port that doesn't have a listener, and
it is relying on receiving the ICMP unreachable to cause a connect()
call to fail with ECONNREFUSED before it makes a connection attempt to
another port that succeeds.  If this is the situation, then the
connect() call should eventually fail with ETIMEDOUT if you wait long
enough.

You might try disabling net.inet.tcp.blackhole and enabling
net.inet.tcp.log_in_vain instead.  That will tell you if samba is
attempting to connect to a port without a listener when it starts up and
might give you a hint about possible configuration changes that you
could make so that you can re-enable net.inet.tcp.blackhole.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Random signals in {build,install}world recently?

2003-10-21 Thread Don Lewis
On 21 Oct, Peter Jeremy wrote:
> On Mon, Oct 20, 2003 at 11:45:21PM -0700, Terry Lambert wrote:
>>I've noticed a lot of bad problems with Hynix memory lately; your
>>mileage may vary.  At Whistle we had a problem with memory with Gold
>>contacts, and didn't have any problems with the ones with Tin.
> 
> A good rule of thumb is to make sure that the finish on the DIMM
> contacts are the same as the ones on the DIMM socket - both gold or
> both tin.  Note that whilst gold doesn't oxidise, it's fairly easy
> to make a gold coating so thin that it's gas permeable allowing the
> underlying metal to oxidise.

Mixing tin and gold can cause galvanic corrosion.  You can also have
connection problems due to the buildup of tin oxide on the gold surface.

I had memory corruption problems on my Athlon UP system that I tracked
down to a memory timing misconfiguration.  My memory was rated for a CAS
Latency of 2.5, but the motherboard BIOS in "auto" memory configuration
mode set the timing to 2.0.  When I manually configured the memory
timing, the errors went away.  The memory errors would show up as random
file corruption in "make buildworld", and I also saw errors on one of
the last tests that memtest86 performs.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


select + signal + truss => LOR

2003-11-09 Thread Don Lewis
I don't believe I've seen any reports of this particular lock order
reversal.  I got it by pointing truss at syslogd.  My kernel and world
were built from a cvsup run slightly before Fri Nov  7 14:50:18 PST
2003.


 Sleeping on "stopevent" with the following non-sleepable locks held:
exclusive sleep mutex sigacts r = 0 (0xc6bc0aa8) locked @ /usr/src/sys/kern/kern
_condvar.c:289
lock order reversal
 1st 0xc6bc0aa8 sigacts (sigacts) @ /usr/src/sys/kern/kern_condvar.c:289
 2nd 0xc6bbabc4 process lock (process lock) @ /usr/src/sys/kern/kern_synch.c:309
Stack backtrace:
backtrace(c08a4327,c6bbabc4,c08a0922,c08a0922,c08a1964) at backtrace+0x17
witness_lock(c6bbabc4,8,c08a1964,135,c08a05a9) at witness_lock+0x672
_mtx_lock_flags(c6bbabc4,0,c08a1964,135,) at _mtx_lock_flags+0xba
msleep(c6bbac98,c6bbabc4,5c,c08a4b24,0) at msleep+0x794
stopevent(c6bbab58,2,e,822,c096d440) at stopevent+0x85
issignal(c640,2,c08a1463,bd,c6bbab58) at issignal+0x168
cursig(c640,0,c089e483,121,0) at cursig+0xf0
cv_wait_sig(c0991f34,c0991f00,c08a492e,348,4) at cv_wait_sig+0x448
kern_select(c640,7,8055060,0,0) at kern_select+0x526
select(c640,e5f26d10,c08bea62,3ee,5) at select+0x66
syscall(2f,2f,2f,8,1) at syscall+0x2c0
Xint0x80_syscall() at Xint0x80_syscall+0x1d
--- syscall (93), eip = 0x480d03ff, esp = 0xbfbff79c, ebp = 0xbfbffd98 ---
Sleeping on "stopevent" with the following non-sleepable locks held:
exclusive sleep mutex sigacts r = 0 (0xc6bc0aa8) locked @ /usr/src/sys/kern/subr
_trap.c:260
Sleeping on "stopevent" with the following non-sleepable locks held:
exclusive sleep mutex sigacts r = 0 (0xc6bc0aa8) locked @ /usr/src/sys/kern/subr
_trap.c:260



___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


serial console oddity

2003-11-09 Thread Don Lewis
I've been seeing some wierd things for many months when using a serial
console on my -CURRENT box.  I finally had a chance to take a closer
look today.

It looks like the problem is some sort of interference between kernel
output to the console and userland writes to /dev/console.  I typically
see syslogd output to the console get corrupted.  Each message that
syslogd writes seems to get truncated or otherwise corrupted.  The most
common thing I see is that each syslog message is reduced to a space and
the first character of the month, or sometimes just a space, or
sometimes nothing at all.  This is totally consistent until I "kill
-HUP" syslogd, which I believe causes syslogd to close and open
/dev/console, after which the syslog output appears correct on the
console. When the syslogd output is being corrupted, I can cat a file to
/dev/console and the output appears to be correct.

I truss'ed syslogd, and it appears to be working normally, the writev()
call that writes the data to the console appears to be writing the
correct character count, so it would appear that the fault is in the
kernel.

The problem doesn't appear to be specific to syslogd, because I have
seen the output from the shutdown scripts that goes to the console get
truncated as well.

I have my serial console running at the default 9600 bps.

I dug around in the source in search of the problem, but I got lost in a
maze of twisty little passages.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: serial console oddity

2003-11-09 Thread Don Lewis
On  9 Nov, Bruce Evans wrote:
> On Sat, 8 Nov 2003, Don Lewis wrote:
> 
>> I've been seeing some wierd things for many months when using a serial
>> console on my -CURRENT box.  I finally had a chance to take a closer
>> look today.
>>
>> It looks like the problem is some sort of interference between kernel
>> output to the console and userland writes to /dev/console.  I typically
>> see syslogd output to the console get corrupted.  Each message that
>> syslogd writes seems to get truncated or otherwise corrupted.  The most
>> common thing I see is that each syslog message is reduced to a space and
>> the first character of the month, or sometimes just a space, or
>> sometimes nothing at all.
> 
> This is (at least primarily) a longstanding bug in ttymsg().  It uses
> nonblocking mode so that it doesn't block in write() or close().  For
> the same reason, it doesn't wait for output to drain before close().
> If the close happens to be the last one on the device, this causes any
> data buffered in the tty and lower software layers to be discarded
> cleanly and any data in lower hardware layers to by discarded in a
> driver plus hardware-dependent way (usually not so cleanly, especially
> for the character being transmitted).

I didn't think of a flush on close problem because I thought syslogd
always kept the console open.

>> This is totally consistent until I "kill
>> -HUP" syslogd, which I believe causes syslogd to close and open
>> /dev/console, after which the syslog output appears correct on the
>> console. When the syslogd output is being corrupted, I can cat a file to
>> /dev/console and the output appears to be correct.
> 
> When I debugged this, syslogd didn't seem to keep the console open,
> so the open()/close() in ttymsg() always caused the problem.  I didn't
> notice killing syslogd makes a difference.  Perhaps it helps due to a
> missing close.  Holding the console open may be a workaround or even
> the correct fix.  It's not clear where this should be done (should all
> clients of ttymsg() do it?).  Running getty on the console or on the
> underlying tty device should do it accidentally.

It looks to me like syslogd keeps the console open in addition to the
open()/close() in ttymsg().  cfline() calls open() on anything that
begins with '/' and calls isatty() to figure out whether it should set
the type to F_CONSOLE, F_TTY, or F_FILE, and init() closes the file
descriptor for all of these when syslogd is HUPed.

I wonder if the console descriptor is getting revoked ...

>> I truss'ed syslogd, and it appears to be working normally, the writev()
>> call that writes the data to the console appears to be writing the
>> correct character count, so it would appear that the fault is in the
>> kernel.
> 
> If there are any kernel bugs in this area, then they would be that
> last close of the console affects the underlying tty.  The multiple
> console changes are quite likely to have broken this if getty is run
> on the underlying tty (they silently discarded the half-close of the
> underlying tty which was needed to avoided trashing some of its state
> when only the console is closed).

I'm not running getty on my serial console.  It is running on ttyv*. I'm
only using the serial console to capture kernel stack traces, etc.

>> The problem doesn't appear to be specific to syslogd, because I have
>> seen the output from the shutdown scripts that goes to the console get
>> truncated as well.
> 
> Yes, in theory it should affect anything that uses ttymsg() or does
> direct non-blocking writes without waiting for the output to drain.

> Here are some half-baked fixes.  The part that clears O_NONBLOCK is
> wrong, and the usleep() part is obviously a hack.  ttymsg() shouldn't
> block even in close(), since if the close is in the parent ttymsg()
> might block forever and if the close() is in a forked child then
> blocking could create zillions of blocked children.
> 
> Another part of the patch is concerned with limiting forked children.
> If I were happy with that part then blocking would not be so bad.  In
> practice, I don't have enough system activity for blocked children to
> be a problem.  To see the problem with blocked children, do something
> like the following:
> - turn off clocal on the console so that the console can block better.
>   For sio consoles this often requires turning it off in the lock-state
>   device, since the driver defends against this foot shooting by locking
>   it on.
> - hold the console open or otherwise avoid the original bug in this
>   thread, else messages will just be discarded in close() faster than
>   they 

Re: serial console oddity

2003-11-09 Thread Don Lewis
On  9 Nov, Bruce Evans wrote:

> For a non-half-baked fix, do somethng like:
> - never block in ttymsg(), but always wait for output to drain using
>   tcdrain() in a single child process.  It's probably acceptable for
>   this to not report errors to ttymsg()'s caller.
> - limit children better.  I think we now fork children iff writev()
>   returns EWOULDBLOCK and this happens mainly when the tty buffers
>   fill up due to clocal being off and the external console not
>   listening.  Handling this right seems to require handing off the
>   messages to a single child process that can buffer the messages
>   in userland and can block writing and draining them.  Blocked write()s
>   and tcdrain()s are easy enough to handle in a specialized process by
>   sending signals to abort them.

Another way of handling EWOULDBLOCK would be to add the descriptor to
syslogd's select loop instead of forking a child process.  There is
still the issue of how to handle blocking trdrain()s or close()s,
perhaps a thread.  Syslogd should not attempt to re-open the device if a
tcdrain() or close() was in progress.

BTW, it sounds like the pending output should not be discarded by the
close(), the termios(4) man page says:

   Closing a Terminal Device File
 The last process to close a terminal device file causes any output to be
 sent to the device and any input to be discarded.

If output is discarded in the O_NONBLOCK case, it seems to be
undocumented.  Should close() return ENOSPC in this case?
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: serial console oddity

2003-11-09 Thread Don Lewis
On  9 Nov, Don Lewis wrote:
> On  9 Nov, Bruce Evans wrote:
>> On Sat, 8 Nov 2003, Don Lewis wrote:

>>> This is totally consistent until I "kill
>>> -HUP" syslogd, which I believe causes syslogd to close and open
>>> /dev/console, after which the syslog output appears correct on the
>>> console. When the syslogd output is being corrupted, I can cat a file to
>>> /dev/console and the output appears to be correct.
>> 
>> When I debugged this, syslogd didn't seem to keep the console open,
>> so the open()/close() in ttymsg() always caused the problem.  I didn't
>> notice killing syslogd makes a difference.  Perhaps it helps due to a
>> missing close.  Holding the console open may be a workaround or even
>> the correct fix.  It's not clear where this should be done (should all
>> clients of ttymsg() do it?).  Running getty on the console or on the
>> underlying tty device should do it accidentally.
> 
> It looks to me like syslogd keeps the console open in addition to the
> open()/close() in ttymsg().  cfline() calls open() on anything that
> begins with '/' and calls isatty() to figure out whether it should set
> the type to F_CONSOLE, F_TTY, or F_FILE, and init() closes the file
> descriptor for all of these when syslogd is HUPed.
> 
> I wonder if the console descriptor is getting revoked ...

That appears to be the situation:

scratch:~ 101>cat /var/run/syslog.pid 
275
scratch:~ 102>fstat -p 275
USER CMD  PID   FD MOUNT  INUM MODE SZ|DV R/W
root syslogd  275 root / 2 drwxr-xr-x1024  r
root syslogd  275   wd / 2 drwxr-xr-x1024  r
root syslogd  275 text /575452 -r-xr-xr-x   32204  r
root syslogd  2750 /dev  8 crw-rw-rw-null rw
root syslogd  2751 /dev  8 crw-rw-rw-null rw
root syslogd  2752 /dev  8 crw-rw-rw-null rw
root syslogd  2753* local dgram c6c97000
root syslogd  2754* internet6 dgram udp c6c84ee0
root syslogd  2755* internet dgram udp c6c85000
root syslogd  2756 /dev 17 crw---klog  r
root syslogd  2758 - - bad-
root syslogd  2759 /447635 -rw-r--r--   45602  w
root syslogd  275   10 /450144 -rw---   0  w
root syslogd  275   11 /448526 -rw---   85593  w
root syslogd  275   12 /447600 -rw-r-3119  w
root syslogd  275   13 /450142 -rw-r--r--   19324  w
root syslogd  275   14 /447744 -rw-r--r-- 274  w
root syslogd  275   15 /447492 -rw---   19063  w
root syslogd  275   16 /448732 -rw---   15508  w
root syslogd  275   17 /450145 -rw-r-   0  w
root syslogd  275   18 /450146 -rw-r-   0  w

If we could somehow keep the console open, that would probably be a
sufficient fix for the problem of discarded output.  We probably don't
care in the case of messages to users' terminals, since the users
presumably have those devices open.  There's no such guarantee in the
case of the console.


BTW, here's an example where I HUPed syslogd so that it works, but the
rc script output is truncated.  I think the partial message at the
beginning of the 'vnlru' line should be "Stopping cron.".

Nov  9 12:46:54 scratch shutdown: reboot by dl: 
Stopping inetd.
Shutting down daemon processes:killall: Nov  9 12:46:56 scratch upsmon[504]: upsmon 
parent: exiting (child exited)
warning: kill -TERM 504: No such process
Nov  9 12:46:56 scratch kernel: pid 502 (upsd), uid 66: exited on signal 6
.
Stopping cWaiting (max 60 seconds) for system process `vnlru' to stop...stopped
Waiting (max 60 seconds) for system process `bufdaemon' to stop...stopped
Waiting (max 60 seconds) for system process `syncer' to stop...stopped

syncing disks, buffers remaining... 12 12 
done
Uptime: 13h34m21s
Shutting down ACPI
Rebooting...


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


kernel trap 12 with interrupts disabled

2003-11-09 Thread Don Lewis
I just got one of these shortly after I rebooted my November 7th
-CURRENT box. DDB doesn't show much interesting.

 kernel trap 12 with interrupts disabled


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0xbc04d753
fault code  = supervisor read, page not present
instruction pointer = 0x8:0xc0685ddf
stack pointer   = 0x10:0xe5f3bca8
frame pointer   = 0x10:0xe5f3bcfc
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= resume, IOPL = 0
current process = 519 (setiathome)
kernel: type 12 trap, code=0
Stopped at  mi_switch+0xcf: cmpl0x8(%esi),%ebx
db> tr
mi_switch(c6bbc500,df,c08a3da3,f8,0) at mi_switch+0xcf
ast(e5f3bd48) at ast+0x3f2
doreti_ast() at doreti_ast+0x17

Alas, I didn't have enough free space to capture a core file B-(
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: kernel trap 12 with interrupts disabled

2003-11-09 Thread Don Lewis
On  9 Nov, I wrote:
> I just got one of these shortly after I rebooted my November 7th
> -CURRENT box. DDB doesn't show much interesting.
> 
>  kernel trap 12 with interrupts disabled
> 
> 
> Fatal trap 12: page fault while in kernel mode
> cpuid = 0; apic id = 00
> fault virtual address   = 0xbc04d753
> fault code  = supervisor read, page not present
> instruction pointer = 0x8:0xc0685ddf
> stack pointer   = 0x10:0xe5f3bca8
> frame pointer   = 0x10:0xe5f3bcfc
> code segment= base 0x0, limit 0xf, type 0x1b
> = DPL 0, pres 1, def32 1, gran 1
> processor eflags= resume, IOPL = 0
> current process = 519 (setiathome)
> kernel: type 12 trap, code=0
> Stopped at  mi_switch+0xcf: cmpl0x8(%esi),%ebx
> db> tr
> mi_switch(c6bbc500,df,c08a3da3,f8,0) at mi_switch+0xcf
> ast(e5f3bd48) at ast+0x3f2
> doreti_ast() at doreti_ast+0x17
> 
> Alas, I didn't have enough free space to capture a core file B-(

This problem doesn't appear to be reproduceable and doesn't seem to be
load dependent.  I had portupgrade cranking for many hours yesterday
without a hiccup.  I cleaned up a bunch of old distfiles and packages,
so I should have sufficient space to get a core dump in case this
problem happens again.

I forgot to include the vital statistics.  BTW, the RAM is ECC.

Copyright (c) 1992-2003 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 5.1-CURRENT #35: Fri Nov  7 14:50:18 PST 2003
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERICSMB
Preloaded elf kernel "/boot/kernel/kernel" at 0xc0a9e000.
Preloaded elf module "/boot/kernel/aout.ko" at 0xc0a9e244.
ACPI APIC Table: 
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: AMD Athlon(tm) XP 1900+ (1608.23-MHz 686-class CPU)
  Origin = "AuthenticAMD"  Id = 0x662  Stepping = 2
  
Features=0x383fbff
  AMD Features=0xc048
real memory  = 1073676288 (1023 MB)
avail memory = 1033592832 (985 MB)
ioapic0  irqs 0-23 on motherboard
Pentium Pro MTRR support enabled
acpi0:  on motherboard
pcibios: BIOS version 2.10
Using $PIR table, 11 entries at 0xc00fdc30
acpi0: Power Button (fixed)
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x4008-0x400b on acpi0
acpi_cpu0:  on acpi0
acpi_button0:  on acpi0
acpi_button1:  on acpi0
pcib0:  port 
0x6000-0x607f,0x5000-0x500f,0x4080-0x40ff,0x4000-0x407f,0xcf8-0xcff on acpi0
pci0:  on pcib0
pcib0: slot 7 INTD is routed to irq 10
pcib0: slot 7 INTD is routed to irq 10
agp0:  port 0xc000-0xc003 mem 
0xef02-0xef020fff,0xe800-0xebff at device 0.0 on pci0
pcib1:  at device 1.0 on pci0
pci1:  on pcib1
pci_cfgintr: 1:5 INTA BIOS irq 15
pci1:  at device 5.0 (no driver attached)
isab0:  at device 7.0 on pci0
isa0:  on isab0
atapci0:  port 0xc400-0xc40f at device 7.1 on pci0
ata0: at 0x1f0 irq 14 on atapci0
ata0: [MPSAFE]
ata1: at 0x170 irq 15 on atapci0
ata1: [MPSAFE]
uhci0:  port 0xc800-0xc81f irq 10 at device 7.2 on pci0
usb0:  on uhci0
usb0: USB revision 1.0
uhub0: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhub0: port error, restarting port 1
uhub0: port error, giving up port 1
uhub0: port error, restarting port 2
uhub0: port error, giving up port 2
uhci1:  port 0xcc00-0xcc1f irq 10 at device 7.3 on pci0
usb1:  on uhci1
usb1: USB revision 1.0
uhub1: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
uhub1: port error, restarting port 1
uhub1: port error, giving up port 1
uhub1: port error, restarting port 2
uhub1: port error, giving up port 2
viapropm0: SMBus I/O base at 0x5000
viapropm0:  port 0x5000-0x500f at device 7.4 on 
pci0
viapropm0: SMBus revision code 0x40
smbus0:  on viapropm0
smb0:  on smbus0
fxp0:  port 0xe000-0xe03f mem 
0xef00-0xef01,0xef021000-0xef021fff irq 18 at device 10.0 on pci0
fxp0: Ethernet address 00:02:b3:5c:8c:e0
miibus0:  on fxp0
inphy0:  on miibus0
inphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
ahc0:  port 0xe400-0xe4ff mem 
0xef022000-0xef022fff irq 16 at device 12.0 on pci0
aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs
fdc0:  port 0x3f7,0x3f0-0x3f5 
irq 6 drq 2 on acpi0
fdc0: FIFO enabled, 8 bytes threshold
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
sio0 port 0x3f8-0x3ff irq 4 on acpi0
sio0: type 16550A, console
sio1 port 0x2f8-0x2ff irq 3 on acpi0
sio1: type 16550A
ppc0 port 0x778-0x77b,0x378-0x37f irq 7 drq 3 on acpi0
ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode
ppc0: FIFO with 16/16/8 bytes threshold
ppbus0:  on ppc0
plip0:  on ppbus0
lpt0:  on ppbus0
lpt0: Interrupt-driven port
ppi0:  on ppbus0
atkbdc0:  port 0x64,0x60 irq 1 on acpi0
atkbd0:  flags 0x1 irq 1 on atkbdc0
kbd0 at atkbd0
psm0:  irq 12 on atkbdc0
psm0: model IntelliMouse Explorer, device ID 4
npx0: [FAST]
npx0:  on moth

Re: named pipes memory leak?

2003-11-10 Thread Don Lewis
On 10 Nov, Lukas Ertl wrote:
> Hi,
> 
> is there a known problem with named pipes in -CURRENT?
> 
> The following shell script freezes a machine in several minutes and needs
> a power cycle.  You can see the increasing memory in vmstat -z (unpcb) and
> netstat -u.  The kernel is FreeBSD 5.1-CURRENT Tue Nov 4 14:08:23 CET 2003.
> 
> ---8<---
> #/bin/sh
> 
> FIFO=/tmp/foo
> 
> for i in `jot 5 1`; do
>mkfifo ${FIFO}
>echo blubb > ${FIFO} &
>kill $!
>rm ${FIFO}
> done
> ---8<---

If fifo_open() is interrupted, fifo_close() never gets called, and the
resources are not recovered.  I wish doing the resource recovery in
fifo_inactive() would have worked ...

Try this patch:

Index: sys/fs/fifofs/fifo_vnops.c
===
RCS file: /home/ncvs/src/sys/fs/fifofs/fifo_vnops.c,v
retrieving revision 1.89
diff -u -r1.89 fifo_vnops.c
--- sys/fs/fifofs/fifo_vnops.c  16 Jun 2003 17:17:09 -  1.89
+++ sys/fs/fifofs/fifo_vnops.c  10 Nov 2003 19:11:00 -
@@ -154,6 +154,26 @@
 }
 
 /*
+ * Dispose of fifo resources.
+ * Should be called with vnode locked
+ */
+static void
+fifo_cleanup(struct vnode *vp)
+{
+   struct fifoinfo *fip = vp->v_fifoinfo;
+
+   VI_LOCK(vp);
+   if (vp->v_usecount == 1) {
+   vp->v_fifoinfo = NULL;
+   VI_UNLOCK(vp);
+   (void)soclose(fip->fi_readsock);
+   (void)soclose(fip->fi_writesock);
+   FREE(fip, M_VNODE);
+   } else
+   VI_UNLOCK(vp);
+}
+
+/*
  * Open called to set up a new instance of a fifo or
  * to find an active instance of a fifo.
  */
@@ -249,6 +269,7 @@
fip->fi_readers--;
if (fip->fi_readers == 0)
socantsendmore(fip->fi_writesock);
+   fifo_cleanup(vp);
return (error);
}
VI_LOCK(vp);
@@ -268,6 +289,7 @@
fip->fi_writers--;
if (fip->fi_writers == 0)
socantrcvmore(fip->fi_readsock);
+   fifo_cleanup(vp);
return (error);
}
/*
@@ -554,15 +576,7 @@
if (fip->fi_writers == 0)
socantrcvmore(fip->fi_readsock);
}
-   VI_LOCK(vp);
-   if (vp->v_usecount == 1) {
-   vp->v_fifoinfo = NULL;
-   VI_UNLOCK(vp);
-   (void)soclose(fip->fi_readsock);
-   (void)soclose(fip->fi_writesock);
-   FREE(fip, M_VNODE);
-   } else
-   VI_UNLOCK(vp);
+   fifo_cleanup(vp);
VOP_UNLOCK(vp, 0, td);
return (0);
 }

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: named pipes memory leak?

2003-11-10 Thread Don Lewis
On 10 Nov, Lukas Ertl wrote:
> On Mon, 10 Nov 2003, Don Lewis wrote:
> 
>> On 10 Nov, Lukas Ertl wrote:
>> >
>> > The following shell script freezes a machine in several minutes and needs
>> > a power cycle.  You can see the increasing memory in vmstat -z (unpcb) and
>> > netstat -u.  The kernel is FreeBSD 5.1-CURRENT Tue Nov 4 14:08:23 CET 2003.
>> >
>> > ---8<---
>> > #/bin/sh
>> >
>> > FIFO=/tmp/foo
>> >
>> > for i in `jot 5 1`; do
>> >mkfifo ${FIFO}
>> >echo blubb > ${FIFO} &
>> >kill $!
>> >rm ${FIFO}
>> > done
>> > ---8<---
>>
>> If fifo_open() is interrupted, fifo_close() never gets called, and the
>> resources are not recovered.  I wish doing the resource recovery in
>> fifo_inactive() would have worked ...
>>
>> Try this patch:
> 
> Thanks, your patch seems so solve this problem effectively.

The patch has been committed.  Thanks for testing it.

BTW, I encountered a process leak when running your script.  The kill
would sometimes fail to find the process, maybe about 10% of the time. I
think maybe $! hadn't yet been updated and the shell was trying to kill
the previous echo process a second time.  The mkfifo would also fail
sometimes because the file already existed.  I think what the background
shell didn't get around to opening the fifo until after rm had nuked it
causing a plain file to get created.  Hmn, these events seemed to be
associated, so maybe the shell was creating a file and the next echo
command would write to the file and exit before the kill command was
executed.  This doesn't explain all those copies of sh stuck in the
"fifow" state, though.  Adding a "sleep 1" before the kill command seems
to make things work better.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: named pipes memory leak?

2003-11-11 Thread Don Lewis
On 11 Nov, Lukas Ertl wrote:
> On Tue, 11 Nov 2003, Lukas Ertl wrote:
> 
>> Unfortunately, we are still seeing a problem here: we are running uvscan
>> (virus scanner), and while running it we are still seeing increasing unpcb
>> usage and orphaned unix domain sockets.
>>
>> We added some debug printfs to the fifo routines and found out that
>> fifo_cleanup is called from fifo_close with a vnode which has v_usecount
>> == 2 so the socket close calls aren't reached.
> 
> Sorry, I probably missed an important part: we're creating the FIFOs on
> nullfs mounts - the test script works great on plain UFS mounts, but the
> null layer seems to VREF the vnode once again, so v_usecount is 2, thus it
> is missong the check in fifo_cleanup().

Grrr ...  At least I didn't break this, our fifo implementation would
have always leaked when used this way.

Doing the cleanup in fifo_inactive() would have worked better in this
case.  I think I figured out a way to make that work properly, but I
really need to test it.

Is there any particular reason that you are nuking and re-creating the
fifo?  If you don't delete the fifo, the same sockets will get used each
time.

As a workaround could you create a little mdfs to hold the fifo?
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: named pipes memory leak?

2003-11-11 Thread Don Lewis
On 11 Nov, Don Lewis wrote:
> On 11 Nov, Lukas Ertl wrote:

>> Sorry, I probably missed an important part: we're creating the FIFOs on
>> nullfs mounts - the test script works great on plain UFS mounts, but the
>> null layer seems to VREF the vnode once again, so v_usecount is 2, thus it
>> is missong the check in fifo_cleanup().
> 
> Grrr ...  At least I didn't break this, our fifo implementation would
> have always leaked when used this way.
> 
> Doing the cleanup in fifo_inactive() would have worked better in this
> case.  I think I figured out a way to make that work properly, but I
> really need to test it.
> 
> Is there any particular reason that you are nuking and re-creating the
> fifo?  If you don't delete the fifo, the same sockets will get used each
> time.

Now that I've had some time to think about it, if you reuse the same
fifo, you'll run into the same problem that caused me to abandon my
previous fifo_inactive() version of the cleanup code, which is stale
data being left in the fifo after both ends have been closed.  You may
be stuck with plan B below ...

> As a workaround could you create a little mdfs to hold the fifo?

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Still getting NFS client locking up

2003-11-11 Thread Don Lewis
On 31 Oct, Kelley Reynolds wrote:
> --- Original Message ---
> From: Matt Smith <[EMAIL PROTECTED]>
> Sent: Fri, 31 Oct 2003 08:55:49 +
> To: Robert Watson <[EMAIL PROTECTED]>
> Subject: Re: Still gettnig NFS client locking up
> 
>> Robert Watson wrote:
>> > On Tue, 28 Oct 2003, Soren Schmidt wrote:
>> > 
>> > 
>> >>>I'm now running a kernel/world of October 26th on both NFS client
>> >>>and server machines. I am still seeing NFS lockups as reported by
>> >>>several people in these threads:
>> >>
>> >>Me too!!
>> > 
>> > 
>> > Hmm.  I'm unable to reproduce this so far, and I'm pounding several
>> > 5.x NFS clients and servers.  I've been checking out using CVS over
>> > NFS, performing dd's of big files, etc.  There must be something
>> > more I'm missing in reproducing this.  What network interface cards
>> > are you using (client, server)? Are you using DHCP on the client or
>> > server?  What commands trigger it -- what part of the NFS
>> > namespace, etc?  Are you running the commands as root, or another
>> > user?
>> > 
>> > Robert N M Watson FreeBSD Core Team, TrustedBSD
>> > Projects [EMAIL PROTECTED]  Network Associates
>> > Laboratories
>> > 
> 
> I'm also experiencing lockups with NFS, but it's the server that locks
> up on mine. Both client and server are -CURRENT. Server was fresh as
> of two days ago, and the client is a week or two old. They are
> connected via bfe (server) and vr (client). The server, I've found,
> will last much longer if the mount options on the client include 'tcp'
> and 'nfsv3' (supposed to be default, but I'm just calling it like it
> is). Reading files seems to be okay, and I've managed to get as far as
> compiling a kernel on an NFS-mounted /usr, but a buildworld will hang
> in < 30 minutes. The server is running dhcp and pf. All commands are
> being run as root.

I'm not having any problems with my -CURRENT client.  My server is
running 4.9-STABLE, so I can't comment on the state of the NFS server
code in -CURRENT.  For what it's worth, my NFS usage is not very heavy,
and is mostly reading, with very little writing.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: checking stopevent 2!

2003-11-15 Thread Don Lewis
On 15 Nov, Robert Watson wrote:
> 
> On Sat, 15 Nov 2003, Andy Farkas wrote:
> 
>> These messages spew onto my console and into syslogd once every second:
> 
> Heh.  Sounds like your box is having a really bad day, we'll see if we
> can't get it fixed up over the next couple of weeks as things settle out
> :-).

Mine is worse.  If I try to boot it multiuser, it pegs its serial
console with these messages.  It seems to bring up the network to the
point where it can be pinged, but the ping latency averages about
200 ms, and even after a half an hour it still hadn't gotten sshd
started.

I can't drop back to the old kernel because I just did an installworld
with the new version of statfs.

I hope the fix isn't too extensive, since I'll probably be typing it in
by hand in single user mode ...
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: named pipes memory leak?

2003-11-15 Thread Don Lewis
On 11 Nov, Lukas Ertl wrote:
> On Tue, 11 Nov 2003, Lukas Ertl wrote:
> 
>> Unfortunately, we are still seeing a problem here: we are running uvscan
>> (virus scanner), and while running it we are still seeing increasing unpcb
>> usage and orphaned unix domain sockets.
>>
>> We added some debug printfs to the fifo routines and found out that
>> fifo_cleanup is called from fifo_close with a vnode which has v_usecount
>> == 2 so the socket close calls aren't reached.
> 
> Sorry, I probably missed an important part: we're creating the FIFOs on
> nullfs mounts - the test script works great on plain UFS mounts, but the
> null layer seems to VREF the vnode once again, so v_usecount is 2, thus it
> is missong the check in fifo_cleanup().

I just committed a fix for this.  Fifo_vnops.c 1.91 should not leak
memory and sockets when used with nullfs.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


vnode lock violation in today's -CURRENT

2003-11-18 Thread Don Lewis
I just ran into this while running portupgrade.

VOP_GETATTR: 0xc741e000 is not locked but should be
Debugger("Lock violation.
")
Stopped at  Debugger+0x55:  xchgl   %ebx,in_Debugger.0
db> tr
Debugger(c08bf9aa,c9749c78,c741e000,c08bf9eb,e820c984) at Debugger+0x55
vfs_badlock(c08bf9eb,c9749c78,c741e000,c09590e0,c741e000) at vfs_badlock+0x45
assert_vop_locked(c741e000,c9749c78,e820c9dc,0,e820c9c0) at assert_vop_locked+0x62
getdents_common(c6ede500,e820cd10,1,e820cd40,c0848b70) at getdents_common+0xfa
linux_getdents64(c6ede500,e820cd10,c08d6778,3ee,3) at linux_getdents64+0x20
syscall(2f,2f,2f,5,0) at syscall+0x2c0
Xint0x80_syscall() at Xint0x80_syscall+0x1d
--- syscall (220, Linux ELF, linux_getdents64), eip = 0x805a028, esp = 0xbfbfdc5c, ebp 
= 0xbfbfdcb8 ---


It looks to me like the call to vn_lock() in getdents_common() needs to
be moved to before the call to VOP_GETATTR().  The malloc() call should
probably be moved as well, which means that the intervening error
handling needs to be tweaked.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Updated acpi_cpu patch

2003-11-18 Thread Don Lewis
On 18 Nov, Lukas Ertl wrote:
> On Tue, 18 Nov 2003, Nate Lawson wrote:

>> This excerpt from truckman@'s asl shows that 4 Cx states are only
>> available when the AC adapter is not attached.  (The C*NA memory addresses
>> appear to be managed by the BIOS and not the AML but the PSR access is
>> clear).
> 
> This part of the ASL looks the one here - let me guess, is it a ThinkPad?
> :-)

Yup, a Thinkpad R40, which refuses to actually power down in ACPI mode
when I run "shutdown -p".
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Unfortunate dynamic linking for everything

2003-11-18 Thread Don Lewis
On 18 Nov, Garance A Drosihn wrote:
> At 8:07 AM -0500 11/18/03, [EMAIL PROTECTED] wrote:
> 
>>  If there hadn't been a noticed increase in cost by using
>>  all-shared-libs, then the measurements were done
>>  incorrectly.  If the decision is made based upon allowing
>>  for 1.5X (at least) times increase in fork/exec times, and
>>  larger memory usage (due to sparse allocations), ...
> 
> I do remember some comments about benchmarks, and it was
> true that the all-dynamic bin/sbin does come out slower.  I
> don't remember if the benchmarks were ours or from NetBSD's
> investigation.  However, I think we measured increase in
> overall time for some set of commands, instead of "increase
> in the fork() routine".  Thus, the penalty observed was much
> less than 50%.  I think it was under 2%, but I don't remember
> the exact number.  When we're dealing with a 100% increase
> in the cost of compiling something with the newer gcc, the
> increase due to this change seemed pretty insignificant...

I thought there were some NetBSD benchmark numbers posted, but after
digging through my mail archives, I now think the results that I'm
remembering were posted by Gordon and were run with rcNG, which is
somewhat more shell intensive than our previous rc system:

On  2 Jun, Gordon Tetlow wrote:
> I'm planning on making a dynamically-linked root partition by 5.2. To
> that end, I'm planning on doing to the following:
> 
> Integrate Tim Kientzle's /rescue patches into the tree
> Create /lib and populate with all the libs needed to support dynamically
>   linked binaries in /bin and /sbin
> Have a big (probably NO_DYNAMIC_ROOT) knob to switch from static to
>   dynamic.
> 
> There will be a performance hit associated with this. I did a quick
> measurement at boot and my boot time (from invocation of /etc/rc to
> the login prompt) went from 12 seconds with a static root to 15
> seconds with a dynamic root. I have yet to perform a worldstone on
> it.

I was thinking the difference was smaller than that.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Unfortunate dynamic linking for everything

2003-11-18 Thread Don Lewis
On 18 Nov, Robert Watson wrote:

> (2) Shells again, because they will be fork()d and exec()d frequently
> during heavily scripted activities, such as system boot, periodic
> events, large make jobs, etc.  And presumably the only shell of
> interest is sh, although some of the supporting non-builtin binaries
> may also be of interest. 

You left out my favorite fork()/exec() intensive exmple, our ports
system.  During portupgrade, visible activity can grind stop for quite a
while at the "Registering installation" stage, while top's "last pid"
field increases rapidly and system CPU time is an embarrassingly large
number, and this is with a static /bin and /sbin.

Rather than trying to re-"optimize" this by converting /bin/sh back to
being static, I think a got more could be gained by re-writing this part
of the ports infrastructure to be more efficient.  I'm not volunteering
...
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


dumb question 'Bad system call' after make world

2003-11-21 Thread Don Bowman

So i have a machine freshly installed from 5.1 mini iso.
I did a cvs co of latest current sources, and accidentally
did a 'make world' instead of 'make buildworld'.
Now i just get 'Bad system call' when i try to do anything. 
i need to get the correct kernel on there, does anyone have a 
suggestion for how to fix this?

--don
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


RE: dumb question 'Bad system call' after make world

2003-11-22 Thread Don Bowman
From: Barney Wolff [mailto:[EMAIL PROTECTED]
> Sent: Saturday, November 22, 2003 1:14 PM
> To: Bruce Evans
> Cc: [EMAIL PROTECTED]
> Subject: Re: dumb question 'Bad system call' after make world
> 
> 
> On Sat, Nov 22, 2003 at 11:42:04PM +1100, Bruce Evans wrote:
> > On Fri, 21 Nov 2003, Barney Wolff wrote:
> > 
> > > Will somebody please tell me when "make world" is ever 
> correct in the
> > > environment of the last several years?  I've been unable 
> to understand
> > > its continued existence as a target.
> > 
> > >From my normal world-building script:
> > 
> > DESTDIR=/c/z/root \
> > MAKEOBJDIRPREFIX=/c/z/obj \
> > time -l make -s world > /tmp/world.out 2>&1
> 
> Oh, so it's only correct when you're not really installing world on
> the system you're building on?  Would replacing this with
> ( make buildworld && make installworld ) really be a hardship?
> Must we continue to invite innocents to clobber their systems?

For interest, in case this happens to someone else, i 'fixed'
it by booting from the mini iso disk, inserting disk 2 (live),
going to a shell, and copying all of /bin, /usr/bin, /usr/lib,
/usr/libexec, /lib over to the hd, rebooting, and then doing
the rest of the normal steps.

--don
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


RE: What's changed relating to localhost then?

2003-11-22 Thread Don Bowman
From: Christian Laursen [mailto:[EMAIL PROTECTED]
> Steve Ames <[EMAIL PROTECTED]> writes:
> 
> > If you telnet to 127.0.0.1 the system still believes you are
> > coming from your public IP. Bizarre that. Other IPs don't act
> > that way. My system has two public IPs and 127.0.0.1. If I
> > telnet to myself on either of the public IPs then I appear
> > from the correct IP. However 127.0.0.1 no longer seems to 
> > work that way and that does break a number of things that
> > expect to be connected to by 127.0.0.1
> 
> I can confirm this behaviour. It is possible to force the local
> address to 127.0.0.1 though.
> 
> [EMAIL PROTECTED] ~]$ telnet 127.0.0.1 25  
>   [19:39]
> Trying 127.0.0.1...
> Connected to localhost.
> Escape character is '^]'.
> 220 borg.borderworlds.dk ESMTP Postfix
> 
> Nov 22 19:39:44 borg postfix/smtpd[2683]: connect from 
> borg.borderworlds.dk[10.1.0.2]
> 
> [EMAIL PROTECTED] ~]$ telnet -s 127.0.0.1 127.0.0.1 25 
>   [19:40]
> Trying 127.0.0.1...
> Connected to localhost.
> Escape character is '^]'.
> 220 borg.borderworlds.dk ESMTP Postfix
> 
> Nov 22 19:40:06 borg postfix/smtpd[2683]: connect from 
> localhost[127.0.0.1]
> 
> Fortunately this behaviour didn't break anything here, but it 
> does seem
> broken nonetheless.

This seems to break amd:

amd[751]: Map support for: root, passwd, hesiod, union, nis, ndbm, file,
error.
amd[751]: AMFS: nfs, link, nfsx, nfsl, host, linkx, program, union, inherit,
ufs,
amd[751]:   cdfs, pcfs, auto, direct, toplvl, error.   
amd[751]: FS: cd9660, nfs, nfs3, msdosfs, ufs, unionfs.
amd[751]: Network 1: wire="10.128.2.0" (netnumber=10.128.2).

amd[751]: Network 2: wire="192.168.3.0" (netnumber=192.168.3).

amd[751]: My ip addr is 127.0.0.1  
amd[752]: released controlling tty using setsid()
amd[752]: file server localhost, type local, state starts up
amd[753]: /phaedrus: disabling nfs congestion window   
amd[752]: ignoring request from 10.128.2.57:1018, expected 127.0.0.1

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


serial console, inappropriate ioctl?

2003-11-22 Thread Don Bowman

I have a current system configured to use a serial console. I'm 
getting a 'login_tty /dev/console: Inappropriate ioctl for device'
to the console every few seconds.

bash-2.05b# cat /boot.config
-Dh

In /etc/ttys, i have set console as:
console  "/usr/libexec/getty std.115200" vt100   on  secure
and also have:
ttyd0   "/usr/libexec/getty std.115200" vt100   on secure

in the /boot/device.hints, i have:
hint.sio.0.flags="0x30"

bash-2.05b# uname -a
FreeBSD bsd7.phaedrus.sandvine.com 5.1-CURRENT FreeBSD 5.1-CURRENT #2: Sat
Nov 22 16:39:04 EST 2003
[EMAIL PROTECTED]:/d2/obj/d2/src/sys/GENERIC  i386

Nov 22 18:03:16 bsd7 getty[940]: login_tty /dev/console: Inappropriate ioctl
for device
Nov 22 18:03:16 bsd7 init: getty repeating too quickly on port /dev/console,
sleeping 30 secs

Any suggestions on what this might be coming from?

--don
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


null_lookup() vnode locking wierdness

2003-11-23 Thread Don Lewis
I was trying to figure out why the VOP_UNLOCK() call in null_lookup()
was violating a vnode locking assertion, so I tossed a bunch of
ASSERT_VOP_LOCKED() calls into null_lookup().  I found something I don't
understand ...

ASSERT_VOP_LOCKED(dvp, "null_lookup 1");
if ((flags & ISLASTCN) && (dvp->v_mount->mnt_flag & MNT_RDONLY) &&
(cnp->cn_nameiop == DELETE || cnp->cn_nameiop == RENAME))
return (EROFS);
/*
 * Although it is possible to call null_bypass(), we'll do
 * a direct call to reduce overhead
 */
ASSERT_VOP_LOCKED(dvp, "null_lookup 2");
ldvp = NULLVPTOLOWERVP(dvp);
ASSERT_VOP_LOCKED(dvp, "null_lookup 3");
vp = lvp = NULL;  
error = VOP_LOOKUP(ldvp, &lvp, cnp);
ASSERT_VOP_LOCKED(dvp, "null_lookup 4");
if (error == EJUSTRETURN && (flags & ISLASTCN) &&
(dvp->v_mount->mnt_flag & MNT_RDONLY) &&
(cnp->cn_nameiop == CREATE || cnp->cn_nameiop == RENAME))  
error = EROFS;
   
/*
 * Rely only on the PDIRUNLOCK flag which should be carefully
 * tracked by underlying filesystem.
 */
if (cnp->cn_flags & PDIRUNLOCK)
VOP_UNLOCK(dvp, LK_THISLAYER, td);

The null_lookup {1,2,3} assertions pass, but null_lookup 4 fails.  It
appears that the VOP_LOOKUP() call to the underlying file system is
unlocking the directory vnode in the nullfs file system.  How can that
happen?
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: null_lookup() vnode locking wierdness

2003-11-23 Thread Don Lewis
On 23 Nov, I wrote:
> I was trying to figure out why the VOP_UNLOCK() call in null_lookup()
> was violating a vnode locking assertion, so I tossed a bunch of
> ASSERT_VOP_LOCKED() calls into null_lookup().  I found something I don't
> understand ...
> 
> ASSERT_VOP_LOCKED(dvp, "null_lookup 1");
> if ((flags & ISLASTCN) && (dvp->v_mount->mnt_flag & MNT_RDONLY) &&
> (cnp->cn_nameiop == DELETE || cnp->cn_nameiop == RENAME))
> return (EROFS);
> /*
>  * Although it is possible to call null_bypass(), we'll do
>  * a direct call to reduce overhead
>  */
> ASSERT_VOP_LOCKED(dvp, "null_lookup 2");
> ldvp = NULLVPTOLOWERVP(dvp);
> ASSERT_VOP_LOCKED(dvp, "null_lookup 3");
> vp = lvp = NULL;  
> error = VOP_LOOKUP(ldvp, &lvp, cnp);
> ASSERT_VOP_LOCKED(dvp, "null_lookup 4");
> if (error == EJUSTRETURN && (flags & ISLASTCN) &&
> (dvp->v_mount->mnt_flag & MNT_RDONLY) &&
> (cnp->cn_nameiop == CREATE || cnp->cn_nameiop == RENAME))  
> error = EROFS;
>
> /*
>  * Rely only on the PDIRUNLOCK flag which should be carefully
>  * tracked by underlying filesystem.
>  */
> if (cnp->cn_flags & PDIRUNLOCK)
> VOP_UNLOCK(dvp, LK_THISLAYER, td);
> 
> The null_lookup {1,2,3} assertions pass, but null_lookup 4 fails.  It
> appears that the VOP_LOOKUP() call to the underlying file system is
> unlocking the directory vnode in the nullfs file system.  How can that
> happen?

I think I just answered my own question.  It appears that both vnodes
can share the same lock according to the following code fragment in
null_nodeget():

/*
 * From NetBSD:
 * Now lock the new node. We rely on the fact that we were passed
 * a locked vnode. If the lower node is exporting a struct lock
 * (v_vnlock != NULL) then we just set the upper v_vnlock to the
 * lower one, and both are now locked. If the lower node is exporting
 * NULL, then we copy that up and manually lock the new vnode.
 */

vp->v_vnlock = lowervp->v_vnlock;
error = VOP_LOCK(vp, LK_EXCLUSIVE | LK_THISLAYER, td);

It looks like the easiest fix is to skip the VOP_UNLOCK() call in
null_lookup() if dvp->v_vnlock == ldvp->v_vnlock.

Index: sys/fs/nullfs/null_vnops.c
===
RCS file: /home/ncvs/src/sys/fs/nullfs/null_vnops.c,v
retrieving revision 1.63
diff -u -r1.63 null_vnops.c
--- sys/fs/nullfs/null_vnops.c  17 Jun 2003 08:52:45 -  1.63
+++ sys/fs/nullfs/null_vnops.c  24 Nov 2003 00:10:41 -
@@ -392,7 +392,7 @@
 * Rely only on the PDIRUNLOCK flag which should be carefully
 * tracked by underlying filesystem.
 */
-   if (cnp->cn_flags & PDIRUNLOCK)
+   if ((cnp->cn_flags & PDIRUNLOCK) && dvp->v_vnlock != ldvp->v_vnlock)
VOP_UNLOCK(dvp, LK_THISLAYER, td);
if ((error == 0 || error == EJUSTRETURN) && lvp != NULL) {
if (ldvp == lvp) {

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: 40% slowdown with dynamic /bin/sh

2003-11-24 Thread Don Lewis
On 25 Nov, Daniel O'Connor wrote:
> On Tuesday 25 November 2003 11:52, Dan Nelson wrote:
>  > > I'd greatly prefer that the the dynamic root default be backed out
>> > > until a substantial amount of this performance can be recovered.
>> >
>> > What _REAL WORLD_ task does this slow down?
>>
>> Try timing "cd /usr/ports/www/mozilla-devel ; make clean" with static
>> and dynamic /bin.  bsd.port.mk spawns many many many /bin/sh processes.
> 
> OK my bad, it will probably slow down the ports building.

The ports infrastructure is horribly slow even with a static sh, though
not as glacially slow as installing and patching Solaris 9.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: pcm(4) related panic

2003-11-25 Thread Don Lewis
On 25 Nov, Artur Poplawski wrote:
> Artur Poplawski <[EMAIL PROTECTED]> wrote:
> 
>> Hello,  
>> 
>> On a 5.1-RELEASE and 5.2-BETA machines I have been able to cause a panic 
>> like this:

>> Sleeping on "swread" with the following non-sleepable locks held:
>> exclusive sleep mutex pcm0:play:0 (pcm channel) r = 0 (0xc1c3d740) locked @ \   
>> /usr/src/sys/dev/sound/pcm/dsp.c:146

This enables the panic.

>> panic: sleeping thread (pid 583) owns a non-sleepable lock

Then the panic happens when another thread tries to grab the mutex.


The problem is that the pcm code attempts to hold a mutex across a call
to uiomove(), which can sleep if the userland buffer that it is trying
to access is paged out.  Either the buffer has to be pre-wired before
calling getchns(), or the mutex has to be dropped around the call to
uiomove().  The amount of memory to be wired should be limited to
'sz' as calculated by chn_read() and chn_write(), which complicates the
logic.  Dropping the mutex probably has other issues.


___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: pcm(4) related panic

2003-11-25 Thread Don Lewis
On 25 Nov, Don Lewis wrote:
> On 25 Nov, Artur Poplawski wrote:
>> Artur Poplawski <[EMAIL PROTECTED]> wrote:
>> 
>>> Hello,  
>>> 
>>> On a 5.1-RELEASE and 5.2-BETA machines I have been able to cause a panic 
>>> like this:
> 
>>> Sleeping on "swread" with the following non-sleepable locks held:
>>> exclusive sleep mutex pcm0:play:0 (pcm channel) r = 0 (0xc1c3d740) locked @ \   
>>> /usr/src/sys/dev/sound/pcm/dsp.c:146
> 
> This enables the panic.
> 
>>> panic: sleeping thread (pid 583) owns a non-sleepable lock
> 
> Then the panic happens when another thread tries to grab the mutex.
> 
> 
> The problem is that the pcm code attempts to hold a mutex across a call
> to uiomove(), which can sleep if the userland buffer that it is trying
> to access is paged out.  Either the buffer has to be pre-wired before
> calling getchns(), or the mutex has to be dropped around the call to
> uiomove().  The amount of memory to be wired should be limited to
> 'sz' as calculated by chn_read() and chn_write(), which complicates the
> logic.  Dropping the mutex probably has other issues.

Following up to myself ...

It might be safe to drop the mutex for the uiomove() call if the code
set flags to enforce a limit of one reader and one writer at a time to
keep the code from being re-entered.  The buffer pointer manipulations
in sndbuf_dispose() and sndbuf_acquire() would probably still have to be
protected by the mutex.  If this can be made to work, it would probably
be preferable to wiring the buffer.  It would have a lot less CPU
overhead, and would work better with large buffers, which could still be
allowed to page normally.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: panic on 5.2 BETA: blockable sleep lock

2003-11-26 Thread Don Lewis
On 26 Nov, Stefan Ehmann wrote:
> I got the following panic twice when starting xawtv using 5.2 BETA (CVS
> from Oct 23)
> panic: blockable sleep lock (sleep mutex) sellck
> @/usr/src/sys/kern/sys_generic.c:1145
> 
> I don't think it is directly related to bktr since the last commit there
> was ~3 months ago. Here is the dmesg output for completeness though.
> bktr0:  mem 0xdf103000-0xdf103fff irq 10 at device 12.0
> on pci0
> bktr0: Card has no configuration EEPROM. Cannot determine card make.
> bktr0: IMS TV Turbo, Philips FR1236 NTSC FM tuner.
> 
> 
> Here is a backtrace:
> 
> GNU gdb 5.2.1 (FreeBSD)
> Copyright 2002 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you
> are
> welcome to change it and/or distribute copies of it under certain
> conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for
> details.
> This GDB was configured as "i386-undermydesk-freebsd"...
> panic: blockable sleep lock (sleep mutex) sellck @
> /usr/src/sys/kern/sys_generic.c:1145
> panic messages:
> --
> panic: blockable sleep lock (sleep mutex) sellck @
> /usr/src/sys/kern/sys_generic.c:1145
> 
> syncing disks, buffers remaining... 3023 3023 panic: mi_switch: switch
> in a critical section

> #5  0xc05153b7 in panic () at /usr/src/sys/kern/kern_shutdown.c:550
> #6  0xc053c010 in witness_lock (lock=0xc076cf40, flags=8, 
> file=0xc06d2139 "/usr/src/sys/kern/sys_generic.c", line=1145)
> at /usr/src/sys/kern/subr_witness.c:609
> #7  0xc050b57a in _mtx_lock_flags (m=0xc3a56dc0, opts=0, 
> file=0xc06ff79c "\037#nÀ\t", line=-1065955520)
> at /usr/src/sys/kern/kern_mutex.c:221
> #8  0xc0540436 in selrecord (selector=0xc076cf40, sip=0xc3a56dc0)
> at /usr/src/sys/kern/sys_generic.c:1145
> #9  0xc08509f1 in bktr_poll () from /boot/kernel/bktr.ko


The bktr_poll() implementation invokes the macros DISABLE_INTR() and
ENABLE_INTR() as a wrapper around a block of code that calls
selrecord().

DISABLE_INTR(s);
 
if (events & (POLLIN | POLLRDNORM)) {
   
switch ( FUNCTION( minor(dev) ) ) {
case VBI_DEV:
if(bktr->vbisize == 0)
selrecord(td, &bktr->vbi_select);
else
revents |= events & (POLLIN | POLLRDNORM);
break;
}
}

ENABLE_INTR(s);

The implementation of DISABLE_INTR() and ENABLE_INTR() is defined in
bktr_os.h as:

#if defined(__FreeBSD__)
#if (__FreeBSD_version >=50)
#define DECLARE_INTR_MASK(s)/* no need to declare 's' */
#define DISABLE_INTR(s) critical_enter()
#define ENABLE_INTR(s)  critical_exit()
#else


The man page for critical_enter() and friends says:

DESCRIPTION
 These functions are used to prevent preemption in a critical region of
 code.  All that is guaranteed is that the thread currently executing on a
 CPU will not be preempted.  Specifically, a thread in a critical region
 will not migrate to another CPU while it is in a critical region.  The
 current CPU may still trigger faults and exceptions during a critical
 section; however, these faults are usually fatal.

 The cpu_critical_enter() and cpu_critical_exit() functions provide the
 machine dependent disabling of preemption, normally by disabling inter-
 rupts on the local CPU.

 The critical_enter() and critical_exit() functions provide a machine
 independent wrapper around the machine dependent API.  This wrapper cur-
 rently saves state regarding nested critical sections.  Nearly all code
 should use these versions of the API.

 Note that these functions are not required to provide any inter-CPU syn-
 chronization, data protection, or memory ordering guarantees and thus
 should not be used to protect shared data structures.

 These functions should be used with care as an infinite loop within a
 critical region will deadlock the CPU.  Also, they should not be inter-
 locked with operations on mutexes, sx locks, semaphores, or other syn-
 chronization primitives.


The problem is that selrecord() wants to lock a MTX_DEF mutex, which can
cause a context switch if the mutex is already locked by another thread.
This is contrary to what bktr_poll() wants to accomplish by calling
critical_enter().

I'm guessing that bktr_poll() wants to prevent bktr->vbisize from being
updated by an interrupt while the above mentioned block of code is
executing, possibly to prevent a select wakeup from being missed.  If
this is correct, the proper fix would probably be for a mutex to be
added to the bktr structure, and the DISABLE_INTR() and ENABLE_INTR()
calls above to lock and unlock this mutex.  Other places in the code
that manipulate bktr->vbisize should grab the mutex before doing so, and
there are places in the code that are cu

Re: panic on 5.2 BETA: blockable sleep lock

2003-11-27 Thread Don Lewis
On 27 Nov, Stefan Ehmann wrote:
> On Wed, 2003-11-26 at 08:33, Don Lewis wrote:
>> The problem is that selrecord() wants to lock a MTX_DEF mutex, which can
>> cause a context switch if the mutex is already locked by another thread.
>> This is contrary to what bktr_poll() wants to accomplish by calling
>> critical_enter().
> 
> Strange enough that does not seem to happen with a kernel built without
> INVARIANTS and WITNESS. Does this make any sense or is this just by
> chance?

You might try the patch below with WITNESS enabled.  I don't have the
hardware, so I can't test it.  It compiles for me, but for all I know it
could delete all your files if you run it.

Index: sys/dev/bktr/bktr_core.c
===
RCS file: /home/ncvs/src/sys/dev/bktr/bktr_core.c,v
retrieving revision 1.131
diff -u -r1.131 bktr_core.c
--- sys/dev/bktr/bktr_core.c9 Nov 2003 09:17:21 -   1.131
+++ sys/dev/bktr/bktr_core.c27 Nov 2003 23:58:19 -
@@ -526,6 +526,9 @@
}
 #endif /* FreeBSD or BSDi */
 
+#ifdef USE_VBIMUTEX
+   mtx_init(&bktr->vbimutex, "bktr vbi lock", NULL, MTX_DEF);
+#endif
 
 /* If this is a module, save the current contiguous memory */
 #if defined(BKTR_FREEBSD_MODULE)
@@ -807,6 +810,7 @@
 * both Odd and Even VBI data is captured. Therefore we do this
 * in the Even field interrupt handler.
 */
+   LOCK_VBI(bktr);
if (  (bktr->vbiflags & VBI_CAPTURE)
&&(bktr->vbiflags & VBI_OPEN)
 &&(field==EVEN_F)) {
@@ -826,6 +830,7 @@
 
 
}
+   UNLOCK_VBI(bktr);
 
/*
 *  Register the completed field
@@ -1066,8 +1071,13 @@
 int
 vbi_open( bktr_ptr_t bktr )
 {
-   if (bktr->vbiflags & VBI_OPEN)  /* device is busy */
+
+   LOCK_VBI(bktr);
+
+   if (bktr->vbiflags & VBI_OPEN) {/* device is busy */
+   UNLOCK_VBI(bktr);
return( EBUSY );
+   }
 
bktr->vbiflags |= VBI_OPEN;
 
@@ -1081,6 +1091,8 @@
bzero((caddr_t) bktr->vbibuffer, VBI_BUFFER_SIZE);
bzero((caddr_t) bktr->vbidata,  VBI_DATA_SIZE);
 
+   UNLOCK_VBI(bktr);
+
return( 0 );
 }
 
@@ -1166,8 +1178,12 @@
 vbi_close( bktr_ptr_t bktr )
 {
 
+   LOCK_VBI(bktr);
+
bktr->vbiflags &= ~VBI_OPEN;
 
+   UNLOCK_VBI(bktr);
+
return( 0 );
 }
 
@@ -1232,19 +1248,32 @@
 int
 vbi_read(bktr_ptr_t bktr, struct uio *uio, int ioflag)
 {
-   int readsize, readsize2;
+   int readsize, readsize2, start;
int status;
 
+   /*
+* XXX - vbi_read() should be protected against being re-entered
+* while it is unlocked for the uiomove.
+*/
+   LOCK_VBI(bktr);
 
while(bktr->vbisize == 0) {
if (ioflag & IO_NDELAY) {
-   return EWOULDBLOCK;
+   status = EWOULDBLOCK;
+   goto out;
}
 
bktr->vbi_read_blocked = TRUE;
+#ifdef USE_VBIMUTEX
+   if ((status = msleep(VBI_SLEEP, &bktr->vbimutex, VBIPRI, "vbi",
+   0))) {
+   goto out;
+   }
+#else
if ((status = tsleep(VBI_SLEEP, VBIPRI, "vbi", 0))) {
-   return status;
+   goto out;
}
+#endif
}
 
/* Now we have some data to give to the user */
@@ -1262,19 +1291,28 @@
/* We need to wrap around */
 
readsize2 = VBI_BUFFER_SIZE - bktr->vbistart;
-   status = uiomove((caddr_t)bktr->vbibuffer + bktr->vbistart, 
readsize2, uio);
-   status += uiomove((caddr_t)bktr->vbibuffer, (readsize - readsize2), 
uio);
+   start =  bktr->vbistart;
+   UNLOCK_VBI(bktr);
+   status = uiomove((caddr_t)bktr->vbibuffer + start, readsize2, 
uio);
+   if (status == 0)
+   status = uiomove((caddr_t)bktr->vbibuffer, (readsize - 
readsize2), uio);
} else {
+   UNLOCK_VBI(bktr);
/* We do not need to wrap around */
status = uiomove((caddr_t)bktr->vbibuffer + bktr->vbistart, readsize, 
uio);
}
 
+   LOCK_VBI(bktr);
+
/* Update the number of bytes left to read */
bktr->vbisize -= readsize;
 
/* Update vbistart */
bktr->vbistart += readsize;
bktr->vbistart = bktr->vbistart % VBI_BUFFER_SIZE; /* wrap around if needed */
+
+out:
+   UNLOCK_VBI(bktr);
 
return( status );
 
Index: sys/dev/bktr/bktr_os.c
===
RCS file: /home/ncvs/src/sys/dev/bktr/bktr

[patch] mtx_init() API violations

2003-11-27 Thread Don Lewis
It's a good thing that the value of MTX_DEF is 0 ;-)

This isn't a critical fix, but it probably should be done soon after the
code freeze is lifted to prevent the spread if infection via cut and
paste programming.

Index: dev/ata/ata-all.c
===
RCS file: /home/ncvs/src/sys/dev/ata/ata-all.c,v
retrieving revision 1.197
diff -u -r1.197 ata-all.c
--- dev/ata/ata-all.c   11 Nov 2003 14:55:35 -  1.197
+++ dev/ata/ata-all.c   28 Nov 2003 01:34:26 -
@@ -120,7 +120,7 @@
 ch->dev = dev;
 ch->state = ATA_IDLE;
 bzero(&ch->queue_mtx, sizeof(struct mtx));
-mtx_init(&ch->queue_mtx, "ATA queue lock", MTX_DEF, 0);
+mtx_init(&ch->queue_mtx, "ATA queue lock", NULL, MTX_DEF);
 TAILQ_INIT(&ch->ata_queue);
 
 /* initialise device(s) on this channel */
Index: dev/ata/ata-disk.c
===
RCS file: /home/ncvs/src/sys/dev/ata/ata-disk.c,v
retrieving revision 1.164
diff -u -r1.164 ata-disk.c
--- dev/ata/ata-disk.c  11 Nov 2003 14:55:35 -  1.164
+++ dev/ata/ata-disk.c  28 Nov 2003 01:35:05 -
@@ -94,7 +94,7 @@
adp->sectors = 17;
adp->heads = 8;
 }
-mtx_init(&adp->queue_mtx, "ATA disk bioqueue lock", MTX_DEF, 0);
+mtx_init(&adp->queue_mtx, "ATA disk bioqueue lock", NULL, MTX_DEF);
 bioq_init(&adp->queue);
 
 lbasize = (u_int32_t)atadev->param->lba_size_1 |
Index: dev/ata/atapi-cd.c
===
RCS file: /home/ncvs/src/sys/dev/ata/atapi-cd.c,v
retrieving revision 1.156
diff -u -r1.156 atapi-cd.c
--- dev/ata/atapi-cd.c  24 Nov 2003 14:20:19 -  1.156
+++ dev/ata/atapi-cd.c  28 Nov 2003 01:35:18 -
@@ -222,7 +222,7 @@
 if (!(cdp = malloc(sizeof(struct acd_softc), M_ACD, M_NOWAIT | M_ZERO)))
return NULL;
 bioq_init(&cdp->queue);
-mtx_init(&cdp->queue_mtx, "ATAPI CD bioqueue lock", MTX_DEF, 0);
+mtx_init(&cdp->queue_mtx, "ATAPI CD bioqueue lock", NULL, MTX_DEF);
 cdp->device = atadev;
 cdp->lun = ata_get_lun(&acd_lun_map);
 cdp->block_size = 2048;
Index: dev/ata/atapi-fd.c
===
RCS file: /home/ncvs/src/sys/dev/ata/atapi-fd.c,v
retrieving revision 1.89
diff -u -r1.89 atapi-fd.c
--- dev/ata/atapi-fd.c  11 Nov 2003 14:55:35 -  1.89
+++ dev/ata/atapi-fd.c  28 Nov 2003 01:35:28 -
@@ -80,7 +80,7 @@
 fdp->lun = ata_get_lun(&afd_lun_map);
 ata_set_name(atadev, "afd", fdp->lun);
 bioq_init(&fdp->queue);
-mtx_init(&fdp->queue_mtx, "ATAPI FD bioqueue lock", MTX_DEF, 0);  
+mtx_init(&fdp->queue_mtx, "ATAPI FD bioqueue lock", NULL, MTX_DEF);  
 
 if (afd_sense(fdp)) {
free(fdp, M_AFD);
Index: dev/ata/atapi-tape.c
===
RCS file: /home/ncvs/src/sys/dev/ata/atapi-tape.c,v
retrieving revision 1.84
diff -u -r1.84 atapi-tape.c
--- dev/ata/atapi-tape.c11 Nov 2003 14:55:36 -  1.84
+++ dev/ata/atapi-tape.c28 Nov 2003 01:35:46 -
@@ -103,7 +103,7 @@
 stp->lun = ata_get_lun(&ast_lun_map);
 ata_set_name(atadev, "ast", stp->lun);
 bioq_init(&stp->queue);
-mtx_init(&stp->queue_mtx, "ATAPI TAPE bioqueue lock", MTX_DEF, 0);
+mtx_init(&stp->queue_mtx, "ATAPI TAPE bioqueue lock", NULL, MTX_DEF);
 
 if (ast_sense(stp)) {
free(stp, M_AST);
Index: dev/led/led.c
===
RCS file: /home/ncvs/src/sys/dev/led/led.c,v
retrieving revision 1.3
diff -u -r1.3 led.c
--- dev/led/led.c   23 Nov 2003 10:22:51 -  1.3
+++ dev/led/led.c   28 Nov 2003 01:35:59 -
@@ -216,7 +216,7 @@
struct sbuf *sb;
 
if (next_minor == 0) {
-   mtx_init(&led_mtx, "LED mtx", MTX_DEF, 0);
+   mtx_init(&led_mtx, "LED mtx", NULL, MTX_DEF);
timeout(led_timeout, NULL, hz / 10);
}
 
Index: dev/pst/pst-pci.c
===
RCS file: /home/ncvs/src/sys/dev/pst/pst-pci.c,v
retrieving revision 1.5
diff -u -r1.5 pst-pci.c
--- dev/pst/pst-pci.c   24 Aug 2003 17:54:17 -  1.5
+++ dev/pst/pst-pci.c   28 Nov 2003 01:36:09 -
@@ -96,7 +96,7 @@
 sc->phys_ibase = vtophys(sc->ibase);
 sc->reg = (struct i2o_registers *)sc->ibase;
 sc->dev = dev;
-mtx_init(&sc->mtx, "pst lock", MTX_DEF, 0);
+mtx_init(&sc->mtx, "pst lock", NULL, MTX_DEF);
 
 if (!iop_init(sc))
return 0;
Index: dev/sound/pcm/sndstat.c
===
RCS file: /home/ncvs/src/sys/dev/sound/pcm/sndstat.c,v
retrieving revision 1.14
diff -u -r1.14 sndstat.c
--- dev/sound/pcm/sndstat.c 7 Sep 2003 16:28:03 -   1.14
+++ dev/sound/pcm/sndstat.c 28 Nov 2003 01:36:21 -
@@ -340,7 +340,7 @@
 static int
 sndstat_init(void)
 {
-   mtx_i

Re: panic inserting CF card.

2003-11-27 Thread Don Lewis
On 27 Nov, masta wrote:
> I'm able to reproduce a panic on demand. I simply insert my IBM Microdrive
> CF card into the PCM/CIA slot of the laptop, which is a Dell CPi, and
> panic! I build a kernel.debug in the hope somebody can help me analyze the
> back-trace.
> 
> Oh by the way, the sources from this backtrace are from 11/27/2003, and
> inserting the CF card used to be functional in early November. I have no
> begun the process of targeting various dates/time to find when
> functionality was lost.

> (kgdb) where
> #10 0xc06995d8 in calltrap () at {standard input}:94
> #11 0xc06be8de in __udivdi3 (a=0, b=0) at ../../../libkern/udivdi3.c:51
> #12 0xc044ac53 in ad_print (adp=0x0) at ../../../dev/ata/ata-disk.c:384
> #13 0xc044a3fb in ad_attach (atadev=0xc243d4a4) at
> ../../../dev/ata/ata-disk.c:162
> #14 0xc0435e99 in ata_attach (dev=0x0) at ../../../dev/ata/ata-all.c:165
> #15 0xc04752f5 in pccard_compat_do_attach (bus=0xc13a6b00, dev=0xc243d400)
> at card_if.h:120
> #16 0xc043c927 in pccard_compat_attach (dev=0xc243d400) at card_if.h:138
> #17 0xc052fdf9 in device_probe_and_attach (dev=0xc243d400) at device_if.h:39
> #18 0xc0473fa8 in pccard_attach_card (dev=0xc13a6b00) at
> ../../../dev/pccard/pccard.c:262
> #19 0xc04628c8 in exca_insert (exca=0xc138d204) at card_if.h:66
> #20 0xc047c4f3 in cbb_insert (sc=0xc138d204) at
> ../../../dev/pccbb/pccbb.c:1078
> #21 0xc047c30b in cbb_event_thread (arg=0xc138d200) at
> ../../../dev/pccbb/pccbb.c:1028
> #22 0xc0502b84 in fork_exit (callout=0xc047c1d0 ,
> arg=0x0, frame=0x0) at ../../../kern/kern_fork.c:793
> (kgdb) q

In ad_print() there is the following statement:

ata_prtdev(adp->device,"%lluMB <%.40s> [%lld/%d/%d] at ata%d-%s %s%s\n",
   (unsigned long long)(adp->total_secs /
((1024L * 1024L) / DEV_BSIZE)),
   adp->device->param->model,
   (unsigned long long)(adp->total_secs /
(adp->heads * adp->sectors)),
   adp->heads, adp->sectors,
   device_get_unit(adp->device->channel->dev),
   (adp->device->unit == ATA_MASTER) ? "master" : "slave",
   (adp->flags & AD_F_TAG_ENABLED) ? "tagged " : "",
   ata_mode2str(adp->device->mode));   

Offhand I'd guess that adp->heads and/or adp->sectors is zero.  If
you've got a core file, try backtracking from there with gdb, otherwise
sprinkle some printf's around.  Either this calculation is new, or some
recent change is causing the heads and sectors to be initialized to
zero.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


RE: Time jumping on both 4.x and 5.x ...

2003-11-29 Thread Don Bowman
From: Kris Kennaway [mailto:[EMAIL PROTECTED]
> 
> On Sat, Nov 29, 2003 at 11:32:28AM +0100, Michael Nottebrock wrote:
> Content-Description: signed data
> > On Saturday 29 November 2003 09:19, Kris Kennaway wrote:
> > >
> > > Are all affected machines multi-processor?
> > 
> > None. Both are i386 UP (although the 4.9-RELEASE box is 
> running an SMP-enabled 
> > kernel).
> 
> I didn't think 4.x SMP kernels could run on a UP machine.

They can if the machine has an APIC. ie most P3 and newer
boards will run a MP kernel even if only one processor
is installed.

For me, the bug reproduces on 4.7, and if I set
kern.timecounter.method=1, the problem goes away. I've
reproduced on both TSC and i8254.

--don
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


RE: Corrected gettimeofday() test code

2003-11-29 Thread Don Bowman
From: Kris Kennaway [mailto:[EMAIL PROTECTED]
> 
> I forwarded the reports of timecounter problems to phk, and he asked
> that people who are seeing timecounter problems provide FULL details
> of their system configuration, including:
> 
> * dmesg
> 
> * kernel configuration
> 
> * compiler options
> 
> * time-related system configuration (whether ntpd/timed/ntpdate is
> running, and if so whether it's correcting for a seriously drifting
> clock)
> 
> * The kernel timecounter configuration, e.g. the
> kern.timecounter.method and kern.timecounter.hardware sysctls, and
> whether changing them has any effect.
> 
> * The exact output of the corrected test program below (the original
> would give spurious errors if it didn't run at least once a second,
> which may have been confusing some people if their systems were
> sufficiently loaded).
> 
> * The system status when the problem is observed (i.e. does it only
> occur under load; what else is running at the time)
> 

For this config (below), kern.timecounter.method=0 reproduces the
problem, kern.timecounter.method=1 does not.

Output in 'error' case:
1070147643.248866 1070147651.028646 1070147643.248866 1070147651.028646
1070147656.287818 1070147664.067692 1070147656.287818 1070147664.067692
1070147659.326429 1070147667.106238 1070147659.326429 1070147667.106238
1070147668.370071 1070147676.149884 1070147668.370071 1070147676.149884
1070147681.433111 1070147689.212926 1070147681.433111 1070147689.212926
1070147683.418743 1070147691.198632 1070147683.418743 1070147691.198632

problem shows up within ~30s of starting the test program, and the messages
will come out about once per 1-5s period after that, not regularly.

kern.timecounter.hardware: TSC

on this machine, i have others which are i8254 which do it too.

hw.ncpu=1

compiler flags:
COPTFLAGS= -O2 -pipe -malign-loops=4 -malign-jumps=4 -malign-functions=4
-mcpu=i686 -march=i686 -fno-gcse -g



machine is running 4.7-RELEASE-p2.

dmesg, kernel config attached.


Intel-specific functions:
Version 0673:
Type 0 - Original OEM
Family 6 - Pentium Pro
Model 7 - Pentium III/Pentium III Xeon - external L2 cache
Stepping 3
Reserved 0

from cpuid.

ntpd -p /var/run/ntpd.pid -b -g -A
runs.

Clock does not drift much, less than 2s/day. Clock is 
stepped when system boots. There is a chimer on our router 
which broadcasts ntp time every minute. ntpd is not
observed to step clock.

On the 8254 machine [a dual 0f27 xeon, 533 FSB with HTT enabled]
1070147645.119531 1070148340.497729 1070148339.-693880469 1070148340.497729

both systems were unloaded. This was with the corrected program
you included.

--don



dmesg.boot
Description: Binary data
___#
___# GENERIC -- Generic kernel configuration file for FreeBSD/i386
___#
___# For more information on this file, please read the handbook section on
___# Kernel Configuration Files:
___#
___#http://www.FreeBSD.org/handbook/kernelconfig-config.html
___#
___# The handbook is also available locally in /usr/share/doc/handbook
___# if you've installed the doc distribution, otherwise always see the
___# FreeBSD World Wide Web server (http://www.FreeBSD.org/) for the
___# latest information.
___#
___# An exhaustive list of options and more detailed explanations of the
___# device lines is also present in the ./LINT configuration file. If you are
___# in doubt as to the purpose or necessity of a line, check first in LINT.
___#
___# $FreeBSD: src/sys/i386/conf/GENERIC,v 1.246.2.39 2002/03/24 13:19:10 wilko Exp $
___machine  i386
___#cpu I386_CPU
___#cpu I486_CPU
___cpu  I586_CPU
___cpu  I686_CPU
___identTPC
___maxusers 0
___makeoptions  DEBUG=-g#Build kernel with gdb(1) debug symbols
___#options MATH_EMULATE#Support for x87 emulation
___options  INET#InterNETworking
___#options INET6   #IPv6 communications protocols
___options  FFS #Berkeley Fast Filesystem
___options  FFS_ROOT#FFS usable as root device [keep this!]
___options  SOFTUPDATES #Enable FFS soft updates support
___options  UFS_DIRHASH #Improve performance on big directories
___options  MFS #Memory Filesystem
___options  MD_ROOT #MD is a potential root device
___options  NFS #Network Filesystem
___options  NFS_ROOT#NFS usable as root device, NFS required
___options  MSDOSFS #MSDOS Filesystem
___options  CD9660  #ISO 9660 Filesystem
___options  CD9660_ROOT #CD-ROM usable as root, CD9660 required
___options  PROCFS  #Process filesystem
___options  COMPAT_43   #Compatible with BSD 4.3 [KEEP THIS!]
___options  SCSI_DELAY=20

Re: panic on 5.2 BETA: blockable sleep lock

2003-11-30 Thread Don Lewis
On 30 Nov, Stefan Ehmann wrote:
> On Fri, 2003-11-28 at 01:02, Don Lewis wrote:
>> On 27 Nov, Stefan Ehmann wrote:
>> > On Wed, 2003-11-26 at 08:33, Don Lewis wrote:
>> >> The problem is that selrecord() wants to lock a MTX_DEF mutex, which can
>> >> cause a context switch if the mutex is already locked by another thread.
>> >> This is contrary to what bktr_poll() wants to accomplish by calling
>> >> critical_enter().
>> > 
>> > Strange enough that does not seem to happen with a kernel built without
>> > INVARIANTS and WITNESS. Does this make any sense or is this just by
>> > chance?
>> 
>> You might try the patch below with WITNESS enabled.  I don't have the
>> hardware, so I can't test it.  It compiles for me, but for all I know it
>> could delete all your files if you run it.
> 
> Any chance for getting this committed?

I've been forwarding these messages to the bktr maintainer listed in
/usr/src/MAINTAINERS, in case he isn't subscribed to [EMAIL PROTECTED]  I'm not
suprised that I haven't heard from him because this issue came up at the
start of the Thanksgiving holiday weekend.  Commiting the patch will
also require re approval because of the code freeze in preparation for
5.2-RELEASE.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: 5.2-BETA panic: page fault

2003-11-30 Thread Don Lewis
On 30 Nov, Stefan Ehmann wrote:
> On Sun, 2003-11-30 at 11:13, Stefan Ehmann wrote:
>> This happens to me several times a day (cvsup from yesterday didn't
>> change anything). The panic message is always the same, the backtrace is
>> different though (but always seems to be file system related in some
>> way)
>> 
>> Here's one from today:
> 
> As per request I made a (hopefully more useful) backtrace with a patched
> gdb version:
> 
> (kgdb) bt

> #12 0xc050f8f8 in panic () at /usr/src/sys/kern/kern_shutdown.c:550
> #13 0xc068248c in trap_fatal (frame=0xd7f2ea48, eva=0)
> at /usr/src/sys/i386/i386/trap.c:821
> #14 0xc0682152 in trap_pfault (frame=0xd7f2ea48, usermode=0, eva=0)
> at /usr/src/sys/i386/i386/trap.c:735
> #15 0xc0681d63 in trap (frame=
>   {tf_fs = -672006120, tf_es = -672006128, tf_ds = -1068105712,
> tf_edi = -104931, tf_esi = 228, tf_ebp = -671946076, tf_isp =
> -671946124, tf_ebx = 0, tf_edx = 16777217, tf_ecx = -1011687424, tf_eax
> = -1011660928, tf_trapno = 12, tf_err = 0, tf_eip = -1068475565, tf_cs =
> 8, tf_eflags = 66178, tf_esp = 2, tf_ss = -1011660928}) at
> /usr/src/sys/i386/i386/trap.c:420
> #16 0xc06743d8 in calltrap () at {standard input}:94
> #17 0xc0505b53 in _mtx_lock_flags (m=0x0, opts=0, 
> file=0xc06bfc1d "/usr/src/sys/kern/kern_lock.c", line=228)
> at /usr/src/sys/kern/kern_mutex.c:214
> #18 0xc0502b54 in lockmgr (lkp=0xc3b2e028, flags=0, interlkp=0xe4, 
> td=0xc06bfc1d) at /usr/src/sys/kern/kern_lock.c:228
> #19 0xc0566d87 in vfs_busy (mp=0x0, flags=0, interlkp=0x0, td=0x0)
> at /usr/src/sys/kern/vfs_subr.c:527
> #20 0xc056374c in lookup (ndp=0xd7f2ec00) at
> /usr/src/sys/kern/vfs_lookup.c:559

It seems pretty clear that the panic is caused by passing a null pointer
to mtx_lock().  That is pretty clear from the eva=0 argument to
trap_pfault() and the m=0x0 argument to _mtx_lock_flags().  If you have
INVARIANTS defined, the first thing that _mtx_lock_flags() does is to
dereference m->mtx_object, which is at beginning of struct mtx.

There appears to be some stack spammage happening, and it is pretty much
consistent between this stack trace and the previous one displayed by
the unpatched version of gdb.  Notice how all the arguments to
vfs_busy() are NULL/0, but td and interlkp are passed directly to
lockmgr(), which has a non-NULL td and interlkp arguments, though the
interlkp argument looks seriously bogus.  Looking at the lockmgr() call
in vfs_busy(), I don't see how the flags argument to lockmgr() could be
0.  If the mp argument to vfs_busy() were really NULL, vfs_busy() would
have paniced before calling lockmgr().

I wonder if an interrupt handler is stomping on the stack ...
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: 5.2-BETA panic: page fault

2003-11-30 Thread Don Lewis
Can you reproduce this problem without bktr?

> #6  0xc06743d8 in calltrap () at {standard input}:94
> #7  0xc0505b53 in _mtx_lock_flags (m=0x0, opts=0, 
> file=0xc06bfc1d "/usr/src/sys/kern/kern_lock.c", line=228)
> at /usr/src/sys/kern/kern_mutex.c:214
> #8  0xc0502b54 in lockmgr (lkp=0xc3b2e028, flags=0, interlkp=0xe4, 
> td=0xc06bfc1d) at /usr/src/sys/kern/kern_lock.c:228
> #9  0xc0566d87 in vfs_busy (mp=0x0, flags=16, interlkp=0xc075d0e0,
> td=0x0)
> at /usr/src/sys/kern/vfs_subr.c:527
> #10 0xc056cfff in sync (td=0xc0730dc0, uap=0x0)

> #16 0xc06743d8 in calltrap () at {standard input}:94
> #17 0xc0505b53 in _mtx_lock_flags (m=0x0, opts=0, 
> file=0xc06bfc1d "/usr/src/sys/kern/kern_lock.c", line=228)
> at /usr/src/sys/kern/kern_mutex.c:214
> #18 0xc0502b54 in lockmgr (lkp=0xc3b2e028, flags=0, interlkp=0xe4, 
> td=0xc06bfc1d) at /usr/src/sys/kern/kern_lock.c:228
> #19 0xc0566d87 in vfs_busy (mp=0x0, flags=0, interlkp=0x0, td=0x0)
> at /usr/src/sys/kern/vfs_subr.c:527
> #20 0xc056374c in lookup (ndp=0xd7f2ec00) at
> /usr/src/sys/kern/vfs_lookup.c:559

You are getting a double panic, with the second happening during the
file system sync.  The code seems to be be tripping over the same mount
list entry each time.  Maybe the mount list is getting corrupted.  Are
you using amd?  Print *lkp in the lockmgr() stack frame.


You might want to add
KASSERT(mp->mnt_lock.lk_interlock !=NULL, "vfs_busy: NULL mount
pointer interlock");
at the top of vfs_busy() and right before the lockmgr() call.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: 5.2-BETA panic: page fault

2003-11-30 Thread Don Lewis
On  1 Dec, Stefan Ehmann wrote:
> On Mon, 2003-12-01 at 01:10, Don Lewis wrote:
>> Can you reproduce this problem without bktr?
>> 
> 
>> You are getting a double panic, with the second happening during the
>> file system sync.  The code seems to be be tripping over the same mount
>> list entry each time.  Maybe the mount list is getting corrupted.  Are
>> you using amd?  Print *lkp in the lockmgr() stack frame.
>> 
>> 
>> You might want to add
>>  KASSERT(mp->mnt_lock.lk_interlock !=NULL, "vfs_busy: NULL mount
>> pointer interlock");
>> at the top of vfs_busy() and right before the lockmgr() call.
> 
> No, I'm not using amd.
> 
> (kgdb) print *lkp
> $1 = {lk_interlock = 0x0, lk_flags = 0, lk_sharecount = 0, lk_waitcount
> = 0, 
>   lk_exclusivecount = 0, lk_prio = 0, lk_wmesg = 0x0, lk_timo = 0, 
>   lk_lockholder = 0x0, lk_newlock = 0x0}
> 
> This is indeed just NULLs.

Not good.  Nothing should be writing to lk_interlock once it has been
initialized.  Either something is stomping on an active struct mount,
we're still using it after it has been put on the free list, or
dp->v_mountedhere is pointing somewhere bogus.  I don't suspect the
latter because the second panic() in the sync() code doesn't follow this
path to get to the struct mount.

> I haven't tried without bktr yet but I hope I'll have time for that (and
> the KASSERT) tomorrow.
> 
> The panic only seems to happen when accessing my read-only mounted ext2
> partition. Today I tried not to access any data there and uptime is
> 14h30min now. The panic always happened after a few hours. So this is
> probably the core of the problem.

That sounds like a possibility.  I might be able to try that here when I
have some idle time on my -CURRENT box.

Can you print *dp->v_mountedhere in the lookup() frame?  That should
show the mount point information and might show if anything else in
struct mount is damaged.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: panic on 5.2 BETA: blockable sleep lock

2003-12-01 Thread Don Lewis
On  1 Dec, Roger Hardiman wrote:
> Hi
> 
> Please can someone commit the bktr patch for me to fix 5.2-BETA
> (as long as re@ approve). I don't have the resources.

If you're happy with the patch, I'll pursue re@ approval for the commit.

>> I'm not suprised that I haven't heard from him because this issue came up
> at the
>> start of the Thanksgiving holiday weekend.
> 
> If only it were that simple. Actually I'm English and we don't have
> Thanksgiving.

Sorry, I guess I should have fired up xearth ;-)
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: panic on 5.2 BETA: blockable sleep lock

2003-12-01 Thread Don Lewis
On  1 Dec, Roger Hardiman wrote:
> Hi
> 
> Please can someone commit the bktr patch for me to fix 5.2-BETA
> (as long as re@ approve). I don't have the resources.

The patch has been committed with re@ approval.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


  1   2   3   4   5   >