On 10/30/2007 4:58 PM, Karsten McMinn wrote:
> ddb (4). (trace and ps) Have remote accesible console on the server.
> Check for hardware problems. Check for irregular network traffic.
Thanks for your reply. As already told the system didn't get stuck in
the ddb, so no info from it. The network traffic looked quite light.
34 web hits between 11 am and ~11:44 when the system stopped responding.
Only 2 unsuccessful ftp access that morning (ftp requires TLS).
I'm wondering if this might be caused due to the lack of memory (RAM)?
This is a wild and undereducated assumption: Did the system allow pings
and connections be established but not more because spawned processes
weren't able to get the memory required to run?
Currently the memory stats of the system are (with nearly no load):
# sysctl -n hw.physmem
66678784
# sysctl -n hw.usermem
66256896
# vmstat
procs memory page disks
r b w avm fre flt re pi po fr sr wd0 cd0
0 0 0 46456 4908 46 0 0 0 0 2 7 0
traps cpu
int sys cs us sy id
232 74 18 1 1 99
# top -b
load averages: 0.39, 0.22, 0.18 09:15:24
62 processes: 61 idle, 1 on processor
CPU states: 0.7% user, 0.0% nice, 0.4% system,
0.3% interrupt, 98.7% idle
Memory: Real: 24M/51M act/tot Free: 5476K Swap: 26M/256M used/to
While monitoring the system with "vmstat -w 1" and accessing web pages I
noticed that the free memory can drop significantly. I once got it down
quite a bit by running multiple http sessions simultaneously.
procs memory page disks
r b w avm fre flt re pi po fr sr wd0 cd0
2 0 0 56096 748 1903 0 7 0 0 0 240 0
traps cpu
int sys cs us sy id
355 1545 354 50 10 40
So I assume adding more memory to the system would be a good investment
and not money wasted, right?
On 10/30/2007 5:47 PM, Claus Niesen wrote:
The console terminal didn't respond either. I could use Ctrl-Alt-F2 to switch
consoles but the console terminal wouldn't respond at all to key strokes. I
didn't see any error messages on the console itself either. Faulty hardware or
is it lack of RAM due to the multiple apache instances?
OpenBSD 4.0-stable (GENERIC) #3: Wed Mar 14 14:13:09 CDT 2007
[EMAIL PROTECTED]:/usr/src/sys/arch/i386/compile/GENERIC
cpu0: Intel Pentium II ("GenuineIntel" 686-class, 512KB L2 cache) 266 MHz
cpu0: FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,MMX
real mem = 66678784 (65116K)
avail mem = 52568064 (51336K)
using 839 buffers containing 3436544 bytes (3356K) of memory
mainbus0 (root)
bios0 at mainbus0: AT/286+(96) BIOS, date 08/22/99, BIOS32 rev. 0 @ 0xec800,
SMBIOS rev. 2.1 @ 0xf13e6 (54 entries)
bios0: Compaq Deskpro
pcibios0 at bios0: rev 2.1 @ 0xec800/0x3800
pcibios0: PCI IRQ Routing Table rev 1.0 @ 0xf6ff0/176 (9 entries)
pcibios0: PCI Interrupt Router at 000:20:0 ("Intel 82371AB PIIX4 ISA" rev 0x00)
pcibios0: PCI bus #1 is the last bus
bios0: ROM list: 0xc0000/0x8000 0xe0000/0x8000!
cpu0 at mainbus0
pci0 at mainbus0 bus 0: configuration mode 1 (no bios)
pchb0 at pci0 dev 0 function 0 "Intel 82443BX AGP" rev 0x02
ppb0 at pci0 dev 1 function 0 "Intel 82443BX AGP" rev 0x02
pci1 at ppb0 bus 1
vga1 at pci1 dev 0 function 0 "ATI Mach64 GD" rev 0x5c
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
xl0 at pci0 dev 16 function 0 "3Com 3c905C 100Base-TX" rev 0x6c: irq 11,
address 00:01:02:66:8e:45
bmtphy0 at xl0 phy 24: Broadcom 3C905C internal PHY, rev. 4
pcib0 at pci0 dev 20 function 0 "Intel 82371AB PIIX4 ISA" rev 0x02
pciide0 at pci0 dev 20 function 1 "Intel 82371AB IDE" rev 0x01: DMA, channel 0
wired to compatibility, channel 1 wired to compatibility
wd0 at pciide0 channel 0 drive 0: <IC35L060AVER07-0>
wd0: 16-sector PIO, LBA, 58644MB, 120103200 sectors
wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 2
atapiscsi0 at pciide0 channel 1 drive 0
scsibus0 at atapiscsi0: 2 targets
cd0 at scsibus0 targ 0 lun 0: <, 40X CD-ROM, 2.B1> SCSI0 5/cdrom removable
cd0(pciide0:1:0): using PIO mode 4, Ultra-DMA mode 2
uhci0 at pci0 dev 20 function 2 "Intel 82371AB USB" rev 0x01: irq 11
usb0 at uhci0: USB revision 1.0
uhub0 at usb0
uhub0: Intel UHCI root hub, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
piixpm0 at pci0 dev 20 function 3 "Intel 82371AB Power" rev 0x02: SMI
iic0 at piixpm0
isa0 at pcib0
isadma0 at isa0
pckbc0 at isa0 port 0x60/5
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0: console keyboard, using wsdisplay0
sb0 at isa0 port 0x220/24 irq 5 drq 1: dsp v3.01
midi0 at sb0: <SB MIDI UART>
audio0 at sb0
opl0 at sb0: model OPL3
midi1 at opl0: <SB Yamaha OPL3>
pcppi0 at isa0 port 0x61
midi2 at pcppi0: <PC speaker>
spkr0 at pcppi0
lpt0 at isa0 port 0x378/4 irq 7
npx0 at isa0 port 0xf0/16: reported by CPUID; using exception 16
pccom0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
pccom1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo
fdc0 at isa0 port 0x3f0/6 irq 6 drq 2
fd0 at fdc0 drive 0: 1.44MB 80 cyl, 2 head, 18 sec
biomask ff45 netmask ff45 ttymask ffc7
pctr: 686-class user-level performance counters enabled
mtrr: Pentium Pro MTRR support
dkcsum: wd0 matches BIOS drive 0x80
root on wd0a
rootdev=0x0 rrootdev=0x300 rawdev=0x302
WARNING: / was not properly unmounted
-------- Original-Nachricht --------
Datum: Tue, 30 Oct 2007 13:49:46 -0500
Von: Claus <[EMAIL PROTECTED]>
An: misc@openbsd.org
Betreff: Server trouble shooting
Background:
I'm running an web server with the Apache from the base install, php,
pureftp and postgresql database to serve multiple websites. Each
websites runs in its own instance of apache and one extra instance of
apache is doing reverse proxy via the domain name. In all 5 independent
apache instances are started. I've done this to separate the domains so
that php won't be able to access the data from another domain.
A simplified graphic representation:
Internet
|
NAT Firewall (OpenBSD)
|
+----------------------+
| | |
| Apache Reverse proxy | Web Server (OpenBSD 4.0)
| | | |
| dom1.com dom2.com |
+----------------------+
Problem:
This is the second time that after a period of time (1 to 3 months) that
the server does not respond to http, ftp and ssh. The connection seems
to be established but the service does not respond. Ping responds fine.
The first time this happened the system was in the ddb>. Since I'm not
to familiar with kernel debugging I simply restarted the system. :(
Question:
Instead of simply just rebooting the system I would like to start to
learn to trouble shoot the problem. Currently I'm physically away from
the system and can't look at the console. Since I can't connect
successfully via ssh is there anything else I could be doing remotely?