I managed to narrow down this isssue significantly.

My hardware setup on this box is a soekris 4801 board + a 4-port serial card
by Sunix (
http://www.sunix.com.tw/it/en/Product_Detail.php?cate=2&class_a_id=34&sid=36
1 ), full dmesg at the end.

The serial ports are:
....
puc0 at pci0 dev 10 function 0 "Sunix 40XX" rev 0x01: ports: 4 com
pccom3 at puc0 port 0 irq 11: ti16750, 64 byte fifo
pccom4 at puc0 port 1 irq 11: ti16750, 64 byte fifo
pccom5 at puc0 port 2 irq 11: ti16750, 64 byte fifo
pccom6 at puc0 port 3 irq 11: ti16750, 64 byte fifo
...
pccom0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
pccom0: console
pccom1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo
.....


OpenBSD 3.6-3.8: modem comms using an ISDN modem over /dev/cua03 to cua06
work OK.

OpenBSD 3.9 and later up to -current of a few days ago: using /dev/cua0[3,4]
fails, /dev/cua01 works OK. When I use the sunix ports, on the remote end it
appears that my ppp process is sending corrupted data mixed with valid data
fields in LCP proposals. The fact that the on-board isa-attached pccom1
works shows that userland ppp is not at fault for this breakage.

So far everything points at the changes in puc(4) between 3.8 and 3.9.

Before I start reverting the (luckily few) changes one by one, does anyone
have a better idea? Or a hint on how to debug this better.


Thanks, Mitja
(and apologies for not submiting a dmesg the first time, but I felt that the
original mail was long enough even without it)


OpenBSD 4.2-current (GENERIC) #10: Sat Oct 20 15:59:16 CEST 2007
    [EMAIL PROTECTED]:/usr/src/sys/arch/i386/compile/GENERIC
cpu0: Geode(TM) Integrated Processor by National Semi ("Geode by NSC"
586-class) 267 MHz
cpu0: FPU,TSC,MSR,CX8,CMOV,MMX
cpu0: TSC disabled
real mem  = 133787648 (127MB)
avail mem = 121565184 (115MB)
mainbus0 at root
bios0 at mainbus0: AT/286+ BIOS, date 20/70/08, BIOS32 rev. 0 @ 0xf7840
pcibios0 at bios0: rev 2.0 @ 0xf0000/0x10000
pcibios0: pcibios_get_intr_routing - function not supported
pcibios0: PCI IRQ Routing information unavailable.
pcibios0: PCI bus #0 is the last bus
bios0: ROM list: 0xc8000/0x9000
cpu0 at mainbus0
pci0 at mainbus0 bus 0: configuration mode 1 (bios)
pchb0 at pci0 dev 0 function 0 "Cyrix GXm PCI" rev 0x00
sis0 at pci0 dev 6 function 0 "NS DP83815 10/100" rev 0x00, DP83816A: irq
10, address 00:00:24:c6:15:4c
nsphyter0 at sis0 phy 0: DP83815 10/100 PHY, rev. 1
sis1 at pci0 dev 7 function 0 "NS DP83815 10/100" rev 0x00, DP83816A: irq
10, address 00:00:24:c6:15:4d
nsphyter1 at sis1 phy 0: DP83815 10/100 PHY, rev. 1
sis2 at pci0 dev 8 function 0 "NS DP83815 10/100" rev 0x00, DP83816A: irq
10, address 00:00:24:c6:15:4e
nsphyter2 at sis2 phy 0: DP83815 10/100 PHY, rev. 1
puc0 at pci0 dev 10 function 0 "Sunix 40XX" rev 0x01: ports: 4 com
pccom3 at puc0 port 0 irq 11: ti16750, 64 byte fifo
pccom4 at puc0 port 1 irq 11: ti16750, 64 byte fifo
pccom5 at puc0 port 2 irq 11: ti16750, 64 byte fifo
pccom6 at puc0 port 3 irq 11: ti16750, 64 byte fifo
gscpcib0 at pci0 dev 18 function 0 "NS SC1100 ISA" rev 0x00
gpio0 at gscpcib0: 64 pins
"NS SC1100 SMI" rev 0x00 at pci0 dev 18 function 1 not configured
pciide0 at pci0 dev 18 function 2 "NS SCx200 IDE" rev 0x01: DMA, channel 0
wired to compatibility, channel 1 wired to compatibility
wd0 at pciide0 channel 0 drive 1: <HITACHI_DK23AA-60B>
wd0: 16-sector PIO, LBA, 5729MB, 11733120 sectors
wd0(pciide0:0:1): using PIO mode 4, Ultra-DMA mode 2
geodesc0 at pci0 dev 18 function 5 "NS SC1100 X-Bus" rev 0x00: iid 6
revision 3 wdstatus 0
ohci0 at pci0 dev 19 function 0 "Compaq USB OpenHost" rev 0x08: irq 5,
version 1.0, legacy support
isa0 at gscpcib0
isadma0 at isa0
pckbc0 at isa0 port 0x60/5
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0: console keyboard
pcppi0 at isa0 port 0x61
midi0 at pcppi0: <PC speaker>
spkr0 at pcppi0
nsclpcsio0 at isa0 port 0x2e/2: NSC PC87366 rev 9: GPIO VLM TMS
gpio1 at nsclpcsio0: 29 pins
gscsio0 at isa0 port 0x15c/2: SC1100 SIO rev 1:
npx0 at isa0 port 0xf0/16: reported by CPUID; using exception 16
pccom0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
pccom0: console
pccom1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo
usb0 at ohci0: USB revision 1.0
uhub0 at usb0 "Compaq OHCI root hub" rev 1.00/1.00 addr 1
biomask f3e5 netmask f7e5 ttymask ffe7
dkcsum: wd0 matches BIOS drive 0x80
root on wd0a swap on wd0b dump on wd0b



> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
> On Behalf Of Mitja Mu>enih
> Sent: Saturday, October 20, 2007 1:41 PM
> To: misc@openbsd.org
> Subject: weird ppp upgrade problems
>
> Hi!
>
> I have a box with two external ISDN modems attached to it
> that acts as an
> outgoing modem pool to a number of remote located ISDN
> routers (zyxel P-202H
> Plus v2).
>
> Recently I was given the go-ahead to upgrade this 3.6 box so
> I swapped the
> disk with a new one with a fresh install of 4.1 (4.2 cds
> haven't yet arrived
> then). Unfortuantely, this upgrade caused the outgoing ppp
> connections to
> fail.
>
> My initial investigation didn't show me a clear reason for
> this so I went
> back to 3.6 and then upgraded version by version, until I
> found the moment
> when the unchanged very simple ppp.conf (at the bottom of
> this mail) didn't
> work anymore.
>
> To cut a long story shirt, something happened between 3.8 and 3.9 that
> changes the behaviour of the LCP negotiation protocol in my
> specific case:
>
> 3.8:
> Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: deflink: LayerStart
> Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: deflink:
> SendConfigReq(1)
> state = Stopped
> Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP:  ACFCOMP[2]
> Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP:  PROTOCOMP[2]
> Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP:  ACCMAP[6] 0x00000000
> Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP:  MRU[4] 1500
> Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP:  MAGICNUM[6] 0x434b14d1
> Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: deflink: State
> change Stopped
> --> Req-Sent
> Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: deflink:
> RecvConfigRej(1)
> state = Req-Sent
> Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP:  PROTOCOMP[2]
> Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP:  ACFCOMP[2]
> Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: deflink:
> SendConfigReq(2)
> state = Req-Sent
> Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP:  ACCMAP[6] 0x00000000
> Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP:  MRU[4] 1500
> Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP:  MAGICNUM[6] 0x434b14d1
> Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: deflink:
> RecvConfigAck(2)
> state = Req-Sent
>
> 3.9:
>
> Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: deflink: LayerStart
> Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: deflink:
> SendConfigReq(1) state
> = Stopped
> Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP:  ACFCOMP[2]
> Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP:  PROTOCOMP[2]
> Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP:  ACCMAP[6] 0x00000000
> Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP:  MRU[4] 1500
> Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP:  MAGICNUM[6] 0x1c0be38b
> Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: deflink: State
> change Stopped
> --> Req-Sent
> Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: deflink:
> RecvConfigReq(235)
> state = Req-Sent
> Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP:  MRU[4] 1524
> Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP:  AUTHPROTO[5]
> 0xc223 (CHAP
> 0x05)
> Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP:  MAGICNUM[6] 0x4b27d550
> Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP:  ACFCOMP[2]
> Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP:  MRRU[4] 1524
> Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP:  ENDDISC[9] MAC
> 00:13:49:33:c8:84
> Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP:  LDBACP[4] 002b
> Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP:  ACCMAP[6] 0x00000000
> Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: deflink:
> SendConfigRej(235)
> state = Req-Sent
> Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP:  MRRU[4] 1524
> Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP:  LDBACP[4] 002b
>
> Under 3.8, my end makes some proposals, the remote end
> rejects a couple of
> them and eventually we reach an agreement. Under 3.9 and all the later
> versions including -current, we send the exactly same
> proposals but the
> remote end instead of replying to them, sends its own set of
> proposals -
> which we ultimately find unnacceptable as we cannot grok LDBACP.
>
> Under 3.8 the flow goes: SendConfigReq ->  RecvConfigRej ->
> SendConfigReq ->
> RecvConfigAck.
> Under 3.9 the flow goes: SendConfigReq ->  RecvConfigReq ->
> SendConfigRej ->
> RecvConfigReq -> SendConfigRej ...
>
> The funny part is that our config file and proposal remain
> the same during
> the OS version change, even the physical bits being pushed through the
> serial port seem not to have changed except for MAGICNUM - so
> what makes the
> remote router change its behaviour??? What makes Zyxel react
> differently?
>
> cvs diff on /usr/src/usr.sbin/ppp/ppp between 3.8 and 3.9
> shows a good 2k
> lines have changed, but it seems to be mostly the inclusion of radius
> support and some ipv6-related work, nothing that even
> remotely appears to be
> involved in LCP.  Out of desperation I have tried to build
> 3.8's ppp sources
> on a 3.9 system, but surprisingly the resulting ppp gave the
> 3.9 behaviour.
> Could this mean that change was caused by kernel or other
> system components?
>
>
> Regards, Mitja
>
> -------------------
> default:
>  set log Phase Chat LCP IPCP CCP tun command
>  set dial "ABORT BUSY ABORT NO\\sCARRIER TIMEOUT 5 \"\" AT
> OK-AT-OK ATE1Q0
> OK \\dATDT\\T TIMEOUT 40 CONNECT"
>  disable ipv6cp
>  enable mssfixup
>
> stpn:
>  set device /dev/cua03
>  set phone 004912345678901
>  set login
>  set authname xxxxxx
>  set authkey yyyyyy
>  set ifaddr 0.0.0.0/16 10.32.0.1/0 255.255.0.0 0.0.0.0
>  add! 10.32.0.0/16 HISADDR
> ------------------

Reply via email to