I managed to narrow down this isssue significantly. My hardware setup on this box is a soekris 4801 board + a 4-port serial card by Sunix ( http://www.sunix.com.tw/it/en/Product_Detail.php?cate=2&class_a_id=34&sid=36 1 ), full dmesg at the end.
The serial ports are: .... puc0 at pci0 dev 10 function 0 "Sunix 40XX" rev 0x01: ports: 4 com pccom3 at puc0 port 0 irq 11: ti16750, 64 byte fifo pccom4 at puc0 port 1 irq 11: ti16750, 64 byte fifo pccom5 at puc0 port 2 irq 11: ti16750, 64 byte fifo pccom6 at puc0 port 3 irq 11: ti16750, 64 byte fifo ... pccom0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo pccom0: console pccom1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo ..... OpenBSD 3.6-3.8: modem comms using an ISDN modem over /dev/cua03 to cua06 work OK. OpenBSD 3.9 and later up to -current of a few days ago: using /dev/cua0[3,4] fails, /dev/cua01 works OK. When I use the sunix ports, on the remote end it appears that my ppp process is sending corrupted data mixed with valid data fields in LCP proposals. The fact that the on-board isa-attached pccom1 works shows that userland ppp is not at fault for this breakage. So far everything points at the changes in puc(4) between 3.8 and 3.9. Before I start reverting the (luckily few) changes one by one, does anyone have a better idea? Or a hint on how to debug this better. Thanks, Mitja (and apologies for not submiting a dmesg the first time, but I felt that the original mail was long enough even without it) OpenBSD 4.2-current (GENERIC) #10: Sat Oct 20 15:59:16 CEST 2007 [EMAIL PROTECTED]:/usr/src/sys/arch/i386/compile/GENERIC cpu0: Geode(TM) Integrated Processor by National Semi ("Geode by NSC" 586-class) 267 MHz cpu0: FPU,TSC,MSR,CX8,CMOV,MMX cpu0: TSC disabled real mem = 133787648 (127MB) avail mem = 121565184 (115MB) mainbus0 at root bios0 at mainbus0: AT/286+ BIOS, date 20/70/08, BIOS32 rev. 0 @ 0xf7840 pcibios0 at bios0: rev 2.0 @ 0xf0000/0x10000 pcibios0: pcibios_get_intr_routing - function not supported pcibios0: PCI IRQ Routing information unavailable. pcibios0: PCI bus #0 is the last bus bios0: ROM list: 0xc8000/0x9000 cpu0 at mainbus0 pci0 at mainbus0 bus 0: configuration mode 1 (bios) pchb0 at pci0 dev 0 function 0 "Cyrix GXm PCI" rev 0x00 sis0 at pci0 dev 6 function 0 "NS DP83815 10/100" rev 0x00, DP83816A: irq 10, address 00:00:24:c6:15:4c nsphyter0 at sis0 phy 0: DP83815 10/100 PHY, rev. 1 sis1 at pci0 dev 7 function 0 "NS DP83815 10/100" rev 0x00, DP83816A: irq 10, address 00:00:24:c6:15:4d nsphyter1 at sis1 phy 0: DP83815 10/100 PHY, rev. 1 sis2 at pci0 dev 8 function 0 "NS DP83815 10/100" rev 0x00, DP83816A: irq 10, address 00:00:24:c6:15:4e nsphyter2 at sis2 phy 0: DP83815 10/100 PHY, rev. 1 puc0 at pci0 dev 10 function 0 "Sunix 40XX" rev 0x01: ports: 4 com pccom3 at puc0 port 0 irq 11: ti16750, 64 byte fifo pccom4 at puc0 port 1 irq 11: ti16750, 64 byte fifo pccom5 at puc0 port 2 irq 11: ti16750, 64 byte fifo pccom6 at puc0 port 3 irq 11: ti16750, 64 byte fifo gscpcib0 at pci0 dev 18 function 0 "NS SC1100 ISA" rev 0x00 gpio0 at gscpcib0: 64 pins "NS SC1100 SMI" rev 0x00 at pci0 dev 18 function 1 not configured pciide0 at pci0 dev 18 function 2 "NS SCx200 IDE" rev 0x01: DMA, channel 0 wired to compatibility, channel 1 wired to compatibility wd0 at pciide0 channel 0 drive 1: <HITACHI_DK23AA-60B> wd0: 16-sector PIO, LBA, 5729MB, 11733120 sectors wd0(pciide0:0:1): using PIO mode 4, Ultra-DMA mode 2 geodesc0 at pci0 dev 18 function 5 "NS SC1100 X-Bus" rev 0x00: iid 6 revision 3 wdstatus 0 ohci0 at pci0 dev 19 function 0 "Compaq USB OpenHost" rev 0x08: irq 5, version 1.0, legacy support isa0 at gscpcib0 isadma0 at isa0 pckbc0 at isa0 port 0x60/5 pckbd0 at pckbc0 (kbd slot) pckbc0: using irq 1 for kbd slot wskbd0 at pckbd0: console keyboard pcppi0 at isa0 port 0x61 midi0 at pcppi0: <PC speaker> spkr0 at pcppi0 nsclpcsio0 at isa0 port 0x2e/2: NSC PC87366 rev 9: GPIO VLM TMS gpio1 at nsclpcsio0: 29 pins gscsio0 at isa0 port 0x15c/2: SC1100 SIO rev 1: npx0 at isa0 port 0xf0/16: reported by CPUID; using exception 16 pccom0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo pccom0: console pccom1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo usb0 at ohci0: USB revision 1.0 uhub0 at usb0 "Compaq OHCI root hub" rev 1.00/1.00 addr 1 biomask f3e5 netmask f7e5 ttymask ffe7 dkcsum: wd0 matches BIOS drive 0x80 root on wd0a swap on wd0b dump on wd0b > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > On Behalf Of Mitja Mu>enih > Sent: Saturday, October 20, 2007 1:41 PM > To: misc@openbsd.org > Subject: weird ppp upgrade problems > > Hi! > > I have a box with two external ISDN modems attached to it > that acts as an > outgoing modem pool to a number of remote located ISDN > routers (zyxel P-202H > Plus v2). > > Recently I was given the go-ahead to upgrade this 3.6 box so > I swapped the > disk with a new one with a fresh install of 4.1 (4.2 cds > haven't yet arrived > then). Unfortuantely, this upgrade caused the outgoing ppp > connections to > fail. > > My initial investigation didn't show me a clear reason for > this so I went > back to 3.6 and then upgraded version by version, until I > found the moment > when the unchanged very simple ppp.conf (at the bottom of > this mail) didn't > work anymore. > > To cut a long story shirt, something happened between 3.8 and 3.9 that > changes the behaviour of the LCP negotiation protocol in my > specific case: > > 3.8: > Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: deflink: LayerStart > Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: deflink: > SendConfigReq(1) > state = Stopped > Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: ACFCOMP[2] > Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: PROTOCOMP[2] > Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: ACCMAP[6] 0x00000000 > Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: MRU[4] 1500 > Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: MAGICNUM[6] 0x434b14d1 > Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: deflink: State > change Stopped > --> Req-Sent > Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: deflink: > RecvConfigRej(1) > state = Req-Sent > Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: PROTOCOMP[2] > Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: ACFCOMP[2] > Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: deflink: > SendConfigReq(2) > state = Req-Sent > Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: ACCMAP[6] 0x00000000 > Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: MRU[4] 1500 > Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: MAGICNUM[6] 0x434b14d1 > Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: deflink: > RecvConfigAck(2) > state = Req-Sent > > 3.9: > > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: deflink: LayerStart > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: deflink: > SendConfigReq(1) state > = Stopped > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: ACFCOMP[2] > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: PROTOCOMP[2] > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: ACCMAP[6] 0x00000000 > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: MRU[4] 1500 > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: MAGICNUM[6] 0x1c0be38b > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: deflink: State > change Stopped > --> Req-Sent > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: deflink: > RecvConfigReq(235) > state = Req-Sent > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: MRU[4] 1524 > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: AUTHPROTO[5] > 0xc223 (CHAP > 0x05) > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: MAGICNUM[6] 0x4b27d550 > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: ACFCOMP[2] > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: MRRU[4] 1524 > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: ENDDISC[9] MAC > 00:13:49:33:c8:84 > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: LDBACP[4] 002b > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: ACCMAP[6] 0x00000000 > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: deflink: > SendConfigRej(235) > state = Req-Sent > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: MRRU[4] 1524 > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: LDBACP[4] 002b > > Under 3.8, my end makes some proposals, the remote end > rejects a couple of > them and eventually we reach an agreement. Under 3.9 and all the later > versions including -current, we send the exactly same > proposals but the > remote end instead of replying to them, sends its own set of > proposals - > which we ultimately find unnacceptable as we cannot grok LDBACP. > > Under 3.8 the flow goes: SendConfigReq -> RecvConfigRej -> > SendConfigReq -> > RecvConfigAck. > Under 3.9 the flow goes: SendConfigReq -> RecvConfigReq -> > SendConfigRej -> > RecvConfigReq -> SendConfigRej ... > > The funny part is that our config file and proposal remain > the same during > the OS version change, even the physical bits being pushed through the > serial port seem not to have changed except for MAGICNUM - so > what makes the > remote router change its behaviour??? What makes Zyxel react > differently? > > cvs diff on /usr/src/usr.sbin/ppp/ppp between 3.8 and 3.9 > shows a good 2k > lines have changed, but it seems to be mostly the inclusion of radius > support and some ipv6-related work, nothing that even > remotely appears to be > involved in LCP. Out of desperation I have tried to build > 3.8's ppp sources > on a 3.9 system, but surprisingly the resulting ppp gave the > 3.9 behaviour. > Could this mean that change was caused by kernel or other > system components? > > > Regards, Mitja > > ------------------- > default: > set log Phase Chat LCP IPCP CCP tun command > set dial "ABORT BUSY ABORT NO\\sCARRIER TIMEOUT 5 \"\" AT > OK-AT-OK ATE1Q0 > OK \\dATDT\\T TIMEOUT 40 CONNECT" > disable ipv6cp > enable mssfixup > > stpn: > set device /dev/cua03 > set phone 004912345678901 > set login > set authname xxxxxx > set authkey yyyyyy > set ifaddr 0.0.0.0/16 10.32.0.1/0 255.255.0.0 0.0.0.0 > add! 10.32.0.0/16 HISADDR > ------------------