On Sun, 28 Oct 2007, Mitja Mu>enih wrote: > I managed to narrow down this isssue significantly. > > My hardware setup on this box is a soekris 4801 board + a 4-port serial card > by Sunix ( > http://www.sunix.com.tw/it/en/Product_Detail.php?cate=2&class_a_id=34&sid=36 > 1 ), full dmesg at the end. > > The serial ports are: > .... > puc0 at pci0 dev 10 function 0 "Sunix 40XX" rev 0x01: ports: 4 com > pccom3 at puc0 port 0 irq 11: ti16750, 64 byte fifo > pccom4 at puc0 port 1 irq 11: ti16750, 64 byte fifo > pccom5 at puc0 port 2 irq 11: ti16750, 64 byte fifo > pccom6 at puc0 port 3 irq 11: ti16750, 64 byte fifo
Can you try a more recent snap? It has fifo probe code enabled, and it is interesteing to see if the fifo probe reports a different depth. That commit was done to resolve an issue for a 2 port Sunix card. For this card (4036A) the actual fifo depth is 32 instead of 64. See http://www.openbsd.org/cgi-bin/cvsweb/src/sys/dev/ic/com_subr.c.diff?r1=1.10& r2=1.11 -Otto > ... > pccom0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo > pccom0: console > pccom1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo > ..... > > > OpenBSD 3.6-3.8: modem comms using an ISDN modem over /dev/cua03 to cua06 > work OK. > > OpenBSD 3.9 and later up to -current of a few days ago: using /dev/cua0[3,4] > fails, /dev/cua01 works OK. When I use the sunix ports, on the remote end it > appears that my ppp process is sending corrupted data mixed with valid data > fields in LCP proposals. The fact that the on-board isa-attached pccom1 > works shows that userland ppp is not at fault for this breakage. > > So far everything points at the changes in puc(4) between 3.8 and 3.9. > > Before I start reverting the (luckily few) changes one by one, does anyone > have a better idea? Or a hint on how to debug this better. > > > Thanks, Mitja > (and apologies for not submiting a dmesg the first time, but I felt that the > original mail was long enough even without it) > > > OpenBSD 4.2-current (GENERIC) #10: Sat Oct 20 15:59:16 CEST 2007 > [EMAIL PROTECTED]:/usr/src/sys/arch/i386/compile/GENERIC > cpu0: Geode(TM) Integrated Processor by National Semi ("Geode by NSC" > 586-class) 267 MHz > cpu0: FPU,TSC,MSR,CX8,CMOV,MMX > cpu0: TSC disabled > real mem = 133787648 (127MB) > avail mem = 121565184 (115MB) > mainbus0 at root > bios0 at mainbus0: AT/286+ BIOS, date 20/70/08, BIOS32 rev. 0 @ 0xf7840 > pcibios0 at bios0: rev 2.0 @ 0xf0000/0x10000 > pcibios0: pcibios_get_intr_routing - function not supported > pcibios0: PCI IRQ Routing information unavailable. > pcibios0: PCI bus #0 is the last bus > bios0: ROM list: 0xc8000/0x9000 > cpu0 at mainbus0 > pci0 at mainbus0 bus 0: configuration mode 1 (bios) > pchb0 at pci0 dev 0 function 0 "Cyrix GXm PCI" rev 0x00 > sis0 at pci0 dev 6 function 0 "NS DP83815 10/100" rev 0x00, DP83816A: irq > 10, address 00:00:24:c6:15:4c > nsphyter0 at sis0 phy 0: DP83815 10/100 PHY, rev. 1 > sis1 at pci0 dev 7 function 0 "NS DP83815 10/100" rev 0x00, DP83816A: irq > 10, address 00:00:24:c6:15:4d > nsphyter1 at sis1 phy 0: DP83815 10/100 PHY, rev. 1 > sis2 at pci0 dev 8 function 0 "NS DP83815 10/100" rev 0x00, DP83816A: irq > 10, address 00:00:24:c6:15:4e > nsphyter2 at sis2 phy 0: DP83815 10/100 PHY, rev. 1 > puc0 at pci0 dev 10 function 0 "Sunix 40XX" rev 0x01: ports: 4 com > pccom3 at puc0 port 0 irq 11: ti16750, 64 byte fifo > pccom4 at puc0 port 1 irq 11: ti16750, 64 byte fifo > pccom5 at puc0 port 2 irq 11: ti16750, 64 byte fifo > pccom6 at puc0 port 3 irq 11: ti16750, 64 byte fifo > gscpcib0 at pci0 dev 18 function 0 "NS SC1100 ISA" rev 0x00 > gpio0 at gscpcib0: 64 pins > "NS SC1100 SMI" rev 0x00 at pci0 dev 18 function 1 not configured > pciide0 at pci0 dev 18 function 2 "NS SCx200 IDE" rev 0x01: DMA, channel 0 > wired to compatibility, channel 1 wired to compatibility > wd0 at pciide0 channel 0 drive 1: <HITACHI_DK23AA-60B> > wd0: 16-sector PIO, LBA, 5729MB, 11733120 sectors > wd0(pciide0:0:1): using PIO mode 4, Ultra-DMA mode 2 > geodesc0 at pci0 dev 18 function 5 "NS SC1100 X-Bus" rev 0x00: iid 6 > revision 3 wdstatus 0 > ohci0 at pci0 dev 19 function 0 "Compaq USB OpenHost" rev 0x08: irq 5, > version 1.0, legacy support > isa0 at gscpcib0 > isadma0 at isa0 > pckbc0 at isa0 port 0x60/5 > pckbd0 at pckbc0 (kbd slot) > pckbc0: using irq 1 for kbd slot > wskbd0 at pckbd0: console keyboard > pcppi0 at isa0 port 0x61 > midi0 at pcppi0: <PC speaker> > spkr0 at pcppi0 > nsclpcsio0 at isa0 port 0x2e/2: NSC PC87366 rev 9: GPIO VLM TMS > gpio1 at nsclpcsio0: 29 pins > gscsio0 at isa0 port 0x15c/2: SC1100 SIO rev 1: > npx0 at isa0 port 0xf0/16: reported by CPUID; using exception 16 > pccom0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo > pccom0: console > pccom1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo > usb0 at ohci0: USB revision 1.0 > uhub0 at usb0 "Compaq OHCI root hub" rev 1.00/1.00 addr 1 > biomask f3e5 netmask f7e5 ttymask ffe7 > dkcsum: wd0 matches BIOS drive 0x80 > root on wd0a swap on wd0b dump on wd0b > > > > > -----Original Message----- > > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > > On Behalf Of Mitja Mu>enih > > Sent: Saturday, October 20, 2007 1:41 PM > > To: misc@openbsd.org > > Subject: weird ppp upgrade problems > > > > Hi! > > > > I have a box with two external ISDN modems attached to it > > that acts as an > > outgoing modem pool to a number of remote located ISDN > > routers (zyxel P-202H > > Plus v2). > > > > Recently I was given the go-ahead to upgrade this 3.6 box so > > I swapped the > > disk with a new one with a fresh install of 4.1 (4.2 cds > > haven't yet arrived > > then). Unfortuantely, this upgrade caused the outgoing ppp > > connections to > > fail. > > > > My initial investigation didn't show me a clear reason for > > this so I went > > back to 3.6 and then upgraded version by version, until I > > found the moment > > when the unchanged very simple ppp.conf (at the bottom of > > this mail) didn't > > work anymore. > > > > To cut a long story shirt, something happened between 3.8 and 3.9 that > > changes the behaviour of the LCP negotiation protocol in my > > specific case: > > > > 3.8: > > Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: deflink: LayerStart > > Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: deflink: > > SendConfigReq(1) > > state = Stopped > > Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: ACFCOMP[2] > > Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: PROTOCOMP[2] > > Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: ACCMAP[6] 0x00000000 > > Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: MRU[4] 1500 > > Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: MAGICNUM[6] 0x434b14d1 > > Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: deflink: State > > change Stopped > > --> Req-Sent > > Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: deflink: > > RecvConfigRej(1) > > state = Req-Sent > > Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: PROTOCOMP[2] > > Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: ACFCOMP[2] > > Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: deflink: > > SendConfigReq(2) > > state = Req-Sent > > Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: ACCMAP[6] 0x00000000 > > Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: MRU[4] 1500 > > Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: MAGICNUM[6] 0x434b14d1 > > Oct 18 16:29:27 ringie ppp[11094]: tun0: LCP: deflink: > > RecvConfigAck(2) > > state = Req-Sent > > > > 3.9: > > > > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: deflink: LayerStart > > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: deflink: > > SendConfigReq(1) state > > = Stopped > > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: ACFCOMP[2] > > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: PROTOCOMP[2] > > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: ACCMAP[6] 0x00000000 > > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: MRU[4] 1500 > > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: MAGICNUM[6] 0x1c0be38b > > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: deflink: State > > change Stopped > > --> Req-Sent > > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: deflink: > > RecvConfigReq(235) > > state = Req-Sent > > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: MRU[4] 1524 > > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: AUTHPROTO[5] > > 0xc223 (CHAP > > 0x05) > > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: MAGICNUM[6] 0x4b27d550 > > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: ACFCOMP[2] > > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: MRRU[4] 1524 > > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: ENDDISC[9] MAC > > 00:13:49:33:c8:84 > > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: LDBACP[4] 002b > > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: ACCMAP[6] 0x00000000 > > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: deflink: > > SendConfigRej(235) > > state = Req-Sent > > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: MRRU[4] 1524 > > Oct 18 16:28:45 ringie ppp[6602]: tun0: LCP: LDBACP[4] 002b > > > > Under 3.8, my end makes some proposals, the remote end > > rejects a couple of > > them and eventually we reach an agreement. Under 3.9 and all the later > > versions including -current, we send the exactly same > > proposals but the > > remote end instead of replying to them, sends its own set of > > proposals - > > which we ultimately find unnacceptable as we cannot grok LDBACP. > > > > Under 3.8 the flow goes: SendConfigReq -> RecvConfigRej -> > > SendConfigReq -> > > RecvConfigAck. > > Under 3.9 the flow goes: SendConfigReq -> RecvConfigReq -> > > SendConfigRej -> > > RecvConfigReq -> SendConfigRej ... > > > > The funny part is that our config file and proposal remain > > the same during > > the OS version change, even the physical bits being pushed through the > > serial port seem not to have changed except for MAGICNUM - so > > what makes the > > remote router change its behaviour??? What makes Zyxel react > > differently? > > > > cvs diff on /usr/src/usr.sbin/ppp/ppp between 3.8 and 3.9 > > shows a good 2k > > lines have changed, but it seems to be mostly the inclusion of radius > > support and some ipv6-related work, nothing that even > > remotely appears to be > > involved in LCP. Out of desperation I have tried to build > > 3.8's ppp sources > > on a 3.9 system, but surprisingly the resulting ppp gave the > > 3.9 behaviour. > > Could this mean that change was caused by kernel or other > > system components? > > > > > > Regards, Mitja > > > > ------------------- > > default: > > set log Phase Chat LCP IPCP CCP tun command > > set dial "ABORT BUSY ABORT NO\\sCARRIER TIMEOUT 5 \"\" AT > > OK-AT-OK ATE1Q0 > > OK \\dATDT\\T TIMEOUT 40 CONNECT" > > disable ipv6cp > > enable mssfixup > > > > stpn: > > set device /dev/cua03 > > set phone 004912345678901 > > set login > > set authname xxxxxx > > set authkey yyyyyy > > set ifaddr 0.0.0.0/16 10.32.0.1/0 255.255.0.0 0.0.0.0 > > add! 10.32.0.0/16 HISADDR > > ------------------