On 11/13/2016 08:55 PM, Florian Fainelli wrote: > Le 13/11/2016 à 11:51, Mason a écrit : >> On 13/11/2016 04:09, Andrew Lunn wrote: >> >>> Mason wrote: >>> >>>> When connected to a Gigabit switch >>>> 3.4 negotiates a LAN DHCP setup instantly >>>> 4.7 requires over 5 seconds to do so >>> >>> When you run tcpdump on the DHCP server, are you noticing the first >>> request is missing? >>> >>> What can happen is the dhclient gets started immediately and sends out >>> its first request before auto-negotiation has finished. So this first packet >>> gets lost. The retransmit after a few seconds is then successful. >> >> I will run tcpdump on the server as I run udhcpc on the client >> for Linux 3.4 vs 4.7 >> >> Do you know what would make auto-negotiation fail at 100 Mbps >> on 4.7? (whereas it succeeds on 3.4) >> >> (Thinking out loud) If the problem were in auto-negotiation, >> then if should work if I hard-code speed and duplex using >> ethtool, right? (IIRC, hard-coding doesn't help.) > > I would start with checking basic things: > > - does your Ethernet driver get a link UP being reported correctly > (netif_carrier_ok returns 1)? > - if you let the bootloader configure the PHY and utilize the Generic > PHY driver instead of the Atheros PHY driver, does the problem appear as > well?
Would using a "fixed-link" serve the same? It appears that using a fixed-link ð0 { #address-cells = <1>; #size-cells = <0>; #ifdef WITH_FIXED_LINK phy-connection-type = "rgmii"; fixed-link { speed = <100>; full-duplex; }; #else phy-connection-type = "rgmii"; phy-handle = <ð0_phy>; /* Atheros AR8035 */ eth0_phy: ethernet-phy@4 { interrupt-parent = <&irq0>; compatible = "ethernet-phy-id004d.d072", "ethernet-phy-ieee802.3-c22"; interrupts = <37 IRQ_TYPE_EDGE_RISING>; reg = <4>; }; #endif }; works. ---- For what is worth, the patch that Mason was talking about earlier in the thread: "...After much hair-pulling, it turned out that *some* of the breakage was caused by a local patch..." was setting changing the following delay in 'drivers/net/phy/phy.c:phy_state_machine()' /* Only re-schedule a PHY state machine change if we are polling the * PHY, if PHY_IGNORE_INTERRUPT is set, then we will be moving * between states from phy_mac_interrupt() */ if (phydev->irq == PHY_POLL) queue_delayed_work(system_power_efficient_wq, &phydev->state_queue, PHY_STATE_TIME * HZ); from "PHY_STATE_TIME * HZ" to "0". That caused 2 of 3 types of boards to fail, while one of them always worked regardless of the delay. In a nutshell: - Board A, chip X: works with delay "PHY_STATE_TIME * HZ" or "0". - Board B, chip X: does not work with delay "0" - Board C, chip Y: does not work with delay "0" Does board A works by "luck" when this delay is "0"? (this delay has always been there, but it is not clear why) > - what do transmit/receive counters on the Ethernet driver/MAC return? >