On Wed, Dec 09, 2020 at 06:57:44PM +0200, Martin Zaharinov wrote: > > On 9 Dec 2020, at 18:40, Guillaume Nault <gna...@redhat.com> wrote: > > On Wed, Dec 09, 2020 at 04:47:52PM +0200, Martin Zaharinov wrote: > >> Hi All > >> > >> I have problem with latest kernel release > >> And the problem is base on this late problem : > >> > >> > >> https://www.mail-archive.com/search?l=netdev@vger.kernel.org&q=subject:%22Re%5C%3A+ppp%5C%2Fpppoe%2C+still+panic+4.15.3+in+ppp_push%22&o=newest&f=1 > >> > >> I have same problem in kernel 5.6 > now I use kernel 5.9.13 and have same > >> problem. > >> > >> > >> In kernel 5.9.13 now don’t have any crashes in dimes but in one moment > >> accel service stop with defunct and in log have many of this line : > >> > >> > >> error: vlan608: ioctl(PPPIOCCONNECT): Transport endpoint is not connected > >> error: vlan617: ioctl(PPPIOCCONNECT): Transport endpoint is not connected > >> error: vlan679: ioctl(PPPIOCCONNECT): Transport endpoint is not connected > >> > >> In one moment connected user bump double or triple and after that service > >> defunct and need wait to drop all session to start . > >> > >> I talk with accel-ppp team and they said this is kernel related problem > >> and to back to kernel 4.14 there is not this problem. > >> > >> Problem is come after kernel 4.15 > and not have solution to this moment. > > > > I'm sorry, I don't understand. > > Do you mean that v4.14 worked fine (no crash, no ioctl() error)? > > Did the problem start appearing in v4.15? Or did v4.15 work and the > > problem appeared in v4.16? > > In Telegram group I talk with Sergey and Dimka and told my the problem is > come after changes from 4.14 to 4.15 > Sergey write this : "as I know, there was a similar issue in kernel 4.15 so > maybe it is still not fixed"
Ok, but what is your experience? Do you have a kernel version where accel-ppp reports no ioctl() error and doesn't crash the kernel? There wasn't a lot of changes between 4.14 and 4.15 for PPP. The only PPP patch I can see that might have been risky is commit 0171c4183559 ("ppp: unlock all_ppp_mutex before registering device"). > I don’t have options to test with this old kernel 4.14.xxx i don’t have > support for them. > > > > > >> Please help to find the problem. > >> > >> Last time in link I see is make changes in ppp_generic.c > >> > >> ppp_lock(ppp); > >> spin_lock_bh(&pch->downl); > >> if (!pch->chan) { > >> /* Don't connect unregistered channels */ > >> spin_unlock_bh(&pch->downl); > >> ppp_unlock(ppp); > >> ret = -ENOTCONN; > >> goto outl; > >> } > >> spin_unlock_bh(&pch->downl); > >> > >> > >> But this fix only to don’t display error and freeze system > >> The problem is stay and is to big. > > > > Do you use accel-ppp's unit-cache option? Does the problem go away if > > you stop using it? > > > > No I don’t use unit-cache , if I set unit-cache accel-ppp defunct same but > user Is connect and disconnet more fast. > > The problem is same with unit and without . > Only after this patch I don’t see error in dimes but this is not solution. Soryy, what's "in dimes"? Do you mean that reverting commit 77f840e3e5f0 ("ppp: prevent unregistered channels from connecting to PPP units") fixes your problem? > In network have customer what have power cut problem, when drop 600 user and > back Is normal but in this moment kernel is locking and start to make this : > sessions: > starting: 4235 > active: 3882 > finishing: 378 > The problem is starting session is not real user normal user in this server > is ~4k customers . What type of session is it? L2TP, PPPoE, PPTP? > I use pppd_compat . > > Any idea ? > > >> > >> Please help to fix. > Martin >