And one other From other mailing I see you send patch to Denys Fedoryshchenko this patch is :
diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c index 255a5def56e9..2acf4b0eabd1 100644 --- a/drivers/net/ppp/ppp_generic.c +++ b/drivers/net/ppp/ppp_generic.c @@ -3161,6 +3161,15 @@ ppp_connect_channel(struct channel *pch, int unit) goto outl; ppp_lock(ppp); + spin_lock_bh(>downl); + if (!pch->chan) { + /* Don't connect unregistered channels */ + ppp_unlock(ppp); + spin_unlock_bh(>downl); + ret = -ENOTCONN; + goto outl; + } + spin_unlock_bh(>downl); if (pch->file.hdrlen > ppp->file.hdrlen) ppp->file.hdrlen = pch->file.hdrlen; hdrlen = pch->file.hdrlen + 2; /* for protocol bytes */ But in official stable kernel three In ppp_generic.c is this : spin_lock_bh(&pch->downl); if (!pch->chan) { /* Don't connect unregistered channels */ spin_unlock_bh(&pch->downl); ppp_unlock(ppp); ret = -ENOTCONN; goto outl; } spin_unlock_bh(&pch->downl); It is normal to unlock ppp after spin_unlock ? shouldn't it be as you wrote it? In your patch first : + ppp_unlock(ppp); + spin_unlock_bh(>downl); But in stable kernel is : spin_unlock_bh(&pch->downl); ppp_unlock(ppp); > On 9 Dec 2020, at 20:10, Guillaume Nault <gna...@redhat.com> wrote: > > On Wed, Dec 09, 2020 at 06:57:44PM +0200, Martin Zaharinov wrote: >>> On 9 Dec 2020, at 18:40, Guillaume Nault <gna...@redhat.com> wrote: >>> On Wed, Dec 09, 2020 at 04:47:52PM +0200, Martin Zaharinov wrote: >>>> Hi All >>>> >>>> I have problem with latest kernel release >>>> And the problem is base on this late problem : >>>> >>>> >>>> https://www.mail-archive.com/search?l=netdev@vger.kernel.org&q=subject:%22Re%5C%3A+ppp%5C%2Fpppoe%2C+still+panic+4.15.3+in+ppp_push%22&o=newest&f=1 >>>> >>>> I have same problem in kernel 5.6 > now I use kernel 5.9.13 and have same >>>> problem. >>>> >>>> >>>> In kernel 5.9.13 now don’t have any crashes in dimes but in one moment >>>> accel service stop with defunct and in log have many of this line : >>>> >>>> >>>> error: vlan608: ioctl(PPPIOCCONNECT): Transport endpoint is not connected >>>> error: vlan617: ioctl(PPPIOCCONNECT): Transport endpoint is not connected >>>> error: vlan679: ioctl(PPPIOCCONNECT): Transport endpoint is not connected >>>> >>>> In one moment connected user bump double or triple and after that service >>>> defunct and need wait to drop all session to start . >>>> >>>> I talk with accel-ppp team and they said this is kernel related problem >>>> and to back to kernel 4.14 there is not this problem. >>>> >>>> Problem is come after kernel 4.15 > and not have solution to this moment. >>> >>> I'm sorry, I don't understand. >>> Do you mean that v4.14 worked fine (no crash, no ioctl() error)? >>> Did the problem start appearing in v4.15? Or did v4.15 work and the >>> problem appeared in v4.16? >> >> In Telegram group I talk with Sergey and Dimka and told my the problem is >> come after changes from 4.14 to 4.15 >> Sergey write this : "as I know, there was a similar issue in kernel 4.15 so >> maybe it is still not fixed" > > Ok, but what is your experience? Do you have a kernel version where > accel-ppp reports no ioctl() error and doesn't crash the kernel? > > There wasn't a lot of changes between 4.14 and 4.15 for PPP. > The only PPP patch I can see that might have been risky is commit > 0171c4183559 ("ppp: unlock all_ppp_mutex before registering device"). > >> I don’t have options to test with this old kernel 4.14.xxx i don’t have >> support for them. >> >> >>> >>>> Please help to find the problem. >>>> >>>> Last time in link I see is make changes in ppp_generic.c >>>> >>>> ppp_lock(ppp); >>>> spin_lock_bh(&pch->downl); >>>> if (!pch->chan) { >>>> /* Don't connect unregistered channels */ >>>> spin_unlock_bh(&pch->downl); >>>> ppp_unlock(ppp); >>>> ret = -ENOTCONN; >>>> goto outl; >>>> } >>>> spin_unlock_bh(&pch->downl); >>>> >>>> >>>> But this fix only to don’t display error and freeze system >>>> The problem is stay and is to big. >>> >>> Do you use accel-ppp's unit-cache option? Does the problem go away if >>> you stop using it? >>> >> >> No I don’t use unit-cache , if I set unit-cache accel-ppp defunct same but >> user Is connect and disconnet more fast. >> >> The problem is same with unit and without . >> Only after this patch I don’t see error in dimes but this is not solution. > > Soryy, what's "in dimes"? > Do you mean that reverting commit 77f840e3e5f0 ("ppp: prevent > unregistered channels from connecting to PPP units") fixes your problem? > >> In network have customer what have power cut problem, when drop 600 user and >> back Is normal but in this moment kernel is locking and start to make this : >> sessions: >> starting: 4235 >> active: 3882 >> finishing: 378 >> The problem is starting session is not real user normal user in this server >> is ~4k customers . > > What type of session is it? L2TP, PPPoE, PPTP? > >> I use pppd_compat . >> >> Any idea ? >> >>>> >>>> Please help to fix. >> Martin