[Bug 197921] scheduler: Allow non-migratable threads to bind to their current CPU

2024-01-09 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=197921

Zhenlei Huang  changed:

   What|Removed |Added

 CC||z...@freebsd.org

--- Comment #3 from Zhenlei Huang  ---
It seems we do not have usage that bind a thread to local CPU, otherwise
`KASSERT(THREAD_CAN_MIGRATE(td), ("%p must be migratable", td))` will complain
(when kernel built with option INVARIANTS).

(In reply to Ed Maste from comment #1)
> but, what about just moving the KASSERT after the `if (PCPU_GET(cpuid) == 
> cpu)` test?
I think that is much simpler.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


Re: route ipv6 errors on bootup in -current main-n267425-aa1223ac3afc on arm64

2024-01-09 Thread void

On Mon, Jan 08, 2024 at 01:07:30PM -0800, Enji Cooper wrote:


Was the kernel/utility built with IPv6? If not, that’s a general 
bug which should be filed (which can be easily checked/avoided 
using the FEATURES(9) subsystem)…

Cheers!
-Enji


world/kernel was built with WITHOUT_INET6= in /etc/src.conf

I made the problem go away with removing WITHOUT_INET6= and rebuilding.
The system was installed by taking 
FreeBSD-15.0-CURRENT-arm64-aarch64-RPI-20240104-8bf0882e186e-267378.img

and dd-ing it to a usb3-connected hd.

Where can I read about features?

% man features
No manual entry for "features"

it's not in apropos
thanks,
--



Re: noatime on ufs2

2024-01-09 Thread void

On Mon, Jan 08, 2024 at 12:41:02PM -0800, Xin LI wrote:

On Sun, Jan 7, 2024 at 5:27 AM void  wrote:


Hi,

Does /var/mail still need atime?

I've installed a ufs2-based -current main-n267425-aa1223ac3afc on
rpi4/8BG which installs into one / . If it's mounted with noatime,
will it have consequences for /var/mail ?



It doesn't matter if you don't normally receive emails locally (nowadays,
it's rare).


I read the periodic ones locally


If you do receive emails locally, it depends on what application(s) that
you are using.  


In this type of context it'll be either exim or dma with mutt used for reading
the emails. It'll be in whatever the default format is (mbox?)


Most applications nowadays check both mtime and atime plus
sizes of the mailbox file and do not rely on atime (because they saved the
previous mtime).  Without atime updates, some application may claim that
you have new mail when the mailbox is not empty when they first start.

That's said, if I were you and I'm using some flash based storage (with rpi
it's highly likely) regardless if I'm using mail locally; most of the time
the data is not really useful for anything, and it does increase the wear
of your storage.


Good to know. When installing, the opportunity to define partitions
didn't arise because the image was installed by dd to
media that will run the system (in this case a hard disk)

I was concerned that email might not work right without atime.
So far, it seems to be working OK.

My fstab in part looks like this:

/dev/ufs/rootfs / ufs rw,noatime,async  1   1

(async is fine because the system won't be used for data that
needs to be kept locally, and it's connected to a UPS)
--



Re: noatime on ufs2

2024-01-09 Thread void

On Tue, Jan 09, 2024 at 09:47:59AM +0100, Olivier Certner wrote:i

So, to me, at this point, it still sounds more than a gimmick 
than something really useful.  If someone has a precise use case 
for it and motivation, than of course please go ahead.


The only use-cases I [1] can think of are either with an email system 
that needs it or with something like a webserver where there's

a team of devops working on the web service who need elevated access
and a couple of sysadmins who need root for their general job, and audit
or similar is used to log these accesses.

But maybe there are more use-cases for atime? 


but as has been mentioned, most modern mail systems don't need it
and I'm not sure how much something like audit would. Do things
like tripwire/mtree need it? It's an interesting question.

[1] in my limited experience. i've only seen email "needing" it
and that's only in some contexts
--



Re: noatime on ufs2

2024-01-09 Thread robert
On Tue, Jan 9, 2024, at 05:13, void wrote:
> On Tue, Jan 09, 2024 at 09:47:59AM +0100, Olivier Certner wrote:i
>
>> So, to me, at this point, it still sounds more than a gimmick 
>> than something really useful.  If someone has a precise use case 
>> for it and motivation, than of course please go ahead.
>
> The only use-cases I [1] can think of are either with an email system 
> that needs it or with something like a webserver where there's
> a team of devops working on the web service who need elevated access
> and a couple of sysadmins who need root for their general job, and audit
> or similar is used to log these accesses.
>
> But maybe there are more use-cases for atime? 
>
> but as has been mentioned, most modern mail systems don't need it
> and I'm not sure how much something like audit would. Do things
> like tripwire/mtree need it? It's an interesting question.

No, they use other data and checksums instead of access times.

>
> [1] in my limited experience. i've only seen email "needing" it
> and that's only in some contexts
> --



Re: noatime on ufs2

2024-01-09 Thread robert
On Tue, Jan 9, 2024, at 04:47, void wrote:
> On Mon, Jan 08, 2024 at 12:41:02PM -0800, Xin LI wrote:
>>On Sun, Jan 7, 2024 at 5:27 AM void  wrote:
>>
>>> Hi,
>>>
>>> Does /var/mail still need atime?
>>>
>>> I've installed a ufs2-based -current main-n267425-aa1223ac3afc on
>>> rpi4/8BG which installs into one / . If it's mounted with noatime,
>>> will it have consequences for /var/mail ?
>>
>>
>>It doesn't matter if you don't normally receive emails locally (nowadays,
>>it's rare).
>
> I read the periodic ones locally
>
>>If you do receive emails locally, it depends on what application(s) that
>>you are using.  
>
> In this type of context it'll be either exim or dma with mutt used for reading
> the emails. It'll be in whatever the default format is (mbox?)

set mbox_type=Maildir forces Mutt to use the Maildir format. I remember one or 
more of Postfix, Procmail, Mutt, and Dovecot assuming the Maildir format for 
any folder or spool file that ended in a /. 

>
>>Most applications nowadays check both mtime and atime plus
>>sizes of the mailbox file and do not rely on atime (because they saved the
>>previous mtime).  Without atime updates, some application may claim that
>>you have new mail when the mailbox is not empty when they first start.
>>
>>That's said, if I were you and I'm using some flash based storage (with rpi
>>it's highly likely) regardless if I'm using mail locally; most of the time
>>the data is not really useful for anything, and it does increase the wear
>>of your storage.
>
> Good to know. When installing, the opportunity to define partitions
> didn't arise because the image was installed by dd to
> media that will run the system (in this case a hard disk)
>
> I was concerned that email might not work right without atime.
> So far, it seems to be working OK.
>
> My fstab in part looks like this:
>
> /dev/ufs/rootfs / ufs rw,noatime,async  1   1
>
> (async is fine because the system won't be used for data that
> needs to be kept locally, and it's connected to a UPS)
> --



Re: route ipv6 errors on bootup in -current main-n267425-aa1223ac3afc on arm64

2024-01-09 Thread void

On Tue, Jan 09, 2024 at 10:24:53AM +, void wrote:

On Mon, Jan 08, 2024 at 01:07:30PM -0800, Enji Cooper wrote:


Was the kernel/utility built with IPv6? If not, that’s a general bug 
which should be filed (which can be easily checked/avoided using the 
FEATURES(9) subsystem)…

Cheers!
-Enji


world/kernel was built with WITHOUT_INET6= in /etc/src.conf

I made the problem go away with removing WITHOUT_INET6= and rebuilding.


I'll re-add this to try and replicate the problem with the same sources
(main-n267425-aa1223ac3afc) and if it happens again I'll make a PR for it
--



Re: route ipv6 errors on bootup in -current main-n267425-aa1223ac3afc on arm64

2024-01-09 Thread void

On Tue, Jan 09, 2024 at 12:24:40PM +, void wrote:

On Tue, Jan 09, 2024 at 10:24:53AM +, void wrote:

On Mon, Jan 08, 2024 at 01:07:30PM -0800, Enji Cooper wrote:


Was the kernel/utility built with IPv6? If not, that’s a general 
bug which should be filed (which can be easily checked/avoided 
using the FEATURES(9) subsystem)…

Cheers!
-Enji


world/kernel was built with WITHOUT_INET6= in /etc/src.conf

I made the problem go away with removing WITHOUT_INET6= and rebuilding.


I'll re-add this to try and replicate the problem with the same sources
(main-n267425-aa1223ac3afc) and if it happens again I'll make a PR for it


I forgot about this line:

options INET6   # IPv6 communications protocols

which, on current/arm64 lives in std.arm64 which gets included by
GENERIC which is included by GENERIC-MMCCAM which is included by
GENERIC-MMCCAM-NODEBUG

commenting it out and having WITHOUT_INET6= in /etc/src.conf and rebuilding
fixes the problem. Sorry for the noise.
--



Re: noatime on ufs2

2024-01-09 Thread Xin LI
On Tue, Jan 9, 2024 at 2:47 AM void  wrote:

> I was concerned that email might not work right without atime.
> So far, it seems to be working OK.
>

Depending on how you define "correct".  Deliveries won't be affected by
atime setting in any way; telling if you have new mail _may_ be affected,
but it would be at worst annoying (shell / MUA claims you have new mail
while you don't, and even that it's only for their startup, once the shell
is running it won't rely on atime).

Cheers,


kernel: fatal trap 12 on CURRENT, when using WireGuard

2024-01-09 Thread Rainer Hurling
I tried to update my 15.0-CURRENT box from n267335-499e84e16f56 to a 
very recent commit. The build and install went fine. After booting with 
new base, I got a page fault with the following error:



Kernel page fault with the following non-sleepable locks held:
shared rm netlink lock (netlink lock) r = 0 (0xf8005fc8ca20) locked 
@ /usr/src/sys/netlink/netlink_domain.c:241
exclusive rw lle (lle) r = 0 (0xf801951dce90) locked @ 
/usr/src/sys/netinet/in.c:1716

stack backtrace:
#0 0x80bc6c45 at witness_debugger+0x65
#1 0x80bc7d89 at witness_warn+0x3e9
#2 0x81056b18 at trap_pfault+0x88
#3 0x81028708 at calltrap+0x8
#4 0x80dbd6a2 at nl_send_group+0x1d2
#5 0x80dc0e27 at _nlmsg_flush+0x37
#6 0x80dc4fdc at rtnl_lle_event+0x10c
#7 0x80d15e32 at arp_mark_lle_reachable+0xd2
#8 0x80d15b43 at arp_check_update_lle+0x293
#9 0x80d151c5 at arpintr+0xa65
#10 0x80caaaed at netisr_dispatch_src+0xad
#11 0x80c8d57a at ether_demux+0x0x17a
#12 0x80c8ec53 at ether_nh_input+0x403
#13 0x80caaaed at netisr_dispatch_src+0xad
#14 0x80c8d9c9 at ether_input+0xd9
#15 0x80ca66ac at iflib_rxeof+0xe4c
#16 0x80ca0b5a at _task_fn_rx+0x7a
#17 0x80ba0118 at gtaskqueue_run_locked+0xa8

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x3
fault code  = supervisor read data, page not present
instruction pointer = 0x20:0x80dc0a10
stack pointer   = 0x28:0xfe006a3a8760
frame pointer   = 0x28:0xfe006a3a8790
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1. def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 0 (if_io_tqg_0)
rdi: fe006a3a8850 rsi: fe006a3a86f0 rdx: fe006a3a87b0
rcx: f80001f88740  r8: 83210090  r9: 
rax:  rbx: 0003 rbp: fe006a3a8790
r10: 0001 r11:  r12: f8005fc8ca00
r13: f8005fc8ca20 r14: fe006a3a8850 r15: 
trap number = 12
panic: page fault
cpuid = 0
time = 1704824328
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
0xfe006a3a8430

vpanic() at vpanic+0x131/frame 0xfe006a3a8560
panic() at panic+0x43/frame 0xfe006a3a85c0
trap_fatal() at trap_fatal+0x40f/frame 0xfe006a3a8620
trap_pfault() at trap_pfault+0xae/frame 0xfe006a3a8690
calltrap() at calltrap+0x8/frame 0xfe006a3a8690
--- trap 0xc, rip = 0x80dc0a10, rsp = 0xfe006a3a8760, rbp = 
0xfe006a3a8790 ---

nl_send_one() at nl_send_one+0x20/frame 0xfe006a3a8790
nl_send_group() at nl_send_group+0x1d2/frame 0xfe006a3a8820
_nlmsg-flush() at _nlmsg_flush+0x37/frame 0xfe006a3a8840
rtnl_lle_event() at rtnl_lle_event+0x10c/frame 0xfe006a3a88e0
arp_mark_lle_reachable() at arp_mark_lle_reachable+0xd2/frame 
0xfe006a3a8930
arp_check_update_lle() at arp_check_update_lle+0x293/frame 
0xfe006a3a8a00

arpintr() at arpintr+0xa65/frame 0xfe006a3a8b60
netisr_dispatch_src() at netisr_dispatch_src+0xad/frame 0xfe006a3a8bc8
ether_demux() at ether_demux+0x17a/frame 0xfe006a4a8bf0
ether_nh_input() at ether_nh_input+0x403/frame 0xfe006a3a8c40
netisr_dispatch_src() at netisr_dispatch_src+0xad/frame 0xfe006a3a8ca0
ether_input() at ehter_input+0xd9/frame 0xfe006a3a8d00
iflib_rxeof() at iflib_rxeof+0xe4c/frame 0xfe006a3a8e00
_task_fn_rx() at _task_fn_rx+0x7a/frame 0xfe006a3a8e40
gtaskqueue_run_locked() at gtaskqueue_run_locked+0xa8/frame 
0xfe006a3a8ec0
gtaskqueue_thread_loop() at gtaskqueue_thread_loop+0xd3/frame 
0xfe006a3a8ef0

fork_exit() at fork_exit+0x82/frame 0xfe006a3a8f30
fork_trampoline() at fork_trampoline+0xe/frame 0xfe006a3a8f30
--- trap 0xf2b9f109, rip = 0x7afef8a176bef8a5, rsp = 0xddc963edd18963e9, 
rbp = 0x61f64fc36db64fc7

KDB: enter: panic
[ thread pid 0 tid 100067 ]
Stopped at  kdb_enter+0x33: movq$0,0xe3a582(%rip)
db>


Since the current process 'if_io_tqg_0' and problems with netlink are 
mentioned, I searched in the area of my network connections. I 
discovered that this page fault only occurs when a connection is 
established with WireGuard (wg-quick up wg0). Without using WireGuard, 
this error does not occur.


I was able to find out at which commit this behavior occurs with my box:
- Up to commit main-n267347-660bd40a598a everything is fine.
- The two following commits n267348-67d9023f07a4 and 
n267349-0ad011ececb9 do not build on my box (module/netlink broken ...).
- From commit n267349-0ad011ececb9 (netlink) onwards this page fault 
occurs when WireGuard is started.


Any help is greatly appreciated.
CC'ed Gleb Smirnoff due to the affected commits.

Regards,
Rainer Hurling



Re: kernel: fatal trap 12 on CURRENT, when using WireGuard

2024-01-09 Thread Gleb Smirnoff
  Rainer,

On Tue, Jan 09, 2024 at 09:23:54PM +0100, Rainer Hurling wrote:
R> I tried to update my 15.0-CURRENT box from n267335-499e84e16f56 to a very
R> recent commit. The build and install went fine. After booting with new
R> base, I got a page fault with the following error:

Sorry for that, my fault. Can you please test this patch?

-- 
Gleb Smirnoff
diff --git a/sys/netlink/netlink_domain.c b/sys/netlink/netlink_domain.c
index 7660dcada103..4790845d1d31 100644
--- a/sys/netlink/netlink_domain.c
+++ b/sys/netlink/netlink_domain.c
@@ -233,7 +233,7 @@ nl_send_group(struct nl_writer *nw)
 copy = nl_buf_copy(nb);
 if (copy != NULL) {
 	nw->buf = copy;
-	(void)nl_send_one(nw);
+	(void)nl_send(nw, nlp_last);
 } else {
 	NLP_LOCK(nlp_last);
 	if (nlp_last->nl_socket != NULL)
@@ -246,7 +246,7 @@ nl_send_group(struct nl_writer *nw)
 	}
 	if (nlp_last != NULL) {
 		nw->buf = nb;
-		(void)nl_send_one(nw);
+		(void)nl_send(nw, nlp_last);
 	} else
 		nl_buf_free(nb);
 
diff --git a/sys/netlink/netlink_io.c b/sys/netlink/netlink_io.c
index fb8e0a46e8dd..5f50c40f71d8 100644
--- a/sys/netlink/netlink_io.c
+++ b/sys/netlink/netlink_io.c
@@ -194,9 +194,8 @@ nl_taskqueue_handler(void *_arg, int pending)
  * If no queue overrunes happened, wakes up socket owner.
  */
 bool
-nl_send_one(struct nl_writer *nw)
+nl_send(struct nl_writer *nw, struct nlpcb *nlp)
 {
-	struct nlpcb *nlp = nw->nlp;
 	struct socket *so = nlp->nl_socket;
 	struct sockbuf *sb = &so->so_rcv;
 	struct nl_buf *nb;
diff --git a/sys/netlink/netlink_message_writer.c b/sys/netlink/netlink_message_writer.c
index 0b85378b41b6..50305e3d9d80 100644
--- a/sys/netlink/netlink_message_writer.c
+++ b/sys/netlink/netlink_message_writer.c
@@ -65,6 +65,13 @@ nlmsg_get_buf(struct nl_writer *nw, u_int len, bool waitok)
 	return (true);
 }
 
+static bool
+nl_send_one(struct nl_writer *nw)
+{
+
+	return (nl_send(nw, nw->nlp));
+}
+
 bool
 _nlmsg_get_unicast_writer(struct nl_writer *nw, int size, struct nlpcb *nlp)
 {
diff --git a/sys/netlink/netlink_var.h b/sys/netlink/netlink_var.h
index c8f0d02a0dab..ddf30b373446 100644
--- a/sys/netlink/netlink_var.h
+++ b/sys/netlink/netlink_var.h
@@ -130,9 +130,7 @@ void nl_osd_unregister(void);
 void nl_set_thread_nlp(struct thread *td, struct nlpcb *nlp);
 
 /* netlink_io.c */
-#define	NL_IOF_UNTRANSLATED	0x01
-#define	NL_IOF_IGNORE_LIMIT	0x02
-bool nl_send_one(struct nl_writer *);
+bool nl_send(struct nl_writer *, struct nlpcb *);
 void nlmsg_ack(struct nlpcb *nlp, int error, struct nlmsghdr *nlmsg,
 struct nl_pstate *npt);
 void nl_on_transmit(struct nlpcb *nlp);


Re: noatime on ufs2

2024-01-09 Thread Warner Losh
On Tue, Jan 9, 2024, 11:11 AM Steffen Nurpmeso  wrote:

> rob...@rrbrussell.com wrote in
>  <5f370bce-bcdb-47ea-aaa7-551ee092a...@app.fastmail.com>:
>  |On Tue, Jan 9, 2024, at 05:13, void wrote:
>  |> On Tue, Jan 09, 2024 at 09:47:59AM +0100, Olivier Certner wrote:i
>  |>> So, to me, at this point, it still sounds more than a gimmick
>  |>> than something really useful.  If someone has a precise use case
>
> Email existence checks are in UNIX for many decades.
> In fact since 1974-11-26 when Ken Thompson added that to login(1).
> "You have new mail" is in BSD since
>
>   Commit: Bill Joy 
>   CommitDate: 1978-11-05 19:59:54 -0800
>

It has also been used for almost as long to see if log files have changed
if you set your MAIL variable to that. So not just for email...

Warner

Start development on BSD 3
> Create reference copy of all prior development files
>
> in BSD Mail and csh(1).
> And today in bash(1), for example, there can be read
>
> /* If the user has just run a program which manipulates the
>mail file, then don't bother explaining that the mail
>file has been manipulated.  Since some systems don't change
>the access time to be equal to the modification time when
>the mail in the file is manipulated, check the size also.  If
>the file has not grown, continue. */
> if ((atime >= mtime) && !file_is_bigger)
>   continue;
>
> /* If the mod time is later than the access time and the file
>has grown, note the fact that this is *new* mail. */
> if (use_user_notification == 0 && (atime < mtime) &&
> file_is_bigger)
>   message = _("You have new mail in $_");
>
> I would not exactly call this a gimmick.
> On Linux mount(8) from https://github.com/karelzak/util-linux says
>
>relatime
>Update inode access times relative to modify or change time. Access
>time is only updated if the previous access time was earlier than
>or equal to the current modify or change time. (Similar to noatime,
>but it doesn’t break mutt(1) or other applications that need to
>know if a file has been read since the last time it was modified.)
>
> and this is what i use, except for some noatime mount points
> (/x/doc, /x/music, /x/pub, to be exact).
>
> --steffen
> |
> |Der Kragenbaer,The moon bear,
> |der holt sich munter   he cheerfully and one by one
> |einen nach dem anderen runter  wa.ks himself off
> |(By Robert Gernhardt)
>
>


Re: e179d973 insta-panics in nl_send_one()

2024-01-09 Thread Gleb Smirnoff
On Mon, Jan 08, 2024 at 10:40:52AM +0100, Jakob Alvermark wrote:
J> > >  --- trap 0xc, rip = 0x...f80d97b78, rsp = 0x...
J> > >  nl_send_one() at nl_send_one+0x18/frame 0xf
J> > >  nl_send_group() at nl_send_group+0x1bc/frame 0xf...
J> > >  _nlmsg_flush() at _nlmsg_flush+0x37/frame 0xf...
J> > >  rtnl_handle_ifevent() + 0xa1
J> > >  if_attach_internal + 0x3df
J> > > 
J> > > I have a picture of the full panic if desired...
J> > > 
J> > > -- 
J> > > Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
J> > > p...@freebsd.org | TCP/IP since RFC 956
J> > > FreeBSD committer   | BSD since 4.3-tahoe
J> > > Never attribute to malice what can adequately be explained by 
incompetence.
J> 
J> I get the same panic, with kernel and userland both installed.

Sorry, that was my failure. Fix pushed and now working on
a regression test that would cover Netlink group writers.

-- 
Gleb Smirnoff



Re: kernel: fatal trap 12 on CURRENT, when using WireGuard

2024-01-09 Thread Rainer Hurling

Am 09.01.24 um 21:40 schrieb Gleb Smirnoff:

   Rainer,

On Tue, Jan 09, 2024 at 09:23:54PM +0100, Rainer Hurling wrote:
R> I tried to update my 15.0-CURRENT box from n267335-499e84e16f56 to a very
R> recent commit. The build and install went fine. After booting with new
R> base, I got a page fault with the following error:

Sorry for that, my fault. Can you please test this patch?



Hi Gleb,

Thanks for the very fast response.

I tried your patch and it seems to work as expected. I have a running 
system, with WireGuard on, at commit main-n267469-0013741108bc-dirty.


Many thanks again and best wishes,
Rainer