Re: weekly locate error Was: September 2024 stabilization week
On 10/1/24 11:29, Rodney W. Grimes wrote: On 9/30/24 19:36, Jamie Landeg-Jones wrote: Kyle Evans wrote: It might be that the better long-term approach is to teach updatedb.sh how to drop privileges and push that out of the periodic script to avoid surprises like this from the different execution environments. This /feels/ like the kind of thing we could take an opinionated stance on, maybe providing an escape hatch of some sort if someone really wants to complain that they can't document all filenames on the system. This is how it already works. It calls locate.updatedb as "nobody", so only files readable by "nobody" are indexed: echo /usr/libexec/locate.updatedb | nice -n 5 su -fm nobody || rc=3 Yes, my proposal is that it stops doing that and we teach updatedb to handle the priv-dropping instead, so that you get the same behavior no matter how you execute it. If you do this please make it possible to run it WITHOUT dropping privledge, some of actually run locate.updatedb with full access to file systems to produce more complete locate databases where this information is not considered private. Thanks, Kyle Evans This is the problem I have with mailing lists; 2/3 responses didn't go back and read the critical bit of context to my stance (but at least you still included it in your quote, the other one trimmed it entirely): > [...] surprises like this from the different execution environments. > This /feels/ like the kind of thing we could take an opinionated > stance on, maybe providing an escape hatch of some sort if someone > really wants to complain that they can't document all filenames on > the system. I don't disagree that there are probably valid cases, this is a proposal of a possible change, not a change itself. Admittedly I didn't see it as likely as it apparently is, but it's not like I completely ignored the possibility. Thanks, Kyle Evans
Re: weekly locate error Was: September 2024 stabilization week
> On 9/30/24 19:36, Jamie Landeg-Jones wrote: > > Kyle Evans wrote: > > > >> It might be that the better long-term approach is to teach updatedb.sh > >> how to drop privileges and push that out of the periodic script to avoid > >> surprises like this from the different execution environments. This > >> /feels/ like the kind of thing we could take an opinionated stance on, > >> maybe providing an escape hatch of some sort if someone really wants to > >> complain that they can't document all filenames on the system. > > > > This is how it already works. It calls locate.updatedb as "nobody", so > > only files readable by "nobody" are indexed: > > > > echo /usr/libexec/locate.updatedb | nice -n 5 su -fm nobody || rc=3 > > Yes, my proposal is that it stops doing that and we teach updatedb to > handle the priv-dropping instead, so that you get the same behavior no > matter how you execute it. If you do this please make it possible to run it WITHOUT dropping privledge, some of actually run locate.updatedb with full access to file systems to produce more complete locate databases where this information is not considered private. > Thanks, > Kyle Evans -- Rod Grimes rgri...@freebsd.org
Re: weekly locate error Was: September 2024 stabilization week
> Yes, my proposal is that it stops doing that and we teach updatedb to > handle the priv-dropping instead, so that you get the same behavior no > matter how you execute it. Please don't, or at least don't without an option to avoid that. Having a files DB for some projects is really handy, and I don't want to have to give permissions to 'nobody' for these. Thanks and regards. -- Olivier Certner signature.asc Description: This is a digitally signed message part.
Re: panic: curthread not pinned
Konstantin Belousov wrote: > On Tue, Oct 01, 2024 at 11:53:33AM +0300, Konstantin Belousov wrote: >> On Tue, Oct 01, 2024 at 01:16:31PM +0700, dsdq...@hotmail.com wrote: >>> Hyper-V Gen2 VM, 8 cores, 8GB RAM. The panic is reproducible while >>> running `make -j8 buildworld`. Initially installed from >>> FreeBSD-15.0-CURRENT-amd64-20240926-6a4f0c063718-272495-disc1.iso, >>> updating the kernel (that I could build) to b35f0aa4952 does not help. >>> >>> panic: curthread not pinned >>> cpuid = 3 >>> time = 1727763183 >>> KDB: stack backtrace: >>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame >>> 0xfe008e34a1f0 >>> vpanic() at vpanic+0x13f/frame 0xfe008e34a320 >>> panic() at panic+0x43/frame 0xfe008e34a380 >>> smp_targeted_tlb_shootdown_native() at >>> smp_targeted_tlb_shootdown_native+0x472/frame 0xfe008e34a4c0 >>> pmap_remove_all() at pmap_remove_all+0x560/frame 0xfe008e34a540 >> >> Can you obtain the core dump and then backtrace from kgdb? > > I think I found a place where this occurs. Please try the patch below. > > commit 6dcffb980fa3026092f79107ee7668918c9f5490 > Author: Konstantin Belousov > Date: Tue Oct 1 14:45:23 2024 +0300 > > hyperv: call smp_targeted_tlb_shootdown_native() with pin > > Sponsored by: The FreeBSD Foundation > MFC after: 1 week > > diff --git a/sys/dev/hyperv/vmbus/hyperv_mmu.c > b/sys/dev/hyperv/vmbus/hyperv_mmu.c > index 7c29fe294093..8e982974161c 100644 > --- a/sys/dev/hyperv/vmbus/hyperv_mmu.c > +++ b/sys/dev/hyperv/vmbus/hyperv_mmu.c > @@ -241,7 +241,6 @@ hv_vm_tlb_flush(pmap_t pmap, vm_offset_t addr1, > vm_offset_t addr2, > critical_exit(); > return; > native: > - sched_unpin(); > critical_exit(); > return smp_targeted_tlb_shootdown_native(pmap, addr1, > addr2, curcpu_cb, op); It seems to have helped, at least I was able to finish the buildworld successfully with patched kernel, thanks.
Re: panic: curthread not pinned
On Tue, Oct 01, 2024 at 11:53:33AM +0300, Konstantin Belousov wrote: > On Tue, Oct 01, 2024 at 01:16:31PM +0700, dsdq...@hotmail.com wrote: > > Hyper-V Gen2 VM, 8 cores, 8GB RAM. The panic is reproducible while > > running `make -j8 buildworld`. Initially installed from > > FreeBSD-15.0-CURRENT-amd64-20240926-6a4f0c063718-272495-disc1.iso, > > updating the kernel (that I could build) to b35f0aa4952 does not help. > > > > panic: curthread not pinned > > cpuid = 3 > > time = 1727763183 > > KDB: stack backtrace: > > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame > > 0xfe008e34a1f0 > > vpanic() at vpanic+0x13f/frame 0xfe008e34a320 > > panic() at panic+0x43/frame 0xfe008e34a380 > > smp_targeted_tlb_shootdown_native() at > > smp_targeted_tlb_shootdown_native+0x472/frame 0xfe008e34a4c0 > > pmap_remove_all() at pmap_remove_all+0x560/frame 0xfe008e34a540 > > Can you obtain the core dump and then backtrace from kgdb? I think I found a place where this occurs. Please try the patch below. commit 6dcffb980fa3026092f79107ee7668918c9f5490 Author: Konstantin Belousov Date: Tue Oct 1 14:45:23 2024 +0300 hyperv: call smp_targeted_tlb_shootdown_native() with pin Sponsored by: The FreeBSD Foundation MFC after: 1 week diff --git a/sys/dev/hyperv/vmbus/hyperv_mmu.c b/sys/dev/hyperv/vmbus/hyperv_mmu.c index 7c29fe294093..8e982974161c 100644 --- a/sys/dev/hyperv/vmbus/hyperv_mmu.c +++ b/sys/dev/hyperv/vmbus/hyperv_mmu.c @@ -241,7 +241,6 @@ hv_vm_tlb_flush(pmap_t pmap, vm_offset_t addr1, vm_offset_t addr2, critical_exit(); return; native: - sched_unpin(); critical_exit(); return smp_targeted_tlb_shootdown_native(pmap, addr1, addr2, curcpu_cb, op);
Re: weekly locate error Was: September 2024 stabilization week
On Mon, 30 Sep 2024, at 22:32, Kyle Evans wrote: > - install $tmp $FCODES > + cat $tmp > $FCODES thank you! --
Re: panic: curthread not pinned
On Tue, Oct 01, 2024 at 01:16:31PM +0700, dsdq...@hotmail.com wrote: > Hyper-V Gen2 VM, 8 cores, 8GB RAM. The panic is reproducible while > running `make -j8 buildworld`. Initially installed from > FreeBSD-15.0-CURRENT-amd64-20240926-6a4f0c063718-272495-disc1.iso, > updating the kernel (that I could build) to b35f0aa4952 does not help. > > panic: curthread not pinned > cpuid = 3 > time = 1727763183 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame > 0xfe008e34a1f0 > vpanic() at vpanic+0x13f/frame 0xfe008e34a320 > panic() at panic+0x43/frame 0xfe008e34a380 > smp_targeted_tlb_shootdown_native() at > smp_targeted_tlb_shootdown_native+0x472/frame 0xfe008e34a4c0 > pmap_remove_all() at pmap_remove_all+0x560/frame 0xfe008e34a540 Can you obtain the core dump and then backtrace from kgdb?
Re: weekly locate error Was: September 2024 stabilization week
Kyle Evans wrote: > Yes, my proposal is that it stops doing that and we teach updatedb to > handle the priv-dropping instead, so that you get the same behavior no > matter how you execute it. Ahhh OK, I get you now. sorry, I musunderstood, I thought you meant the current "periodic" method runs the filesystem walk as root, but when you said "if someone really wants to complain that they can't document all filenames on the system.", i guess you were referring to those who may call /usr/libexec/locate.updatedb directly as root. For what it's worth, in addition to the periodic job, I do actually run a less frequent privileged direct run of /usr/libexec/locate.updatedb (with the output in a suitably locked directory!). This proposed change wouldn't be an issue to me, but as a data point, there may be quite a few others who do so too. Cheers, Jamie
Re: September 2024 stabilization week
On Sun, Sep 29, 2024 at 09:42:17AM -0700, Simon J. Gerraty wrote: S> Michael Butler wrote: S> > > I have found that *only* on arm64, locate errors like so: S> > > S> > > # sh /etc/periodic/weekly/310.locate S> S> This runs /usr/libexec/locate.updatedb as nobody S> and ensures that /var/db/locate.database exists and is owned by nobody, S> but /var/db itself is root:wheel and 755 so the error from install does S> not seem surprising. S> S> Though that begs the question of how this ever works ;-) The way it always worked is that /var/db/locate.database always exists and is owned by nobody. This is done by the periodic job before soing su: locdb="$FCODES" touch "$locdb" && rc=0 || rc=3 chown nobody "$locdb" || rc=3 chmod 644 "$locdb" || rc=3 After that it runs su: echo /usr/libexec/locate.updatedb | nice -n 5 su -fm nobody || rc=3 Before f62c1f3f8e91 the file was installed with cat(1): cat $tmp > $FCODES # should be cp? After f62c1f3f8e91 the install(1) is used. The latter is designed to use a temporary file to avoid race conditions. But we can create a temporary file in /var/db when we are nobody. I'm going to change this line back to cat(1) in a week unless Dag-Erling responds. -- Gleb Smirnoff
Re: Wake (resume) regression?
On 01/10/2024 03:21, Graham Perrin wrote: Has anything changed in the past two months that might cause wake to become unreliable? … IIRC the unreliability became noticeable not long after the bump to 1500023. That was around the end of August (with pkgbase). … Correction: that was, around the end of July (not August).