Re: weekly locate error Was: September 2024 stabilization week

2024-10-01 Thread Kyle Evans

On 10/1/24 11:29, Rodney W. Grimes wrote:

On 9/30/24 19:36, Jamie Landeg-Jones wrote:

Kyle Evans  wrote:


It might be that the better long-term approach is to teach updatedb.sh
how to drop privileges and push that out of the periodic script to avoid
surprises like this from the different execution environments.  This
/feels/ like the kind of thing we could take an opinionated stance on,
maybe providing an escape hatch of some sort if someone really wants to
complain that they can't document all filenames on the system.


This is how it already works. It calls locate.updatedb as "nobody", so
only files readable by "nobody" are indexed:

  echo /usr/libexec/locate.updatedb | nice -n 5 su -fm nobody || rc=3


Yes, my proposal is that it stops doing that and we teach updatedb to
handle the priv-dropping instead, so that you get the same behavior no
matter how you execute it.


If you do this please make it possible to run it WITHOUT dropping
privledge, some of actually run locate.updatedb with full access
to file systems to produce more complete locate databases where
this information is not considered private.


Thanks,
Kyle Evans


This is the problem I have with mailing lists; 2/3 responses didn't go 
back and read the critical bit of context to my stance (but at least you 
still included it in your quote, the other one trimmed it entirely):


> [...] surprises like this from the different execution environments.
> This /feels/ like the kind of thing we could take an opinionated
> stance on, maybe providing an escape hatch of some sort if someone
> really wants to complain that they can't document all filenames on
> the system.

I don't disagree that there are probably valid cases, this is a proposal 
of a possible change, not a change itself.  Admittedly I didn't see it 
as likely as it apparently is, but it's not like I completely ignored 
the possibility.


Thanks,

Kyle Evans



Re: weekly locate error Was: September 2024 stabilization week

2024-10-01 Thread Rodney W. Grimes
> On 9/30/24 19:36, Jamie Landeg-Jones wrote:
> > Kyle Evans  wrote:
> > 
> >> It might be that the better long-term approach is to teach updatedb.sh
> >> how to drop privileges and push that out of the periodic script to avoid
> >> surprises like this from the different execution environments.  This
> >> /feels/ like the kind of thing we could take an opinionated stance on,
> >> maybe providing an escape hatch of some sort if someone really wants to
> >> complain that they can't document all filenames on the system.
> > 
> > This is how it already works. It calls locate.updatedb as "nobody", so
> > only files readable by "nobody" are indexed:
> > 
> >  echo /usr/libexec/locate.updatedb | nice -n 5 su -fm nobody || rc=3
> 
> Yes, my proposal is that it stops doing that and we teach updatedb to 
> handle the priv-dropping instead, so that you get the same behavior no 
> matter how you execute it.

If you do this please make it possible to run it WITHOUT dropping
privledge, some of actually run locate.updatedb with full access
to file systems to produce more complete locate databases where
this information is not considered private.

> Thanks,
> Kyle Evans
-- 
Rod Grimes rgri...@freebsd.org



Re: weekly locate error Was: September 2024 stabilization week

2024-10-01 Thread Olivier Certner
> Yes, my proposal is that it stops doing that and we teach updatedb to 
> handle the priv-dropping instead, so that you get the same behavior no 
> matter how you execute it.

Please don't, or at least don't without an option to avoid that.  Having a 
files DB for some projects is really handy, and I don't want to have to give 
permissions to 'nobody' for these.

Thanks and regards.

-- 
Olivier Certner

signature.asc
Description: This is a digitally signed message part.


Re: panic: curthread not pinned

2024-10-01 Thread dsdqmzk
Konstantin Belousov wrote:
> On Tue, Oct 01, 2024 at 11:53:33AM +0300, Konstantin Belousov wrote:
>> On Tue, Oct 01, 2024 at 01:16:31PM +0700, dsdq...@hotmail.com wrote:
>>> Hyper-V Gen2 VM, 8 cores, 8GB RAM.  The panic is reproducible while
>>> running `make -j8 buildworld`.  Initially installed from
>>> FreeBSD-15.0-CURRENT-amd64-20240926-6a4f0c063718-272495-disc1.iso,
>>> updating the kernel (that I could build) to b35f0aa4952 does not help.
>>>
>>> panic: curthread not pinned
>>> cpuid = 3
>>> time = 1727763183
>>> KDB: stack backtrace:
>>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
>>> 0xfe008e34a1f0
>>> vpanic() at vpanic+0x13f/frame 0xfe008e34a320
>>> panic() at panic+0x43/frame 0xfe008e34a380
>>> smp_targeted_tlb_shootdown_native() at
>>> smp_targeted_tlb_shootdown_native+0x472/frame 0xfe008e34a4c0
>>> pmap_remove_all() at pmap_remove_all+0x560/frame 0xfe008e34a540
>>
>> Can you obtain the core dump and then backtrace from kgdb?
> 
> I think I found a place where this occurs.  Please try the patch below.
> 
> commit 6dcffb980fa3026092f79107ee7668918c9f5490
> Author: Konstantin Belousov 
> Date:   Tue Oct 1 14:45:23 2024 +0300
> 
> hyperv: call smp_targeted_tlb_shootdown_native() with pin
> 
> Sponsored by:   The FreeBSD Foundation
> MFC after:  1 week
> 
> diff --git a/sys/dev/hyperv/vmbus/hyperv_mmu.c 
> b/sys/dev/hyperv/vmbus/hyperv_mmu.c
> index 7c29fe294093..8e982974161c 100644
> --- a/sys/dev/hyperv/vmbus/hyperv_mmu.c
> +++ b/sys/dev/hyperv/vmbus/hyperv_mmu.c
> @@ -241,7 +241,6 @@ hv_vm_tlb_flush(pmap_t pmap, vm_offset_t addr1, 
> vm_offset_t addr2,
>   critical_exit();
>   return;
>  native:
> - sched_unpin();
>   critical_exit();
>   return smp_targeted_tlb_shootdown_native(pmap, addr1,
>   addr2, curcpu_cb, op);

It seems to have helped, at least I was able to finish the buildworld
successfully with patched kernel, thanks.



Re: panic: curthread not pinned

2024-10-01 Thread Konstantin Belousov
On Tue, Oct 01, 2024 at 11:53:33AM +0300, Konstantin Belousov wrote:
> On Tue, Oct 01, 2024 at 01:16:31PM +0700, dsdq...@hotmail.com wrote:
> > Hyper-V Gen2 VM, 8 cores, 8GB RAM.  The panic is reproducible while
> > running `make -j8 buildworld`.  Initially installed from
> > FreeBSD-15.0-CURRENT-amd64-20240926-6a4f0c063718-272495-disc1.iso,
> > updating the kernel (that I could build) to b35f0aa4952 does not help.
> > 
> > panic: curthread not pinned
> > cpuid = 3
> > time = 1727763183
> > KDB: stack backtrace:
> > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
> > 0xfe008e34a1f0
> > vpanic() at vpanic+0x13f/frame 0xfe008e34a320
> > panic() at panic+0x43/frame 0xfe008e34a380
> > smp_targeted_tlb_shootdown_native() at
> > smp_targeted_tlb_shootdown_native+0x472/frame 0xfe008e34a4c0
> > pmap_remove_all() at pmap_remove_all+0x560/frame 0xfe008e34a540
> 
> Can you obtain the core dump and then backtrace from kgdb?

I think I found a place where this occurs.  Please try the patch below.

commit 6dcffb980fa3026092f79107ee7668918c9f5490
Author: Konstantin Belousov 
Date:   Tue Oct 1 14:45:23 2024 +0300

hyperv: call smp_targeted_tlb_shootdown_native() with pin

Sponsored by:   The FreeBSD Foundation
MFC after:  1 week

diff --git a/sys/dev/hyperv/vmbus/hyperv_mmu.c 
b/sys/dev/hyperv/vmbus/hyperv_mmu.c
index 7c29fe294093..8e982974161c 100644
--- a/sys/dev/hyperv/vmbus/hyperv_mmu.c
+++ b/sys/dev/hyperv/vmbus/hyperv_mmu.c
@@ -241,7 +241,6 @@ hv_vm_tlb_flush(pmap_t pmap, vm_offset_t addr1, vm_offset_t 
addr2,
critical_exit();
return;
 native:
-   sched_unpin();
critical_exit();
return smp_targeted_tlb_shootdown_native(pmap, addr1,
addr2, curcpu_cb, op);



Re: weekly locate error Was: September 2024 stabilization week

2024-10-01 Thread void
On Mon, 30 Sep 2024, at 22:32, Kyle Evans wrote:

> -   install $tmp $FCODES
> +   cat $tmp > $FCODES

thank you!
-- 



Re: panic: curthread not pinned

2024-10-01 Thread Konstantin Belousov
On Tue, Oct 01, 2024 at 01:16:31PM +0700, dsdq...@hotmail.com wrote:
> Hyper-V Gen2 VM, 8 cores, 8GB RAM.  The panic is reproducible while
> running `make -j8 buildworld`.  Initially installed from
> FreeBSD-15.0-CURRENT-amd64-20240926-6a4f0c063718-272495-disc1.iso,
> updating the kernel (that I could build) to b35f0aa4952 does not help.
> 
> panic: curthread not pinned
> cpuid = 3
> time = 1727763183
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
> 0xfe008e34a1f0
> vpanic() at vpanic+0x13f/frame 0xfe008e34a320
> panic() at panic+0x43/frame 0xfe008e34a380
> smp_targeted_tlb_shootdown_native() at
> smp_targeted_tlb_shootdown_native+0x472/frame 0xfe008e34a4c0
> pmap_remove_all() at pmap_remove_all+0x560/frame 0xfe008e34a540

Can you obtain the core dump and then backtrace from kgdb?



Re: weekly locate error Was: September 2024 stabilization week

2024-10-01 Thread Jamie Landeg-Jones
Kyle Evans  wrote:

> Yes, my proposal is that it stops doing that and we teach updatedb to 
> handle the priv-dropping instead, so that you get the same behavior no 
> matter how you execute it.

Ahhh OK, I get you now. sorry, I musunderstood, I thought you meant the
current "periodic" method runs the filesystem walk as root, but when you
said "if someone really wants to complain that they can't document all
filenames on the system.", i guess you were referring to those who
may call /usr/libexec/locate.updatedb directly as root.

For what it's worth, in addition to the periodic job, I do actually run
a less frequent privileged direct run of
/usr/libexec/locate.updatedb (with the output in a suitably locked
directory!). This proposed change wouldn't be an issue to me, but as a
data point, there may be quite a few others who do so too.

Cheers, Jamie



Re: September 2024 stabilization week

2024-10-01 Thread Gleb Smirnoff
On Sun, Sep 29, 2024 at 09:42:17AM -0700, Simon J. Gerraty wrote:
S> Michael Butler  wrote:
S> > > I have found that *only* on arm64, locate errors like so:
S> > >
S> > > # sh /etc/periodic/weekly/310.locate
S> 
S> This runs /usr/libexec/locate.updatedb as nobody
S> and ensures that /var/db/locate.database exists and is owned by nobody,
S> but /var/db itself is root:wheel and 755 so the error from install does
S> not seem surprising.
S> 
S> Though that begs the question of how this ever works ;-)

The way it always worked is that /var/db/locate.database always
exists and is owned by nobody.  This is done by the periodic job
before soing su:

locdb="$FCODES"
touch "$locdb" && rc=0 || rc=3
chown nobody "$locdb" || rc=3
chmod 644 "$locdb" || rc=3

After that it runs su:

echo /usr/libexec/locate.updatedb | nice -n 5 su -fm nobody || rc=3

Before f62c1f3f8e91 the file was installed with cat(1):

cat $tmp > $FCODES  # should be cp?

After f62c1f3f8e91 the install(1) is used.  The latter is designed to
use a temporary file to avoid race conditions.  But we can create a temporary
file in /var/db when we are nobody.

I'm going to change this line back to cat(1) in a week unless Dag-Erling
responds.

-- 
Gleb Smirnoff



Re: Wake (resume) regression?

2024-10-01 Thread Graham Perrin

On 01/10/2024 03:21, Graham Perrin wrote:


Has anything changed in the past two months that might cause wake to 
become unreliable?


… IIRC the unreliability became noticeable not long after the bump to 
1500023. That was around the end of August (with pkgbase). …



Correction: that was, around the end of July (not August).