On Tue, Dec 22, 2020 at 04:17:21PM +1100, Jonathan Gray wrote:
> On Tue, Dec 22, 2020 at 02:56:58AM +0000, James Cook wrote:
> > On Sun, Dec 20, 2020 at 06:52:39PM +0000, James Cook wrote:
> > > On Sun, Dec 20, 2020 at 08:41:00PM +1100, Jonathan Gray wrote:
> > > > On Sun, Dec 20, 2020 at 06:27:33AM +0000, James Cook wrote:
> > > > > On Sat, Dec 19, 2020 at 07:33:42AM +0000, James Cook wrote:
> > > > > > > Suggestions are welcome. In the meantime I am slowly trying to 
> > > > > > > debug
> > > > > > > this myself, mostly as a learning exercise. I've successfully 
> > > > > > > built my
> > > > > > > own bsd.rd (using the instructions on the release(8) man page) 
> > > > > > > with the
> > > > > > > intention of adding some debug output to narrow down where it's
> > > > > > > getting stuck, but I don't know my way around the kernel code.
> > > > > > 
> > > > > > Minor progress: I have determined that the kernel gets at least as 
> > > > > > far
> > > > > > as exec-ing the init process (more precisely, calling sys_execve in
> > > > > > init_main.c).
> > > > > 
> > > > > I found out init gets stuck calling sleep(2) in setctty in
> > > > > sbin/init/init.c. (Details below on how I determined that.)
> > > > > 
> > > > > Any idea what could cause a call to sleep to just hang indefinitely?
> > > > > 
> > > > 
> > > > Can you try hpet instead of tsc?
> > > > 
> > > > Either sysctl kern.timecounter.hardware=acpihpet0
> > > 
> > > With (my custom) bsd.rd, that sysctl does not seem to exist, and it still 
> > > hangs
> > > on boot with the below code change.
> > > 
> > > With bsd.sp and bsd.mp, the sysctl was already set to acpihpet0. If I set 
> > > it to
> > > tsc, then "sleep 1" gets somewhat slower but doesn't take on the order of
> > > minutes to an hour like it does with bsd.rd.
> > > 
> > > If it's relevant: I noticed the time reported by the "date" command 
> > > advances
> > > normally with bsd.sp/bsd.mp and acpihpet0 (I ran two date commands 15s 
> > > apart
> > > according to my phone timer, and got dates 15s apart). With the sysctl 
> > > set to
> > > tsc, or with bsd.rd, the time returned by date advances slowly.
> > > 
> > > (With bsd.sp and bsd.mp, I tried "disable acpihpet*" at the boot_config 
> > > prompt,
> > > but the sysctl was still set to acpihpet0 when I booted.)
> > 
> > Sorry Jonathan, I somehow didn't include you in my reply, which is
> > quoted above.
> > 
> > I have a bit more data to share.
> > 
> > It appears bsd.rd's tsleep_nsec is sleeping for about 2.5x as long as
> > needed, but its uptime increases at only 1/2500 the rate of real time,
> > so sys_nanosleep is calling tsleep_nsec about 1000 times to compensate.
> > (I didn't wait around for all 1000 loop iterations).
> > 
> > By contrast, bsd.sp's tsleep_nsec sleeps for about 4.2x as long as
> > requested, but it is able to measure real time accurately.
> > 
> > Based on my previous email, I guess bsd.rd is using tsc and bsd.sp is
> > using hpet.
> > 
> > In more detail:
> > 
> > I added some debug output to the sys_nanosleep function, printing the
> > "start" and "end" values returned by getnanouptime on each iteration of
> > the loop. Diff at bottom of email.
> > 
> > I've summarized in these two tables and explained them in the text
> > below.
> > 
> > 
> > bsd.rd:
> > 
> > requested nsecs | actual time per loop iteration | end - start
> > ----------------+--------------------------------+-------------
> >      2000000000 |                             5s |         2ms
> > 
> > 
> > bsd.sp:
> > 
> > requested nsecs | end - start (plausibly matches actual time)
> > ----------------+--------------------------------------------
> >      1000000000 |   4.22s
> >      2000000000 |   8.40s
> >      5000000000 |  20.93s
> >     10000000000 |  41.82s
> > 
> > 
> > requested nsecs:
> > 
> > This is the "nsecs" variable in the inner loop, matching what was
> > passed to the nanosleep system call. In the case of bsd.rd, I list the
> > value from the first loop iteration; for bsd.sp, there was always just
> > one iteration.
> > 
> > 
> > end - start:
> > 
> > This is the difference between the "stop" and "start" timespecs, filled
> > by getnanouptime before and after the call to tsleep_nsec.
> > 
> > Note that for bsd.rd, it only goes up by 2ms, only 1/1000 of what was
> > requested.
> > 
> > For bsd.sp, it seems to be about 0.04 + 4.18 * (requested time).
> > 
> > 
> > "actual time per loop iteration":
> > 
> > For bsd.rd, each loop iteration takes about 5s (as measured by my
> > phone's timer). So, real time is passing about 2500x as fast as the
> > kernel thinks it is.
> > 
> > As far as I can tell, bsd.sp measures real time intervals accurately
> > (as reported by date, or by comparing end-start to my phone's timer).
> 
> I'm not sure if it will change anything but if the problem is with
> tsc perhaps this will help?
> 
> Index: sys/arch/amd64/amd64/tsc.c
> ===================================================================
> RCS file: /cvs/src/sys/arch/amd64/amd64/tsc.c,v
> retrieving revision 1.21
> diff -u -p -r1.21 tsc.c
> --- sys/arch/amd64/amd64/tsc.c        6 Sep 2020 20:50:00 -0000       1.21
> +++ sys/arch/amd64/amd64/tsc.c        20 Dec 2020 09:29:16 -0000
> @@ -72,6 +72,7 @@ tsc_freq_cpuid(struct cpu_info *ci)
>                       case 0x5e: /* Skylake desktop */
>                       case 0x8e: /* Kabylake mobile */
>                       case 0x9e: /* Kabylake desktop */
> +                     case 0xa6: /* Coffeelake mobile */
>                               khz = 24000; /* 24.0 MHz */
>                               break;
>                       case 0x5f: /* Atom Denverton */

That fixed it! With both bsd.rd and bsd.sp, "sleep 10" sleeps for 10
seconds, and date;sleep 10;date shows dates 10s apart.

I also noticed kern.timecounter.hardware is now "tsc" when I boot
bsd.sp. Without the above diff, it is acpihpet0. With the diff applied
I tried setting it to acpihpet0, and everything still works (sleep 10
sleeps for 10 seconds, and date;sleep 10;date shows dates 10s apart).

(I have been reading the code and am still scratching my head about how
tsleep_nsec and getnanouptime could have disagreed with each other, no
matter how broken the underlying timer is. As far as I can tell,
getnanouptime gets the time from timehands which is updated by
tc_windup, and timers are determined by "elapsed" time in
timeout_hardclock_update, and both functions seem to rely on
essentially the same mechanism to decide how much time has passed. I
guess I've only scratched the surface...)

-- 
James

Reply via email to