it seems /dev/bintime will continue to be used no matter what. The Blue Gene approach, I understand, won't quite do the trick, although it was what we came up with after rejecting the "magic negative fd" and kread/kwrite system call approach (we tried a lot of options in 2007). The "magic negative fd" starts to look like ioctl, and we weren't comfortable going down that road. The new system calls seem to only complicate the picture. If I had to do this all over again, given the options presented so far, I'd do what we did on Blue Gene. It preserves the use of /dev/bintime as a source. Note, also, that the fd need not stay allocated from user mode. Part of the currying involves getting the chan for the fd, and doing an incref on that. User mode can close the fd once the process private system call is set up. But, I understand if this seems a bit too much for the task at hand.
In the HPC days we had realized that we could just return the time in nsecs in an unused register, since the toolchain is caller save; user code could decide to ignore that register, save for the one or two programs that use it. This had the nice property of making system call measurement easy. Further, one could shortstop sysr1 very early in the assembly language syscall prolog, and the net effect would be a very low overhead system call that happens to return time. We ended the project before trying that out. This is a bit nicer than using TOS because it has zero impact on cache lines or any other memory subsystem hardware. In the best case, it's just a register to register move. I'm experimenting with that now on risc-v on linux. The low-res-clock-on-TOS is really nice, esp if it derives from a stable clock. How precise does the clock need to be for Go? nsec? microsecond? Also, for many reasons, I'd rather not have to do a system call to get a precise clock. This kind of thing matters in HPC, and I'm back in that world, so I care again :-) Could there be two more TOS variables, M and N, such that nsec = (rdtsc * N) / M? I.e. the kernel passes calibration on TOS and user code uses it to compute nsec, not having to leave user mode? This M and N approach is very commonly used in clock trees, which uses integers, not floating point, to implement clock scaling. On Mon, Mar 10, 2025 at 7:27 PM Charles Forsyth <charles.fors...@gmail.com> wrote: > I've done that often enough! > > On Mon, 10 Mar 2025 at 22:46, <o...@eigenstate.org> wrote: > >> Quoth o...@eigenstate.org: >> > If the goal is performance, any syscall is much of a muchness, and >> keeping >> > things in userspace is the way to go. As far as this: >> > >> > > Or the kernel would have to share a memory page with the parameters >> with >> > > the user process, and then that layout becomes part of the kernel >> interface. >> > >> > We already have that with the TOS. It even has a low res clock in there. >> > >> >> Also, before I forget: on 9front, userspace uses /dev/bintime; this has >> come in handy in the past because I've bound static files over it for >> testing. Losing that interposability would be a downside. >> > *9fans <https://9fans.topicbox.com/latest>* / 9fans / see discussions > <https://9fans.topicbox.com/groups/9fans> + participants > <https://9fans.topicbox.com/groups/9fans/members> + delivery options > <https://9fans.topicbox.com/groups/9fans/subscription> Permalink > <https://9fans.topicbox.com/groups/9fans/T59810df4fe34a033-M64df585a7a55e8f39ce8e734> > ------------------------------------------ 9fans: 9fans Permalink: https://9fans.topicbox.com/groups/9fans/T59810df4fe34a033-Mfc8a8d467f4076826272ca27 Delivery options: https://9fans.topicbox.com/groups/9fans/subscription