date:20230212

[RFC PATCH 0/12] Towards glibc on x86_64-gnu

2023-02-12 Thread Sergey Bugaev

Hello!

Seeing that the patches to make binutils and GCC support --target=x86_64-gnu
have recently landed, I thought it would be interesting to try to configure
glibc with --host=x86_64-gnu and see what breaks.

To make things work at least somewhat, I've tried to put some basic directory
& Implies structure in place, then fixed various 64-bit issues in generic code
(these fixes should hopefully be uncontroversial!). Then I have even tried to
actually write some of the missing arch-specific code for x86_64-gnu.

With this patch series, I'm able to 'make install-headers' successfully.
Moreover, I'm able to 'make io/subdir_objs' successfully - which builds a lot
of Hurd-specific, but mostly arch-independent, routines. 'subdir_objs' in
'mach', 'hurd', and 'htl' do not succeed, but a lof of files do get built.

The main missing pieces seem to be, ordered by scariness increasing:
- missing x86_64 thread state definition in the kernel headers;
- TLS things;
- INTR_MSG_TRAP.

Thread state (i.e. a struct with all the register values) should not be that
hard for GNU Mach to define (unless it is defined somewhere already and I'm
just not seeing it).

It seems that GCC expects TLS on x86_64 to be done relative to %fs, not %gs, so
that's what I attempted to do in tls.h. The main thing missing there is the
ability to actually set (and read) the %fs base address of a thread. It is my
understanding (but note that I have no idea what I'm talking about) that on
x86_64 the segment descriptors (as in GDT/LDT) are not used for this, and
instead the address can be set by writing to a MSR. Linux exposes the
arch_prctl (ARCH_[GS]ET_[FG]S) syscall for this; so maybe GNU Mach could also
have an explicit routine for this, perhaps like this:

routine i386_set_fgs_base (
target_thread: thread_t;
which: int;
value: rpc_vm_address_t);

We should not need a getter routine, because one can simply inspect the target
thread's state (unless, again, I misunderstand things horribly).

Anyway; this is an RFC. I don't expect the latter patches to actually get
merged. "I have no idea what I'm doing" applies. Maybe I'm doing things
horribly wrong; tell me!

Note: I'm sending this as a single patch series that consists of patches for
the glibc, MIG, and the Hurd repos. The repo that a patch is for is indicated
in the subject line of each patch.

Sergey

[RFC PATCH glibc 12/12] C11 thrd: Downgrade the default alignment of mtx_t

2023-02-12 Thread Sergey Bugaev

..so that it can match the alignment of pthread_mutex_t on x86_64-gnu.
This likely breaks many other arches (if not all of them), though.

Signed-off-by: Sergey Bugaev 
---
 sysdeps/pthread/threads.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sysdeps/pthread/threads.h b/sysdeps/pthread/threads.h
index 860d597d..207c1dee 100644
--- a/sysdeps/pthread/threads.h
+++ b/sysdeps/pthread/threads.h
@@ -64,7 +64,7 @@ typedef __once_flag once_flag;
 typedef union
 {
   char __size[__SIZEOF_PTHREAD_MUTEX_T];
-  long int __align __LOCK_ALIGNMENT;
+  int __align __LOCK_ALIGNMENT;
 } mtx_t;
 
 typedef union
-- 
2.39.1

[RFC PATCH glibc 3/12] mach, hurd: Cast through uintptr_t

2023-02-12 Thread Sergey Bugaev

When casting between a pointer and an integer of a different size, GCC
emits a warning (which is escalated to a build failure by -Werror).
Indeed, if what you start with is a pointer, which you then cast to a
shorter integer and then back again, you're going to cut off some bits
of the pointer.

But if you start with an integer (such as mach_port_t), then cast it to
a longer pointer (void *), and then back to a shorter integer, you are
fine. To keep GCC happy, cast through an intermediary uintptr_t, which
is always the same size as a pointer.

Signed-off-by: Sergey Bugaev 
---
 htl/cthreads-compat.c |  5 +++--
 hurd/fopenport.c  | 15 ++-
 hurd/hurd/port.h  |  2 +-
 hurd/port-cleanup.c   |  3 ++-
 hurd/vpprintf.c   |  6 --
 mach/devstream.c  |  9 +
 6 files changed, 25 insertions(+), 15 deletions(-)

diff --git a/htl/cthreads-compat.c b/htl/cthreads-compat.c
index 55043a8b..2a0a95b6 100644
--- a/htl/cthreads-compat.c
+++ b/htl/cthreads-compat.c
@@ -25,8 +25,9 @@ void
 __cthread_detach (__cthread_t thread)
 {
   int err;
+  pthread_t pthread = (pthread_t) (uintptr_t) thread;
 
-  err = __pthread_detach ((pthread_t) thread);
+  err = __pthread_detach (pthread);
   assert_perror (err);
 }
 weak_alias (__cthread_detach, cthread_detach)
@@ -40,7 +41,7 @@ __cthread_fork (__cthread_fn_t func, void *arg)
   err = __pthread_create (&thread, NULL, func, arg);
   assert_perror (err);
 
-  return (__cthread_t) thread;
+  return (__cthread_t) (uintptr_t) thread;
 }
 weak_alias (__cthread_fork, cthread_fork)
 
diff --git a/hurd/fopenport.c b/hurd/fopenport.c
index 60afb445..be6aa30c 100644
--- a/hurd/fopenport.c
+++ b/hurd/fopenport.c
@@ -28,9 +28,10 @@ readio (void *cookie, char *buf, size_t n)
   mach_msg_type_number_t nread;
   error_t err;
   char *bufp = buf;
+  io_t io = (io_t) (uintptr_t) cookie;
 
   nread = n;
-  if (err = __io_read ((io_t) cookie, &bufp, &nread, -1, n))
+  if (err = __io_read (io, &bufp, &nread, -1, n))
 return __hurd_fail (err);
 
   if (bufp != buf)
@@ -50,8 +51,9 @@ writeio (void *cookie, const char *buf, size_t n)
 {
   vm_size_t wrote;
   error_t err;
+  io_t io = (io_t) (uintptr_t) cookie;
 
-  if (err = __io_write ((io_t) cookie, buf, n, -1, &wrote))
+  if (err = __io_write (io, buf, n, -1, &wrote))
 return __hurd_fail (err);
 
   return wrote;
@@ -65,7 +67,8 @@ seekio (void *cookie,
off64_t *pos,
int whence)
 {
-  error_t err = __io_seek ((file_t) cookie, *pos, whence, pos);
+  io_t io = (io_t) (uintptr_t) cookie;
+  error_t err = __io_seek (io, *pos, whence, pos);
   return err ? __hurd_fail (err) : 0;
 }
 
@@ -74,8 +77,9 @@ seekio (void *cookie,
 static int
 closeio (void *cookie)
 {
+  io_t io = (io_t) (uintptr_t) cookie;
   error_t error = __mach_port_deallocate (__mach_task_self (),
- (mach_port_t) cookie);
+ io);
   if (error)
 return __hurd_fail (error);
   return 0;
@@ -127,6 +131,7 @@ __fopenport (mach_port_t port, const char *mode)
   return NULL;
 }
 
-  return fopencookie ((void *) port, mode, funcsio);
+  return fopencookie ((void *) (uintptr_t) port,
+  mode, funcsio);
 }
 weak_alias (__fopenport, fopenport)
diff --git a/hurd/hurd/port.h b/hurd/hurd/port.h
index fdba8db5..1e473978 100644
--- a/hurd/hurd/port.h
+++ b/hurd/hurd/port.h
@@ -96,7 +96,7 @@ _hurd_port_locked_get (struct hurd_port *port,
   if (result != MACH_PORT_NULL)
 {
   link->cleanup = &_hurd_port_cleanup;
-  link->cleanup_data = (void *) result;
+  link->cleanup_data = (void *) (uintptr_t) result;
   _hurd_userlink_link (&port->users, link);
 }
   __spin_unlock (&port->lock);
diff --git a/hurd/port-cleanup.c b/hurd/port-cleanup.c
index 08ab3d47..ad43b830 100644
--- a/hurd/port-cleanup.c
+++ b/hurd/port-cleanup.c
@@ -26,7 +26,8 @@
 void
 _hurd_port_cleanup (void *cleanup_data, jmp_buf env, int val)
 {
-  __mach_port_deallocate (__mach_task_self (), (mach_port_t) cleanup_data);
+  mach_port_t port = (mach_port_t) (uintptr_t) cleanup_data;
+  __mach_port_deallocate (__mach_task_self (), port);
 }
 
 /* We were cancelled while using a port, and called from the cleanup unwinding.
diff --git a/hurd/vpprintf.c b/hurd/vpprintf.c
index fe734946..010f04ca 100644
--- a/hurd/vpprintf.c
+++ b/hurd/vpprintf.c
@@ -25,8 +25,9 @@
 static ssize_t
 do_write (void *cookie,const char *buf, size_t n)
 {
+  io_t io = (io_t) (uintptr_t) cookie;
   vm_size_t amount = n;
-  error_t error = __io_write ((io_t) cookie, buf, n, -1, &amount);
+  error_t error = __io_write (io, buf, n, -1, &amount);
   if (error)
 return __hurd_fail (error);
   return n;
@@ -51,7 +52,8 @@ vpprintf (io_t port, const char *format, va_list arg)
 #endif
 
   _IO_cookie_init (&temp_f.cfile, _IO_NO_READS,
-  (void *) port, (cookie_io_functions_t) { write: do_write });
+   (void *) (uintptr_t) port,
+   (co

[RFC PATCH glibc 11/12] hurd, htl: Add some x86_64-specific code

2023-02-12 Thread Sergey Bugaev

tls.h in particular is very unfinished.

Signed-off-by: Sergey Bugaev 
---
 sysdeps/mach/hurd/x86_64/static-start.S |  27 +++
 sysdeps/mach/hurd/x86_64/tls.h  | 182 
 sysdeps/mach/hurd/x86_64/tlsdesc.sym|  22 +++
 sysdeps/x86_64/htl/bits/pthreadtypes-arch.h |  36 
 sysdeps/x86_64/htl/machine-sp.h |  29 
 sysdeps/x86_64/htl/pt-machdep.h |  28 +++
 6 files changed, 324 insertions(+)
 create mode 100644 sysdeps/mach/hurd/x86_64/static-start.S
 create mode 100644 sysdeps/mach/hurd/x86_64/tls.h
 create mode 100644 sysdeps/mach/hurd/x86_64/tlsdesc.sym
 create mode 100644 sysdeps/x86_64/htl/bits/pthreadtypes-arch.h
 create mode 100644 sysdeps/x86_64/htl/machine-sp.h
 create mode 100644 sysdeps/x86_64/htl/pt-machdep.h

diff --git a/sysdeps/mach/hurd/x86_64/static-start.S 
b/sysdeps/mach/hurd/x86_64/static-start.S
new file mode 100644
index ..982d3d52
--- /dev/null
+++ b/sysdeps/mach/hurd/x86_64/static-start.S
@@ -0,0 +1,27 @@
+/* Startup code for statically linked Hurd/x86_64 binaries.
+   Copyright (C) 1998-2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   .  */
+
+   .text
+   .globl _start
+_start:
+   call _hurd_stack_setup
+   xorq %rdx, %rdx
+   jmp _start1
+
+#define _start _start1
+#include 
diff --git a/sysdeps/mach/hurd/x86_64/tls.h b/sysdeps/mach/hurd/x86_64/tls.h
new file mode 100644
index ..027304d9
--- /dev/null
+++ b/sysdeps/mach/hurd/x86_64/tls.h
@@ -0,0 +1,182 @@
+/* Definitions for thread-local data handling.  Hurd/x86_64 version.
+   Copyright (C) 2003-2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   .  */
+
+#ifndef _X86_64_TLS_H
+#define _X86_64_TLS_H
+
+
+/* Some things really need not be machine-dependent.  */
+#include 
+
+
+#ifndef __ASSEMBLER__
+# include 
+
+/* Type of the TCB.  */
+typedef struct
+{
+  void *tcb;   /* Points to this structure.  */
+  dtv_t *dtv;  /* Vector of pointers to TLS data.  */
+  thread_t self;   /* This thread's control port.  */
+  int __glibc_padding1;
+  int multiple_threads;
+  int gscope_flag;
+  uintptr_t sysinfo;
+  uintptr_t stack_guard;
+  uintptr_t pointer_guard;
+  long __glibc_padding2[2];
+  int private_futex;
+  int __glibc_padding3;
+  /* Reservation of some values for the TM ABI.  */
+  void *__private_tm[4];
+  /* GCC split stack support.  */
+  void *__private_ss;
+  /* The lowest address of shadow stack.  */
+  unsigned long long int ssp_base;
+
+  /* Keep these fields last, so offsets of fields above can continue being
+ compatible with the x86_64 NPTL version.  */
+  mach_port_t reply_port;  /* This thread's reply port.  */
+  struct hurd_sigstate *_hurd_sigstate;
+
+  /* Used by the exception handling implementation in the dynamic loader.  */
+  struct rtld_catch *rtld_catch;
+} tcbhead_t;
+
+/* GCC generates %fs:0x28 to access the stack guard.  */
+_Static_assert (offsetof (tcbhead_t, stack_guard) == 0x28,
+"stack guard offset");
+/* libgcc uses %fs:0x70 to access the split stack pointer.  */
+_Static_assert (offsetof (tcbhead_t, __private_ss) == 0x70,
+"split stack pointer offset");
+
+/* FIXME */
+# define __LIBC_NO_TLS() 0
+
+/* The TCB can have any size and the memory following the address the
+   thread pointer points to is unspecified.  Allocate the TCB there.  */
+# define TLS_TCB_AT_TP 1
+# define TLS_DTV_AT_TP 0
+
+# define TCB_ALIGNMENT 64
+
+# define TLS_INIT_TP(descr) 0
+
+# if __GNUC_PREREQ (6, 0)
+
+#  defin

[RFC PATCH mig 7/12] Drop -undef -ansi from cpp flags

2023-02-12 Thread Sergey Bugaev

Since GNU Mach commit d30481122a5d24ad6b921062f93b9172ef922fc3,
i386/machine_types.defs defines types based on defined(__x86_64__).
Supressing the built-in macro definitions will now result in the wrong
type being silently selected.

-undef was initially introduced in commit
78b6a7665db7b2eae367e17102821cbdca231d19 without much of an explanation.
-ansi was introduced in commit 6940fb91859e46b2e96a331a029f2dc2a0ee51c9
"to avoid -Di386=1 and the like".

Since glibc has been using MIG with CPP set to a custom GCC invocation
which did *not* use either flag, it appears that everything works well
enough even without them. On the other hand, not having __x86_64__
defined most definetely causes issues for anything that does not set a
custom CPP when invoking MIG (i.e., most users). Other built-in
definitions could be used in the future in a similar way (e.g. on other
architectures); it's really more of a coincidence that they have not
been used until now, and things kept working with -undef.
---
 mig.in | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mig.in b/mig.in
index 63e0269..94fd500 100644
--- a/mig.in
+++ b/mig.in
@@ -38,7 +38,7 @@ migcom=${MIGDIR-$libexecdir}/${MIGCOM-@MIGCOM@}
 # The expansion of TARGET_CC might refer to ${CC}, so make sure it is defined.
 default_cc="@CC@"
 CC="${CC-${default_cc}}"
-default_cpp="@TARGET_CC@ -E -x c -undef -ansi"
+default_cpp="@TARGET_CC@ -E -x c"
 cpp="${CPP-${default_cpp}}"
 
 cppflags=
-- 
2.39.1

[RFC PATCH glibc 5/12] htl: Fix semaphore reference

2023-02-12 Thread Sergey Bugaev

'sem' is the opaque 'sem_t', 'isem' is the actual 'struct new_sem'.

Signed-off-by: Sergey Bugaev 
---
 sysdeps/htl/sem-timedwait.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/sysdeps/htl/sem-timedwait.c b/sysdeps/htl/sem-timedwait.c
index 8f2df6e7..9974e9ae 100644
--- a/sysdeps/htl/sem-timedwait.c
+++ b/sysdeps/htl/sem-timedwait.c
@@ -60,7 +60,7 @@ __sem_timedwait_internal (sem_t *restrict sem,
   int cancel_oldtype = LIBC_CANCEL_ASYNC();
 
 #if __HAVE_64B_ATOMICS
-  uint64_t d = atomic_fetch_add_relaxed (&sem->data,
+  uint64_t d = atomic_fetch_add_relaxed (&isem->data,
 (uint64_t) 1 << SEM_NWAITERS_SHIFT);
 
   pthread_cleanup_push (__sem_wait_cleanup, isem);
@@ -72,11 +72,11 @@ __sem_timedwait_internal (sem_t *restrict sem,
  /* No token, sleep.  */
  if (timeout)
err = __lll_abstimed_wait_intr (
- ((unsigned int *) &sem->data) + SEM_VALUE_OFFSET,
+ ((unsigned int *) &isem->data) + SEM_VALUE_OFFSET,
  0, timeout, flags, clock_id);
  else
err = __lll_wait_intr (
- ((unsigned int *) &sem->data) + SEM_VALUE_OFFSET,
+ ((unsigned int *) &isem->data) + SEM_VALUE_OFFSET,
  0, flags);
 
  if (err != 0 && err != KERN_INVALID_ARGUMENT)
@@ -92,12 +92,12 @@ __sem_timedwait_internal (sem_t *restrict sem,
}
 
  /* Token changed */
- d = atomic_load_relaxed (&sem->data);
+ d = atomic_load_relaxed (&isem->data);
}
   else
{
  /* Try to acquire and dequeue.  */
- if (atomic_compare_exchange_weak_acquire (&sem->data,
+ if (atomic_compare_exchange_weak_acquire (&isem->data,
  &d, d - 1 - ((uint64_t) 1 << SEM_NWAITERS_SHIFT)))
{
  /* Success */
-- 
2.39.1

[RFC PATCH glibc 1/12] hurd: Refactor readlinkat()

2023-02-12 Thread Sergey Bugaev

Make the code flow more linear using early returns where possible. This
makes it so much easier to reason about what runs on error / successful
code paths.

Signed-off-by: Sergey Bugaev 
---
 sysdeps/mach/hurd/readlinkat.c | 55 --
 1 file changed, 32 insertions(+), 23 deletions(-)

diff --git a/sysdeps/mach/hurd/readlinkat.c b/sysdeps/mach/hurd/readlinkat.c
index 1efb09ca..dabdbb37 100644
--- a/sysdeps/mach/hurd/readlinkat.c
+++ b/sysdeps/mach/hurd/readlinkat.c
@@ -31,38 +31,47 @@ __readlinkat (int fd, const char *file_name, char *buf, 
size_t len)
   error_t err;
   file_t file_stat;
   struct stat64 st;
+  enum retry_type doretry;
+  char retryname[1024];
+  file_t file;
+  char *rbuf = buf;
 
   file_stat = __file_name_lookup_at (fd, 0, file_name, O_NOLINK, 0);
   if (file_stat == MACH_PORT_NULL)
 return -1;
 
   err = __io_stat (file_stat, &st);
-  if (! err)
-if (S_ISLNK (st.st_mode))
-  {
-   enum retry_type doretry;
-   char retryname[1024];
-   file_t file;
-   char *rbuf = buf;
+  if (err)
+goto out;
+  if (!S_ISLNK (st.st_mode))
+{
+  err = EINVAL;
+  goto out;
+}
 
-   err = __dir_lookup (file_stat, "", O_READ | O_NOLINK, 0, &doretry, 
retryname, &file);
-   if (! err && (doretry != FS_RETRY_NORMAL || retryname[0] != '\0'))
- err = EGRATUITOUS;
-   if (! err)
- {
-   err = __io_read (file, &rbuf, &len, 0, len);
-   if (!err && rbuf != buf)
- {
-   memcpy (buf, rbuf, len);
-   __vm_deallocate (__mach_task_self (), (vm_address_t)rbuf, len);
- }
+  err = __dir_lookup (file_stat, "", O_READ | O_NOLINK,
+  0, &doretry, retryname, &file);
+  if (err)
+goto out;
+  if (doretry != FS_RETRY_NORMAL || retryname[0] != '\0')
+{
+  err = EGRATUITOUS;
+  goto out;
+}
+
+  err = __io_read (file, &rbuf, &len, 0, len);
+  __mach_port_deallocate (__mach_task_self (), file);
+  if (err)
+goto out;
+
+  if (rbuf != buf)
+{
+  memcpy (buf, rbuf, len);
+  __vm_deallocate (__mach_task_self (), (vm_address_t) rbuf, len);
+}
 
-   __mach_port_deallocate (__mach_task_self (), file);
- }
-  }
-else
-  err = EINVAL;
 
+ out:
   __mach_port_deallocate (__mach_task_self (), file_stat);
 
   return err ? __hurd_fail (err) : len;
-- 
2.39.1

[RFC PATCH glibc 9/12] mach: Look for mach_i386.defs on x86_64 too

2023-02-12 Thread Sergey Bugaev

Signed-off-by: Sergey Bugaev 
---
 mach/Makefile | 3 ++-
 sysdeps/mach/configure| 6 +++---
 sysdeps/mach/configure.ac | 6 +++---
 3 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/mach/Makefile b/mach/Makefile
index 39358fdb..f2fdd7da 100644
--- a/mach/Makefile
+++ b/mach/Makefile
@@ -64,7 +64,8 @@ CFLAGS-RPC_i386_set_ldt.o = $(no-stack-protector)
 CFLAGS-RPC_task_get_special_port.o = $(no-stack-protector)
 
 # Translate GNU names for CPUs into the names used in Mach header files.
-mach-machine = $(patsubst powerpc,ppc,$(base-machine))
+mach-machine := $(patsubst powerpc,ppc,$(base-machine))
+mach-machine := $(patsubst x86_64,i386,$(mach-machine))
 
 # Define mach-syscalls and sysno-*.
 ifndef inhibit_mach_syscalls
diff --git a/sysdeps/mach/configure b/sysdeps/mach/configure
index 3f0a9029..8c341d59 100644
--- a/sysdeps/mach/configure
+++ b/sysdeps/mach/configure
@@ -249,7 +249,7 @@ for ifc in mach mach4 gnumach \
   clock clock_priv host_priv host_security ledger lock_set \
   processor processor_set task task_notify thread_act vm_map \
   memory_object memory_object_default default_pager \
-  i386/mach_i386 \
+  machine/mach_i386 \
   ; do
   as_ac_Header=`$as_echo "ac_cv_header_mach/${ifc}.defs" | $as_tr_sh`
 ac_fn_c_check_header_preproc "$LINENO" "mach/${ifc}.defs" "$as_ac_Header"
@@ -440,7 +440,7 @@ if ${libc_cv_mach_i386_ioports+:} false; then :
 else
   cat confdefs.h - <<_ACEOF >conftest.$ac_ext
 /* end confdefs.h.  */
-#include 
+#include 
 
 _ACEOF
 if (eval "$ac_cpp conftest.$ac_ext") 2>&5 |
@@ -466,7 +466,7 @@ if ${libc_cv_mach_i386_gdt+:} false; then :
 else
   cat confdefs.h - <<_ACEOF >conftest.$ac_ext
 /* end confdefs.h.  */
-#include 
+#include 
 
 _ACEOF
 if (eval "$ac_cpp conftest.$ac_ext") 2>&5 |
diff --git a/sysdeps/mach/configure.ac b/sysdeps/mach/configure.ac
index a57cb259..579c0021 100644
--- a/sysdeps/mach/configure.ac
+++ b/sysdeps/mach/configure.ac
@@ -64,7 +64,7 @@ for ifc in mach mach4 gnumach \
   clock clock_priv host_priv host_security ledger lock_set \
   processor processor_set task task_notify thread_act vm_map \
   memory_object memory_object_default default_pager \
-  i386/mach_i386 \
+  machine/mach_i386 \
   ; do
   AC_CHECK_HEADER(mach/${ifc}.defs, [dnl
   mach_interface_list="$mach_interface_list $ifc"],, -)
@@ -89,7 +89,7 @@ AC_CHECK_HEADER(machine/ndr_def.h, [dnl
 
 AC_CACHE_CHECK(for i386_io_perm_modify in mach_i386.defs,
   libc_cv_mach_i386_ioports, [dnl
-AC_EGREP_HEADER(i386_io_perm_modify, mach/i386/mach_i386.defs,
+AC_EGREP_HEADER(i386_io_perm_modify, mach/machine/mach_i386.defs,
libc_cv_mach_i386_ioports=yes,
libc_cv_mach_i386_ioports=no)])
 if test $libc_cv_mach_i386_ioports = yes; then
@@ -98,7 +98,7 @@ fi
 
 AC_CACHE_CHECK(for i386_set_gdt in mach_i386.defs,
   libc_cv_mach_i386_gdt, [dnl
-AC_EGREP_HEADER(i386_set_gdt, mach/i386/mach_i386.defs,
+AC_EGREP_HEADER(i386_set_gdt, mach/machine/mach_i386.defs,
libc_cv_mach_i386_gdt=yes,
libc_cv_mach_i386_gdt=no)])
 if test $libc_cv_mach_i386_gdt = yes; then
-- 
2.39.1

[RFC PATCH mig 8/12] Set max type alignment to sizeof(long)

2023-02-12 Thread Sergey Bugaev

...not sizeof(int). This appears to be required on x86_64.

PLEASE NOTE: There is now a patch from Flavio Cruz ("Make MIG work for
pure 64 bit kernel and userland") that appears to tackle this same issue
(and more). Most likely this simplistic patch should be dropped in favor
of the one from Flavio.
---
 type.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/type.c b/type.c
index b104c66..7765200 100644
--- a/type.c
+++ b/type.c
@@ -542,7 +542,7 @@ itLongDecl(u_int inname, const_string_t instr, u_int 
outname,
 it->itOutName = outname;
 it->itOutNameStr = outstr;
 it->itSize = size;
-it->itAlignment = MIN(word_size, size / 8);
+it->itAlignment = MIN(sizeof_long, size / 8);
 if (inname == MACH_MSG_TYPE_STRING_C)
 {
it->itStruct = false;
-- 
2.39.1

[RFC PATCH glibc 4/12] hurd: Fix xattr error value

2023-02-12 Thread Sergey Bugaev

This does not seem like it is supposed to return negative error codes.

Signed-off-by: Sergey Bugaev 
---
 hurd/xattr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hurd/xattr.c b/hurd/xattr.c
index 48914bcf..5a0fc263 100644
--- a/hurd/xattr.c
+++ b/hurd/xattr.c
@@ -68,7 +68,7 @@ _hurd_xattr_get (io_t port, const char *name, void *value, 
size_t *size)
{
  if (buf != value)
__munmap (buf, bufsz);
- return -ERANGE;
+ return ERANGE;
}
   if (buf != value && bufsz > 0)
{
-- 
2.39.1

[RFC PATCH glibc 10/12] hurd: Set up the basic tree for x86_64-gnu

2023-02-12 Thread Sergey Bugaev

And move pt-setup.c to the generic x86 tree.

Signed-off-by: Sergey Bugaev 
---
 sysdeps/mach/hurd/Implies  | 1 +
 sysdeps/mach/hurd/{i386 => x86}/htl/pt-setup.c | 4 ++--
 sysdeps/mach/hurd/x86_64/Implies   | 2 ++
 sysdeps/mach/hurd/x86_64/htl/Implies   | 3 +++
 sysdeps/mach/x86_64/Implies| 1 +
 5 files changed, 9 insertions(+), 2 deletions(-)
 rename sysdeps/mach/hurd/{i386 => x86}/htl/pt-setup.c (97%)
 create mode 100644 sysdeps/mach/hurd/x86_64/Implies
 create mode 100644 sysdeps/mach/hurd/x86_64/htl/Implies
 create mode 100644 sysdeps/mach/x86_64/Implies

diff --git a/sysdeps/mach/hurd/Implies b/sysdeps/mach/hurd/Implies
index d2d5234c..e19dd1fd 100644
--- a/sysdeps/mach/hurd/Implies
+++ b/sysdeps/mach/hurd/Implies
@@ -3,3 +3,4 @@
 gnu
 # The Hurd provides a rough superset of the functionality of 4.4 BSD.
 unix/bsd
+mach/hurd/htl
diff --git a/sysdeps/mach/hurd/i386/htl/pt-setup.c 
b/sysdeps/mach/hurd/x86/htl/pt-setup.c
similarity index 97%
rename from sysdeps/mach/hurd/i386/htl/pt-setup.c
rename to sysdeps/mach/hurd/x86/htl/pt-setup.c
index 94caed82..3abd92b2 100644
--- a/sysdeps/mach/hurd/i386/htl/pt-setup.c
+++ b/sysdeps/mach/hurd/x86/htl/pt-setup.c
@@ -1,4 +1,4 @@
-/* Setup thread stack.  Hurd/i386 version.
+/* Setup thread stack.  Hurd/x86 version.
Copyright (C) 2000-2023 Free Software Foundation, Inc.
This file is part of the GNU C Library.
 
@@ -22,7 +22,7 @@
 
 #include 
 
-/* The stack layout used on the i386 is:
+/* The stack layout used on the x86 is:
 
 -
|  ARG|
diff --git a/sysdeps/mach/hurd/x86_64/Implies b/sysdeps/mach/hurd/x86_64/Implies
new file mode 100644
index ..6b5e6f47
--- /dev/null
+++ b/sysdeps/mach/hurd/x86_64/Implies
@@ -0,0 +1,2 @@
+mach/hurd/x86
+mach/hurd/x86_64/htl
diff --git a/sysdeps/mach/hurd/x86_64/htl/Implies 
b/sysdeps/mach/hurd/x86_64/htl/Implies
new file mode 100644
index ..1ae8f16e
--- /dev/null
+++ b/sysdeps/mach/hurd/x86_64/htl/Implies
@@ -0,0 +1,3 @@
+mach/hurd/htl
+mach/hurd/x86/htl
+x86_64/htl
diff --git a/sysdeps/mach/x86_64/Implies b/sysdeps/mach/x86_64/Implies
new file mode 100644
index ..da8291f4
--- /dev/null
+++ b/sysdeps/mach/x86_64/Implies
@@ -0,0 +1 @@
+mach/x86
-- 
2.39.1

[RFC PATCH glibc 2/12] hurd: Use mach_msg_type_number_t where appropriate

2023-02-12 Thread Sergey Bugaev

It has been decided that on x86_64, mach_msg_type_number_t stays 32-bit.
Therefore, it's not possible to use mach_msg_type_number_t
interchangeably with size_t, in particular this breaks when a pointer to
a variable is passed to a MIG routine.

Signed-off-by: Sergey Bugaev 
---
 hurd/hurdioctl.c   | 2 +-
 hurd/hurdprio.c| 2 +-
 hurd/lookup-retry.c| 2 +-
 hurd/xattr.c   | 4 ++--
 sysdeps/mach/hurd/getcwd.c | 2 +-
 sysdeps/mach/hurd/readlinkat.c | 6 --
 sysdeps/mach/hurd/sendfile64.c | 2 +-
 7 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/hurd/hurdioctl.c b/hurd/hurdioctl.c
index e9a6a7ea..59e6a5c1 100644
--- a/hurd/hurdioctl.c
+++ b/hurd/hurdioctl.c
@@ -311,7 +311,7 @@ static int
 siocgifconf (int fd, int request, struct ifconf *ifc)
 {
   error_t err;
-  size_t data_len = ifc->ifc_len;
+  mach_msg_type_number_t data_len = ifc->ifc_len;
   char *data = ifc->ifc_buf;
 
   if (data_len <= 0)
diff --git a/hurd/hurdprio.c b/hurd/hurdprio.c
index dbeb272b..954d3987 100644
--- a/hurd/hurdprio.c
+++ b/hurd/hurdprio.c
@@ -58,7 +58,7 @@ _hurd_priority_which_map (enum __priority_which which, int 
who,
  int *oldpi = pi;
  mach_msg_type_number_t oldpisize = pisize;
  char *tw = 0;
- size_t twsz = 0;
+ mach_msg_type_number_t twsz = 0;
  err = __USEPORT (PROC, __proc_getprocinfo (port, pids[i],
 &pi_flags,
 &pi, &pisize,
diff --git a/hurd/lookup-retry.c b/hurd/lookup-retry.c
index 0d344026..8850c4fd 100644
--- a/hurd/lookup-retry.c
+++ b/hurd/lookup-retry.c
@@ -162,7 +162,7 @@ __hurd_file_name_lookup_retry (error_t (*use_init_port)
{
  char buf[1024];
  char *trans = buf;
- size_t translen = sizeof buf;
+ mach_msg_type_number_t translen = sizeof buf;
  err = __file_get_translator (*result,
   &trans, &translen);
  if (!err
diff --git a/hurd/xattr.c b/hurd/xattr.c
index f98e548c..48914bcf 100644
--- a/hurd/xattr.c
+++ b/hurd/xattr.c
@@ -60,7 +60,7 @@ _hurd_xattr_get (io_t port, const char *name, void *value, 
size_t *size)
   if (!strcmp (name, "translator"))
 {
   char *buf = value;
-  size_t bufsz = value ? *size : 0;
+  mach_msg_type_number_t bufsz = value ? *size : 0;
   error_t err = __file_get_translator (port, &buf, &bufsz);
   if (err)
return err;
@@ -144,7 +144,7 @@ _hurd_xattr_set (io_t port, const char *name, const void 
*value, size_t size,
{
  /* Must make sure it's already there.  */
  char *buf = NULL;
- size_t bufsz = 0;
+ mach_msg_type_number_t bufsz = 0;
  error_t err = __file_get_translator (port, &buf, &bufsz);
  if (err)
return err;
diff --git a/sysdeps/mach/hurd/getcwd.c b/sysdeps/mach/hurd/getcwd.c
index fc5e78e7..f24b35b3 100644
--- a/sysdeps/mach/hurd/getcwd.c
+++ b/sysdeps/mach/hurd/getcwd.c
@@ -117,7 +117,7 @@ __hurd_canonicalize_directory_name_internal (file_t thisdir,
   int mount_point;
   file_t newp;
   char *dirdata;
-  size_t dirdatasize;
+  mach_msg_type_number_t dirdatasize;
   int direntry, nentries;
 
 
diff --git a/sysdeps/mach/hurd/readlinkat.c b/sysdeps/mach/hurd/readlinkat.c
index dabdbb37..5bb8b753 100644
--- a/sysdeps/mach/hurd/readlinkat.c
+++ b/sysdeps/mach/hurd/readlinkat.c
@@ -35,6 +35,7 @@ __readlinkat (int fd, const char *file_name, char *buf, 
size_t len)
   char retryname[1024];
   file_t file;
   char *rbuf = buf;
+  mach_msg_type_number_t nread = len;
 
   file_stat = __file_name_lookup_at (fd, 0, file_name, O_NOLINK, 0);
   if (file_stat == MACH_PORT_NULL)
@@ -59,15 +60,16 @@ __readlinkat (int fd, const char *file_name, char *buf, 
size_t len)
   goto out;
 }
 
-  err = __io_read (file, &rbuf, &len, 0, len);
+  err = __io_read (file, &rbuf, &nread, 0, len);
   __mach_port_deallocate (__mach_task_self (), file);
   if (err)
 goto out;
 
+  len = nread;
   if (rbuf != buf)
 {
   memcpy (buf, rbuf, len);
-  __vm_deallocate (__mach_task_self (), (vm_address_t) rbuf, len);
+  __vm_deallocate (__mach_task_self (), (vm_address_t) rbuf, nread);
 }
 
 
diff --git a/sysdeps/mach/hurd/sendfile64.c b/sysdeps/mach/hurd/sendfile64.c
index 658b0282..e36b9790 100644
--- a/sysdeps/mach/hurd/sendfile64.c
+++ b/sysdeps/mach/hurd/sendfile64.c
@@ -35,7 +35,7 @@ __sendfile64 (int out_fd, int in_fd, off64_t *offset, size_t 
count)
  which might blow if it's huge or address space is real tight.  */
 
   char *data = 0;
-  size_t datalen = 0;
+  mach_msg_type_number_t datalen = 0;
   error_t err = HURD_DPORT_USE (in_fd,
__io_rea

[RFC PATCH hurd 6/12] hurd: Fix modes_t and speeds_t types on 64-bit

2023-02-12 Thread Sergey Bugaev

---
 hurd/tioctl.defs | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hurd/tioctl.defs b/hurd/tioctl.defs
index 905a4a38..a04f5ae4 100644
--- a/hurd/tioctl.defs
+++ b/hurd/tioctl.defs
@@ -34,9 +34,9 @@ import ; /* XXX */
 
 /* These are the pieces of a struct termios as specified by the
definition of _IOT_termios in . */
-type modes_t = array[4] of int;
+type modes_t = array[4] of long;
 type ccs_t = array[20] of char;
-type speeds_t = array[2] of int;
+type speeds_t = array[2] of long;
 
 /* This is the arg for a struct winsize as specified by the
definition of _IOT_winsize in . */
-- 
2.39.1

Re: [PATCH mig] Make MIG work for pure 64 bit kernel and userland.

2023-02-12 Thread Luca


Il 12/02/23 07:15, Flavio Cruz ha scritto:

Made changes to MIG so that generated stubs can be compiled when using
a 64 bit userland. The most important change here is to ensure
that data is always 8-byte aligned. Not doing that will lead to
undefined behavior since on 64 bits many structures can contain 64 bit
scalars which require 8-byte alignment. Even if x86 is usually fine with
unaligned accesses, compilers are taking advantage of this UD which may
break us in the future
(http://pzemtsov.github.io/2016/11/06/bug-story-alignment-on-x86.html)

The downside is that messages get significantly larger since we have
to pad the initial mach_msg_type_t so that the data item can be 8-byte
aligned. In the future, I want to make MIG layout the data in a way
that non-complex data can be packed together (essentially, how XNU messages
work) and to force userland to create larger structures so that we don't
have to do any structure resizing while in the kernel. For now, this unblocks
future work on 64 bits and is quite simple to understand.

For the 64 bit / 32 bit configuration, we can pass --enable-user32 when
targeting x86_64, and MIG will generate stubs as before. I am unsure
whether there exists undefined behavior in that particular configuration
since most of the structures used by the kernel do not seem to use 8
byte scalars so potentially we should be OK there.


As mentioned in the article, using memcpy() to a correctly aligned 
pointer seems to avoid the possible alignment issues. Since in the 
kernel we need to use copyin/copyout anyway, wouldn't this be enough?


As for the mig stubs, if I understand correctly the issue stems from the 
fact that they use a struct to create the message fields in memory, so 
the binary format of a message needs to have the same alignment 
requirements. Would using memcpy here avoid the alignment issue?
I think it would even simplify the stub code, e.g. for array types, and 
I'm not sure it would decrease performance.



As a follow up, we will need to change gnumach to iterate over these
messages correctly (to follow the 8-byte alignment rules).


I see that currently even the 64-bit version is compiled with 
-mno-3dnow -mno-mmx -mno-sse -mno-sse2, do you mean to also remove these 
flags?




diff --git a/utils.h b/utils.h
index 4be4f4c..477ea74 100644
--- a/utils.h
+++ b/utils.h
@@ -39,6 +39,8 @@
({ __typeof__ (a) _a = (a); \
__typeof__ (b) _b = (b); \
  _a > _b ? _b : _a; })
+#define ALIGN(x, alignment) \
+((x) + (alignment)-1) & ~((alignment)-1)


Maybe this could be replaced by the mach_msg_align() macro from 
mach/message.h. As far as I understand, they have similar purpose, and 
it may already include the USER32 setting, in the future.



Luca

Re: [PATCH mig] Make MIG work for pure 64 bit kernel and userland.

2023-02-12 Thread Samuel Thibault

Hello,

Flavio Cruz, le dim. 12 févr. 2023 01:15:05 -0500, a ecrit:
> diff --git a/configure.ac b/configure.ac
> index e4645bd..8d52d5f 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -55,6 +55,13 @@ AC_SUBST([MIGCOM])
>  [MIG=`echo mig | sed "$program_transform_name"`]
>  AC_SUBST([MIG])
>  
> +AC_ARG_ENABLE(user32,
> +  AS_HELP_STRING([--enable-user32],
> + [Build mig for 64 bit kernel that works with 32 
> bit userland]))
> +AS_IF([test "x$enable_user32" = "xyes"], [
> +   AC_DEFINE([USER32], [1], ["Build mig for 64 bit kernel that works 
> with 32 bit userland"])
> +])
> +
>  AC_OUTPUT([Makefile mig tests/Makefile \
> tests/good/Makefile \
> tests/generate-only/Makefile \

> diff --git a/parser.y b/parser.y
> index 03c5ec8..1ebeae5 100644
> --- a/parser.y
> +++ b/parser.y
> @@ -212,6 +212,16 @@ Subsystem:   SubsystemStart 
> SubsystemMods
>  IsKernelUser ? ", KernelUser" : "",
>  IsKernelServer ? ", KernelServer" : "");
>  }
> +if (IsKernelUser || IsKernelServer) {
> +port_size = vm_offset_size;
> +port_size_in_bits = vm_offset_size_in_bits;
> +#ifdef USER32
> +/* Since we are using MIG to generate stubs for a 64 bit kernel that 
> operates
> +   with a 32 bit userland, then we need to consider that `word_size` 
> is 4 byte aligned.
> + */
> +word_size = sizeof(uint32_t);
> +#endif
> +}
>  init_type();
>  }
>   ;

Mmm, I thought Luca's work was already handling the message translation,
so that the stubs parts of the kernel only sees 64bit messages already
with proper alignments etc.?

At least, I believe we'd rather aim for that, to have only a 64bit mig
and a 32bit mig, and not need a separate 64/32bit mig.

Samuel

Re: [PATCH mig] Make MIG work for pure 64 bit kernel and userland.

2023-02-12 Thread Samuel Thibault

Thanks for your work on the 64bit RPC part, that's tricky :)

Flavio Cruz, le dim. 12 févr. 2023 01:15:05 -0500, a ecrit:
> diff --git a/cpu.sym b/cpu.sym
> index 6bcb878..7858b47 100644
> --- a/cpu.sym
> +++ b/cpu.sym
> @@ -95,8 +95,8 @@ expr MACH_MSG_TYPE_POLYMORPHIC
>  
>  
>  /* Types used in interfaces */
> -expr sizeof(integer_t)   word_size
> -expr (sizeof(integer_t)*8)   word_size_in_bits
> +/* The minimum alignment required for this architecture */
> +expr sizeof(uintptr_t)   desired_machine_alignment

I'm really not at ease with such general "architecture-required
alignment". Alignment always depends on the kind of data you are
manipulating. If you throw a __float128 field in your structure, it'll
have to be 16-byte-aligned.

I'd tend to think that we probably want to follow C on that: each
type has its own alignment constraint that we can encode in mig,
and alignment constraints of structures is the max of the alignment
requirements of the elements of the structure.

> diff --git a/type.c b/type.c
> index b104c66..05ee201 100644
> --- a/type.c
> +++ b/type.c
> @@ -35,18 +35,6 @@
>  #include "cpu.h"
>  #include "utils.h"
>  
> -#if word_size_in_bits == 32
> -#define  word_size_name MACH_MSG_TYPE_INTEGER_32
> -#define  word_size_name_string "MACH_MSG_TYPE_INTEGER_32"
> -#else
> -#if word_size_in_bits == 64
> -#define  word_size_name MACH_MSG_TYPE_INTEGER_64
> -#define  word_size_name_string "MACH_MSG_TYPE_INTEGER_64"
> -#else
> -#error Unsupported word size!
> -#endif
> -#endif

I agree on dropping the "word size" term that is ambiguous on a 64bit
architecture, where you can still want to consider 32bit as a word.

I'd however say to introduce int_name and int_name_string, so that:

> -it->itInName = word_size_name;
> -it->itInNameStr = word_size_name_string;
> -it->itOutName = word_size_name;
> -it->itOutNameStr = word_size_name_string;
> -it->itSize = word_size_in_bits;
> +it->itInName = MACH_MSG_TYPE_INTEGER_32;
> +it->itInNameStr = "MACH_MSG_TYPE_INTEGER_32";
> +it->itOutName = MACH_MSG_TYPE_INTEGER_32;
> +it->itOutNameStr = "MACH_MSG_TYPE_INTEGER_32";
> +it->itSize = sizeof_int * 8;
> +it->itAlignment = sizeof_int;

These become coherent, and less surprising to change if we ever need to.

> @@ -873,6 +865,7 @@ itMakeDeallocType(void)
>  it->itOutName = MACH_MSG_TYPE_BOOLEAN;
>  it->itOutNameStr = "MACH_MSG_TYPE_BOOLEAN";
>  it->itSize = 32;
> +it->itAlignment = sizeof_int;

I'd say also use sizeof_int * 8 for itSize.

> @@ -909,7 +903,8 @@ init_type(void)
>  itRetCodeType->itInNameStr = "MACH_MSG_TYPE_INTEGER_32";
>  itRetCodeType->itOutName = MACH_MSG_TYPE_INTEGER_32;
>  itRetCodeType->itOutNameStr = "MACH_MSG_TYPE_INTEGER_32";
> -itRetCodeType->itSize = 32;
> +itRetCodeType->itSize = sizeof_int * 8;
> +itRetCodeType->itAlignment = sizeof_int;

Also here, use int_name and int_name_string.

> @@ -919,7 +914,8 @@ init_type(void)
>  itDummyType->itInNameStr = "MACH_MSG_TYPE_UNSTRUCTURED";
>  itDummyType->itOutName = MACH_MSG_TYPE_UNSTRUCTURED;
>  itDummyType->itOutNameStr = "MACH_MSG_TYPE_UNSTRUCTURED";
> -itDummyType->itSize = word_size_in_bits;
> +itDummyType->itSize = word_size * 8;
> +itDummyType->itAlignment = word_size;

So, now the question raises: what do we want to permit in an
unstructured? If we're fine with not including __float128, we can indeed
go with alignof(intptr_t). But I'd really say to call this so (e.g.
max_alignof), rather than word_size which does not convey much meaning.

> @@ -1041,17 +1040,15 @@ itCheckIntType(identifier_t name, const ipc_type_t 
> *it)
>   error("argument %s isn't a proper integer", name);
>  }
>  void
> -itCheckNaturalType(name, it)
> -identifier_t name;
> -ipc_type_t *it;
> +itCheckNaturalType(identifier_t name, ipc_type_t *it)
>  {
> -if ((it->itInName != word_size_name) ||
> - (it->itOutName != word_size_name) ||
> +if ((it->itInName != MACH_MSG_TYPE_INTEGER_32) ||
> + (it->itOutName != MACH_MSG_TYPE_INTEGER_32) ||

Also, here, int_name for coherency with sizeof_int.:

>   (it->itNumber != 1) ||
> - (it->itSize != word_size_in_bits) ||
> + (it->itSize != sizeof_int * 8) ||
>   !it->itInLine ||
>   it->itDeallocate != d_NO ||
>   !it->itStruct ||
>   it->itVarArray)
> - error("argument %s should have been a %s", name, word_size_name_string);
> + error("argument %s should have been a MACH_MSG_TYPE_INTEGER_32", name);
>  }


> diff --git a/utils.c b/utils.c
> index 2fb8c2e..e8aec9a 100644
> --- a/utils.c
> +++ b/utils.c
> @@ -304,6 +304,10 @@ WriteFieldDeclPrim(FILE *file, const argument_t *arg,
>  
>  fprintf(file, "\t\tmach_msg_type_%st %s;\n",
>   arg->argLongForm ? "long_" : "", arg->argTTName);
> +if (!arg->argLongForm && word_size > sizeof_mach_msg_type_t) {
> +/* Pad mach_msg_type_t in c

Re: [RFC PATCH glibc 1/12] hurd: Refactor readlinkat()

2023-02-12 Thread Samuel Thibault

Applied, thanks!

Sergey Bugaev via Libc-alpha, le dim. 12 févr. 2023 14:10:32 +0300, a ecrit:
> Make the code flow more linear using early returns where possible. This
> makes it so much easier to reason about what runs on error / successful
> code paths.
> 
> Signed-off-by: Sergey Bugaev 
> ---
>  sysdeps/mach/hurd/readlinkat.c | 55 --
>  1 file changed, 32 insertions(+), 23 deletions(-)
> 
> diff --git a/sysdeps/mach/hurd/readlinkat.c b/sysdeps/mach/hurd/readlinkat.c
> index 1efb09ca..dabdbb37 100644
> --- a/sysdeps/mach/hurd/readlinkat.c
> +++ b/sysdeps/mach/hurd/readlinkat.c
> @@ -31,38 +31,47 @@ __readlinkat (int fd, const char *file_name, char *buf, 
> size_t len)
>error_t err;
>file_t file_stat;
>struct stat64 st;
> +  enum retry_type doretry;
> +  char retryname[1024];
> +  file_t file;
> +  char *rbuf = buf;
>  
>file_stat = __file_name_lookup_at (fd, 0, file_name, O_NOLINK, 0);
>if (file_stat == MACH_PORT_NULL)
>  return -1;
>  
>err = __io_stat (file_stat, &st);
> -  if (! err)
> -if (S_ISLNK (st.st_mode))
> -  {
> - enum retry_type doretry;
> - char retryname[1024];
> - file_t file;
> - char *rbuf = buf;
> +  if (err)
> +goto out;
> +  if (!S_ISLNK (st.st_mode))
> +{
> +  err = EINVAL;
> +  goto out;
> +}
>  
> - err = __dir_lookup (file_stat, "", O_READ | O_NOLINK, 0, &doretry, 
> retryname, &file);
> - if (! err && (doretry != FS_RETRY_NORMAL || retryname[0] != '\0'))
> -   err = EGRATUITOUS;
> - if (! err)
> -   {
> - err = __io_read (file, &rbuf, &len, 0, len);
> - if (!err && rbuf != buf)
> -   {
> - memcpy (buf, rbuf, len);
> - __vm_deallocate (__mach_task_self (), (vm_address_t)rbuf, len);
> -   }
> +  err = __dir_lookup (file_stat, "", O_READ | O_NOLINK,
> +  0, &doretry, retryname, &file);
> +  if (err)
> +goto out;
> +  if (doretry != FS_RETRY_NORMAL || retryname[0] != '\0')
> +{
> +  err = EGRATUITOUS;
> +  goto out;
> +}
> +
> +  err = __io_read (file, &rbuf, &len, 0, len);
> +  __mach_port_deallocate (__mach_task_self (), file);
> +  if (err)
> +goto out;
> +
> +  if (rbuf != buf)
> +{
> +  memcpy (buf, rbuf, len);
> +  __vm_deallocate (__mach_task_self (), (vm_address_t) rbuf, len);
> +}
>  
> - __mach_port_deallocate (__mach_task_self (), file);
> -   }
> -  }
> -else
> -  err = EINVAL;
>  
> + out:
>__mach_port_deallocate (__mach_task_self (), file_stat);
>  
>return err ? __hurd_fail (err) : len;
> -- 
> 2.39.1
> 

-- 
Samuel
---
Pour une évaluation indépendante, transparente et rigoureuse !
Je soutiens la Commission d'Évaluation de l'Inria.

Re: [RFC PATCH glibc 2/12] hurd: Use mach_msg_type_number_t where appropriate

2023-02-12 Thread Samuel Thibault

Applied, thanks!

Sergey Bugaev via Libc-alpha, le dim. 12 févr. 2023 14:10:33 +0300, a ecrit:
> It has been decided that on x86_64, mach_msg_type_number_t stays 32-bit.
> Therefore, it's not possible to use mach_msg_type_number_t
> interchangeably with size_t, in particular this breaks when a pointer to
> a variable is passed to a MIG routine.
> 
> Signed-off-by: Sergey Bugaev 
> ---
>  hurd/hurdioctl.c   | 2 +-
>  hurd/hurdprio.c| 2 +-
>  hurd/lookup-retry.c| 2 +-
>  hurd/xattr.c   | 4 ++--
>  sysdeps/mach/hurd/getcwd.c | 2 +-
>  sysdeps/mach/hurd/readlinkat.c | 6 --
>  sysdeps/mach/hurd/sendfile64.c | 2 +-
>  7 files changed, 11 insertions(+), 9 deletions(-)
> 
> diff --git a/hurd/hurdioctl.c b/hurd/hurdioctl.c
> index e9a6a7ea..59e6a5c1 100644
> --- a/hurd/hurdioctl.c
> +++ b/hurd/hurdioctl.c
> @@ -311,7 +311,7 @@ static int
>  siocgifconf (int fd, int request, struct ifconf *ifc)
>  {
>error_t err;
> -  size_t data_len = ifc->ifc_len;
> +  mach_msg_type_number_t data_len = ifc->ifc_len;
>char *data = ifc->ifc_buf;
>  
>if (data_len <= 0)
> diff --git a/hurd/hurdprio.c b/hurd/hurdprio.c
> index dbeb272b..954d3987 100644
> --- a/hurd/hurdprio.c
> +++ b/hurd/hurdprio.c
> @@ -58,7 +58,7 @@ _hurd_priority_which_map (enum __priority_which which, int 
> who,
> int *oldpi = pi;
> mach_msg_type_number_t oldpisize = pisize;
> char *tw = 0;
> -   size_t twsz = 0;
> +   mach_msg_type_number_t twsz = 0;
> err = __USEPORT (PROC, __proc_getprocinfo (port, pids[i],
>&pi_flags,
>&pi, &pisize,
> diff --git a/hurd/lookup-retry.c b/hurd/lookup-retry.c
> index 0d344026..8850c4fd 100644
> --- a/hurd/lookup-retry.c
> +++ b/hurd/lookup-retry.c
> @@ -162,7 +162,7 @@ __hurd_file_name_lookup_retry (error_t (*use_init_port)
>   {
> char buf[1024];
> char *trans = buf;
> -   size_t translen = sizeof buf;
> +   mach_msg_type_number_t translen = sizeof buf;
> err = __file_get_translator (*result,
>  &trans, &translen);
> if (!err
> diff --git a/hurd/xattr.c b/hurd/xattr.c
> index f98e548c..48914bcf 100644
> --- a/hurd/xattr.c
> +++ b/hurd/xattr.c
> @@ -60,7 +60,7 @@ _hurd_xattr_get (io_t port, const char *name, void *value, 
> size_t *size)
>if (!strcmp (name, "translator"))
>  {
>char *buf = value;
> -  size_t bufsz = value ? *size : 0;
> +  mach_msg_type_number_t bufsz = value ? *size : 0;
>error_t err = __file_get_translator (port, &buf, &bufsz);
>if (err)
>   return err;
> @@ -144,7 +144,7 @@ _hurd_xattr_set (io_t port, const char *name, const void 
> *value, size_t size,
>   {
> /* Must make sure it's already there.  */
> char *buf = NULL;
> -   size_t bufsz = 0;
> +   mach_msg_type_number_t bufsz = 0;
> error_t err = __file_get_translator (port, &buf, &bufsz);
> if (err)
>   return err;
> diff --git a/sysdeps/mach/hurd/getcwd.c b/sysdeps/mach/hurd/getcwd.c
> index fc5e78e7..f24b35b3 100644
> --- a/sysdeps/mach/hurd/getcwd.c
> +++ b/sysdeps/mach/hurd/getcwd.c
> @@ -117,7 +117,7 @@ __hurd_canonicalize_directory_name_internal (file_t 
> thisdir,
>int mount_point;
>file_t newp;
>char *dirdata;
> -  size_t dirdatasize;
> +  mach_msg_type_number_t dirdatasize;
>int direntry, nentries;
>  
>  
> diff --git a/sysdeps/mach/hurd/readlinkat.c b/sysdeps/mach/hurd/readlinkat.c
> index dabdbb37..5bb8b753 100644
> --- a/sysdeps/mach/hurd/readlinkat.c
> +++ b/sysdeps/mach/hurd/readlinkat.c
> @@ -35,6 +35,7 @@ __readlinkat (int fd, const char *file_name, char *buf, 
> size_t len)
>char retryname[1024];
>file_t file;
>char *rbuf = buf;
> +  mach_msg_type_number_t nread = len;
>  
>file_stat = __file_name_lookup_at (fd, 0, file_name, O_NOLINK, 0);
>if (file_stat == MACH_PORT_NULL)
> @@ -59,15 +60,16 @@ __readlinkat (int fd, const char *file_name, char *buf, 
> size_t len)
>goto out;
>  }
>  
> -  err = __io_read (file, &rbuf, &len, 0, len);
> +  err = __io_read (file, &rbuf, &nread, 0, len);
>__mach_port_deallocate (__mach_task_self (), file);
>if (err)
>  goto out;
>  
> +  len = nread;
>if (rbuf != buf)
>  {
>memcpy (buf, rbuf, len);
> -  __vm_deallocate (__mach_task_self (), (vm_address_t) rbuf, len);
> +  __vm_deallocate (__mach_task_self (), (vm_address_t) rbuf, nread);
>  }
>  
>  
> diff --git a/sysdeps/mach/hurd/sendfile64.c b/sysdeps/mach/hurd/sendfile64.c
> index 658b0282..e36b9790 100644
> --- a/sysdeps/mach/hurd/sendfile64.c
> +++ b/sysdeps/mach/hurd/sendfile64.c
> @@ -35,7 +35

Re: [RFC PATCH glibc 3/12] mach, hurd: Cast through uintptr_t

2023-02-12 Thread Samuel Thibault

Applied, thanks!

Sergey Bugaev, le dim. 12 févr. 2023 14:10:34 +0300, a ecrit:
> When casting between a pointer and an integer of a different size, GCC
> emits a warning (which is escalated to a build failure by -Werror).
> Indeed, if what you start with is a pointer, which you then cast to a
> shorter integer and then back again, you're going to cut off some bits
> of the pointer.
> 
> But if you start with an integer (such as mach_port_t), then cast it to
> a longer pointer (void *), and then back to a shorter integer, you are
> fine. To keep GCC happy, cast through an intermediary uintptr_t, which
> is always the same size as a pointer.
> 
> Signed-off-by: Sergey Bugaev 
> ---
>  htl/cthreads-compat.c |  5 +++--
>  hurd/fopenport.c  | 15 ++-
>  hurd/hurd/port.h  |  2 +-
>  hurd/port-cleanup.c   |  3 ++-
>  hurd/vpprintf.c   |  6 --
>  mach/devstream.c  |  9 +
>  6 files changed, 25 insertions(+), 15 deletions(-)
> 
> diff --git a/htl/cthreads-compat.c b/htl/cthreads-compat.c
> index 55043a8b..2a0a95b6 100644
> --- a/htl/cthreads-compat.c
> +++ b/htl/cthreads-compat.c
> @@ -25,8 +25,9 @@ void
>  __cthread_detach (__cthread_t thread)
>  {
>int err;
> +  pthread_t pthread = (pthread_t) (uintptr_t) thread;
>  
> -  err = __pthread_detach ((pthread_t) thread);
> +  err = __pthread_detach (pthread);
>assert_perror (err);
>  }
>  weak_alias (__cthread_detach, cthread_detach)
> @@ -40,7 +41,7 @@ __cthread_fork (__cthread_fn_t func, void *arg)
>err = __pthread_create (&thread, NULL, func, arg);
>assert_perror (err);
>  
> -  return (__cthread_t) thread;
> +  return (__cthread_t) (uintptr_t) thread;
>  }
>  weak_alias (__cthread_fork, cthread_fork)
>  
> diff --git a/hurd/fopenport.c b/hurd/fopenport.c
> index 60afb445..be6aa30c 100644
> --- a/hurd/fopenport.c
> +++ b/hurd/fopenport.c
> @@ -28,9 +28,10 @@ readio (void *cookie, char *buf, size_t n)
>mach_msg_type_number_t nread;
>error_t err;
>char *bufp = buf;
> +  io_t io = (io_t) (uintptr_t) cookie;
>  
>nread = n;
> -  if (err = __io_read ((io_t) cookie, &bufp, &nread, -1, n))
> +  if (err = __io_read (io, &bufp, &nread, -1, n))
>  return __hurd_fail (err);
>  
>if (bufp != buf)
> @@ -50,8 +51,9 @@ writeio (void *cookie, const char *buf, size_t n)
>  {
>vm_size_t wrote;
>error_t err;
> +  io_t io = (io_t) (uintptr_t) cookie;
>  
> -  if (err = __io_write ((io_t) cookie, buf, n, -1, &wrote))
> +  if (err = __io_write (io, buf, n, -1, &wrote))
>  return __hurd_fail (err);
>  
>return wrote;
> @@ -65,7 +67,8 @@ seekio (void *cookie,
>   off64_t *pos,
>   int whence)
>  {
> -  error_t err = __io_seek ((file_t) cookie, *pos, whence, pos);
> +  io_t io = (io_t) (uintptr_t) cookie;
> +  error_t err = __io_seek (io, *pos, whence, pos);
>return err ? __hurd_fail (err) : 0;
>  }
>  
> @@ -74,8 +77,9 @@ seekio (void *cookie,
>  static int
>  closeio (void *cookie)
>  {
> +  io_t io = (io_t) (uintptr_t) cookie;
>error_t error = __mach_port_deallocate (__mach_task_self (),
> -   (mach_port_t) cookie);
> +   io);
>if (error)
>  return __hurd_fail (error);
>return 0;
> @@ -127,6 +131,7 @@ __fopenport (mach_port_t port, const char *mode)
>return NULL;
>  }
>  
> -  return fopencookie ((void *) port, mode, funcsio);
> +  return fopencookie ((void *) (uintptr_t) port,
> +  mode, funcsio);
>  }
>  weak_alias (__fopenport, fopenport)
> diff --git a/hurd/hurd/port.h b/hurd/hurd/port.h
> index fdba8db5..1e473978 100644
> --- a/hurd/hurd/port.h
> +++ b/hurd/hurd/port.h
> @@ -96,7 +96,7 @@ _hurd_port_locked_get (struct hurd_port *port,
>if (result != MACH_PORT_NULL)
>  {
>link->cleanup = &_hurd_port_cleanup;
> -  link->cleanup_data = (void *) result;
> +  link->cleanup_data = (void *) (uintptr_t) result;
>_hurd_userlink_link (&port->users, link);
>  }
>__spin_unlock (&port->lock);
> diff --git a/hurd/port-cleanup.c b/hurd/port-cleanup.c
> index 08ab3d47..ad43b830 100644
> --- a/hurd/port-cleanup.c
> +++ b/hurd/port-cleanup.c
> @@ -26,7 +26,8 @@
>  void
>  _hurd_port_cleanup (void *cleanup_data, jmp_buf env, int val)
>  {
> -  __mach_port_deallocate (__mach_task_self (), (mach_port_t) cleanup_data);
> +  mach_port_t port = (mach_port_t) (uintptr_t) cleanup_data;
> +  __mach_port_deallocate (__mach_task_self (), port);
>  }
>  
>  /* We were cancelled while using a port, and called from the cleanup 
> unwinding.
> diff --git a/hurd/vpprintf.c b/hurd/vpprintf.c
> index fe734946..010f04ca 100644
> --- a/hurd/vpprintf.c
> +++ b/hurd/vpprintf.c
> @@ -25,8 +25,9 @@
>  static ssize_t
>  do_write (void *cookie,  const char *buf, size_t n)
>  {
> +  io_t io = (io_t) (uintptr_t) cookie;
>vm_size_t amount = n;
> -  error_t error = __io_write ((io_t) cookie, buf, n, -1, &amount);
> +  error_t error = __io_write (io, buf, n,

Re: [RFC PATCH glibc 4/12] hurd: Fix xattr error value

2023-02-12 Thread Samuel Thibault

Applied, thanks!

Sergey Bugaev, le dim. 12 févr. 2023 14:10:35 +0300, a ecrit:
> This does not seem like it is supposed to return negative error codes.
> 
> Signed-off-by: Sergey Bugaev 
> ---
>  hurd/xattr.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hurd/xattr.c b/hurd/xattr.c
> index 48914bcf..5a0fc263 100644
> --- a/hurd/xattr.c
> +++ b/hurd/xattr.c
> @@ -68,7 +68,7 @@ _hurd_xattr_get (io_t port, const char *name, void *value, 
> size_t *size)
>   {
> if (buf != value)
>   __munmap (buf, bufsz);
> -   return -ERANGE;
> +   return ERANGE;
>   }
>if (buf != value && bufsz > 0)
>   {
> -- 
> 2.39.1
> 
> 

-- 
Samuel
---
Pour une évaluation indépendante, transparente et rigoureuse !
Je soutiens la Commission d'Évaluation de l'Inria.

Re: [RFC PATCH glibc 5/12] htl: Fix semaphore reference

2023-02-12 Thread Samuel Thibault

Applied, thanks!

Sergey Bugaev, le dim. 12 févr. 2023 14:10:36 +0300, a ecrit:
> 'sem' is the opaque 'sem_t', 'isem' is the actual 'struct new_sem'.
> 
> Signed-off-by: Sergey Bugaev 
> ---
>  sysdeps/htl/sem-timedwait.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/sysdeps/htl/sem-timedwait.c b/sysdeps/htl/sem-timedwait.c
> index 8f2df6e7..9974e9ae 100644
> --- a/sysdeps/htl/sem-timedwait.c
> +++ b/sysdeps/htl/sem-timedwait.c
> @@ -60,7 +60,7 @@ __sem_timedwait_internal (sem_t *restrict sem,
>int cancel_oldtype = LIBC_CANCEL_ASYNC();
>  
>  #if __HAVE_64B_ATOMICS
> -  uint64_t d = atomic_fetch_add_relaxed (&sem->data,
> +  uint64_t d = atomic_fetch_add_relaxed (&isem->data,
>(uint64_t) 1 << SEM_NWAITERS_SHIFT);
>  
>pthread_cleanup_push (__sem_wait_cleanup, isem);
> @@ -72,11 +72,11 @@ __sem_timedwait_internal (sem_t *restrict sem,
> /* No token, sleep.  */
> if (timeout)
>   err = __lll_abstimed_wait_intr (
> -   ((unsigned int *) &sem->data) + SEM_VALUE_OFFSET,
> +   ((unsigned int *) &isem->data) + SEM_VALUE_OFFSET,
> 0, timeout, flags, clock_id);
> else
>   err = __lll_wait_intr (
> -   ((unsigned int *) &sem->data) + SEM_VALUE_OFFSET,
> +   ((unsigned int *) &isem->data) + SEM_VALUE_OFFSET,
> 0, flags);
>  
> if (err != 0 && err != KERN_INVALID_ARGUMENT)
> @@ -92,12 +92,12 @@ __sem_timedwait_internal (sem_t *restrict sem,
>   }
>  
> /* Token changed */
> -   d = atomic_load_relaxed (&sem->data);
> +   d = atomic_load_relaxed (&isem->data);
>   }
>else
>   {
> /* Try to acquire and dequeue.  */
> -   if (atomic_compare_exchange_weak_acquire (&sem->data,
> +   if (atomic_compare_exchange_weak_acquire (&isem->data,
> &d, d - 1 - ((uint64_t) 1 << SEM_NWAITERS_SHIFT)))
>   {
> /* Success */
> -- 
> 2.39.1
> 
> 

-- 
Samuel
---
Pour une évaluation indépendante, transparente et rigoureuse !
Je soutiens la Commission d'Évaluation de l'Inria.

Re: [RFC PATCH hurd 6/12] hurd: Fix modes_t and speeds_t types on 64-bit

2023-02-12 Thread Samuel Thibault

We don't need these to be 64bits, 32bit will be completely large enough.

Which issue did you actually encounter?

Samuel

Sergey Bugaev, le dim. 12 févr. 2023 14:10:37 +0300, a ecrit:
> ---
>  hurd/tioctl.defs | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/hurd/tioctl.defs b/hurd/tioctl.defs
> index 905a4a38..a04f5ae4 100644
> --- a/hurd/tioctl.defs
> +++ b/hurd/tioctl.defs
> @@ -34,9 +34,9 @@ import ; /* XXX */
>  
>  /* These are the pieces of a struct termios as specified by the
> definition of _IOT_termios in . */
> -type modes_t = array[4] of int;
> +type modes_t = array[4] of long;
>  type ccs_t = array[20] of char;
> -type speeds_t = array[2] of int;
> +type speeds_t = array[2] of long;
>  
>  /* This is the arg for a struct winsize as specified by the
> definition of _IOT_winsize in . */
> -- 
> 2.39.1

Re: [RFC PATCH mig 7/12] Drop -undef -ansi from cpp flags

2023-02-12 Thread Samuel Thibault

Sergey Bugaev, le dim. 12 févr. 2023 14:10:38 +0300, a ecrit:
> Since GNU Mach commit d30481122a5d24ad6b921062f93b9172ef922fc3,
> i386/machine_types.defs defines types based on defined(__x86_64__).
> Supressing the built-in macro definitions will now result in the wrong
> type being silently selected.
> 
> -undef was initially introduced in commit
> 78b6a7665db7b2eae367e17102821cbdca231d19 without much of an explanation.
> -ansi was introduced in commit 6940fb91859e46b2e96a331a029f2dc2a0ee51c9
> "to avoid -Di386=1 and the like".
> 
> Since glibc has been using MIG with CPP set to a custom GCC invocation
> which did *not* use either flag, it appears that everything works well
> enough even without them. On the other hand, not having __x86_64__
> defined most definetely causes issues for anything that does not set a
> custom CPP when invoking MIG (i.e., most users). Other built-in
> definitions could be used in the future in a similar way (e.g. on other
> architectures); it's really more of a coincidence that they have not
> been used until now, and things kept working with -undef.

That looks alright to me. Flavior, what do you think about it?

> ---
>  mig.in | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mig.in b/mig.in
> index 63e0269..94fd500 100644
> --- a/mig.in
> +++ b/mig.in
> @@ -38,7 +38,7 @@ migcom=${MIGDIR-$libexecdir}/${MIGCOM-@MIGCOM@}
>  # The expansion of TARGET_CC might refer to ${CC}, so make sure it is 
> defined.
>  default_cc="@CC@"
>  CC="${CC-${default_cc}}"
> -default_cpp="@TARGET_CC@ -E -x c -undef -ansi"
> +default_cpp="@TARGET_CC@ -E -x c"
>  cpp="${CPP-${default_cpp}}"
>  
>  cppflags=
> -- 
> 2.39.1

Re: [RFC PATCH glibc 9/12] mach: Look for mach_i386.defs on x86_64 too

2023-02-12 Thread Samuel Thibault

Sergey Bugaev, le dim. 12 févr. 2023 14:10:40 +0300, a ecrit:
> Signed-off-by: Sergey Bugaev 
> ---
>  mach/Makefile | 3 ++-
>  sysdeps/mach/configure| 6 +++---
>  sysdeps/mach/configure.ac | 6 +++---
>  3 files changed, 8 insertions(+), 7 deletions(-)
> 
> diff --git a/mach/Makefile b/mach/Makefile
> index 39358fdb..f2fdd7da 100644
> --- a/mach/Makefile
> +++ b/mach/Makefile
> @@ -64,7 +64,8 @@ CFLAGS-RPC_i386_set_ldt.o = $(no-stack-protector)
>  CFLAGS-RPC_task_get_special_port.o = $(no-stack-protector)
>  
>  # Translate GNU names for CPUs into the names used in Mach header files.
> -mach-machine = $(patsubst powerpc,ppc,$(base-machine))
> +mach-machine := $(patsubst powerpc,ppc,$(base-machine))
> +mach-machine := $(patsubst x86_64,i386,$(mach-machine))

Mmm, no, it's not actually a translation. What exactly does this fix?

To build a 64bit glibc, we'll use headers installed by a 64bit gnumach,
i.e. into /usr/include/x86_64-gnu/

I have applied the rest, thanks!

Samuel

> diff --git a/sysdeps/mach/con b/sysdeps/mach/configure
> index 3f0a9029..8c341d59 100644
> --- a/sysdeps/mach/configure
> +++ b/sysdeps/mach/configure
> @@ -249,7 +249,7 @@ for ifc in mach mach4 gnumach \
>  clock clock_priv host_priv host_security ledger lock_set \
>  processor processor_set task task_notify thread_act vm_map \
>  memory_object memory_object_default default_pager \
> -i386/mach_i386 \
> +machine/mach_i386 \
>  ; do
>as_ac_Header=`$as_echo "ac_cv_header_mach/${ifc}.defs" | $as_tr_sh`
>  ac_fn_c_check_header_preproc "$LINENO" "mach/${ifc}.defs" "$as_ac_Header"
> @@ -440,7 +440,7 @@ if ${libc_cv_mach_i386_ioports+:} false; then :
>  else
>cat confdefs.h - <<_ACEOF >conftest.$ac_ext
>  /* end confdefs.h.  */
> -#include 
> +#include 
>  
>  _ACEOF
>  if (eval "$ac_cpp conftest.$ac_ext") 2>&5 |
> @@ -466,7 +466,7 @@ if ${libc_cv_mach_i386_gdt+:} false; then :
>  else
>cat confdefs.h - <<_ACEOF >conftest.$ac_ext
>  /* end confdefs.h.  */
> -#include 
> +#include 
>  
>  _ACEOF
>  if (eval "$ac_cpp conftest.$ac_ext") 2>&5 |
> diff --git a/sysdeps/mach/configure.ac b/sysdeps/mach/configure.ac
> index a57cb259..579c0021 100644
> --- a/sysdeps/mach/configure.ac
> +++ b/sysdeps/mach/configure.ac
> @@ -64,7 +64,7 @@ for ifc in mach mach4 gnumach \
>  clock clock_priv host_priv host_security ledger lock_set \
>  processor processor_set task task_notify thread_act vm_map \
>  memory_object memory_object_default default_pager \
> -i386/mach_i386 \
> +machine/mach_i386 \
>  ; do
>AC_CHECK_HEADER(mach/${ifc}.defs, [dnl
>mach_interface_list="$mach_interface_list $ifc"],, -)
> @@ -89,7 +89,7 @@ AC_CHECK_HEADER(machine/ndr_def.h, [dnl
>  
>  AC_CACHE_CHECK(for i386_io_perm_modify in mach_i386.defs,
>  libc_cv_mach_i386_ioports, [dnl
> -AC_EGREP_HEADER(i386_io_perm_modify, mach/i386/mach_i386.defs,
> +AC_EGREP_HEADER(i386_io_perm_modify, mach/machine/mach_i386.defs,
>   libc_cv_mach_i386_ioports=yes,
>   libc_cv_mach_i386_ioports=no)])
>  if test $libc_cv_mach_i386_ioports = yes; then
> @@ -98,7 +98,7 @@ fi
>  
>  AC_CACHE_CHECK(for i386_set_gdt in mach_i386.defs,
>  libc_cv_mach_i386_gdt, [dnl
> -AC_EGREP_HEADER(i386_set_gdt, mach/i386/mach_i386.defs,
> +AC_EGREP_HEADER(i386_set_gdt, mach/machine/mach_i386.defs,
>   libc_cv_mach_i386_gdt=yes,
>   libc_cv_mach_i386_gdt=no)])
>  if test $libc_cv_mach_i386_gdt = yes; then
> -- 
> 2.39.1
> 
> 

-- 
Samuel
---
Pour une évaluation indépendante, transparente et rigoureuse !
Je soutiens la Commission d'Évaluation de l'Inria.

Re: [RFC PATCH glibc 10/12] hurd: Set up the basic tree for x86_64-gnu

2023-02-12 Thread Samuel Thibault

Sergey Bugaev, le dim. 12 févr. 2023 14:10:41 +0300, a ecrit:
> And move pt-setup.c to the generic x86 tree.
> 
> Signed-off-by: Sergey Bugaev 
> ---
>  sysdeps/mach/hurd/Implies  | 1 +
>  sysdeps/mach/hurd/{i386 => x86}/htl/pt-setup.c | 4 ++--
>  sysdeps/mach/hurd/x86_64/Implies   | 2 ++
>  sysdeps/mach/hurd/x86_64/htl/Implies   | 3 +++
>  sysdeps/mach/x86_64/Implies| 1 +
>  5 files changed, 9 insertions(+), 2 deletions(-)
>  rename sysdeps/mach/hurd/{i386 => x86}/htl/pt-setup.c (97%)
>  create mode 100644 sysdeps/mach/hurd/x86_64/Implies
>  create mode 100644 sysdeps/mach/hurd/x86_64/htl/Implies
>  create mode 100644 sysdeps/mach/x86_64/Implies
> 
> diff --git a/sysdeps/mach/hurd/Implies b/sysdeps/mach/hurd/Implies
> index d2d5234c..e19dd1fd 100644
> --- a/sysdeps/mach/hurd/Implies
> +++ b/sysdeps/mach/hurd/Implies
> @@ -3,3 +3,4 @@
>  gnu
>  # The Hurd provides a rough superset of the functionality of 4.4 BSD.
>  unix/bsd
> +mach/hurd/htl
> diff --git a/sysdeps/mach/hurd/i386/htl/pt-setup.c 
> b/sysdeps/mach/hurd/x86/htl/pt-setup.c
> similarity index 97%
> rename from sysdeps/mach/hurd/i386/htl/pt-setup.c
> rename to sysdeps/mach/hurd/x86/htl/pt-setup.c
> index 94caed82..3abd92b2 100644
> --- a/sysdeps/mach/hurd/i386/htl/pt-setup.c
> +++ b/sysdeps/mach/hurd/x86/htl/pt-setup.c
> @@ -1,4 +1,4 @@
> -/* Setup thread stack.  Hurd/i386 version.
> +/* Setup thread stack.  Hurd/x86 version.
> Copyright (C) 2000-2023 Free Software Foundation, Inc.
> This file is part of the GNU C Library.
>  
> @@ -22,7 +22,7 @@
>  
>  #include 
>  
> -/* The stack layout used on the i386 is:
> +/* The stack layout used on the x86 is:
>  
>  -
> |  ARG|
> diff --git a/sysdeps/mach/hurd/x86_64/Implies 
> b/sysdeps/mach/hurd/x86_64/Implies
> new file mode 100644
> index ..6b5e6f47
> --- /dev/null
> +++ b/sysdeps/mach/hurd/x86_64/Implies
> @@ -0,0 +1,2 @@
> +mach/hurd/x86
> +mach/hurd/x86_64/htl
> diff --git a/sysdeps/mach/hurd/x86_64/htl/Implies 
> b/sysdeps/mach/hurd/x86_64/htl/Implies
> new file mode 100644
> index ..1ae8f16e
> --- /dev/null
> +++ b/sysdeps/mach/hurd/x86_64/htl/Implies
> @@ -0,0 +1,3 @@
> +mach/hurd/htl
> +mach/hurd/x86/htl
> +x86_64/htl

Always remember to also try the 32bit build, I added this to your
commit:

diff --git a/sysdeps/mach/hurd/i386/htl/Implies 
b/sysdeps/mach/hurd/i386/htl/Implies
index 7a0f99d772..fe3bd983b8 100644
--- a/sysdeps/mach/hurd/i386/htl/Implies
+++ b/sysdeps/mach/hurd/i386/htl/Implies
@@ -1,2 +1,3 @@
 mach/hurd/htl
+mach/hurd/x86/htl
 i386/htl

Thanks!

> diff --git a/sysdeps/mach/x86_64/Implies b/sysdeps/mach/x86_64/Implies
> new file mode 100644
> index ..da8291f4
> --- /dev/null
> +++ b/sysdeps/mach/x86_64/Implies
> @@ -0,0 +1 @@
> +mach/x86
> -- 
> 2.39.1
> 
> 

-- 
Samuel
---
Pour une évaluation indépendante, transparente et rigoureuse !
Je soutiens la Commission d'Évaluation de l'Inria.

Re: [RFC PATCH hurd 6/12] hurd: Fix modes_t and speeds_t types on 64-bit

2023-02-12 Thread Sergey Bugaev

On Sun, Feb 12, 2023 at 6:00 PM Samuel Thibault  wrote:
> We don't need these to be 64bits, 32bit will be completely large enough.
>
> Which issue did you actually encounter?

Well, these MIG definitions need to match the ones in C/glibc. In
hurd/ioctl_types.h (in the Hurd tree) there is

typedef tcflag_t modes_t[4];
typedef speed_t speeds_t[2];

and then in bits/termios.h (in the glibc tree)

/* Type of terminal control flag masks.  */
typedef unsigned long int tcflag_t;

/* Type of control characters.  */
typedef unsigned char cc_t;

/* Type of baud rate specifiers.  */
typedef long int speed_t;

which is why I changed the MIG ones to long. We could do it the other
way around, make our own bits/termios.h and define them as fixed-size
32 bit ints.

Sergey

Re: [RFC PATCH glibc 12/12] C11 thrd: Downgrade the default alignment of mtx_t

2023-02-12 Thread Samuel Thibault

Sergey Bugaev, le dim. 12 févr. 2023 14:10:43 +0300, a ecrit:
> ..so that it can match the alignment of pthread_mutex_t on x86_64-gnu.

I'd say rather make pthread_mutex_t aligned on long int, so we can
possibly in the future put some pointers in it without breaking the ABI.

> This likely breaks many other arches (if not all of them), though.
> 
> Signed-off-by: Sergey Bugaev 
> ---
>  sysdeps/pthread/threads.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/sysdeps/pthread/threads.h b/sysdeps/pthread/threads.h
> index 860d597d..207c1dee 100644
> --- a/sysdeps/pthread/threads.h
> +++ b/sysdeps/pthread/threads.h
> @@ -64,7 +64,7 @@ typedef __once_flag once_flag;
>  typedef union
>  {
>char __size[__SIZEOF_PTHREAD_MUTEX_T];
> -  long int __align __LOCK_ALIGNMENT;
> +  int __align __LOCK_ALIGNMENT;
>  } mtx_t;
>  
>  typedef union
> -- 
> 2.39.1

Re: [RFC PATCH hurd 6/12] hurd: Fix modes_t and speeds_t types on 64-bit

2023-02-12 Thread Samuel Thibault

Sergey Bugaev, le dim. 12 févr. 2023 18:15:57 +0300, a ecrit:
> On Sun, Feb 12, 2023 at 6:00 PM Samuel Thibault  
> wrote:
> > We don't need these to be 64bits, 32bit will be completely large enough.
> >
> > Which issue did you actually encounter?
> 
> Well, these MIG definitions need to match the ones in C/glibc. In
> hurd/ioctl_types.h (in the Hurd tree) there is
> 
> typedef tcflag_t modes_t[4];
> typedef speed_t speeds_t[2];
> 
> and then in bits/termios.h (in the glibc tree)
> 
> /* Type of terminal control flag masks.  */
> typedef unsigned long int tcflag_t;
> 
> /* Type of control characters.  */
> typedef unsigned char cc_t;
> 
> /* Type of baud rate specifiers.  */
> typedef long int speed_t;
> 
> which is why I changed the MIG ones to long. We could do it the other
> way around, make our own bits/termios.h and define them as fixed-size
> 32 bit ints.

I'd rather say drop the "long" part, to avoid having to pull the
stdint.h header in. Nowadays' BSD headers just use the int type,
notably.

Samuel

Re: [RFC PATCH glibc 9/12] mach: Look for mach_i386.defs on x86_64 too

2023-02-12 Thread Sergey Bugaev

> >  # Translate GNU names for CPUs into the names used in Mach header files.
> > -mach-machine = $(patsubst powerpc,ppc,$(base-machine))
> > +mach-machine := $(patsubst powerpc,ppc,$(base-machine))
> > +mach-machine := $(patsubst x86_64,i386,$(mach-machine))
>
> Mmm, no, it's not actually a translation. What exactly does this fix?

The only thing the mach-machine variable is used for (as the comment
above it kind of explains) is for guessing the include guard name in
the machine-specific Mach header. They need this because, uh,

For generating mach-syscalls.mk, the build system passes '#include
' to the compiler to be macro-expanded.
 contains lines like
kernel_trap(mach_msg_trap,-25,7), and at the very top, this:

/*
 *  The machine-dependent "syscall_sw.h" file should
 *  define a macro for
 *  kernel_trap(trap_name, trap_number, arg_count)
 *  which will expand into assembly code for the
 *  trap.
 *
 *  N.B.: When adding calls, do not put spaces in the macros.
 */
#include 

So when using  normally, those kernel_trap() macros
get expanded into arch-specific assembly definitions, such as on i386:

#define kernel_trap(trap_name,trap_number,number_args) \
ENTRY(trap_name) \
movl$ trap_number,%eax; \
SVC; \
ret; \
END(trap_name)

Now back to the glibc buildsystem and the generation of
mach-syscalls.mk: the glibc buildsystem preprocesses
 (the one that contains all those kernel_trap ()
"calls") with the -D_MACH_`echo $(mach-machine) | tr a-z
A-Z`_SYSCALL_SW_H_=1 incantation. This expands to something like
-D_MACH_I386_SYSCALL_SW_H_, which "fakes" the include guard of the
! So those kernel_trap () "calls" do not
get expanded, and the glibc buildsystem then postprocesses them with a
sed invocation to strip those kernel_trap () calls:

sed -n -e 's/^kernel_trap(\(.*\),\([-0-9]*\),\([0-9]*\))$$/\1 \2 \3/p'

And *that* (faking the include guard) is what the mach-machine
variable is used for. As another comment there says, "Go kludges!!!"

Since mach/machine/syscall_sw.h is the i386 version on x86_64 (or --
is it not supposed to be?), the _MACH_I386_SYSCALL_SW_H_ guard is the
one to fake, hence setting mach-machine to i386.

P.S. As you can see, even when the patch is a simple one-line change,
there can be quite a story (and some pulled hairs) behind it :)

> To build a 64bit glibc, we'll use headers installed by a 64bit gnumach,
> i.e. into /usr/include/x86_64-gnu/

Yes, sure, I'm building with the headers from the 64-bit Mach (gnumach
master configured with --host=x86_64-gnu).

Sergey

> I have applied the rest, thanks!
>
> Samuel

Re: [RFC PATCH glibc 9/12] mach: Look for mach_i386.defs on x86_64 too

2023-02-12 Thread Samuel Thibault

Sergey Bugaev, le dim. 12 févr. 2023 18:38:03 +0300, a ecrit:
> Since mach/machine/syscall_sw.h is the i386 version on x86_64 (or --
> is it not supposed to be?)

Nobody yet decided that the system call interface would be the same on
i386 and on x86_64 :)

Most probably we'll need a different header, to put the trap number of
rax instead of eax, notably. And the systemcall instruction will most
probably not be an lcall.

> the _MACH_I386_SYSCALL_SW_H_ guard is the one to fake, hence setting
> mach-machine to i386.

This can be already fixed by shipping a different file in mach, as we'll
most probably want in the end anyway.

Samuel

Re: [RFC PATCH glibc 12/12] C11 thrd: Downgrade the default alignment of mtx_t

2023-02-12 Thread Sergey Bugaev

On Sun, Feb 12, 2023 at 6:18 PM Samuel Thibault  wrote:
> I'd say rather make pthread_mutex_t aligned on long int, so we can
> possibly in the future put some pointers in it without breaking the ABI.

I honestly have 0 idea how it is not 8-aligned right now (nor how its
size is 32 and not like 56), given that this is its definition...

no, strike that, I see that there are *two* versions of
struct___pthread_mutex.h, one in sysdeps/mach/hurd/htl/bits and the
other one in sysdeps/htl/bits/types, and the former one wins. The
latter one seems unused then, its only point was to confuse the hell
out of me.

Sergey

Re: [RFC PATCH glibc 9/12] mach: Look for mach_i386.defs on x86_64 too

2023-02-12 Thread Sergey Bugaev

On Sun, Feb 12, 2023 at 6:46 PM Samuel Thibault  wrote:
>
> Sergey Bugaev, le dim. 12 févr. 2023 18:38:03 +0300, a ecrit:
> > Since mach/machine/syscall_sw.h is the i386 version on x86_64 (or --
> > is it not supposed to be?)
>
> Nobody yet decided that the system call interface would be the same on
> i386 and on x86_64 :)
>
> Most probably we'll need a different header, to put the trap number of
> rax instead of eax, notably. And the systemcall instruction will most
> probably not be an lcall.
>
> > the _MACH_I386_SYSCALL_SW_H_ guard is the one to fake, hence setting
> > mach-machine to i386.
>
> This can be already fixed by shipping a different file in mach, as we'll
> most probably want in the end anyway.

I've kind of assumed that the x86_64 Mach using / installing the i386
headers is intentional, and if some things (like the actual syscall
ABI) are to be different, they'd both be defined in the same file,
predicated on an ifdef. This is what XNU does for sure, see
osfmk/mach/i386/syscall_sw.h

Sergey

Re: [PATCH mig] Make MIG work for pure 64 bit kernel and userland.

2023-02-12 Thread Luca


Hi,

Il 12/02/23 14:37, Luca ha scritto:

Il 12/02/23 07:15, Flavio Cruz ha scritto:

For the 64 bit / 32 bit configuration, we can pass --enable-user32 when
targeting x86_64, and MIG will generate stubs as before. I am unsure
whether there exists undefined behavior in that particular configuration
since most of the structures used by the kernel do not seem to use 8
byte scalars so potentially we should be OK there.


As mentioned in the article, using memcpy() to a correctly aligned 
pointer seems to avoid the possible alignment issues. Since in the 
kernel we need to use copyin/copyout anyway, wouldn't this be enough?


As for the mig stubs, if I understand correctly the issue stems from the 
fact that they use a struct to create the message fields in memory, so 
the binary format of a message needs to have the same alignment 
requirements. Would using memcpy here avoid the alignment issue?
I think it would even simplify the stub code, e.g. for array types, and 
I'm not sure it would decrease performance.


In addition to this, wouldn't it be a kind of a risk to rely on the C 
memory layout for packing message fields? What if in the future this 
breaks because of some new optimization, or some other cpu architecture?


I'm referring mainly to the generic message processing in the kernel, 
e.g. in ipc/kmsg.c and similar, which should then have some 
architecture-specific parts, or depend on compiler features, but the 
same applies to the mig-generated code.


It seems XNU's mig [0] always sets the alignment to natural_t (=4) with 
a #pragma... I still have to understand how they handle these issues.



Luca


[0] I found https://github.com/markmentovai/bootstrap_cmds which is easy 
to compile outside Apple systems, and it doesn't seem too different from 
the official version here 
https://github.com/apple-oss-distributions/bootstrap_cmds

Re: [PATCH mig] Make MIG work for pure 64 bit kernel and userland.

2023-02-12 Thread Sergey Bugaev

On Sun, Feb 12, 2023 at 7:01 PM Luca  wrote:
> It seems XNU's mig [0] always sets the alignment to natural_t (=4) with
> a #pragma... I still have to understand how they handle these issues.

Please note that XNU uses "untyped messaging", and Apple's version of
MIG is "untyped MIG". They don't have the kernel parse the message
body, it just passes it to the receiver task as a blob, where it's up
to MIG again to make sense of it.

Sergey

Re: [RFC PATCH glibc 11/12] hurd, htl: Add some x86_64-specific code

2023-02-12 Thread Samuel Thibault

Sergey Bugaev, le dim. 12 févr. 2023 14:10:42 +0300, a ecrit:
> It seems that GCC expects TLS on x86_64 to be done relative to %fs, not %gs, 
> so
> that's what I attempted to do in tls.h. The main thing missing there is the
> ability to actually set (and read) the %fs base address of a thread. It is my
> understanding (but note that I have no idea what I'm talking about) that on
> x86_64 the segment descriptors (as in GDT/LDT) are not used for this,

segmentation has somewhat disappeared in x86_64, yes.

> and instead the address can be set by writing to a MSR. Linux exposes
> the arch_prctl (ARCH_[GS]ET_[FG]S) syscall for this; so maybe GNU Mach
> could also have an explicit routine for this, perhaps like this:
> 
> routine i386_set_fgs_base (
>   target_thread: thread_t;
>   which: int;
>   value: rpc_vm_address_t);

Indeed.

> We should not need a getter routine, because one can simply inspect the target
> thread's state (unless, again, I misunderstand things horribly).

For 16bit fs/gs values we could read them from userland yes. But for
fs/gs base, the FSGSBASE instruction is not available on all 64bit
processors. And ATM in THREAD_TCB we want to be able to get the base of
another thread.

> diff --git a/sysdeps/mach/hurd/x86_64/static-start.S 
> b/sysdeps/mach/hurd/x86_64/static-start.S
> new file mode 100644
> index ..982d3d52
> --- /dev/null
> +++ b/sysdeps/mach/hurd/x86_64/static-start.S
> @@ -0,0 +1,27 @@
> +/* Type of the TCB.  */
> +typedef struct
> +{
> +  void *tcb; /* Points to this structure.  */
> +  dtv_t *dtv;/* Vector of pointers to TLS data.  */
> +  thread_t self; /* This thread's control port.  */
> +  int __glibc_padding1;
> +  int multiple_threads;
> +  int gscope_flag;
> +  uintptr_t sysinfo;
> +  uintptr_t stack_guard;
> +  uintptr_t pointer_guard;
> +  long __glibc_padding2[2];
> +  int private_futex;

? Isn't that rather feature_1 ?

> +  int __glibc_padding3;
> +  /* Reservation of some values for the TM ABI.  */
> +  void *__private_tm[4];
> +  /* GCC split stack support.  */
> +  void *__private_ss;
> +  /* The lowest address of shadow stack.  */
> +  unsigned long long int ssp_base;
> +
> +  /* Keep these fields last, so offsets of fields above can continue being
> + compatible with the x86_64 NPTL version.  */
> +  mach_port_t reply_port;  /* This thread's reply port.  */
> +  struct hurd_sigstate *_hurd_sigstate;
> +
> +  /* Used by the exception handling implementation in the dynamic loader.  */
> +  struct rtld_catch *rtld_catch;
> +} tcbhead_t;
> +


> +/* GCC generates %fs:0x28 to access the stack guard.  */
> +_Static_assert (offsetof (tcbhead_t, stack_guard) == 0x28,
> +"stack guard offset");
> +/* libgcc uses %fs:0x70 to access the split stack pointer.  */
> +_Static_assert (offsetof (tcbhead_t, __private_ss) == 0x70,
> +"split stack pointer offset");

Indeed. Could you perhaps also add them to the i386 tls.h?

> +/* FIXME */
> +# define __LIBC_NO_TLS() 0

We'll want an efficient way to know whether we have configured TLS
indeed. At worse we can make it a global variable.

> +/* The TCB can have any size and the memory following the address the
> +   thread pointer points to is unspecified.  Allocate the TCB there.  */
> +# define TLS_TCB_AT_TP   1
> +# define TLS_DTV_AT_TP   0
> +

Also copy the comment above TCB_ALIGNMENT.

> +/* Install new dtv for current thread.  */
> +# define INSTALL_NEW_DTV(dtvp) THREAD_SETMEM (THREAD_SELF, dtv, dtvp)
> +/* Return the address of the dtv for the current thread.  */
> +# define THREAD_DTV() THREAD_GETMEM (THREAD_SELF, dtv)

While at it, try to make the i386 version use that too?

Samuel

Re: [RFC PATCH 0/12] Towards glibc on x86_64-gnu

2023-02-12 Thread Samuel Thibault

Hello,

Sergey Bugaev via Libc-alpha, le dim. 12 févr. 2023 14:10:31 +0300, a ecrit:
> The main missing pieces seem to be, ordered by scariness increasing:
> - missing x86_64 thread state definition in the kernel headers;
> - TLS things;
> - INTR_MSG_TRAP.

INTR_MSG_TRAP shouldn't be hard. It's the trampoline part which will
be.  Also, sysdeps/mach/hurd/i386/init-first.c. But possibly that part
could be simplified on 32bit first, now that libpthread just takes the
existing stack rather that forcing to use one, see 9cec82de715b which
started to simplify things.

Samuel

Re: [RFC PATCH hurd 6/12] hurd: Fix modes_t and speeds_t types on 64-bit

2023-02-12 Thread Sergey Bugaev

On Sun, Feb 12, 2023 at 6:22 PM Samuel Thibault  wrote:
> I'd rather say drop the "long" part, to avoid having to pull the
> stdint.h header in.

That's what I meant, yes.

> Nowadays' BSD headers just use the int type,
> notably.

So given that the Linux port has its own bits/termios.h, is it fine to
just modify the generic one inline, rather than creating a
Hurd-specific version? The patch for that follows.

Sergey

-- >8 --

>From 625b774141823a8f504cce8a92b9f45f5e6f050f Mon Sep 17 00:00:00 2001
From: Sergey Bugaev 
Date: Sun, 12 Feb 2023 19:08:57 +0300
Subject: [PATCH] hurd: Fix tcflag_t and speed_t types on 64-bit

These are supposed to stay 32-bit even on 64-bit systems. This matches
BSD and Linux, as well as how these types are already defined in
tioctl.defs

Signed-off-by: Sergey Bugaev 
---
 bits/termios.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/bits/termios.h b/bits/termios.h
index 4439c2f1..6a883ceb 100644
--- a/bits/termios.h
+++ b/bits/termios.h
@@ -99,13 +99,13 @@
`tcflag_t', `cc_t', `speed_t' and the `TC*' constants appropriately.  */

 /* Type of terminal control flag masks.  */
-typedef unsigned long int tcflag_t;
+typedef unsigned int tcflag_t;

 /* Type of control characters.  */
 typedef unsigned char cc_t;

 /* Type of baud rate specifiers.  */
-typedef long int speed_t;
+typedef int speed_t;

 /* Terminal control structure.  */
 struct termios
-- 
2.39.1

Re: [RFC PATCH glibc 11/12] hurd, htl: Add some x86_64-specific code

2023-02-12 Thread Sergey Bugaev

On Sun, Feb 12, 2023 at 7:11 PM Samuel Thibault  wrote:
>
> Sergey Bugaev, le dim. 12 févr. 2023 14:10:42 +0300, a ecrit:
> > We should not need a getter routine, because one can simply inspect the 
> > target
> > thread's state (unless, again, I misunderstand things horribly).
>
> For 16bit fs/gs values we could read them from userland yes. But for
> fs/gs base, the FSGSBASE instruction is not available on all 64bit
> processors. And ATM in THREAD_TCB we want to be able to get the base of
> another thread.

What I've meant is:

__thread_get_state (whatever_thread, &state);
uintptr_t its_fs_base = state->fs_base;

You can't really do the same to *write* [fg]s_base, because doing
thread_set_state on your own thread is bound to end badly.

> > diff --git a/sysdeps/mach/hurd/x86_64/static-start.S 
> > b/sysdeps/mach/hurd/x86_64/static-start.S
> > new file mode 100644
> > index ..982d3d52
> > --- /dev/null
> > +++ b/sysdeps/mach/hurd/x86_64/static-start.S
> > @@ -0,0 +1,27 @@
> > +/* Type of the TCB.  */
> > +typedef struct
> > +{
> > +  void *tcb; /* Points to this structure.  */
> > +  dtv_t *dtv;/* Vector of pointers to TLS data.  */
> > +  thread_t self; /* This thread's control port.  */
> > +  int __glibc_padding1;
> > +  int multiple_threads;
> > +  int gscope_flag;
> > +  uintptr_t sysinfo;
> > +  uintptr_t stack_guard;
> > +  uintptr_t pointer_guard;
> > +  long __glibc_padding2[2];
> > +  int private_futex;
>
> ? Isn't that rather feature_1 ?

sysdeps/mach/hurd/i386/tls.h has 'int private_futex;', which is where
I stole this from. A quick grep confirms that it's never used, so we
might rename both to feature_1, or maybe another instance of
__glibc_padding.

> > +/* GCC generates %fs:0x28 to access the stack guard.  */
> > +_Static_assert (offsetof (tcbhead_t, stack_guard) == 0x28,
> > +"stack guard offset");
> > +/* libgcc uses %fs:0x70 to access the split stack pointer.  */
> > +_Static_assert (offsetof (tcbhead_t, __private_ss) == 0x70,
> > +"split stack pointer offset");
>
> Indeed. Could you perhaps also add them to the i386 tls.h?

> > +/* Install new dtv for current thread.  */
> > +# define INSTALL_NEW_DTV(dtvp) THREAD_SETMEM (THREAD_SELF, dtv, dtvp)
> > +/* Return the address of the dtv for the current thread.  */
> > +# define THREAD_DTV() THREAD_GETMEM (THREAD_SELF, dtv)
>
> While at it, try to make the i386 version use that too?

Yeah, I have not ported the improvements back to the 32-bit version;
maybe I should. Another cool one is doing fs/gs-relative access using
GCC's __seg_fs/__seg_gs when supported.

Sergey

Re: [RFC PATCH glibc 12/12] C11 thrd: Downgrade the default alignment of mtx_t

2023-02-12 Thread Samuel Thibault

Sergey Bugaev, le dim. 12 févr. 2023 18:52:37 +0300, a ecrit:
> On Sun, Feb 12, 2023 at 6:18 PM Samuel Thibault  
> wrote:
> > I'd say rather make pthread_mutex_t aligned on long int, so we can
> > possibly in the future put some pointers in it without breaking the ABI.
> 
> I honestly have 0 idea how it is not 8-aligned right now (nor how its
> size is 32 and not like 56), given that this is its definition...
> 
> no, strike that, I see that there are *two* versions of
> struct___pthread_mutex.h, one in sysdeps/mach/hurd/htl/bits and the
> other one in sysdeps/htl/bits/types, and the former one wins. The
> latter one seems unused then, its only point was to confuse the hell
> out of me.

History is full of remnants. That's why one just has to take the time to
clean things up and avoid leaving things behind oneself.  It seems the
a9915c21 cleanup missed this one.

Samuel

Re: [RFC PATCH hurd 6/12] hurd: Fix modes_t and speeds_t types on 64-bit

2023-02-12 Thread Samuel Thibault

Sergey Bugaev, le dim. 12 févr. 2023 19:13:58 +0300, a ecrit:
> On Sun, Feb 12, 2023 at 6:22 PM Samuel Thibault  
> wrote:
> > I'd rather say drop the "long" part, to avoid having to pull the
> > stdint.h header in.
> 
> That's what I meant, yes.
> 
> > Nowadays' BSD headers just use the int type,
> > notably.
> 
> So given that the Linux port has its own bits/termios.h, is it fine to
> just modify the generic one inline, rather than creating a
> Hurd-specific version?

Yes.

> The patch for that follows.
> 
> Sergey
> 
> -- >8 --
> 
> From 625b774141823a8f504cce8a92b9f45f5e6f050f Mon Sep 17 00:00:00 2001
> From: Sergey Bugaev 
> Date: Sun, 12 Feb 2023 19:08:57 +0300
> Subject: [PATCH] hurd: Fix tcflag_t and speed_t types on 64-bit
> 
> These are supposed to stay 32-bit even on 64-bit systems. This matches
> BSD and Linux, as well as how these types are already defined in
> tioctl.defs
> 
> Signed-off-by: Sergey Bugaev 
> ---
>  bits/termios.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/bits/termios.h b/bits/termios.h
> index 4439c2f1..6a883ceb 100644
> --- a/bits/termios.h
> +++ b/bits/termios.h
> @@ -99,13 +99,13 @@
> `tcflag_t', `cc_t', `speed_t' and the `TC*' constants appropriately.  */
> 
>  /* Type of terminal control flag masks.  */
> -typedef unsigned long int tcflag_t;
> +typedef unsigned int tcflag_t;
> 
>  /* Type of control characters.  */
>  typedef unsigned char cc_t;
> 
>  /* Type of baud rate specifiers.  */
> -typedef long int speed_t;
> +typedef int speed_t;
> 
>  /* Terminal control structure.  */
>  struct termios
> -- 
> 2.39.1
> 

-- 
Samuel
---
Pour une évaluation indépendante, transparente et rigoureuse !
Je soutiens la Commission d'Évaluation de l'Inria.

Re: [PATCH mig] Make MIG work for pure 64 bit kernel and userland.

2023-02-12 Thread Luca


Il 12/02/23 17:05, Sergey Bugaev ha scritto:

On Sun, Feb 12, 2023 at 7:01 PM Luca  wrote:

It seems XNU's mig [0] always sets the alignment to natural_t (=4) with
a #pragma... I still have to understand how they handle these issues.


Please note that XNU uses "untyped messaging", and Apple's version of
MIG is "untyped MIG". They don't have the kernel parse the message
body, it just passes it to the receiver task as a blob, where it's up
to MIG again to make sense of it.


It seems the body is parsed more or less in the same way as in gnumach, 
at least to translate ports. Other fields seems just forwarded to the 
mig stubs, also as in gnumach. Could you be more specific about the 
"untyped messaging"?



Luca

Re: [RFC PATCH glibc 11/12] hurd, htl: Add some x86_64-specific code

2023-02-12 Thread Samuel Thibault

Sergey Bugaev, le dim. 12 févr. 2023 19:25:11 +0300, a ecrit:
> On Sun, Feb 12, 2023 at 7:11 PM Samuel Thibault  
> wrote:
> > Sergey Bugaev, le dim. 12 févr. 2023 14:10:42 +0300, a ecrit:
> > > We should not need a getter routine, because one can simply inspect the 
> > > target
> > > thread's state (unless, again, I misunderstand things horribly).
> >
> > For 16bit fs/gs values we could read them from userland yes. But for
> > fs/gs base, the FSGSBASE instruction is not available on all 64bit
> > processors. And ATM in THREAD_TCB we want to be able to get the base of
> > another thread.
> 
> What I've meant is:
> 
> __thread_get_state (whatever_thread, &state);
> uintptr_t its_fs_base = state->fs_base;
> 
> You can't really do the same to *write* [fg]s_base, because doing
> thread_set_state on your own thread is bound to end badly.

? Well, sure, just like setting fs/gs through thread state was not done
for i386.

I don't see where you're aiming. Getting fs/gs from __thread_get_state
won't actually give you the base, you'll just read something like 0.

> > > diff --git a/sysdeps/mach/hurd/x86_64/static-start.S 
> > > b/sysdeps/mach/hurd/x86_64/static-start.S
> > > new file mode 100644
> > > index ..982d3d52
> > > --- /dev/null
> > > +++ b/sysdeps/mach/hurd/x86_64/static-start.S
> > > @@ -0,0 +1,27 @@
> > > +/* Type of the TCB.  */
> > > +typedef struct
> > > +{
> > > +  void *tcb; /* Points to this structure.  */
> > > +  dtv_t *dtv;/* Vector of pointers to TLS data.  
> > > */
> > > +  thread_t self; /* This thread's control port.  */
> > > +  int __glibc_padding1;
> > > +  int multiple_threads;
> > > +  int gscope_flag;
> > > +  uintptr_t sysinfo;
> > > +  uintptr_t stack_guard;
> > > +  uintptr_t pointer_guard;
> > > +  long __glibc_padding2[2];
> > > +  int private_futex;
> >
> > ? Isn't that rather feature_1 ?
> 
> sysdeps/mach/hurd/i386/tls.h has 'int private_futex;', which is where
> I stole this from. A quick grep confirms that it's never used,

Yes, this was just to align on the nptl tls.h. But apparently that got
renamed and hurd's tls wasn't updated.

> so we might rename both to feature_1, or maybe another instance of
> __glibc_padding.

Better stay coherent with the nptl version.

> > > +/* GCC generates %fs:0x28 to access the stack guard.  */
> > > +_Static_assert (offsetof (tcbhead_t, stack_guard) == 0x28,
> > > +"stack guard offset");
> > > +/* libgcc uses %fs:0x70 to access the split stack pointer.  */
> > > +_Static_assert (offsetof (tcbhead_t, __private_ss) == 0x70,
> > > +"split stack pointer offset");
> >
> > Indeed. Could you perhaps also add them to the i386 tls.h?
> 
> > > +/* Install new dtv for current thread.  */
> > > +# define INSTALL_NEW_DTV(dtvp) THREAD_SETMEM (THREAD_SELF, dtv, dtvp)
> > > +/* Return the address of the dtv for the current thread.  */
> > > +# define THREAD_DTV() THREAD_GETMEM (THREAD_SELF, dtv)
> >
> > While at it, try to make the i386 version use that too?
> 
> Yeah, I have not ported the improvements back to the 32-bit version;
> maybe I should.

Better always keep things as coherent as possible. Otherwise another you
will later wonder why in the hell we have differences between the two
versions.

> Another cool one is doing fs/gs-relative access using
> GCC's __seg_fs/__seg_gs when supported.

Yes, that's nice indeed!

Samuel

Re: [RFC PATCH glibc 11/12] hurd, htl: Add some x86_64-specific code

2023-02-12 Thread Florian Weimer

* Samuel Thibault via Libc-alpha:

> Sergey Bugaev, le dim. 12 févr. 2023 19:25:11 +0300, a ecrit:
>> On Sun, Feb 12, 2023 at 7:11 PM Samuel Thibault  
>> wrote:
>> > Sergey Bugaev, le dim. 12 févr. 2023 14:10:42 +0300, a ecrit:
>> > > We should not need a getter routine, because one can simply inspect the 
>> > > target
>> > > thread's state (unless, again, I misunderstand things horribly).
>> >
>> > For 16bit fs/gs values we could read them from userland yes. But for
>> > fs/gs base, the FSGSBASE instruction is not available on all 64bit
>> > processors. And ATM in THREAD_TCB we want to be able to get the base of
>> > another thread.
>> 
>> What I've meant is:
>> 
>> __thread_get_state (whatever_thread, &state);
>> uintptr_t its_fs_base = state->fs_base;
>> 
>> You can't really do the same to *write* [fg]s_base, because doing
>> thread_set_state on your own thread is bound to end badly.
>
> ? Well, sure, just like setting fs/gs through thread state was not done
> for i386.
>
> I don't see where you're aiming. Getting fs/gs from __thread_get_state
> won't actually give you the base, you'll just read something like 0.

The convention is that the FSBASE address is at %fs:0.  The x86-64 TLS
ABI assumes this, so that it's possible to compute the global (not
thread-specific) address of a TLS variable.

Re: [RFC PATCH glibc 11/12] hurd, htl: Add some x86_64-specific code

2023-02-12 Thread Samuel Thibault

Florian Weimer, le dim. 12 févr. 2023 17:40:58 +0100, a ecrit:
> * Samuel Thibault via Libc-alpha:
> 
> > Sergey Bugaev, le dim. 12 févr. 2023 19:25:11 +0300, a ecrit:
> >> On Sun, Feb 12, 2023 at 7:11 PM Samuel Thibault  
> >> wrote:
> >> > Sergey Bugaev, le dim. 12 févr. 2023 14:10:42 +0300, a ecrit:
> >> > > We should not need a getter routine, because one can simply inspect 
> >> > > the target
> >> > > thread's state (unless, again, I misunderstand things horribly).
> >> >
> >> > For 16bit fs/gs values we could read them from userland yes. But for
> >> > fs/gs base, the FSGSBASE instruction is not available on all 64bit
> >> > processors. And ATM in THREAD_TCB we want to be able to get the base of
> >> > another thread.
> >> 
> >> What I've meant is:
> >> 
> >> __thread_get_state (whatever_thread, &state);
> >> uintptr_t its_fs_base = state->fs_base;
> >> 
> >> You can't really do the same to *write* [fg]s_base, because doing
> >> thread_set_state on your own thread is bound to end badly.
> >
> > ? Well, sure, just like setting fs/gs through thread state was not done
> > for i386.
> >
> > I don't see where you're aiming. Getting fs/gs from __thread_get_state
> > won't actually give you the base, you'll just read something like 0.
> 
> The convention is that the FSBASE address is at %fs:0.

Yes, but that works only for reading your own base, not the base of
another thread.

Samuel

Re: [RFC PATCH glibc 11/12] hurd, htl: Add some x86_64-specific code

2023-02-12 Thread Sergey Bugaev

On Sun, Feb 12, 2023 at 7:36 PM Samuel Thibault  wrote:
>
> Sergey Bugaev, le dim. 12 févr. 2023 19:25:11 +0300, a ecrit:
> > On Sun, Feb 12, 2023 at 7:11 PM Samuel Thibault  
> > wrote:
> > > Sergey Bugaev, le dim. 12 févr. 2023 14:10:42 +0300, a ecrit:
> > > > We should not need a getter routine, because one can simply inspect the 
> > > > target
> > > > thread's state (unless, again, I misunderstand things horribly).
> > >
> > > For 16bit fs/gs values we could read them from userland yes. But for
> > > fs/gs base, the FSGSBASE instruction is not available on all 64bit
> > > processors. And ATM in THREAD_TCB we want to be able to get the base of
> > > another thread.
> >
> > What I've meant is:
> >
> > __thread_get_state (whatever_thread, &state);
> > uintptr_t its_fs_base = state->fs_base;
> >
> > You can't really do the same to *write* [fg]s_base, because doing
> > thread_set_state on your own thread is bound to end badly.
>
> ? Well, sure, just like setting fs/gs through thread state was not done
> for i386.
>
> I don't see where you're aiming. Getting fs/gs from __thread_get_state
> won't actually give you the base, you'll just read something like 0.

It is my understanding that the actual values of fs/gs (i.e. the index
of a descriptor) are not useful on x86_64. But fs_base and gs_base are
now things that you have to store in the thread state and save/restore
on every context switch. fs_base and gs_base are like registers in
their own right (well, MSRs are registers). Thus, it should be easy to
read them from the state structure exposed by the kernel.

But again, I really have very little understanding of this, so maybe
I'm talking nonsense.

Sergey

Re: [RFC PATCH glibc 11/12] hurd, htl: Add some x86_64-specific code

2023-02-12 Thread Samuel Thibault

Sergey Bugaev, le dim. 12 févr. 2023 19:51:33 +0300, a ecrit:
> On Sun, Feb 12, 2023 at 7:36 PM Samuel Thibault  
> wrote:
> >
> > Sergey Bugaev, le dim. 12 févr. 2023 19:25:11 +0300, a ecrit:
> > > On Sun, Feb 12, 2023 at 7:11 PM Samuel Thibault  
> > > wrote:
> > > > Sergey Bugaev, le dim. 12 févr. 2023 14:10:42 +0300, a ecrit:
> > > > > We should not need a getter routine, because one can simply inspect 
> > > > > the target
> > > > > thread's state (unless, again, I misunderstand things horribly).
> > > >
> > > > For 16bit fs/gs values we could read them from userland yes. But for
> > > > fs/gs base, the FSGSBASE instruction is not available on all 64bit
> > > > processors. And ATM in THREAD_TCB we want to be able to get the base of
> > > > another thread.
> > >
> > > What I've meant is:
> > >
> > > __thread_get_state (whatever_thread, &state);
> > > uintptr_t its_fs_base = state->fs_base;
> > >
> > > You can't really do the same to *write* [fg]s_base, because doing
> > > thread_set_state on your own thread is bound to end badly.
> >
> > ? Well, sure, just like setting fs/gs through thread state was not done
> > for i386.
> >
> > I don't see where you're aiming. Getting fs/gs from __thread_get_state
> > won't actually give you the base, you'll just read something like 0.
> 
> It is my understanding that the actual values of fs/gs (i.e. the index
> of a descriptor) are not useful on x86_64. But fs_base and gs_base are
> now things that you have to store in the thread state and save/restore
> on every context switch. fs_base and gs_base are like registers in
> their own right (well, MSRs are registers). Thus, it should be easy to
> read them from the state structure exposed by the kernel.

Ah, so you mean adding the fs/gs bases to the thread_state content.

I'd rather make it a separate state content, like we have a separate
i386_DEBUG_STATE content.

(And thus no need for a new RPC).

Samuel

[PATCH 4/6] fix rpc time value for 64 bit

2023-02-12 Thread Luca Dariz

* include/mach/task_info.h: use rpc variant of time_value_t
* include/mach/thread_info.h: Likewise
* kern/mach_clock.c: use rpc variant of time_value_t in
  read_time_stamp()
* kern/mach_clock.h: Likewise
* kern/thread.c: use rpc variant of thread_read_times()
* kern/timer.h_ add thread_read_times_rpc() by converting time_value_t
  to the corresponding rpc structures inline.
---
 include/mach/task_info.h   | 10 +-
 include/mach/thread_info.h |  6 +++---
 kern/mach_clock.c  |  2 +-
 kern/mach_clock.h  |  2 +-
 kern/thread.c  |  2 +-
 kern/timer.h   | 12 
 6 files changed, 23 insertions(+), 11 deletions(-)

diff --git a/include/mach/task_info.h b/include/mach/task_info.h
index 3aaa7cd6..f448ee04 100644
--- a/include/mach/task_info.h
+++ b/include/mach/task_info.h
@@ -56,11 +56,11 @@ struct task_basic_info {
integer_t   base_priority;  /* base scheduling priority */
rpc_vm_size_t   virtual_size;   /* number of virtual pages */
rpc_vm_size_t   resident_size;  /* number of resident pages */
-   time_value_tuser_time;  /* total user run time for
+   rpc_time_value_tuser_time;  /* total user run time for
   terminated threads */
-   time_value_tsystem_time;/* total system run time for
+   rpc_time_value_tsystem_time;/* total system run time for
   terminated threads */
-   time_value_tcreation_time;  /* creation time stamp */
+   rpc_time_value_tcreation_time;  /* creation time stamp */
 };
 
 typedef struct task_basic_info task_basic_info_data_t;
@@ -89,9 +89,9 @@ typedef struct task_events_info   
*task_events_info_t;
   only accurate if suspended */
 
 struct task_thread_times_info {
-   time_value_tuser_time;  /* total user run time for
+   rpc_time_value_tuser_time;  /* total user run time for
   live threads */
-   time_value_tsystem_time;/* total system run time for
+   rpc_time_value_tsystem_time;/* total system run time for
   live threads */
 };
 
diff --git a/include/mach/thread_info.h b/include/mach/thread_info.h
index 569c8c84..46c1ceca 100644
--- a/include/mach/thread_info.h
+++ b/include/mach/thread_info.h
@@ -55,8 +55,8 @@ typedef   integer_t   
thread_info_data_t[THREAD_INFO_MAX];
 #define THREAD_BASIC_INFO  1   /* basic information */
 
 struct thread_basic_info {
-   time_value_tuser_time;  /* user run time */
-   time_value_tsystem_time;/* system run time */
+   rpc_time_value_tuser_time;  /* user run time */
+   rpc_time_value_tsystem_time;/* system run time */
integer_t   cpu_usage;  /* scaled cpu usage percentage */
integer_t   base_priority;  /* base scheduling priority */
integer_t   cur_priority;   /* current scheduling priority */
@@ -65,7 +65,7 @@ struct thread_basic_info {
integer_t   suspend_count;  /* suspend count for thread */
integer_t   sleep_time; /* number of seconds that thread
   has been sleeping */
-   time_value_tcreation_time;  /* time stamp of creation */
+   rpc_time_value_tcreation_time;  /* time stamp of creation */
 };
 
 typedef struct thread_basic_info   thread_basic_info_data_t;
diff --git a/kern/mach_clock.c b/kern/mach_clock.c
index 09717d16..ed38c76b 100644
--- a/kern/mach_clock.c
+++ b/kern/mach_clock.c
@@ -429,7 +429,7 @@ record_time_stamp(time_value_t *stamp)
  * real-time clock frame.
  */
 void
-read_time_stamp (const time_value_t *stamp, time_value_t *result)
+read_time_stamp (const time_value_t *stamp, rpc_time_value_t *result)
 {
time_value64_t result64;
TIME_VALUE_TO_TIME_VALUE64(stamp, &result64);
diff --git a/kern/mach_clock.h b/kern/mach_clock.h
index 7e8d3046..9a670011 100644
--- a/kern/mach_clock.h
+++ b/kern/mach_clock.h
@@ -98,7 +98,7 @@ extern void record_time_stamp (time_value_t *stamp);
  * Read a timestamp in STAMP into RESULT.  Returns values in the
  * real-time clock frame.
  */
-extern void read_time_stamp (const time_value_t *stamp, time_value_t *result);
+extern void read_time_stamp (const time_value_t *stamp, rpc_time_value_t 
*result);
 
 extern void mapable_time_init (void);
 
diff --git a/kern/thread.c b/kern/thread.c
index 17cc458c..4a6b9eda 100644
--- a/kern/thread.c
+++ b/kern/thread.c
@@ -1522,7 +1522,7 @@ kern_return_t thread_info(
 
/* fill in info */
 
-   thread_read_times(thread,
+   thread_read_times_rpc(thread,
&basic_info->user_time,
&basic_info->system_time);
basic_info->base_

[PATCH 3/6] add L4 kmem cache for x86_64

2023-02-12 Thread Luca Dariz

* i386/intel/pmap.c: allocate the L4 page table from a dedicate kmem
  cache instead of the generic kernel map.
  Also improve readability of nested ifdef's.
---
 i386/intel/pmap.c | 34 +++---
 1 file changed, 19 insertions(+), 15 deletions(-)

diff --git a/i386/intel/pmap.c b/i386/intel/pmap.c
index ccbb03fc..615b0fff 100644
--- a/i386/intel/pmap.c
+++ b/i386/intel/pmap.c
@@ -397,13 +397,14 @@ boolean_t cpu_update_needed[NCPUS];
 struct pmapkernel_pmap_store;
 pmap_t kernel_pmap;
 
-struct kmem_cache  pmap_cache; /* cache of pmap structures */
-struct kmem_cache  pd_cache;   /* cache of page directories */
+struct kmem_cache pmap_cache;  /* cache of pmap structures */
+struct kmem_cache pd_cache;/* cache of page directories */
 #if PAE
-struct kmem_cache  pdpt_cache; /* cache of page
-  directory pointer
-  tables */
-#endif
+struct kmem_cache pdpt_cache;  /* cache of page directory pointer tables */
+#ifdef __x86_64__
+struct kmem_cache l4_cache;/* cache of L4 tables */
+#endif /* __x86_64__ */
+#endif /* PAE */
 
 boolean_t  pmap_debug = FALSE; /* flag for debugging prints */
 
@@ -1046,7 +1047,12 @@ void pmap_init(void)
kmem_cache_init(&pdpt_cache, "pdpt",
INTEL_PGBYTES, INTEL_PGBYTES, NULL,
KMEM_CACHE_PHYSMEM);
-#endif
+#ifdef __x86_64__
+   kmem_cache_init(&l4_cache, "L4",
+   INTEL_PGBYTES, INTEL_PGBYTES, NULL,
+   KMEM_CACHE_PHYSMEM);
+#endif /* __x86_64__ */
+#endif /* PAE */
s = (vm_size_t) sizeof(struct pv_entry);
kmem_cache_init(&pv_list_cache, "pv_entry", s, 0, NULL, 0);
 
@@ -1287,10 +1293,8 @@ pmap_t pmap_create(vm_size_t size)
  );
}
 #ifdef __x86_64__
-   // FIXME: use kmem_cache_alloc instead
-   if (kmem_alloc_wired(kernel_map,
-(vm_offset_t *)&p->l4base, INTEL_PGBYTES)
-   != KERN_SUCCESS)
+   p->l4base = (pt_entry_t *) kmem_cache_alloc(&l4_cache);
+   if (p->l4base == NULL)
panic("pmap_create");
memset(p->l4base, 0, INTEL_PGBYTES);
WRITE_PTE(&p->l4base[0], pa_to_pte(kvtophys((vm_offset_t) p->pdpbase)) 
| INTEL_PTE_VALID | INTEL_PTE_WRITE | INTEL_PTE_USER);
@@ -1426,16 +1430,16 @@ void pmap_destroy(pmap_t p)
pmap_set_page_readwrite(p->l4base);
pmap_set_page_readwrite(p->user_l4base);
pmap_set_page_readwrite(p->user_pdpbase);
-#endif
+#endif /* __x86_64__ */
pmap_set_page_readwrite(p->pdpbase);
 #endif /* MACH_PV_PAGETABLES */
 #ifdef __x86_64__
-   kmem_free(kernel_map, (vm_offset_t)p->l4base, INTEL_PGBYTES);
+kmem_cache_free(&l4_cache, (vm_offset_t) p->l4base);
 #ifdef MACH_PV_PAGETABLES
kmem_free(kernel_map, (vm_offset_t)p->user_l4base, INTEL_PGBYTES);
kmem_free(kernel_map, (vm_offset_t)p->user_pdpbase, INTEL_PGBYTES);
-#endif
-#endif
+#endif /* MACH_PV_PAGETABLES */
+#endif /* __x86_64__ */
kmem_cache_free(&pdpt_cache, (vm_offset_t) p->pdpbase);
 #endif /* PAE */
kmem_cache_free(&pmap_cache, (vm_offset_t) p);
-- 
2.30.2

[PATCH gnumach 0/6] minor fixes and las 32on64 compatibility issue

2023-02-12 Thread Luca Dariz

This series contains some minor fixes
  set unused members of thread state to 0
  fix hardcoded physical address
  add L4 kmem cache for x86_64
and the last two rpc compatibility issues
  fix rpc time value for 64 bit
  fix port name size in notifications
then at this stage it seems fine to enable syscall.

I tested this by booting a ramdisk (manually adding debian patches),
and the system can boot until reaching a shell. User-space drivers do
not work yet (intr rpc are not yet implemented for x86_64) and there
are various strange things, for example:
* the reported memory amount is not accurate, e.g. for 8G I see only
  1GB of available memory, with 3GB of free memory. This might br
  related with the memory amount type fixed to 32-bit in host_info().
* the startup task fails to set the args vector on the kernel task
  (invalid address), but this doesn't seem fatal

There could still be issues with internal mach devices, I didn't test
them thoroughly.

Luca Dariz (6):
  set unused members of thread state to 0
  fix hardcoded physical address
  add L4 kmem cache for x86_64
  fix rpc time value for 64 bit
  fix port name size in notifications
  enable syscalls on x86_64

 i386/i386/pcb.c|  1 +
 i386/i386at/com.c  |  2 +-
 i386/intel/pmap.c  | 34 +++---
 include/mach/task_info.h   | 10 +-
 include/mach/thread_info.h |  6 +++---
 ipc/ipc_machdep.h  |  1 +
 ipc/ipc_notify.c   |  8 
 kern/mach_clock.c  |  2 +-
 kern/mach_clock.h  |  2 +-
 kern/thread.c  |  2 +-
 kern/timer.h   | 12 
 x86_64/locore.S|  3 ---
 12 files changed, 49 insertions(+), 34 deletions(-)

-- 
2.30.2

[PATCH 6/6] enable syscalls on x86_64

2023-02-12 Thread Luca Dariz

Signed-off-by: Luca Dariz 
---
 x86_64/locore.S | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/x86_64/locore.S b/x86_64/locore.S
index 5b9c8ef4..95ece3cc 100644
--- a/x86_64/locore.S
+++ b/x86_64/locore.S
@@ -1075,9 +1075,6 @@ syscall_entry_2:
pushq   %rax/* save system call number */
pushq   $0  /* clear trap number slot */
 
-// TODO: test it before dropping ud2
-   ud2
-
pusha   /* save the general registers */
movq%ds,%rdx/* and the segment registers */
pushq   %rdx
-- 
2.30.2

[PATCH 1/6] set unused members of thread state to 0

2023-02-12 Thread Luca Dariz

* i386/i386/pcb.c: always set esp to 0, it  seems unused.
---
 i386/i386/pcb.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/i386/i386/pcb.c b/i386/i386/pcb.c
index 9ac55a1c..924ed08b 100644
--- a/i386/i386/pcb.c
+++ b/i386/i386/pcb.c
@@ -706,6 +706,7 @@ kern_return_t thread_getstatus(
state->eip = saved_state->eip;
state->efl = saved_state->efl;
state->uesp = saved_state->uesp;
+   state->esp = 0;  /* unused */
 
state->cs = saved_state->cs;
state->ss = saved_state->ss;
-- 
2.30.2

[PATCH 2/6] fix hardcoded physical address

2023-02-12 Thread Luca Dariz

* i386/i386at/com.c use proper helper to convert physical to virtual
  address.
---
 i386/i386at/com.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/i386/i386at/com.c b/i386/i386at/com.c
index 000475db..de21206c 100644
--- a/i386/i386at/com.c
+++ b/i386/i386at/com.c
@@ -276,7 +276,7 @@ comcninit(struct consdev *cp)
 
{
charmsg[128];
-   volatile unsigned char *p = (volatile unsigned char *)0xb8000;
+   volatile unsigned char *p = (volatile unsigned char 
*)phystokv(0xb8000);
int i;
 
sprintf(msg, " using COM port %d for console ",
-- 
2.30.2

[PATCH 5/6] fix port name size in notifications

2023-02-12 Thread Luca Dariz

* ipc/ipc_machdep.h: define PORT_NAME_T_SIZE_IN_BITS
* ipc/ipc_notify.c: fix port name size in notification message
  templates
---
 ipc/ipc_machdep.h | 1 +
 ipc/ipc_notify.c  | 8 
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/ipc/ipc_machdep.h b/ipc/ipc_machdep.h
index 29878dc9..2871fc31 100755
--- a/ipc/ipc_machdep.h
+++ b/ipc/ipc_machdep.h
@@ -34,5 +34,6 @@
  */
 
 #define PORT_T_SIZE_IN_BITS (sizeof(mach_port_t)*8)
+#define PORT_NAME_T_SIZE_IN_BITS (sizeof(mach_port_name_t)*8)
 
 #endif /* _IPC_IPC_MACHDEP_H_ */
diff --git a/ipc/ipc_notify.c b/ipc/ipc_notify.c
index eea60116..d0b71cf2 100644
--- a/ipc/ipc_notify.c
+++ b/ipc/ipc_notify.c
@@ -72,7 +72,7 @@ ipc_notify_init_port_deleted(mach_port_deleted_notification_t 
*n)
m->msgh_id = MACH_NOTIFY_PORT_DELETED;
 
t->msgt_name = MACH_MSG_TYPE_PORT_NAME;
-   t->msgt_size = PORT_T_SIZE_IN_BITS;
+   t->msgt_size = PORT_NAME_T_SIZE_IN_BITS;
t->msgt_number = 1;
t->msgt_inline = TRUE;
t->msgt_longform = FALSE;
@@ -102,7 +102,7 @@ 
ipc_notify_init_msg_accepted(mach_msg_accepted_notification_t *n)
m->msgh_id = MACH_NOTIFY_MSG_ACCEPTED;
 
t->msgt_name = MACH_MSG_TYPE_PORT_NAME;
-   t->msgt_size = PORT_T_SIZE_IN_BITS;
+   t->msgt_size = PORT_NAME_T_SIZE_IN_BITS;
t->msgt_number = 1;
t->msgt_inline = TRUE;
t->msgt_longform = FALSE;
@@ -164,7 +164,7 @@ ipc_notify_init_no_senders(
m->msgh_id = MACH_NOTIFY_NO_SENDERS;
 
t->msgt_name = MACH_MSG_TYPE_INTEGER_32;
-   t->msgt_size = PORT_T_SIZE_IN_BITS;
+   t->msgt_size = 32;
t->msgt_number = 1;
t->msgt_inline = TRUE;
t->msgt_longform = FALSE;
@@ -215,7 +215,7 @@ ipc_notify_init_dead_name(
m->msgh_id = MACH_NOTIFY_DEAD_NAME;
 
t->msgt_name = MACH_MSG_TYPE_PORT_NAME;
-   t->msgt_size = PORT_T_SIZE_IN_BITS;
+   t->msgt_size = PORT_NAME_T_SIZE_IN_BITS;
t->msgt_number = 1;
t->msgt_inline = TRUE;
t->msgt_longform = FALSE;
-- 
2.30.2

Re: [RFC PATCH glibc 11/12] hurd, htl: Add some x86_64-specific code

2023-02-12 Thread Sergey Bugaev

On Sun, Feb 12, 2023 at 8:02 PM Samuel Thibault  wrote:
> I'd rather make it a separate state content, like we have a separate
> i386_DEBUG_STATE content.
>
> (And thus no need for a new RPC).

Makes sense to me! 👍

Sergey

Re: [PATCH 1/6] set unused members of thread state to 0

2023-02-12 Thread Samuel Thibault

Better avoid uninitialized data indeed, applied, thanks!

Luca Dariz, le dim. 12 févr. 2023 18:03:08 +0100, a ecrit:
> * i386/i386/pcb.c: always set esp to 0, it  seems unused.
> ---
>  i386/i386/pcb.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/i386/i386/pcb.c b/i386/i386/pcb.c
> index 9ac55a1c..924ed08b 100644
> --- a/i386/i386/pcb.c
> +++ b/i386/i386/pcb.c
> @@ -706,6 +706,7 @@ kern_return_t thread_getstatus(
>   state->eip = saved_state->eip;
>   state->efl = saved_state->efl;
>   state->uesp = saved_state->uesp;
> + state->esp = 0;  /* unused */
>  
>   state->cs = saved_state->cs;
>   state->ss = saved_state->ss;
> -- 
> 2.30.2
> 
> 

-- 
Samuel
---
Pour une évaluation indépendante, transparente et rigoureuse !
Je soutiens la Commission d'Évaluation de l'Inria.

Re: [PATCH 2/6] fix hardcoded physical address

2023-02-12 Thread Samuel Thibault

Applied, thanks!

Luca Dariz, le dim. 12 févr. 2023 18:03:09 +0100, a ecrit:
> * i386/i386at/com.c use proper helper to convert physical to virtual
>   address.
> ---
>  i386/i386at/com.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/i386/i386at/com.c b/i386/i386at/com.c
> index 000475db..de21206c 100644
> --- a/i386/i386at/com.c
> +++ b/i386/i386at/com.c
> @@ -276,7 +276,7 @@ comcninit(struct consdev *cp)
>  
>   {
>   charmsg[128];
> - volatile unsigned char *p = (volatile unsigned char *)0xb8000;
> + volatile unsigned char *p = (volatile unsigned char 
> *)phystokv(0xb8000);
>   int i;
>  
>   sprintf(msg, " using COM port %d for console ",
> -- 
> 2.30.2
> 
> 

-- 
Samuel
---
Pour une évaluation indépendante, transparente et rigoureuse !
Je soutiens la Commission d'Évaluation de l'Inria.

Re: [PATCH 3/6] add L4 kmem cache for x86_64

2023-02-12 Thread Samuel Thibault

Applied, thanks!

Luca Dariz, le dim. 12 févr. 2023 18:03:10 +0100, a ecrit:
> * i386/intel/pmap.c: allocate the L4 page table from a dedicate kmem
>   cache instead of the generic kernel map.
>   Also improve readability of nested ifdef's.
> ---
>  i386/intel/pmap.c | 34 +++---
>  1 file changed, 19 insertions(+), 15 deletions(-)
> 
> diff --git a/i386/intel/pmap.c b/i386/intel/pmap.c
> index ccbb03fc..615b0fff 100644
> --- a/i386/intel/pmap.c
> +++ b/i386/intel/pmap.c
> @@ -397,13 +397,14 @@ boolean_t   cpu_update_needed[NCPUS];
>  struct pmap  kernel_pmap_store;
>  pmap_t   kernel_pmap;
>  
> -struct kmem_cachepmap_cache; /* cache of pmap structures */
> -struct kmem_cachepd_cache;   /* cache of page directories */
> +struct kmem_cache pmap_cache;  /* cache of pmap structures */
> +struct kmem_cache pd_cache;/* cache of page directories */
>  #if PAE
> -struct kmem_cachepdpt_cache; /* cache of page
> -directory pointer
> -tables */
> -#endif
> +struct kmem_cache pdpt_cache;  /* cache of page directory pointer tables */
> +#ifdef __x86_64__
> +struct kmem_cache l4_cache;/* cache of L4 tables */
> +#endif /* __x86_64__ */
> +#endif /* PAE */
>  
>  boolean_tpmap_debug = FALSE; /* flag for debugging prints */
>  
> @@ -1046,7 +1047,12 @@ void pmap_init(void)
>   kmem_cache_init(&pdpt_cache, "pdpt",
>   INTEL_PGBYTES, INTEL_PGBYTES, NULL,
>   KMEM_CACHE_PHYSMEM);
> -#endif
> +#ifdef __x86_64__
> + kmem_cache_init(&l4_cache, "L4",
> + INTEL_PGBYTES, INTEL_PGBYTES, NULL,
> + KMEM_CACHE_PHYSMEM);
> +#endif /* __x86_64__ */
> +#endif /* PAE */
>   s = (vm_size_t) sizeof(struct pv_entry);
>   kmem_cache_init(&pv_list_cache, "pv_entry", s, 0, NULL, 0);
>  
> @@ -1287,10 +1293,8 @@ pmap_t pmap_create(vm_size_t size)
> );
>   }
>  #ifdef __x86_64__
> - // FIXME: use kmem_cache_alloc instead
> - if (kmem_alloc_wired(kernel_map,
> -  (vm_offset_t *)&p->l4base, INTEL_PGBYTES)
> - != KERN_SUCCESS)
> + p->l4base = (pt_entry_t *) kmem_cache_alloc(&l4_cache);
> + if (p->l4base == NULL)
>   panic("pmap_create");
>   memset(p->l4base, 0, INTEL_PGBYTES);
>   WRITE_PTE(&p->l4base[0], pa_to_pte(kvtophys((vm_offset_t) p->pdpbase)) 
> | INTEL_PTE_VALID | INTEL_PTE_WRITE | INTEL_PTE_USER);
> @@ -1426,16 +1430,16 @@ void pmap_destroy(pmap_t p)
>   pmap_set_page_readwrite(p->l4base);
>   pmap_set_page_readwrite(p->user_l4base);
>   pmap_set_page_readwrite(p->user_pdpbase);
> -#endif
> +#endif /* __x86_64__ */
>   pmap_set_page_readwrite(p->pdpbase);
>  #endif   /* MACH_PV_PAGETABLES */
>  #ifdef __x86_64__
> - kmem_free(kernel_map, (vm_offset_t)p->l4base, INTEL_PGBYTES);
> +kmem_cache_free(&l4_cache, (vm_offset_t) p->l4base);
>  #ifdef MACH_PV_PAGETABLES
>   kmem_free(kernel_map, (vm_offset_t)p->user_l4base, INTEL_PGBYTES);
>   kmem_free(kernel_map, (vm_offset_t)p->user_pdpbase, INTEL_PGBYTES);
> -#endif
> -#endif
> +#endif /* MACH_PV_PAGETABLES */
> +#endif /* __x86_64__ */
>   kmem_cache_free(&pdpt_cache, (vm_offset_t) p->pdpbase);
>  #endif   /* PAE */
>   kmem_cache_free(&pmap_cache, (vm_offset_t) p);
> -- 
> 2.30.2
> 
> 

-- 
Samuel
---
Pour une évaluation indépendante, transparente et rigoureuse !
Je soutiens la Commission d'Évaluation de l'Inria.

[PATCH 6/9] add more explicit names for user space virtual space limits

2023-02-12 Thread Luca Dariz

* i386/i386/vm_param.h: add VM_MAX/MIN_USER_ADDRESS to kernel headers.
* i386/i386/db_interface.c
* i386/i386/ldt.c
* i386/i386/pcb.c
* i386/intel/pmap.c
* kern/task.c: replace VM_MAX/MIN_ADDRESS with VM_MAX/MIN_USER_ADDRESS
---
 i386/i386/db_interface.c |  4 ++--
 i386/i386/ldt.c  |  8 
 i386/i386/pcb.c  |  6 +++---
 i386/i386/vm_param.h |  6 +-
 i386/intel/pmap.c| 18 +-
 kern/task.c  |  4 ++--
 6 files changed, 25 insertions(+), 21 deletions(-)

diff --git a/i386/i386/db_interface.c b/i386/i386/db_interface.c
index 3a331490..8e2dc6a2 100644
--- a/i386/i386/db_interface.c
+++ b/i386/i386/db_interface.c
@@ -119,8 +119,8 @@ kern_return_t db_set_debug_state(
int i;
 
for (i = 0; i <= 3; i++)
-   if (state->dr[i] < VM_MIN_ADDRESS
-|| state->dr[i] >= VM_MAX_ADDRESS)
+   if (state->dr[i] < VM_MIN_USER_ADDRESS
+|| state->dr[i] >= VM_MAX_USER_ADDRESS)
return KERN_INVALID_ARGUMENT;
 
pcb->ims.ids = *state;
diff --git a/i386/i386/ldt.c b/i386/i386/ldt.c
index 3f9ac8ff..70fa24e2 100644
--- a/i386/i386/ldt.c
+++ b/i386/i386/ldt.c
@@ -64,13 +64,13 @@ ldt_fill(struct real_descriptor *myldt, struct 
real_descriptor *mygdt)
  (vm_offset_t)&syscall, KERNEL_CS,
  ACC_PL_U|ACC_CALL_GATE, 0);
fill_ldt_descriptor(myldt, USER_CS,
-   VM_MIN_ADDRESS,
-   VM_MAX_ADDRESS-VM_MIN_ADDRESS-4096,
+   VM_MIN_USER_ADDRESS,
+   VM_MAX_USER_ADDRESS-VM_MIN_USER_ADDRESS-4096,
/* XXX LINEAR_... */
ACC_PL_U|ACC_CODE_R, SZ_32);
fill_ldt_descriptor(myldt, USER_DS,
-   VM_MIN_ADDRESS,
-   VM_MAX_ADDRESS-VM_MIN_ADDRESS-4096,
+   VM_MIN_USER_ADDRESS,
+   VM_MAX_USER_ADDRESS-VM_MIN_USER_ADDRESS-4096,
ACC_PL_U|ACC_DATA_W, SZ_32);
 
/* Activate the LDT.  */
diff --git a/i386/i386/pcb.c b/i386/i386/pcb.c
index 924ed08b..3ae9e095 100644
--- a/i386/i386/pcb.c
+++ b/i386/i386/pcb.c
@@ -622,10 +622,10 @@ kern_return_t thread_setstatus(
int_table = state->int_table;
int_count = state->int_count;
 
-   if (int_table >= VM_MAX_ADDRESS ||
+   if (int_table >= VM_MAX_USER_ADDRESS ||
int_table +
int_count * sizeof(struct v86_interrupt_table)
-   > VM_MAX_ADDRESS)
+   > VM_MAX_USER_ADDRESS)
return KERN_INVALID_ARGUMENT;
 
thread->pcb->ims.v86s.int_table = int_table;
@@ -834,7 +834,7 @@ thread_set_syscall_return(
 vm_offset_t
 user_stack_low(vm_size_t stack_size)
 {
-   return (VM_MAX_ADDRESS - stack_size);
+   return (VM_MAX_USER_ADDRESS - stack_size);
 }
 
 /*
diff --git a/i386/i386/vm_param.h b/i386/i386/vm_param.h
index 314fdb35..5e7f149a 100644
--- a/i386/i386/vm_param.h
+++ b/i386/i386/vm_param.h
@@ -31,6 +31,10 @@
 #include 
 #endif
 
+/* To avoid ambiguity in kernel code, make the name explicit */
+#define VM_MIN_USER_ADDRESS VM_MIN_ADDRESS
+#define VM_MAX_USER_ADDRESS VM_MAX_ADDRESS
+
 /* The kernel address space is usually 1GB, usually starting at virtual 
address 0.  */
 /* This can be changed freely to separate kernel addresses from user addresses
  * for better trace support in kdb; the _START symbol has to be offset by the
@@ -77,7 +81,7 @@
 #else
 /* On x86, the kernel virtual address space is actually located
at high linear addresses. */
-#define LINEAR_MIN_KERNEL_ADDRESS  (VM_MAX_ADDRESS)
+#define LINEAR_MIN_KERNEL_ADDRESS  (VM_MAX_USER_ADDRESS)
 #define LINEAR_MAX_KERNEL_ADDRESS  (0xUL)
 #endif
 
diff --git a/i386/intel/pmap.c b/i386/intel/pmap.c
index 9e9f91db..a9ff6f3e 100644
--- a/i386/intel/pmap.c
+++ b/i386/intel/pmap.c
@@ -1341,7 +1341,7 @@ pmap_t pmap_create(vm_size_t size)
  );
}
 #ifdef __x86_64__
-   // TODO alloc only PDPTE for the user range VM_MIN_ADDRESS, 
VM_MAX_ADDRESS
+   // TODO alloc only PDPTE for the user range VM_MIN_USER_ADDRESS, 
VM_MAX_USER_ADDRESS
// and keep the same for kernel range, in l4 table we have different 
entries
p->l4base = (pt_entry_t *) kmem_cache_alloc(&l4_cache);
if (p->l4base == NULL)
@@ -1349,7 +1349,7 @@ pmap_t pmap_create(vm_size_t size)
memset(p->l4base, 0, INTEL_PGBYTES);
WRITE_PTE(&p->l4base[lin2l4num(VM_MIN_KERNEL_ADDRESS)],
  pa_to_pte(kvtophys((vm_offset_t) pdp_kernel)) | 
INTEL_PTE_VALID | INTEL_PTE_WRITE | INTEL_PTE_USER);
-#if lin2l4num(VM_MIN_KERNEL_ADDRESS) != lin2l4num(VM_MAX_ADDRESS)
+#if lin2l4num(VM_MIN_KERNEL_ADDRESS) != lin2l4num(VM_MAX_USER_ADDRESS)
// TODO k

[PATCH 2/9] fix x86_64 asm for higher kernel addresses

2023-02-12 Thread Luca Dariz

* x86_64/interrupt.S: use 64-bit registers as variables could be
  stored at high addresses
* x86_64/locore.S: Likewise
---
 x86_64/interrupt.S | 4 ++--
 x86_64/locore.S| 6 ++
 2 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/x86_64/interrupt.S b/x86_64/interrupt.S
index fe2b3858..31f386ec 100644
--- a/x86_64/interrupt.S
+++ b/x86_64/interrupt.S
@@ -59,10 +59,10 @@ ENTRY(interrupt)
 
movlS_IRQ,%eax  /* copy irq number */
shll$2,%eax /* irq * 4 */
-   movlEXT(iunit)(%eax),%edi   /* get device unit number as 1st arg */
+   movlEXT(iunit)(%rax),%edi   /* get device unit number as 1st arg */
 
shll$1,%eax /* irq * 8 */
-   call*EXT(ivect)(%eax)   /* call interrupt handler */
+   call*EXT(ivect)(%rax)   /* call interrupt handler */
 
movlS_IPL,%edi  /* restore previous ipl */
callsplx_cli/* restore previous ipl */
diff --git a/x86_64/locore.S b/x86_64/locore.S
index 95ece3cc..c54b5cd8 100644
--- a/x86_64/locore.S
+++ b/x86_64/locore.S
@@ -1152,7 +1152,7 @@ syscall_native:
 #endif
shll$5,%eax /* manual indexing of mach_trap_t */
xorq%r10,%r10
-   movlEXT(mach_trap_table)(%eax),%r10d
+   mov EXT(mach_trap_table)(%rax),%r10
/* get number of arguments */
andq%r10,%r10
jz  mach_call_call  /* skip argument copy if none */
@@ -1199,9 +1199,7 @@ mach_call_call:
/* will return with syscallofs still (or again) in eax */
 0:
 #endif /* DEBUG */
-
-   call*EXT(mach_trap_table)+8(%eax)
-   /* call procedure */
+   call*EXT(mach_trap_table)+8(%rax)  /* call procedure */
movq%rsp,%rcx   /* get kernel stack */
or  $(KERNEL_STACK_SIZE-1),%rcx
movq-7-IKS_SIZE(%rcx),%rsp  /* switch back to PCB stack */
-- 
2.30.2

[PATCH 9/9] move kernel virtual address space to upper addresses

2023-02-12 Thread Luca Dariz

* i386/i386/vm_param.h: adjust constants to the new kernel map
  - the boothdr.S code already sets up a temporary map to higher
addresses, so we can use INIT_VM_MIN_KERNEL_ADDRESS as in xen
  - increase the kernel map size to accomodate for bigger structures
and more memory
  - adjust kernel max address and directmap limit
* i386/i386at/biosmem.c: enable directmap check also on x86_64
* i386/include/mach/i386/vm_param.h: increase user virtual memory
  limit as it's not conflicting with the kernel's anymore
* i386/intel/pmap.h: adjust lin2pdenum_cont() and INTEL_PTE_PFN to the
  new kernel map
* x86_64/Makefrag.am: change KERNEL_MAP_BASE to be above 4G, and
  according to mcmodel=kernel. This will allow to use the full memory
  address space.
---
 i386/i386/vm_param.h  | 20 
 i386/i386at/biosmem.c |  2 --
 i386/include/mach/i386/vm_param.h |  2 +-
 i386/intel/pmap.h | 12 ++--
 x86_64/Makefrag.am| 12 ++--
 5 files changed, 33 insertions(+), 15 deletions(-)

diff --git a/i386/i386/vm_param.h b/i386/i386/vm_param.h
index c2e623a6..8264ea11 100644
--- a/i386/i386/vm_param.h
+++ b/i386/i386/vm_param.h
@@ -45,7 +45,7 @@
 #define VM_MIN_KERNEL_ADDRESS  0xC000UL
 #endif
 
-#ifdef MACH_XEN
+#if defined(MACH_XEN) || defined (__x86_64__)
 /* PV kernels can be loaded directly to the target virtual address */
 #define INIT_VM_MIN_KERNEL_ADDRESS VM_MIN_KERNEL_ADDRESS
 #else  /* MACH_XEN */
@@ -72,12 +72,22 @@
  * Reserve mapping room for the kernel map, which includes
  * the device I/O map and the IPC map.
  */
+#ifdef __x86_64__
+/*
+ * Vm structures are quite bigger on 64 bit.
+ * This should be well enough for 8G of physical memory; on the other hand,
+ * maybe not all of them need to be in directly-mapped memory, see the parts
+ * allocated with pmap_steal_memory().
+ */
+#define VM_KERNEL_MAP_SIZE (512 * 1024 * 1024)
+#else
 #define VM_KERNEL_MAP_SIZE (152 * 1024 * 1024)
+#endif
 
 /* This is the kernel address range in linear addresses.  */
 #ifdef __x86_64__
 #define LINEAR_MIN_KERNEL_ADDRESS  VM_MIN_KERNEL_ADDRESS
-#define LINEAR_MAX_KERNEL_ADDRESS  (0xUL)
+#define LINEAR_MAX_KERNEL_ADDRESS  (0xUL)
 #else
 /* On x86, the kernel virtual address space is actually located
at high linear addresses. */
@@ -141,8 +151,10 @@
 #else /* MACH_XEN */
 #ifdef __LP64__
 #define VM_PAGE_MAX_SEGS 4
-#define VM_PAGE_DMA32_LIMIT DECL_CONST(0x1, UL)
-#define VM_PAGE_DIRECTMAP_LIMIT DECL_CONST(0x4000, UL)
+#define VM_PAGE_DMA32_LIMIT DECL_CONST(0x1000, UL)
+#define VM_PAGE_DIRECTMAP_LIMIT (VM_MAX_KERNEL_ADDRESS \
+- VM_MIN_KERNEL_ADDRESS \
+- VM_KERNEL_MAP_SIZE + 1)
 #define VM_PAGE_HIGHMEM_LIMIT   DECL_CONST(0x10, UL)
 #else /* __LP64__ */
 #define VM_PAGE_DIRECTMAP_LIMIT (VM_MAX_KERNEL_ADDRESS \
diff --git a/i386/i386at/biosmem.c b/i386/i386at/biosmem.c
index 78e7bb21..880989fe 100644
--- a/i386/i386at/biosmem.c
+++ b/i386/i386at/biosmem.c
@@ -637,10 +637,8 @@ biosmem_setup_allocator(const struct multiboot_raw_info 
*mbi)
  */
 end = vm_page_trunc((mbi->mem_upper + 1024) << 10);
 
-#ifndef __LP64__
 if (end > VM_PAGE_DIRECTMAP_LIMIT)
 end = VM_PAGE_DIRECTMAP_LIMIT;
-#endif /* __LP64__ */
 
 max_heap_start = 0;
 max_heap_end = 0;
diff --git a/i386/include/mach/i386/vm_param.h 
b/i386/include/mach/i386/vm_param.h
index a684ed97..e98f032c 100644
--- a/i386/include/mach/i386/vm_param.h
+++ b/i386/include/mach/i386/vm_param.h
@@ -74,7 +74,7 @@
*/
 #define VM_MIN_ADDRESS (0)
 #ifdef __x86_64__
-#define VM_MAX_ADDRESS (0x4000UL)
+#define VM_MAX_ADDRESS (0xC000UL)
 #else
 #define VM_MAX_ADDRESS (0xc000UL)
 #endif
diff --git a/i386/intel/pmap.h b/i386/intel/pmap.h
index 34c7cc89..78d27bc8 100644
--- a/i386/intel/pmap.h
+++ b/i386/intel/pmap.h
@@ -77,10 +77,10 @@ typedef phys_addr_t pt_entry_t;
 #define PDPNUM_KERNEL  (((VM_MAX_KERNEL_ADDRESS - VM_MIN_KERNEL_ADDRESS) >> 
PDPSHIFT) + 1)
 #define PDPNUM_USER(((VM_MAX_USER_ADDRESS - VM_MIN_USER_ADDRESS) >> 
PDPSHIFT) + 1)
 #define PDPMASK0x1ff   /* mask for page directory pointer 
index */
-#else
+#else /* __x86_64__ */
 #define PDPNUM 4   /* number of page directory pointers */
 #define PDPMASK3   /* mask for page directory pointer 
index */
-#endif
+#endif /* __x86_64__ */
 #define PDPSHIFT   30  /* page directory pointer */
 #define PDESHIFT   21  /* page descriptor shift */
 #define PDEMASK0x1ff   /* mask for page descriptor index */
@@ -109,7 +109,11 @@ typedef phys_addr_t pt_entry_t;
 #if PAE
 /* Special version assuming contiguous page directories.  Making it
include the page directory pointer table index too.  */
+#ifdef __x86_64__
+#define lin2pdenum_cont(a) (((a

Re: [PATCH 4/6] fix rpc time value for 64 bit

2023-02-12 Thread Samuel Thibault

Applied, thanks!

Luca Dariz, le dim. 12 févr. 2023 18:03:11 +0100, a ecrit:
> * include/mach/task_info.h: use rpc variant of time_value_t
> * include/mach/thread_info.h: Likewise
> * kern/mach_clock.c: use rpc variant of time_value_t in
>   read_time_stamp()
> * kern/mach_clock.h: Likewise
> * kern/thread.c: use rpc variant of thread_read_times()
> * kern/timer.h_ add thread_read_times_rpc() by converting time_value_t
>   to the corresponding rpc structures inline.
> ---
>  include/mach/task_info.h   | 10 +-
>  include/mach/thread_info.h |  6 +++---
>  kern/mach_clock.c  |  2 +-
>  kern/mach_clock.h  |  2 +-
>  kern/thread.c  |  2 +-
>  kern/timer.h   | 12 
>  6 files changed, 23 insertions(+), 11 deletions(-)
> 
> diff --git a/include/mach/task_info.h b/include/mach/task_info.h
> index 3aaa7cd6..f448ee04 100644
> --- a/include/mach/task_info.h
> +++ b/include/mach/task_info.h
> @@ -56,11 +56,11 @@ struct task_basic_info {
>   integer_t   base_priority;  /* base scheduling priority */
>   rpc_vm_size_t   virtual_size;   /* number of virtual pages */
>   rpc_vm_size_t   resident_size;  /* number of resident pages */
> - time_value_tuser_time;  /* total user run time for
> + rpc_time_value_tuser_time;  /* total user run time for
>  terminated threads */
> - time_value_tsystem_time;/* total system run time for
> + rpc_time_value_tsystem_time;/* total system run time for
>  terminated threads */
> - time_value_tcreation_time;  /* creation time stamp */
> + rpc_time_value_tcreation_time;  /* creation time stamp */
>  };
>  
>  typedef struct task_basic_info   task_basic_info_data_t;
> @@ -89,9 +89,9 @@ typedef struct task_events_info 
> *task_events_info_t;
>  only accurate if suspended */
>  
>  struct task_thread_times_info {
> - time_value_tuser_time;  /* total user run time for
> + rpc_time_value_tuser_time;  /* total user run time for
>  live threads */
> - time_value_tsystem_time;/* total system run time for
> + rpc_time_value_tsystem_time;/* total system run time for
>  live threads */
>  };
>  
> diff --git a/include/mach/thread_info.h b/include/mach/thread_info.h
> index 569c8c84..46c1ceca 100644
> --- a/include/mach/thread_info.h
> +++ b/include/mach/thread_info.h
> @@ -55,8 +55,8 @@ typedef integer_t   
> thread_info_data_t[THREAD_INFO_MAX];
>  #define THREAD_BASIC_INFO1   /* basic information */
>  
>  struct thread_basic_info {
> - time_value_tuser_time;  /* user run time */
> - time_value_tsystem_time;/* system run time */
> + rpc_time_value_tuser_time;  /* user run time */
> + rpc_time_value_tsystem_time;/* system run time */
>   integer_t   cpu_usage;  /* scaled cpu usage percentage */
>   integer_t   base_priority;  /* base scheduling priority */
>   integer_t   cur_priority;   /* current scheduling priority */
> @@ -65,7 +65,7 @@ struct thread_basic_info {
>   integer_t   suspend_count;  /* suspend count for thread */
>   integer_t   sleep_time; /* number of seconds that thread
>  has been sleeping */
> - time_value_tcreation_time;  /* time stamp of creation */
> + rpc_time_value_tcreation_time;  /* time stamp of creation */
>  };
>  
>  typedef struct thread_basic_info thread_basic_info_data_t;
> diff --git a/kern/mach_clock.c b/kern/mach_clock.c
> index 09717d16..ed38c76b 100644
> --- a/kern/mach_clock.c
> +++ b/kern/mach_clock.c
> @@ -429,7 +429,7 @@ record_time_stamp(time_value_t *stamp)
>   * real-time clock frame.
>   */
>  void
> -read_time_stamp (const time_value_t *stamp, time_value_t *result)
> +read_time_stamp (const time_value_t *stamp, rpc_time_value_t *result)
>  {
>   time_value64_t result64;
>   TIME_VALUE_TO_TIME_VALUE64(stamp, &result64);
> diff --git a/kern/mach_clock.h b/kern/mach_clock.h
> index 7e8d3046..9a670011 100644
> --- a/kern/mach_clock.h
> +++ b/kern/mach_clock.h
> @@ -98,7 +98,7 @@ extern void record_time_stamp (time_value_t *stamp);
>   * Read a timestamp in STAMP into RESULT.  Returns values in the
>   * real-time clock frame.
>   */
> -extern void read_time_stamp (const time_value_t *stamp, time_value_t 
> *result);
> +extern void read_time_stamp (const time_value_t *stamp, rpc_time_value_t 
> *result);
>  
>  extern void mapable_time_init (void);
>  
> diff --git a/kern/thread.c b/kern/thread.c
> index 17cc458c..4a6b9eda 100644
> --- a/kern/thread.c
> +++ b/kern/thread.c
> @@ -1522,7 +1522,7 @@ kern_return_t thread_info(
>  
>   /

[PATCH 5/9] use L4 page table directly on x86_64 instead of short-circuiting to pdpbase

2023-02-12 Thread Luca Dariz

This is a preparation to run the kernel on high addresses, where the
user vm region and the kernel vm region will use different L3 page
tables.

* i386/intel/pmap.c: on x86_64, retrieve the value of pdpbase from the
  L4 table, and add the pmap_pdp() helper (useful also for PAE).
* i386/intel/pmap.h: remove pdpbase on x86_64.
---
 i386/intel/pmap.c | 97 ---
 i386/intel/pmap.h |  7 ++--
 2 files changed, 78 insertions(+), 26 deletions(-)

diff --git a/i386/intel/pmap.c b/i386/intel/pmap.c
index 470be744..9e9f91db 100644
--- a/i386/intel/pmap.c
+++ b/i386/intel/pmap.c
@@ -430,14 +430,11 @@ pt_entry_t *kernel_page_dir;
 static pmap_mapwindow_t mapwindows[PMAP_NMAPWINDOWS];
 def_simple_lock_data(static, pmapwindows_lock)
 
+#ifdef PAE
 static inline pt_entry_t *
-pmap_pde(const pmap_t pmap, vm_offset_t addr)
+pmap_ptp(const pmap_t pmap, vm_offset_t addr)
 {
-   pt_entry_t *page_dir;
-   if (pmap == kernel_pmap)
-   addr = kvtolin(addr);
-#if PAE
-   pt_entry_t *pdp_table, pdp, pde;
+   pt_entry_t *pdp_table, pdp;
 #ifdef __x86_64__
pdp = pmap->l4base[lin2l4num(addr)];
if ((pdp & INTEL_PTE_VALID) == 0)
@@ -446,6 +443,19 @@ pmap_pde(const pmap_t pmap, vm_offset_t addr)
 #else /* __x86_64__ */
pdp_table = pmap->pdpbase;
 #endif /* __x86_64__ */
+   return pdp_table;
+}
+#endif
+
+static inline pt_entry_t *
+pmap_pde(const pmap_t pmap, vm_offset_t addr)
+{
+   pt_entry_t *page_dir;
+   if (pmap == kernel_pmap)
+   addr = kvtolin(addr);
+#if PAE
+   pt_entry_t *pdp_table, pde;
+   pdp_table = pmap_ptp(pmap, addr);
pde = pdp_table[lin2pdpnum(addr)];
if ((pde & INTEL_PTE_VALID) == 0)
return PT_ENTRY_NULL;
@@ -585,6 +595,7 @@ vm_offset_t pmap_map_bd(
 static void pmap_bootstrap_pae(void)
 {
vm_offset_t addr;
+   pt_entry_t *pdp_kernel;
 
 #ifdef __x86_64__
 #ifdef MACH_HYP
@@ -595,13 +606,15 @@ static void pmap_bootstrap_pae(void)
memset(kernel_pmap->l4base, 0, INTEL_PGBYTES);
 #endif /* x86_64 */
 
+   // TODO: allocate only the PDPTE for kernel virtual space
+   // this means all directmap and the stupid limit above it
init_alloc_aligned(PDPNUM * INTEL_PGBYTES, &addr);
kernel_page_dir = (pt_entry_t*)phystokv(addr);
 
-   kernel_pmap->pdpbase = (pt_entry_t*)phystokv(pmap_grab_page());
-   memset(kernel_pmap->pdpbase, 0, INTEL_PGBYTES);
+   pdp_kernel = (pt_entry_t*)phystokv(pmap_grab_page());
+   memset(pdp_kernel, 0, INTEL_PGBYTES);
for (int i = 0; i < PDPNUM; i++)
-   WRITE_PTE(&kernel_pmap->pdpbase[i],
+   WRITE_PTE(&pdp_kernel[i],
  pa_to_pte(_kvtophys((void *) kernel_page_dir
  + i * INTEL_PGBYTES))
  | INTEL_PTE_VALID
@@ -611,10 +624,14 @@ static void pmap_bootstrap_pae(void)
);
 
 #ifdef __x86_64__
-   WRITE_PTE(&kernel_pmap->l4base[0], 
pa_to_pte(_kvtophys(kernel_pmap->pdpbase)) | INTEL_PTE_VALID | INTEL_PTE_WRITE);
+/* only fill the kernel pdpte during bootstrap */
+   WRITE_PTE(&kernel_pmap->l4base[lin2l4num(VM_MIN_KERNEL_ADDRESS)],
+  pa_to_pte(_kvtophys(pdp_kernel)) | INTEL_PTE_VALID | 
INTEL_PTE_WRITE);
 #ifdef MACH_PV_PAGETABLES
pmap_set_page_readonly_init(kernel_pmap->l4base);
-#endif
+#endif /* MACH_PV_PAGETABLES */
+#else  /* x86_64 */
+kernel_pmap->pdpbase = pdp_kernel;
 #endif /* x86_64 */
 }
 #endif /* PAE */
@@ -1243,7 +1260,7 @@ pmap_page_table_page_dealloc(vm_offset_t pa)
  */
 pmap_t pmap_create(vm_size_t size)
 {
-   pt_entry_t  *page_dir[PDPNUM];
+   pt_entry_t  *page_dir[PDPNUM], *pdp_kernel;
int i;
pmap_t  p;
pmap_statistics_t   stats;
@@ -1301,34 +1318,40 @@ pmap_t pmap_create(vm_size_t size)
 #endif /* MACH_PV_PAGETABLES */
 
 #if PAE
-   p->pdpbase = (pt_entry_t *) kmem_cache_alloc(&pdpt_cache);
-   if (p->pdpbase == NULL) {
+   pdp_kernel = (pt_entry_t *) kmem_cache_alloc(&pdpt_cache);
+   if (pdp_kernel == NULL) {
for (i = 0; i < PDPNUM; i++)
kmem_cache_free(&pd_cache, (vm_address_t) page_dir[i]);
kmem_cache_free(&pmap_cache, (vm_address_t) p);
return PMAP_NULL;
}
 
-   memset(p->pdpbase, 0, INTEL_PGBYTES);
+   memset(pdp_kernel, 0, INTEL_PGBYTES);
{
for (i = 0; i < PDPNUM; i++)
-   WRITE_PTE(&p->pdpbase[i],
+   WRITE_PTE(&pdp_kernel[i],
  pa_to_pte(kvtophys((vm_offset_t) page_dir[i]))
  | INTEL_PTE_VALID
 #if (defined(__x86_64__) && !defined(MACH_HYP)) || defined(MACH_PV_PAGETABLES)
  | INTEL_PTE_WRITE
 #ifdef __x86_64__

[PATCH 4/9] factor out PAE-specific bootstrap

2023-02-12 Thread Luca Dariz

* i386/intel/pmap.c: move it to pmap_bootstrap_pae()
---
 i386/intel/pmap.c | 72 ++-
 1 file changed, 40 insertions(+), 32 deletions(-)

diff --git a/i386/intel/pmap.c b/i386/intel/pmap.c
index 15577a09..470be744 100644
--- a/i386/intel/pmap.c
+++ b/i386/intel/pmap.c
@@ -581,8 +581,46 @@ vm_offset_t pmap_map_bd(
return(virt);
 }
 
+#ifdef PAE
+static void pmap_bootstrap_pae(void)
+{
+   vm_offset_t addr;
+
+#ifdef __x86_64__
+#ifdef MACH_HYP
+   kernel_pmap->user_l4base = NULL;
+   kernel_pmap->user_pdpbase = NULL;
+#endif
+   kernel_pmap->l4base = (pt_entry_t*)phystokv(pmap_grab_page());
+   memset(kernel_pmap->l4base, 0, INTEL_PGBYTES);
+#endif /* x86_64 */
+
+   init_alloc_aligned(PDPNUM * INTEL_PGBYTES, &addr);
+   kernel_page_dir = (pt_entry_t*)phystokv(addr);
+
+   kernel_pmap->pdpbase = (pt_entry_t*)phystokv(pmap_grab_page());
+   memset(kernel_pmap->pdpbase, 0, INTEL_PGBYTES);
+   for (int i = 0; i < PDPNUM; i++)
+   WRITE_PTE(&kernel_pmap->pdpbase[i],
+ pa_to_pte(_kvtophys((void *) kernel_page_dir
+ + i * INTEL_PGBYTES))
+ | INTEL_PTE_VALID
+#if (defined(__x86_64__) && !defined(MACH_HYP)) || defined(MACH_PV_PAGETABLES)
+ | INTEL_PTE_WRITE
+#endif
+   );
+
+#ifdef __x86_64__
+   WRITE_PTE(&kernel_pmap->l4base[0], 
pa_to_pte(_kvtophys(kernel_pmap->pdpbase)) | INTEL_PTE_VALID | INTEL_PTE_WRITE);
+#ifdef MACH_PV_PAGETABLES
+   pmap_set_page_readonly_init(kernel_pmap->l4base);
+#endif
+#endif /* x86_64 */
+}
+#endif /* PAE */
+
 #ifdef MACH_PV_PAGETABLES
-void pmap_bootstrap_xen()
+static void pmap_bootstrap_xen(void)
 {
/* We don't actually deal with the CR3 register content at all */
hyp_vm_assist(VMASST_CMD_enable, VMASST_TYPE_pae_extended_cr3);
@@ -691,37 +729,7 @@ void pmap_bootstrap(void)
/* Note: initial Xen mapping holds at least 512kB free mapped page.
 * We use that for directly building our linear mapping. */
 #if PAE
-   {
-   vm_offset_t addr;
-   init_alloc_aligned(PDPNUM * INTEL_PGBYTES, &addr);
-   kernel_page_dir = (pt_entry_t*)phystokv(addr);
-   }
-   kernel_pmap->pdpbase = (pt_entry_t*)phystokv(pmap_grab_page());
-   memset(kernel_pmap->pdpbase, 0, INTEL_PGBYTES);
-   {
-   int i;
-   for (i = 0; i < PDPNUM; i++)
-   WRITE_PTE(&kernel_pmap->pdpbase[i],
- pa_to_pte(_kvtophys((void *) kernel_page_dir
- + i * INTEL_PGBYTES))
- | INTEL_PTE_VALID
-#if (defined(__x86_64__) && !defined(MACH_HYP)) || defined(MACH_PV_PAGETABLES)
- | INTEL_PTE_WRITE
-#endif
- );
-   }
-#ifdef __x86_64__
-#ifdef MACH_HYP
-   kernel_pmap->user_l4base = NULL;
-   kernel_pmap->user_pdpbase = NULL;
-#endif
-   kernel_pmap->l4base = (pt_entry_t*)phystokv(pmap_grab_page());
-   memset(kernel_pmap->l4base, 0, INTEL_PGBYTES);
-   WRITE_PTE(&kernel_pmap->l4base[0], 
pa_to_pte(_kvtophys(kernel_pmap->pdpbase)) | INTEL_PTE_VALID | INTEL_PTE_WRITE);
-#ifdef MACH_PV_PAGETABLES
-   pmap_set_page_readonly_init(kernel_pmap->l4base);
-#endif
-#endif /* x86_64 */
+   pmap_bootstrap_pae();
 #else  /* PAE */
kernel_pmap->dirbase = kernel_page_dir = 
(pt_entry_t*)phystokv(pmap_grab_page());
 #endif /* PAE */
-- 
2.30.2

[PATCH 8/9] separate initialization of kernel and user PTP tables

2023-02-12 Thread Luca Dariz

* i386/i386/vm_param.h: temporariliy fix kernel upper address
* i386/intel/pmap.c: split kernel and user L3 map initialization. For
  simplicity in handling the different configurations, on 32-bit
  (+PAE) the name PDPNUM_KERNEL is used in place of PDPNUM, while only
  on x86_64 the PDPNUM_USER and PDPNUM_KERNEL are treated differently.
  Also, change iterating over PTP tables in case the kernel map is not
  right after the user map.
* i386/intel/pmap.h: define PDPNUM_USER and PDPNUM_KERNEL and move
  PDPSHIFT to simplify ifdefs.
---
 i386/i386/vm_param.h |  2 +-
 i386/intel/pmap.c| 62 ++--
 i386/intel/pmap.h|  8 +++---
 3 files changed, 52 insertions(+), 20 deletions(-)

diff --git a/i386/i386/vm_param.h b/i386/i386/vm_param.h
index 5e7f149a..c2e623a6 100644
--- a/i386/i386/vm_param.h
+++ b/i386/i386/vm_param.h
@@ -77,7 +77,7 @@
 /* This is the kernel address range in linear addresses.  */
 #ifdef __x86_64__
 #define LINEAR_MIN_KERNEL_ADDRESS  VM_MIN_KERNEL_ADDRESS
-#define LINEAR_MAX_KERNEL_ADDRESS  (0x7fffUL)
+#define LINEAR_MAX_KERNEL_ADDRESS  (0xUL)
 #else
 /* On x86, the kernel virtual address space is actually located
at high linear addresses. */
diff --git a/i386/intel/pmap.c b/i386/intel/pmap.c
index a9ff6f3e..7d4ad341 100644
--- a/i386/intel/pmap.c
+++ b/i386/intel/pmap.c
@@ -604,17 +604,22 @@ static void pmap_bootstrap_pae(void)
 #endif
kernel_pmap->l4base = (pt_entry_t*)phystokv(pmap_grab_page());
memset(kernel_pmap->l4base, 0, INTEL_PGBYTES);
+#else
+   const int PDPNUM_KERNEL = PDPNUM;
 #endif /* x86_64 */
 
-   // TODO: allocate only the PDPTE for kernel virtual space
-   // this means all directmap and the stupid limit above it
-   init_alloc_aligned(PDPNUM * INTEL_PGBYTES, &addr);
+   init_alloc_aligned(PDPNUM_KERNEL * INTEL_PGBYTES, &addr);
kernel_page_dir = (pt_entry_t*)phystokv(addr);
+   memset(kernel_page_dir, 0, PDPNUM_KERNEL * INTEL_PGBYTES);
 
pdp_kernel = (pt_entry_t*)phystokv(pmap_grab_page());
memset(pdp_kernel, 0, INTEL_PGBYTES);
-   for (int i = 0; i < PDPNUM; i++)
-   WRITE_PTE(&pdp_kernel[i],
+   for (int i = 0; i < PDPNUM_KERNEL; i++) {
+   int pdp_index = i;
+#ifdef __x86_64__
+   pdp_index += lin2pdpnum(VM_MIN_KERNEL_ADDRESS);
+#endif
+   WRITE_PTE(&pdp_kernel[pdp_index],
  pa_to_pte(_kvtophys((void *) kernel_page_dir
  + i * INTEL_PGBYTES))
  | INTEL_PTE_VALID
@@ -622,6 +627,7 @@ static void pmap_bootstrap_pae(void)
  | INTEL_PTE_WRITE
 #endif
);
+   }
 
 #ifdef __x86_64__
 /* only fill the kernel pdpte during bootstrap */
@@ -749,12 +755,12 @@ void pmap_bootstrap(void)
pmap_bootstrap_pae();
 #else  /* PAE */
kernel_pmap->dirbase = kernel_page_dir = 
(pt_entry_t*)phystokv(pmap_grab_page());
-#endif /* PAE */
{
unsigned i;
for (i = 0; i < NPDES; i++)
kernel_page_dir[i] = 0;
}
+#endif /* PAE */
 
 #ifdef MACH_PV_PAGETABLES
pmap_bootstrap_xen()
@@ -1260,6 +1266,10 @@ pmap_page_table_page_dealloc(vm_offset_t pa)
  */
 pmap_t pmap_create(vm_size_t size)
 {
+#ifdef __x86_64__
+   // needs to be reworked if we want to dynamically allocate PDPs
+   const int PDPNUM = PDPNUM_KERNEL;
+#endif
pt_entry_t  *page_dir[PDPNUM], *pdp_kernel;
int i;
pmap_t  p;
@@ -1328,8 +1338,12 @@ pmap_t pmap_create(vm_size_t size)
 
memset(pdp_kernel, 0, INTEL_PGBYTES);
{
-   for (i = 0; i < PDPNUM; i++)
-   WRITE_PTE(&pdp_kernel[i],
+   for (i = 0; i < PDPNUM; i++) {
+   int pdp_index = i;
+#ifdef __x86_64__
+   pdp_index += lin2pdpnum(VM_MIN_KERNEL_ADDRESS);
+#endif
+   WRITE_PTE(&pdp_kernel[pdp_index],
  pa_to_pte(kvtophys((vm_offset_t) page_dir[i]))
  | INTEL_PTE_VALID
 #if (defined(__x86_64__) && !defined(MACH_HYP)) || defined(MACH_PV_PAGETABLES)
@@ -1339,19 +1353,39 @@ pmap_t pmap_create(vm_size_t size)
 #endif /* __x86_64__ */
 #endif
  );
+   }
}
 #ifdef __x86_64__
-   // TODO alloc only PDPTE for the user range VM_MIN_USER_ADDRESS, 
VM_MAX_USER_ADDRESS
-   // and keep the same for kernel range, in l4 table we have different 
entries
p->l4base = (pt_entry_t *) kmem_cache_alloc(&l4_cache);
if (p->l4base == NULL)
panic("pmap_create");
memset(p->l4base, 0, INTEL_PGBYTES);
WRITE_PTE(&p->l4base[lin2l4num(VM_MIN_KERNEL_ADDRESS)],
- pa_to_pte(kvtophys((vm_offset

[PATCH 0/9 gnumach] move kernel vm map to high addresses on x86_64

2023-02-12 Thread Luca Dariz

The kernel vm region is moved to the last 2GB of the 64-bit address
space. According to the -mcmodel=kernel, the code must be placed in
this range, but probably other addresses can be used for other data
structures, direct memory mapping and so on (as in Linux).

This is just the first step towards being able to use more memory, as
now also user-space has the same 3GB as on a 32-bit kernel, but the
memory structure has not changed much otherwise. The kernel map is not
immediatly following the user map anymore, so the main changes are on
the pmap module.

Luca Dariz (9):
  prepare pmap helpers for full 64 bit memory map
  fix x86_64 asm for higher kernel addresses
  factor out xen-specific bootstrap
  factor out PAE-specific bootstrap
  use L4 page table directly on x86_64 instead of short-circuiting to
pdpbase
  add more explicit names for user space virtual space limits
  extend data types to hold a 64-bit address
  separate initialization of kernel and user PTP tables
  move kernel virtual address space to upper addresses

 i386/i386/db_interface.c  |   4 +-
 i386/i386/ldt.c   |   8 +-
 i386/i386/pcb.c   |   6 +-
 i386/i386/trap.c  |  12 +-
 i386/i386/vm_param.h  |  26 ++-
 i386/i386at/biosmem.c |   2 -
 i386/include/mach/i386/vm_param.h |   2 +-
 i386/intel/pmap.c | 328 --
 i386/intel/pmap.h |  27 ++-
 kern/task.c   |   4 +-
 x86_64/Makefrag.am|  12 +-
 x86_64/interrupt.S|   4 +-
 x86_64/locore.S   |  10 +-
 13 files changed, 290 insertions(+), 155 deletions(-)

-- 
2.30.2

[PATCH 3/9] factor out xen-specific bootstrap

2023-02-12 Thread Luca Dariz

* i386/intel/pmap.c: move it to pmap_bootstrap_xen()
---
 i386/intel/pmap.c | 107 --
 1 file changed, 56 insertions(+), 51 deletions(-)

diff --git a/i386/intel/pmap.c b/i386/intel/pmap.c
index 9fe16368..15577a09 100644
--- a/i386/intel/pmap.c
+++ b/i386/intel/pmap.c
@@ -581,6 +581,61 @@ vm_offset_t pmap_map_bd(
return(virt);
 }
 
+#ifdef MACH_PV_PAGETABLES
+void pmap_bootstrap_xen()
+{
+   /* We don't actually deal with the CR3 register content at all */
+   hyp_vm_assist(VMASST_CMD_enable, VMASST_TYPE_pae_extended_cr3);
+   /*
+* Xen may only provide as few as 512KB extra bootstrap linear memory,
+* which is far from enough to map all available memory, so we need to
+* map more bootstrap linear memory. We here map 1 (resp. 4 for PAE)
+* other L1 table(s), thus 4MiB extra memory (resp. 8MiB), which is
+* enough for a pagetable mapping 4GiB.
+*/
+#ifdef PAE
+#define NSUP_L1 4
+#else
+#define NSUP_L1 1
+#endif
+   pt_entry_t *l1_map[NSUP_L1];
+   vm_offset_t la;
+   int n_l1map;
+   for (n_l1map = 0, la = VM_MIN_KERNEL_ADDRESS; la >= 
VM_MIN_KERNEL_ADDRESS; la += NPTES * PAGE_SIZE) {
+   pt_entry_t *base = (pt_entry_t*) boot_info.pt_base;
+#ifdef PAE
+#ifdef __x86_64__
+   base = (pt_entry_t*) ptetokv(base[0]);
+#endif /* x86_64 */
+   pt_entry_t *l2_map = (pt_entry_t*) 
ptetokv(base[lin2pdpnum(la)]);
+#else  /* PAE */
+   pt_entry_t *l2_map = base;
+#endif /* PAE */
+   /* Like lin2pdenum, but works with non-contiguous boot L3 */
+   l2_map += (la >> PDESHIFT) & PDEMASK;
+   if (!(*l2_map & INTEL_PTE_VALID)) {
+   struct mmu_update update;
+   unsigned j, n;
+
+   l1_map[n_l1map] = (pt_entry_t*) 
phystokv(pmap_grab_page());
+   for (j = 0; j < NPTES; j++)
+   l1_map[n_l1map][j] = 
(((pt_entry_t)pfn_to_mfn(lin2pdenum(la - VM_MIN_KERNEL_ADDRESS) * NPTES + j)) 
<< PAGE_SHIFT) | INTEL_PTE_VALID | INTEL_PTE_WRITE;
+   pmap_set_page_readonly_init(l1_map[n_l1map]);
+   if (!hyp_mmuext_op_mfn (MMUEXT_PIN_L1_TABLE, kv_to_mfn 
(l1_map[n_l1map])))
+   panic("couldn't pin page %p(%lx)", 
l1_map[n_l1map], (vm_offset_t) kv_to_ma (l1_map[n_l1map]));
+   update.ptr = kv_to_ma(l2_map);
+   update.val = kv_to_ma(l1_map[n_l1map]) | 
INTEL_PTE_VALID | INTEL_PTE_WRITE;
+   hyp_mmu_update(kv_to_la(&update), 1, kv_to_la(&n), 
DOMID_SELF);
+   if (n != 1)
+   panic("couldn't complete bootstrap map");
+   /* added the last L1 table, can stop */
+   if (++n_l1map >= NSUP_L1)
+   break;
+   }
+   }
+}
+#endif /* MACH_PV_PAGETABLES */
+
 /*
  * Bootstrap the system enough to run with virtual memory.
  * Allocate the kernel page directory and page tables,
@@ -677,57 +732,7 @@ void pmap_bootstrap(void)
}
 
 #ifdef MACH_PV_PAGETABLES
-   /* We don't actually deal with the CR3 register content at all */
-   hyp_vm_assist(VMASST_CMD_enable, VMASST_TYPE_pae_extended_cr3);
-   /*
-* Xen may only provide as few as 512KB extra bootstrap linear memory,
-* which is far from enough to map all available memory, so we need to
-* map more bootstrap linear memory. We here map 1 (resp. 4 for PAE)
-* other L1 table(s), thus 4MiB extra memory (resp. 8MiB), which is
-* enough for a pagetable mapping 4GiB.
-*/
-#ifdef PAE
-#define NSUP_L1 4
-#else
-#define NSUP_L1 1
-#endif
-   pt_entry_t *l1_map[NSUP_L1];
-   {
-   vm_offset_t la;
-   int n_l1map;
-   for (n_l1map = 0, la = VM_MIN_KERNEL_ADDRESS; la >= 
VM_MIN_KERNEL_ADDRESS; la += NPTES * PAGE_SIZE) {
-   pt_entry_t *base = (pt_entry_t*) boot_info.pt_base;
-#ifdef PAE
-#ifdef __x86_64__
-   base = (pt_entry_t*) ptetokv(base[0]);
-#endif /* x86_64 */
-   pt_entry_t *l2_map = (pt_entry_t*) 
ptetokv(base[lin2pdpnum(la)]);
-#else  /* PAE */
-   pt_entry_t *l2_map = base;
-#endif /* PAE */
-   /* Like lin2pdenum, but works with non-contiguous boot 
L3 */
-   l2_map += (la >> PDESHIFT) & PDEMASK;
-   if (!(*l2_map & INTEL_PTE_VALID)) {
-   struct mmu_update update;
-   unsigned j, n;
-
-   l1_map[n_l1map] = (pt_entry_t*) 
phystokv(pmap_grab_page());
-   for (j = 0; j < NPTES; j++)
-   l1_map[n_l1map][j] = 
(((pt_entry_t)pfn_to_mfn(lin2pden

[PATCH 7/9] extend data types to hold a 64-bit address

2023-02-12 Thread Luca Dariz

* i386/i386/trap.c: change from int to a proper type to hold a
  register value
* x86_64/locore.S: use 64-bit register to avoid address truncation
---
 i386/i386/trap.c | 12 ++--
 x86_64/locore.S  |  4 ++--
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/i386/i386/trap.c b/i386/i386/trap.c
index 1e04ae7d..9a35fb42 100644
--- a/i386/i386/trap.c
+++ b/i386/i386/trap.c
@@ -154,9 +154,9 @@ char *trap_name(unsigned int trapnum)
  */
 void kernel_trap(struct i386_saved_state *regs)
 {
-   int code;
-   int subcode;
-   int type;
+   unsigned long   code;
+   unsigned long   subcode;
+   unsigned long   type;
vm_map_tmap;
kern_return_t   result;
thread_tthread;
@@ -357,9 +357,9 @@ dump_ss(regs);
 int user_trap(struct i386_saved_state *regs)
 {
int exc = 0;/* Suppress gcc warning */
-   int code;
-   int subcode;
-   int type;
+   unsigned long   code;
+   unsigned long   subcode;
+   unsigned long   type;
thread_t thread = current_thread();
 
 #ifdef __x86_64__
diff --git a/x86_64/locore.S b/x86_64/locore.S
index c54b5cd8..a2663aff 100644
--- a/x86_64/locore.S
+++ b/x86_64/locore.S
@@ -590,7 +590,7 @@ trap_from_kernel:
 ENTRY(thread_exception_return)
 ENTRY(thread_bootstrap_return)
movq%rsp,%rcx   /* get kernel stack */
-   or  $(KERNEL_STACK_SIZE-1),%ecx
+   or  $(KERNEL_STACK_SIZE-1),%rcx
movq-7-IKS_SIZE(%rcx),%rsp  /* switch back to PCB stack */
jmp _return_from_trap
 
@@ -603,7 +603,7 @@ ENTRY(thread_bootstrap_return)
 ENTRY(thread_syscall_return)
movqS_ARG0,%rax /* get return value */
movq%rsp,%rcx   /* get kernel stack */
-   or  $(KERNEL_STACK_SIZE-1),%ecx
+   or  $(KERNEL_STACK_SIZE-1),%rcx
movq-7-IKS_SIZE(%rcx),%rsp  /* switch back to PCB stack */
movq%rax,R_EAX(%rsp)/* save return value */
jmp _return_from_trap
-- 
2.30.2

[PATCH 1/9] prepare pmap helpers for full 64 bit memory map

2023-02-12 Thread Luca Dariz

* i386/intel/pmap.c: start walking the page table tree from the L4
  table instead of the PDP table in pmap_pte() and pmap_pde(),
  preparing for the kernel to run on high addresses.
---
 i386/intel/pmap.c | 28 +++-
 1 file changed, 23 insertions(+), 5 deletions(-)

diff --git a/i386/intel/pmap.c b/i386/intel/pmap.c
index 615b0fff..9fe16368 100644
--- a/i386/intel/pmap.c
+++ b/i386/intel/pmap.c
@@ -437,10 +437,22 @@ pmap_pde(const pmap_t pmap, vm_offset_t addr)
if (pmap == kernel_pmap)
addr = kvtolin(addr);
 #if PAE
-   page_dir = (pt_entry_t *) ptetokv(pmap->pdpbase[lin2pdpnum(addr)]);
-#else
+   pt_entry_t *pdp_table, pdp, pde;
+#ifdef __x86_64__
+   pdp = pmap->l4base[lin2l4num(addr)];
+   if ((pdp & INTEL_PTE_VALID) == 0)
+   return PT_ENTRY_NULL;
+   pdp_table = (pt_entry_t *) ptetokv(pdp);
+#else /* __x86_64__ */
+   pdp_table = pmap->pdpbase;
+#endif /* __x86_64__ */
+   pde = pdp_table[lin2pdpnum(addr)];
+   if ((pde & INTEL_PTE_VALID) == 0)
+   return PT_ENTRY_NULL;
+   page_dir = (pt_entry_t *) ptetokv(pde);
+#else /* PAE */
page_dir = pmap->dirbase;
-#endif
+#endif /* PAE */
return &page_dir[lin2pdenum(addr)];
 }
 
@@ -457,14 +469,20 @@ pmap_pte(const pmap_t pmap, vm_offset_t addr)
pt_entry_t  *ptp;
pt_entry_t  pte;
 
-#if PAE
+#ifdef __x86_64__
+   if (pmap->l4base == 0)
+   return(PT_ENTRY_NULL);
+#elif PAE
if (pmap->pdpbase == 0)
return(PT_ENTRY_NULL);
 #else
if (pmap->dirbase == 0)
return(PT_ENTRY_NULL);
 #endif
-   pte = *pmap_pde(pmap, addr);
+   ptp = pmap_pde(pmap, addr);
+   if (ptp == 0)
+   return(PT_ENTRY_NULL);
+   pte = *ptp;
if ((pte & INTEL_PTE_VALID) == 0)
return(PT_ENTRY_NULL);
ptp = (pt_entry_t *)ptetokv(pte);
-- 
2.30.2

Re: [PATCH mig] Make MIG work for pure 64 bit kernel and userland.

2023-02-12 Thread Flávio Cruz

On Sun, Feb 12, 2023 at 11:34 AM Luca  wrote:

> Il 12/02/23 17:05, Sergey Bugaev ha scritto:
> > On Sun, Feb 12, 2023 at 7:01 PM Luca  wrote:
> >> It seems XNU's mig [0] always sets the alignment to natural_t (=4) with
> >> a #pragma... I still have to understand how they handle these issues.
> >
> > Please note that XNU uses "untyped messaging", and Apple's version of
> > MIG is "untyped MIG". They don't have the kernel parse the message
> > body, it just passes it to the receiver task as a blob, where it's up
> > to MIG again to make sense of it.
>
> It seems the body is parsed more or less in the same way as in gnumach,
> at least to translate ports. Other fields seems just forwarded to the
> mig stubs, also as in gnumach. Could you be more specific about the
> "untyped messaging"?
>

>From a cursory look at their code, they copy the arguments into the stack,
which allows the compiler to ensure the correct alignment (plus using
memcpy for arrays). Note that they also ensure all the message structs
(including mach_msg_header_t) are defined as 4-byte aligned which ensures
that the kernel sees the correct layout. I think this could work here too
but not sure what would give us the best performance.


>
> Luca
>
>
>

Re: [PATCH 5/6] fix port name size in notifications

2023-02-12 Thread Samuel Thibault

Applied, thanks!

Luca Dariz, le dim. 12 févr. 2023 18:03:12 +0100, a ecrit:
> * ipc/ipc_machdep.h: define PORT_NAME_T_SIZE_IN_BITS
> * ipc/ipc_notify.c: fix port name size in notification message
>   templates
> ---
>  ipc/ipc_machdep.h | 1 +
>  ipc/ipc_notify.c  | 8 
>  2 files changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/ipc/ipc_machdep.h b/ipc/ipc_machdep.h
> index 29878dc9..2871fc31 100755
> --- a/ipc/ipc_machdep.h
> +++ b/ipc/ipc_machdep.h
> @@ -34,5 +34,6 @@
>   */
>  
>  #define PORT_T_SIZE_IN_BITS (sizeof(mach_port_t)*8)
> +#define PORT_NAME_T_SIZE_IN_BITS (sizeof(mach_port_name_t)*8)
>  
>  #endif /* _IPC_IPC_MACHDEP_H_ */
> diff --git a/ipc/ipc_notify.c b/ipc/ipc_notify.c
> index eea60116..d0b71cf2 100644
> --- a/ipc/ipc_notify.c
> +++ b/ipc/ipc_notify.c
> @@ -72,7 +72,7 @@ 
> ipc_notify_init_port_deleted(mach_port_deleted_notification_t *n)
>   m->msgh_id = MACH_NOTIFY_PORT_DELETED;
>  
>   t->msgt_name = MACH_MSG_TYPE_PORT_NAME;
> - t->msgt_size = PORT_T_SIZE_IN_BITS;
> + t->msgt_size = PORT_NAME_T_SIZE_IN_BITS;
>   t->msgt_number = 1;
>   t->msgt_inline = TRUE;
>   t->msgt_longform = FALSE;
> @@ -102,7 +102,7 @@ 
> ipc_notify_init_msg_accepted(mach_msg_accepted_notification_t *n)
>   m->msgh_id = MACH_NOTIFY_MSG_ACCEPTED;
>  
>   t->msgt_name = MACH_MSG_TYPE_PORT_NAME;
> - t->msgt_size = PORT_T_SIZE_IN_BITS;
> + t->msgt_size = PORT_NAME_T_SIZE_IN_BITS;
>   t->msgt_number = 1;
>   t->msgt_inline = TRUE;
>   t->msgt_longform = FALSE;
> @@ -164,7 +164,7 @@ ipc_notify_init_no_senders(
>   m->msgh_id = MACH_NOTIFY_NO_SENDERS;
>  
>   t->msgt_name = MACH_MSG_TYPE_INTEGER_32;
> - t->msgt_size = PORT_T_SIZE_IN_BITS;
> + t->msgt_size = 32;
>   t->msgt_number = 1;
>   t->msgt_inline = TRUE;
>   t->msgt_longform = FALSE;
> @@ -215,7 +215,7 @@ ipc_notify_init_dead_name(
>   m->msgh_id = MACH_NOTIFY_DEAD_NAME;
>  
>   t->msgt_name = MACH_MSG_TYPE_PORT_NAME;
> - t->msgt_size = PORT_T_SIZE_IN_BITS;
> + t->msgt_size = PORT_NAME_T_SIZE_IN_BITS;
>   t->msgt_number = 1;
>   t->msgt_inline = TRUE;
>   t->msgt_longform = FALSE;
> -- 
> 2.30.2
> 
> 

-- 
Samuel
---
Pour une évaluation indépendante, transparente et rigoureuse !
Je soutiens la Commission d'Évaluation de l'Inria.

Re: [PATCH 6/6] enable syscalls on x86_64

2023-02-12 Thread Samuel Thibault

Applied, thanks


Luca Dariz, le dim. 12 févr. 2023 18:03:13 +0100, a ecrit:
> Signed-off-by: Luca Dariz 
> ---
>  x86_64/locore.S | 3 ---
>  1 file changed, 3 deletions(-)
> 
> diff --git a/x86_64/locore.S b/x86_64/locore.S
> index 5b9c8ef4..95ece3cc 100644
> --- a/x86_64/locore.S
> +++ b/x86_64/locore.S
> @@ -1075,9 +1075,6 @@ syscall_entry_2:
>   pushq   %rax/* save system call number */
>   pushq   $0  /* clear trap number slot */
>  
> -// TODO: test it before dropping ud2
> - ud2
> -
>   pusha   /* save the general registers */
>   movq%ds,%rdx/* and the segment registers */
>   pushq   %rdx
> -- 
> 2.30.2
> 
> 

-- 
Samuel
---
Pour une évaluation indépendante, transparente et rigoureuse !
Je soutiens la Commission d'Évaluation de l'Inria.

Re: [PATCH 0/9 gnumach] move kernel vm map to high addresses on x86_64

2023-02-12 Thread Luca


Il 12/02/23 18:28, Luca Dariz ha scritto:

The kernel vm region is moved to the last 2GB of the 64-bit address
space. According to the -mcmodel=kernel, the code must be placed in
this range, but probably other addresses can be used for other data
structures, direct memory mapping and so on (as in Linux).

This is just the first step towards being able to use more memory, as
now also user-space has the same 3GB as on a 32-bit kernel, but the
memory structure has not changed much otherwise. The kernel map is not
immediatly following the user map anymore, so the main changes are on
the pmap module.


I forgot to add, I didn't change the code for XEN as I don't know xen in 
general. I hope the code can be reused also in that case.

Re: [PATCH 1/9] prepare pmap helpers for full 64 bit memory map

2023-02-12 Thread Samuel Thibault

Applied, thanks!

Luca Dariz, le dim. 12 févr. 2023 18:28:10 +0100, a ecrit:
> * i386/intel/pmap.c: start walking the page table tree from the L4
>   table instead of the PDP table in pmap_pte() and pmap_pde(),
>   preparing for the kernel to run on high addresses.
> ---
>  i386/intel/pmap.c | 28 +++-
>  1 file changed, 23 insertions(+), 5 deletions(-)
> 
> diff --git a/i386/intel/pmap.c b/i386/intel/pmap.c
> index 615b0fff..9fe16368 100644
> --- a/i386/intel/pmap.c
> +++ b/i386/intel/pmap.c
> @@ -437,10 +437,22 @@ pmap_pde(const pmap_t pmap, vm_offset_t addr)
>   if (pmap == kernel_pmap)
>   addr = kvtolin(addr);
>  #if PAE
> - page_dir = (pt_entry_t *) ptetokv(pmap->pdpbase[lin2pdpnum(addr)]);
> -#else
> + pt_entry_t *pdp_table, pdp, pde;
> +#ifdef __x86_64__
> + pdp = pmap->l4base[lin2l4num(addr)];
> + if ((pdp & INTEL_PTE_VALID) == 0)
> + return PT_ENTRY_NULL;
> + pdp_table = (pt_entry_t *) ptetokv(pdp);
> +#else /* __x86_64__ */
> + pdp_table = pmap->pdpbase;
> +#endif /* __x86_64__ */
> + pde = pdp_table[lin2pdpnum(addr)];
> + if ((pde & INTEL_PTE_VALID) == 0)
> + return PT_ENTRY_NULL;
> + page_dir = (pt_entry_t *) ptetokv(pde);
> +#else /* PAE */
>   page_dir = pmap->dirbase;
> -#endif
> +#endif /* PAE */
>   return &page_dir[lin2pdenum(addr)];
>  }
>  
> @@ -457,14 +469,20 @@ pmap_pte(const pmap_t pmap, vm_offset_t addr)
>   pt_entry_t  *ptp;
>   pt_entry_t  pte;
>  
> -#if PAE
> +#ifdef __x86_64__
> + if (pmap->l4base == 0)
> + return(PT_ENTRY_NULL);
> +#elif PAE
>   if (pmap->pdpbase == 0)
>   return(PT_ENTRY_NULL);
>  #else
>   if (pmap->dirbase == 0)
>   return(PT_ENTRY_NULL);
>  #endif
> - pte = *pmap_pde(pmap, addr);
> + ptp = pmap_pde(pmap, addr);
> + if (ptp == 0)
> + return(PT_ENTRY_NULL);
> + pte = *ptp;
>   if ((pte & INTEL_PTE_VALID) == 0)
>   return(PT_ENTRY_NULL);
>   ptp = (pt_entry_t *)ptetokv(pte);
> -- 
> 2.30.2
> 
> 

-- 
Samuel
---
Pour une évaluation indépendante, transparente et rigoureuse !
Je soutiens la Commission d'Évaluation de l'Inria.

Re: [PATCH 2/9] fix x86_64 asm for higher kernel addresses

2023-02-12 Thread Samuel Thibault

Applied, thanks!

Luca Dariz, le dim. 12 févr. 2023 18:28:11 +0100, a ecrit:
> * x86_64/interrupt.S: use 64-bit registers as variables could be
>   stored at high addresses
> * x86_64/locore.S: Likewise
> ---
>  x86_64/interrupt.S | 4 ++--
>  x86_64/locore.S| 6 ++
>  2 files changed, 4 insertions(+), 6 deletions(-)
> 
> diff --git a/x86_64/interrupt.S b/x86_64/interrupt.S
> index fe2b3858..31f386ec 100644
> --- a/x86_64/interrupt.S
> +++ b/x86_64/interrupt.S
> @@ -59,10 +59,10 @@ ENTRY(interrupt)
>  
>   movlS_IRQ,%eax  /* copy irq number */
>   shll$2,%eax /* irq * 4 */
> - movlEXT(iunit)(%eax),%edi   /* get device unit number as 1st arg */
> + movlEXT(iunit)(%rax),%edi   /* get device unit number as 1st arg */
>  
>   shll$1,%eax /* irq * 8 */
> - call*EXT(ivect)(%eax)   /* call interrupt handler */
> + call*EXT(ivect)(%rax)   /* call interrupt handler */
>  
>   movlS_IPL,%edi  /* restore previous ipl */
>   callsplx_cli/* restore previous ipl */
> diff --git a/x86_64/locore.S b/x86_64/locore.S
> index 95ece3cc..c54b5cd8 100644
> --- a/x86_64/locore.S
> +++ b/x86_64/locore.S
> @@ -1152,7 +1152,7 @@ syscall_native:
>  #endif
>   shll$5,%eax /* manual indexing of mach_trap_t */
>   xorq%r10,%r10
> - movlEXT(mach_trap_table)(%eax),%r10d
> + mov EXT(mach_trap_table)(%rax),%r10
>   /* get number of arguments */
>   andq%r10,%r10
>   jz  mach_call_call  /* skip argument copy if none */
> @@ -1199,9 +1199,7 @@ mach_call_call:
>   /* will return with syscallofs still (or again) in eax */
>  0:
>  #endif /* DEBUG */
> -
> - call*EXT(mach_trap_table)+8(%eax)
> - /* call procedure */
> + call*EXT(mach_trap_table)+8(%rax)  /* call procedure */
>   movq%rsp,%rcx   /* get kernel stack */
>   or  $(KERNEL_STACK_SIZE-1),%rcx
>   movq-7-IKS_SIZE(%rcx),%rsp  /* switch back to PCB stack */
> -- 
> 2.30.2
> 
> 

-- 
Samuel
---
Pour une évaluation indépendante, transparente et rigoureuse !
Je soutiens la Commission d'Évaluation de l'Inria.

Re: [PATCH 3/9] factor out xen-specific bootstrap

2023-02-12 Thread Samuel Thibault

Applied, thanks!

Luca Dariz, le dim. 12 févr. 2023 18:28:12 +0100, a ecrit:
> * i386/intel/pmap.c: move it to pmap_bootstrap_xen()
> ---
>  i386/intel/pmap.c | 107 --
>  1 file changed, 56 insertions(+), 51 deletions(-)
> 
> diff --git a/i386/intel/pmap.c b/i386/intel/pmap.c
> index 9fe16368..15577a09 100644
> --- a/i386/intel/pmap.c
> +++ b/i386/intel/pmap.c
> @@ -581,6 +581,61 @@ vm_offset_t pmap_map_bd(
>   return(virt);
>  }
>  
> +#ifdef   MACH_PV_PAGETABLES
> +void pmap_bootstrap_xen()
> +{
> + /* We don't actually deal with the CR3 register content at all */
> + hyp_vm_assist(VMASST_CMD_enable, VMASST_TYPE_pae_extended_cr3);
> + /*
> +  * Xen may only provide as few as 512KB extra bootstrap linear memory,
> +  * which is far from enough to map all available memory, so we need to
> +  * map more bootstrap linear memory. We here map 1 (resp. 4 for PAE)
> +  * other L1 table(s), thus 4MiB extra memory (resp. 8MiB), which is
> +  * enough for a pagetable mapping 4GiB.
> +  */
> +#ifdef PAE
> +#define NSUP_L1 4
> +#else
> +#define NSUP_L1 1
> +#endif
> + pt_entry_t *l1_map[NSUP_L1];
> + vm_offset_t la;
> + int n_l1map;
> + for (n_l1map = 0, la = VM_MIN_KERNEL_ADDRESS; la >= 
> VM_MIN_KERNEL_ADDRESS; la += NPTES * PAGE_SIZE) {
> + pt_entry_t *base = (pt_entry_t*) boot_info.pt_base;
> +#ifdef   PAE
> +#ifdef __x86_64__
> + base = (pt_entry_t*) ptetokv(base[0]);
> +#endif /* x86_64 */
> + pt_entry_t *l2_map = (pt_entry_t*) 
> ptetokv(base[lin2pdpnum(la)]);
> +#else/* PAE */
> + pt_entry_t *l2_map = base;
> +#endif   /* PAE */
> + /* Like lin2pdenum, but works with non-contiguous boot L3 */
> + l2_map += (la >> PDESHIFT) & PDEMASK;
> + if (!(*l2_map & INTEL_PTE_VALID)) {
> + struct mmu_update update;
> + unsigned j, n;
> +
> + l1_map[n_l1map] = (pt_entry_t*) 
> phystokv(pmap_grab_page());
> + for (j = 0; j < NPTES; j++)
> + l1_map[n_l1map][j] = 
> (((pt_entry_t)pfn_to_mfn(lin2pdenum(la - VM_MIN_KERNEL_ADDRESS) * NPTES + j)) 
> << PAGE_SHIFT) | INTEL_PTE_VALID | INTEL_PTE_WRITE;
> + pmap_set_page_readonly_init(l1_map[n_l1map]);
> + if (!hyp_mmuext_op_mfn (MMUEXT_PIN_L1_TABLE, kv_to_mfn 
> (l1_map[n_l1map])))
> + panic("couldn't pin page %p(%lx)", 
> l1_map[n_l1map], (vm_offset_t) kv_to_ma (l1_map[n_l1map]));
> + update.ptr = kv_to_ma(l2_map);
> + update.val = kv_to_ma(l1_map[n_l1map]) | 
> INTEL_PTE_VALID | INTEL_PTE_WRITE;
> + hyp_mmu_update(kv_to_la(&update), 1, kv_to_la(&n), 
> DOMID_SELF);
> + if (n != 1)
> + panic("couldn't complete bootstrap map");
> + /* added the last L1 table, can stop */
> + if (++n_l1map >= NSUP_L1)
> + break;
> + }
> + }
> +}
> +#endif   /* MACH_PV_PAGETABLES */
> +
>  /*
>   *   Bootstrap the system enough to run with virtual memory.
>   *   Allocate the kernel page directory and page tables,
> @@ -677,57 +732,7 @@ void pmap_bootstrap(void)
>   }
>  
>  #ifdef   MACH_PV_PAGETABLES
> - /* We don't actually deal with the CR3 register content at all */
> - hyp_vm_assist(VMASST_CMD_enable, VMASST_TYPE_pae_extended_cr3);
> - /*
> -  * Xen may only provide as few as 512KB extra bootstrap linear memory,
> -  * which is far from enough to map all available memory, so we need to
> -  * map more bootstrap linear memory. We here map 1 (resp. 4 for PAE)
> -  * other L1 table(s), thus 4MiB extra memory (resp. 8MiB), which is
> -  * enough for a pagetable mapping 4GiB.
> -  */
> -#ifdef PAE
> -#define NSUP_L1 4
> -#else
> -#define NSUP_L1 1
> -#endif
> - pt_entry_t *l1_map[NSUP_L1];
> - {
> - vm_offset_t la;
> - int n_l1map;
> - for (n_l1map = 0, la = VM_MIN_KERNEL_ADDRESS; la >= 
> VM_MIN_KERNEL_ADDRESS; la += NPTES * PAGE_SIZE) {
> - pt_entry_t *base = (pt_entry_t*) boot_info.pt_base;
> -#ifdef   PAE
> -#ifdef __x86_64__
> - base = (pt_entry_t*) ptetokv(base[0]);
> -#endif /* x86_64 */
> - pt_entry_t *l2_map = (pt_entry_t*) 
> ptetokv(base[lin2pdpnum(la)]);
> -#else/* PAE */
> - pt_entry_t *l2_map = base;
> -#endif   /* PAE */
> - /* Like lin2pdenum, but works with non-contiguous boot 
> L3 */
> - l2_map += (la >> PDESHIFT) & PDEMASK;
> - if (!(*l2_map & INTEL_PTE_VALID)) {
> - struct mmu_update update;
> - unsigned j, n;
> -
> -

Re: [PATCH 4/9] factor out PAE-specific bootstrap

2023-02-12 Thread Samuel Thibault

Applied, thanks!

Luca Dariz, le dim. 12 févr. 2023 18:28:13 +0100, a ecrit:
> * i386/intel/pmap.c: move it to pmap_bootstrap_pae()
> ---
>  i386/intel/pmap.c | 72 ++-
>  1 file changed, 40 insertions(+), 32 deletions(-)
> 
> diff --git a/i386/intel/pmap.c b/i386/intel/pmap.c
> index 15577a09..470be744 100644
> --- a/i386/intel/pmap.c
> +++ b/i386/intel/pmap.c
> @@ -581,8 +581,46 @@ vm_offset_t pmap_map_bd(
>   return(virt);
>  }
>  
> +#ifdef PAE
> +static void pmap_bootstrap_pae(void)
> +{
> + vm_offset_t addr;
> +
> +#ifdef __x86_64__
> +#ifdef MACH_HYP
> + kernel_pmap->user_l4base = NULL;
> + kernel_pmap->user_pdpbase = NULL;
> +#endif
> + kernel_pmap->l4base = (pt_entry_t*)phystokv(pmap_grab_page());
> + memset(kernel_pmap->l4base, 0, INTEL_PGBYTES);
> +#endif   /* x86_64 */
> +
> + init_alloc_aligned(PDPNUM * INTEL_PGBYTES, &addr);
> + kernel_page_dir = (pt_entry_t*)phystokv(addr);
> +
> + kernel_pmap->pdpbase = (pt_entry_t*)phystokv(pmap_grab_page());
> + memset(kernel_pmap->pdpbase, 0, INTEL_PGBYTES);
> + for (int i = 0; i < PDPNUM; i++)
> + WRITE_PTE(&kernel_pmap->pdpbase[i],
> +   pa_to_pte(_kvtophys((void *) kernel_page_dir
> +   + i * INTEL_PGBYTES))
> +   | INTEL_PTE_VALID
> +#if (defined(__x86_64__) && !defined(MACH_HYP)) || 
> defined(MACH_PV_PAGETABLES)
> +   | INTEL_PTE_WRITE
> +#endif
> + );
> +
> +#ifdef __x86_64__
> + WRITE_PTE(&kernel_pmap->l4base[0], 
> pa_to_pte(_kvtophys(kernel_pmap->pdpbase)) | INTEL_PTE_VALID | 
> INTEL_PTE_WRITE);
> +#ifdef   MACH_PV_PAGETABLES
> + pmap_set_page_readonly_init(kernel_pmap->l4base);
> +#endif
> +#endif   /* x86_64 */
> +}
> +#endif /* PAE */
> +
>  #ifdef   MACH_PV_PAGETABLES
> -void pmap_bootstrap_xen()
> +static void pmap_bootstrap_xen(void)
>  {
>   /* We don't actually deal with the CR3 register content at all */
>   hyp_vm_assist(VMASST_CMD_enable, VMASST_TYPE_pae_extended_cr3);
> @@ -691,37 +729,7 @@ void pmap_bootstrap(void)
>   /* Note: initial Xen mapping holds at least 512kB free mapped page.
>* We use that for directly building our linear mapping. */
>  #if PAE
> - {
> - vm_offset_t addr;
> - init_alloc_aligned(PDPNUM * INTEL_PGBYTES, &addr);
> - kernel_page_dir = (pt_entry_t*)phystokv(addr);
> - }
> - kernel_pmap->pdpbase = (pt_entry_t*)phystokv(pmap_grab_page());
> - memset(kernel_pmap->pdpbase, 0, INTEL_PGBYTES);
> - {
> - int i;
> - for (i = 0; i < PDPNUM; i++)
> - WRITE_PTE(&kernel_pmap->pdpbase[i],
> -   pa_to_pte(_kvtophys((void *) kernel_page_dir
> -   + i * INTEL_PGBYTES))
> -   | INTEL_PTE_VALID
> -#if (defined(__x86_64__) && !defined(MACH_HYP)) || 
> defined(MACH_PV_PAGETABLES)
> -   | INTEL_PTE_WRITE
> -#endif
> -   );
> - }
> -#ifdef __x86_64__
> -#ifdef MACH_HYP
> - kernel_pmap->user_l4base = NULL;
> - kernel_pmap->user_pdpbase = NULL;
> -#endif
> - kernel_pmap->l4base = (pt_entry_t*)phystokv(pmap_grab_page());
> - memset(kernel_pmap->l4base, 0, INTEL_PGBYTES);
> - WRITE_PTE(&kernel_pmap->l4base[0], 
> pa_to_pte(_kvtophys(kernel_pmap->pdpbase)) | INTEL_PTE_VALID | 
> INTEL_PTE_WRITE);
> -#ifdef   MACH_PV_PAGETABLES
> - pmap_set_page_readonly_init(kernel_pmap->l4base);
> -#endif
> -#endif   /* x86_64 */
> + pmap_bootstrap_pae();
>  #else/* PAE */
>   kernel_pmap->dirbase = kernel_page_dir = 
> (pt_entry_t*)phystokv(pmap_grab_page());
>  #endif   /* PAE */
> -- 
> 2.30.2
> 
> 

-- 
Samuel
---
Pour une évaluation indépendante, transparente et rigoureuse !
Je soutiens la Commission d'Évaluation de l'Inria.

[PATCH mig] Make MIG work for pure 64 bit kernel and userland (V2)

2023-02-12 Thread Flavio Cruz

Made changes to MIG so that generated stubs can be compiled when using
a 64 bit userland. The most important change here is to ensure
that data is always 8-byte aligned. Not doing that will lead to
undefined behavior since on 64 bits many structures can contain 64 bit
scalars which require 8-byte alignment. Even if x86 is usually fine with
unaligned accesses, compilers are taking advantage of this UD which may
break us in the future
(http://pzemtsov.github.io/2016/11/06/bug-story-alignment-on-x86.html)

The downside is that messages get significantly larger since we have
to pad the initial mach_msg_type_t so that the data item can be 8-byte
aligned. In the future, I want to make MIG layout the data in a way
that non-complex data can be packed together (essentially, how XNU messages
work) and to force userland to create larger structures so that we don't
have to do any structure resizing while in the kernel. For now, this unblocks
future work on 64 bits and is quite simple to understand.

For the 64 bit / 32 bit configuration, we can pass --enable-user32 when
targeting x86_64, and MIG will generate stubs as before. I am unsure
whether there exists undefined behavior in that particular configuration
since most of the structures used by the kernel do not seem to use 8
byte scalars so potentially we should be OK there.

As a follow up, we will need to change gnumach to iterate over these
messages correctly (to follow the 8-byte alignment rules).

Tested this patch by building Hurd and comparing previous stubs which
are byte-by-byte identical.
---
 configure.ac |  7 
 cpu.sym  |  4 +--
 global.c |  5 +--
 global.h |  5 +--
 parser.y | 14 +---
 routine.c|  5 ++-
 server.c | 20 +--
 type.c   | 95 ++--
 user.c   | 22 ++--
 utils.c  |  6 +++-
 utils.h  |  2 ++
 11 files changed, 105 insertions(+), 80 deletions(-)

diff --git a/configure.ac b/configure.ac
index e4645bd..8d52d5f 100644
--- a/configure.ac
+++ b/configure.ac
@@ -55,6 +55,13 @@ AC_SUBST([MIGCOM])
 [MIG=`echo mig | sed "$program_transform_name"`]
 AC_SUBST([MIG])
 
+AC_ARG_ENABLE(user32,
+  AS_HELP_STRING([--enable-user32],
+ [Build mig for 64 bit kernel that works with 32 
bit userland]))
+AS_IF([test "x$enable_user32" = "xyes"], [
+   AC_DEFINE([USER32], [1], ["Build mig for 64 bit kernel that works with 
32 bit userland"])
+])
+
 AC_OUTPUT([Makefile mig tests/Makefile \
tests/good/Makefile \
tests/generate-only/Makefile \
diff --git a/cpu.sym b/cpu.sym
index 6bcb878..205ab3e 100644
--- a/cpu.sym
+++ b/cpu.sym
@@ -95,8 +95,8 @@ expr MACH_MSG_TYPE_POLYMORPHIC
 
 
 /* Types used in interfaces */
-expr sizeof(integer_t) word_size
-expr (sizeof(integer_t)*8) word_size_in_bits
+/* The minimum alignment required for this architecture */
+expr sizeof(uintptr_t) desired_max_alignof
 expr sizeof(void*) sizeof_pointer
 expr sizeof(char)  sizeof_char
 expr sizeof(short) sizeof_short
diff --git a/global.c b/global.c
index bf3d10d..ad477c2 100644
--- a/global.c
+++ b/global.c
@@ -66,8 +66,9 @@ string_t InternalHeaderFileName = strNULL;
 string_t UserFileName = strNULL;
 string_t ServerFileName = strNULL;
 
-int port_size = port_name_size;
-int port_size_in_bits = port_name_size_in_bits;
+size_t port_size = port_name_size;
+size_t port_size_in_bits = port_name_size_in_bits;
+size_t max_alignof = desired_max_alignof;
 
 void
 more_global(void)
diff --git a/global.h b/global.h
index 8e57df2..bf8eb38 100644
--- a/global.h
+++ b/global.h
@@ -67,8 +67,9 @@ extern string_t InternalHeaderFileName;
 extern string_t UserFileName;
 extern string_t ServerFileName;
 
-extern int port_size;
-extern int port_size_in_bits;
+extern size_t port_size;
+extern size_t port_size_in_bits;
+extern size_t max_alignof;
 
 extern void more_global(void);
 
diff --git a/parser.y b/parser.y
index 03c5ec8..3aa18c2 100644
--- a/parser.y
+++ b/parser.y
@@ -212,6 +212,16 @@ Subsystem  :   SubsystemStart SubsystemMods
   IsKernelUser ? ", KernelUser" : "",
   IsKernelServer ? ", KernelServer" : "");
 }
+if (IsKernelUser || IsKernelServer) {
+port_size = vm_offset_size;
+port_size_in_bits = vm_offset_size_in_bits;
+#ifdef USER32
+/* Since we are using MIG to generate stubs for a 64 bit kernel that 
operates
+   with a 32 bit userland, then we need to consider that `max_alignof` 
is 4 byte aligned.
+ */
+max_alignof = sizeof(uint32_t);
+#endif
+}
 init_type();
 }
;
@@ -237,16 +247,12 @@ SubsystemMod  :   syKernelUser
 if (IsKernelUser)
warn("duplicate KernelUser keyword");
 IsKernelUser = true;
-port_size = vm_offset_size;
-port_size_in_bits = vm_offset_size_in_

Re: [PATCH mig] Make MIG work for pure 64 bit kernel and userland.

2023-02-12 Thread Flávio Cruz

On Sun, Feb 12, 2023 at 9:44 AM Samuel Thibault 
wrote:

> Thanks for your work on the 64bit RPC part, that's tricky :)
>
> Flavio Cruz, le dim. 12 févr. 2023 01:15:05 -0500, a ecrit:
> > diff --git a/cpu.sym b/cpu.sym
> > index 6bcb878..7858b47 100644
> > --- a/cpu.sym
> > +++ b/cpu.sym
> > @@ -95,8 +95,8 @@ expr MACH_MSG_TYPE_POLYMORPHIC
> >
> >
> >  /* Types used in interfaces */
> > -expr sizeof(integer_t)   word_size
> > -expr (sizeof(integer_t)*8)   word_size_in_bits
> > +/* The minimum alignment required for this architecture */
> > +expr sizeof(uintptr_t)   desired_machine_alignment
>
> I'm really not at ease with such general "architecture-required
> alignment". Alignment always depends on the kind of data you are
> manipulating. If you throw a __float128 field in your structure, it'll
> have to be 16-byte-aligned.
>
> I'd tend to think that we probably want to follow C on that: each
> type has its own alignment constraint that we can encode in mig,
> and alignment constraints of structures is the max of the alignment
> requirements of the elements of the structure.
>

Fair enough. I changed it to max_alignof. I don't think we allow scalars
larger than 8 bytes so I don't think this can happen today.


>
> > diff --git a/type.c b/type.c
> > index b104c66..05ee201 100644
> > --- a/type.c
> > +++ b/type.c
> > @@ -35,18 +35,6 @@
> >  #include "cpu.h"
> >  #include "utils.h"
> >
> > -#if word_size_in_bits == 32
> > -#define  word_size_name MACH_MSG_TYPE_INTEGER_32
> > -#define  word_size_name_string "MACH_MSG_TYPE_INTEGER_32"
> > -#else
> > -#if word_size_in_bits == 64
> > -#define  word_size_name MACH_MSG_TYPE_INTEGER_64
> > -#define  word_size_name_string "MACH_MSG_TYPE_INTEGER_64"
> > -#else
> > -#error Unsupported word size!
> > -#endif
> > -#endif
>
> I agree on dropping the "word size" term that is ambiguous on a 64bit
> architecture, where you can still want to consider 32bit as a word.
>
> I'd however say to introduce int_name and int_name_string, so that:
>

Done


>
> > -it->itInName = word_size_name;
> > -it->itInNameStr = word_size_name_string;
> > -it->itOutName = word_size_name;
> > -it->itOutNameStr = word_size_name_string;
> > -it->itSize = word_size_in_bits;
> > +it->itInName = MACH_MSG_TYPE_INTEGER_32;
> > +it->itInNameStr = "MACH_MSG_TYPE_INTEGER_32";
> > +it->itOutName = MACH_MSG_TYPE_INTEGER_32;
> > +it->itOutNameStr = "MACH_MSG_TYPE_INTEGER_32";
> > +it->itSize = sizeof_int * 8;
> > +it->itAlignment = sizeof_int;
>
> These become coherent, and less surprising to change if we ever need to.
>
> > @@ -873,6 +865,7 @@ itMakeDeallocType(void)
> >  it->itOutName = MACH_MSG_TYPE_BOOLEAN;
> >  it->itOutNameStr = "MACH_MSG_TYPE_BOOLEAN";
> >  it->itSize = 32;
> > +it->itAlignment = sizeof_int;
>
> I'd say also use sizeof_int * 8 for itSize.
>
> > @@ -909,7 +903,8 @@ init_type(void)
> >  itRetCodeType->itInNameStr = "MACH_MSG_TYPE_INTEGER_32";
> >  itRetCodeType->itOutName = MACH_MSG_TYPE_INTEGER_32;
> >  itRetCodeType->itOutNameStr = "MACH_MSG_TYPE_INTEGER_32";
> > -itRetCodeType->itSize = 32;
> > +itRetCodeType->itSize = sizeof_int * 8;
> > +itRetCodeType->itAlignment = sizeof_int;
>
> Also here, use int_name and int_name_string.
>
> > @@ -919,7 +914,8 @@ init_type(void)
> >  itDummyType->itInNameStr = "MACH_MSG_TYPE_UNSTRUCTURED";
> >  itDummyType->itOutName = MACH_MSG_TYPE_UNSTRUCTURED;
> >  itDummyType->itOutNameStr = "MACH_MSG_TYPE_UNSTRUCTURED";
> > -itDummyType->itSize = word_size_in_bits;
> > +itDummyType->itSize = word_size * 8;
> > +itDummyType->itAlignment = word_size;
>
> So, now the question raises: what do we want to permit in an
> unstructured? If we're fine with not including __float128, we can indeed
> go with alignof(intptr_t). But I'd really say to call this so (e.g.
> max_alignof), rather than word_size which does not convey much meaning.
>
> > @@ -1041,17 +1040,15 @@ itCheckIntType(identifier_t name, const
> ipc_type_t *it)
> >   error("argument %s isn't a proper integer", name);
> >  }
> >  void
> > -itCheckNaturalType(name, it)
> > -identifier_t name;
> > -ipc_type_t *it;
> > +itCheckNaturalType(identifier_t name, ipc_type_t *it)
> >  {
> > -if ((it->itInName != word_size_name) ||
> > - (it->itOutName != word_size_name) ||
> > +if ((it->itInName != MACH_MSG_TYPE_INTEGER_32) ||
> > + (it->itOutName != MACH_MSG_TYPE_INTEGER_32) ||
>
> Also, here, int_name for coherency with sizeof_int.:
>
> >   (it->itNumber != 1) ||
> > - (it->itSize != word_size_in_bits) ||
> > + (it->itSize != sizeof_int * 8) ||
> >   !it->itInLine ||
> >   it->itDeallocate != d_NO ||
> >   !it->itStruct ||
> >   it->itVarArray)
> > - error("argument %s should have been a %s", name,
> word_size_name_string);
> > + error("argument %s should have been a MACH_MSG_TYPE_IN

Re: [PATCH 5/9] use L4 page table directly on x86_64 instead of short-circuiting to pdpbase

2023-02-12 Thread Samuel Thibault

Applied, thanks!

Luca Dariz, le dim. 12 févr. 2023 18:28:14 +0100, a ecrit:
> This is a preparation to run the kernel on high addresses, where the
> user vm region and the kernel vm region will use different L3 page
> tables.
> 
> * i386/intel/pmap.c: on x86_64, retrieve the value of pdpbase from the
>   L4 table, and add the pmap_pdp() helper (useful also for PAE).
> * i386/intel/pmap.h: remove pdpbase on x86_64.
> ---
>  i386/intel/pmap.c | 97 ---
>  i386/intel/pmap.h |  7 ++--
>  2 files changed, 78 insertions(+), 26 deletions(-)
> 
> diff --git a/i386/intel/pmap.c b/i386/intel/pmap.c
> index 470be744..9e9f91db 100644
> --- a/i386/intel/pmap.c
> +++ b/i386/intel/pmap.c
> @@ -430,14 +430,11 @@ pt_entry_t *kernel_page_dir;
>  static pmap_mapwindow_t mapwindows[PMAP_NMAPWINDOWS];
>  def_simple_lock_data(static, pmapwindows_lock)
>  
> +#ifdef PAE
>  static inline pt_entry_t *
> -pmap_pde(const pmap_t pmap, vm_offset_t addr)
> +pmap_ptp(const pmap_t pmap, vm_offset_t addr)
>  {
> - pt_entry_t *page_dir;
> - if (pmap == kernel_pmap)
> - addr = kvtolin(addr);
> -#if PAE
> - pt_entry_t *pdp_table, pdp, pde;
> + pt_entry_t *pdp_table, pdp;
>  #ifdef __x86_64__
>   pdp = pmap->l4base[lin2l4num(addr)];
>   if ((pdp & INTEL_PTE_VALID) == 0)
> @@ -446,6 +443,19 @@ pmap_pde(const pmap_t pmap, vm_offset_t addr)
>  #else /* __x86_64__ */
>   pdp_table = pmap->pdpbase;
>  #endif /* __x86_64__ */
> + return pdp_table;
> +}
> +#endif
> +
> +static inline pt_entry_t *
> +pmap_pde(const pmap_t pmap, vm_offset_t addr)
> +{
> + pt_entry_t *page_dir;
> + if (pmap == kernel_pmap)
> + addr = kvtolin(addr);
> +#if PAE
> + pt_entry_t *pdp_table, pde;
> + pdp_table = pmap_ptp(pmap, addr);
>   pde = pdp_table[lin2pdpnum(addr)];
>   if ((pde & INTEL_PTE_VALID) == 0)
>   return PT_ENTRY_NULL;
> @@ -585,6 +595,7 @@ vm_offset_t pmap_map_bd(
>  static void pmap_bootstrap_pae(void)
>  {
>   vm_offset_t addr;
> + pt_entry_t *pdp_kernel;
>  
>  #ifdef __x86_64__
>  #ifdef MACH_HYP
> @@ -595,13 +606,15 @@ static void pmap_bootstrap_pae(void)
>   memset(kernel_pmap->l4base, 0, INTEL_PGBYTES);
>  #endif   /* x86_64 */
>  
> + // TODO: allocate only the PDPTE for kernel virtual space
> + // this means all directmap and the stupid limit above it
>   init_alloc_aligned(PDPNUM * INTEL_PGBYTES, &addr);
>   kernel_page_dir = (pt_entry_t*)phystokv(addr);
>  
> - kernel_pmap->pdpbase = (pt_entry_t*)phystokv(pmap_grab_page());
> - memset(kernel_pmap->pdpbase, 0, INTEL_PGBYTES);
> + pdp_kernel = (pt_entry_t*)phystokv(pmap_grab_page());
> + memset(pdp_kernel, 0, INTEL_PGBYTES);
>   for (int i = 0; i < PDPNUM; i++)
> - WRITE_PTE(&kernel_pmap->pdpbase[i],
> + WRITE_PTE(&pdp_kernel[i],
> pa_to_pte(_kvtophys((void *) kernel_page_dir
> + i * INTEL_PGBYTES))
> | INTEL_PTE_VALID
> @@ -611,10 +624,14 @@ static void pmap_bootstrap_pae(void)
>   );
>  
>  #ifdef __x86_64__
> - WRITE_PTE(&kernel_pmap->l4base[0], 
> pa_to_pte(_kvtophys(kernel_pmap->pdpbase)) | INTEL_PTE_VALID | 
> INTEL_PTE_WRITE);
> +/* only fill the kernel pdpte during bootstrap */
> + WRITE_PTE(&kernel_pmap->l4base[lin2l4num(VM_MIN_KERNEL_ADDRESS)],
> +  pa_to_pte(_kvtophys(pdp_kernel)) | INTEL_PTE_VALID | 
> INTEL_PTE_WRITE);
>  #ifdef   MACH_PV_PAGETABLES
>   pmap_set_page_readonly_init(kernel_pmap->l4base);
> -#endif
> +#endif /* MACH_PV_PAGETABLES */
> +#else/* x86_64 */
> +kernel_pmap->pdpbase = pdp_kernel;
>  #endif   /* x86_64 */
>  }
>  #endif /* PAE */
> @@ -1243,7 +1260,7 @@ pmap_page_table_page_dealloc(vm_offset_t pa)
>   */
>  pmap_t pmap_create(vm_size_t size)
>  {
> - pt_entry_t  *page_dir[PDPNUM];
> + pt_entry_t  *page_dir[PDPNUM], *pdp_kernel;
>   int i;
>   pmap_t  p;
>   pmap_statistics_t   stats;
> @@ -1301,34 +1318,40 @@ pmap_t pmap_create(vm_size_t size)
>  #endif   /* MACH_PV_PAGETABLES */
>  
>  #if PAE
> - p->pdpbase = (pt_entry_t *) kmem_cache_alloc(&pdpt_cache);
> - if (p->pdpbase == NULL) {
> + pdp_kernel = (pt_entry_t *) kmem_cache_alloc(&pdpt_cache);
> + if (pdp_kernel == NULL) {
>   for (i = 0; i < PDPNUM; i++)
>   kmem_cache_free(&pd_cache, (vm_address_t) page_dir[i]);
>   kmem_cache_free(&pmap_cache, (vm_address_t) p);
>   return PMAP_NULL;
>   }
>  
> - memset(p->pdpbase, 0, INTEL_PGBYTES);
> + memset(pdp_kernel, 0, INTEL_PGBYTES);
>   {
>   for (i = 0; i < PDPNUM; i++)
> - WRITE_PTE(&p->pdpbase[i],
> + WRITE_PTE(&pdp_kernel[i],
>

Re: [PATCH 6/9] add more explicit names for user space virtual space limits

2023-02-12 Thread Samuel Thibault

Applied, thanks!

Luca Dariz, le dim. 12 févr. 2023 18:28:15 +0100, a ecrit:
> * i386/i386/vm_param.h: add VM_MAX/MIN_USER_ADDRESS to kernel headers.
> * i386/i386/db_interface.c
> * i386/i386/ldt.c
> * i386/i386/pcb.c
> * i386/intel/pmap.c
> * kern/task.c: replace VM_MAX/MIN_ADDRESS with VM_MAX/MIN_USER_ADDRESS
> ---
>  i386/i386/db_interface.c |  4 ++--
>  i386/i386/ldt.c  |  8 
>  i386/i386/pcb.c  |  6 +++---
>  i386/i386/vm_param.h |  6 +-
>  i386/intel/pmap.c| 18 +-
>  kern/task.c  |  4 ++--
>  6 files changed, 25 insertions(+), 21 deletions(-)
> 
> diff --git a/i386/i386/db_interface.c b/i386/i386/db_interface.c
> index 3a331490..8e2dc6a2 100644
> --- a/i386/i386/db_interface.c
> +++ b/i386/i386/db_interface.c
> @@ -119,8 +119,8 @@ kern_return_t db_set_debug_state(
>   int i;
>  
>   for (i = 0; i <= 3; i++)
> - if (state->dr[i] < VM_MIN_ADDRESS
> -  || state->dr[i] >= VM_MAX_ADDRESS)
> + if (state->dr[i] < VM_MIN_USER_ADDRESS
> +  || state->dr[i] >= VM_MAX_USER_ADDRESS)
>   return KERN_INVALID_ARGUMENT;
>  
>   pcb->ims.ids = *state;
> diff --git a/i386/i386/ldt.c b/i386/i386/ldt.c
> index 3f9ac8ff..70fa24e2 100644
> --- a/i386/i386/ldt.c
> +++ b/i386/i386/ldt.c
> @@ -64,13 +64,13 @@ ldt_fill(struct real_descriptor *myldt, struct 
> real_descriptor *mygdt)
> (vm_offset_t)&syscall, KERNEL_CS,
> ACC_PL_U|ACC_CALL_GATE, 0);
>   fill_ldt_descriptor(myldt, USER_CS,
> - VM_MIN_ADDRESS,
> - VM_MAX_ADDRESS-VM_MIN_ADDRESS-4096,
> + VM_MIN_USER_ADDRESS,
> + VM_MAX_USER_ADDRESS-VM_MIN_USER_ADDRESS-4096,
>   /* XXX LINEAR_... */
>   ACC_PL_U|ACC_CODE_R, SZ_32);
>   fill_ldt_descriptor(myldt, USER_DS,
> - VM_MIN_ADDRESS,
> - VM_MAX_ADDRESS-VM_MIN_ADDRESS-4096,
> + VM_MIN_USER_ADDRESS,
> + VM_MAX_USER_ADDRESS-VM_MIN_USER_ADDRESS-4096,
>   ACC_PL_U|ACC_DATA_W, SZ_32);
>  
>   /* Activate the LDT.  */
> diff --git a/i386/i386/pcb.c b/i386/i386/pcb.c
> index 924ed08b..3ae9e095 100644
> --- a/i386/i386/pcb.c
> +++ b/i386/i386/pcb.c
> @@ -622,10 +622,10 @@ kern_return_t thread_setstatus(
>   int_table = state->int_table;
>   int_count = state->int_count;
>  
> - if (int_table >= VM_MAX_ADDRESS ||
> + if (int_table >= VM_MAX_USER_ADDRESS ||
>   int_table +
>   int_count * sizeof(struct v86_interrupt_table)
> - > VM_MAX_ADDRESS)
> + > VM_MAX_USER_ADDRESS)
>   return KERN_INVALID_ARGUMENT;
>  
>   thread->pcb->ims.v86s.int_table = int_table;
> @@ -834,7 +834,7 @@ thread_set_syscall_return(
>  vm_offset_t
>  user_stack_low(vm_size_t stack_size)
>  {
> - return (VM_MAX_ADDRESS - stack_size);
> + return (VM_MAX_USER_ADDRESS - stack_size);
>  }
>  
>  /*
> diff --git a/i386/i386/vm_param.h b/i386/i386/vm_param.h
> index 314fdb35..5e7f149a 100644
> --- a/i386/i386/vm_param.h
> +++ b/i386/i386/vm_param.h
> @@ -31,6 +31,10 @@
>  #include 
>  #endif
>  
> +/* To avoid ambiguity in kernel code, make the name explicit */
> +#define VM_MIN_USER_ADDRESS VM_MIN_ADDRESS
> +#define VM_MAX_USER_ADDRESS VM_MAX_ADDRESS
> +
>  /* The kernel address space is usually 1GB, usually starting at virtual 
> address 0.  */
>  /* This can be changed freely to separate kernel addresses from user 
> addresses
>   * for better trace support in kdb; the _START symbol has to be offset by the
> @@ -77,7 +81,7 @@
>  #else
>  /* On x86, the kernel virtual address space is actually located
> at high linear addresses. */
> -#define LINEAR_MIN_KERNEL_ADDRESS(VM_MAX_ADDRESS)
> +#define LINEAR_MIN_KERNEL_ADDRESS(VM_MAX_USER_ADDRESS)
>  #define LINEAR_MAX_KERNEL_ADDRESS(0xUL)
>  #endif
>  
> diff --git a/i386/intel/pmap.c b/i386/intel/pmap.c
> index 9e9f91db..a9ff6f3e 100644
> --- a/i386/intel/pmap.c
> +++ b/i386/intel/pmap.c
> @@ -1341,7 +1341,7 @@ pmap_t pmap_create(vm_size_t size)
> );
>   }
>  #ifdef __x86_64__
> - // TODO alloc only PDPTE for the user range VM_MIN_ADDRESS, 
> VM_MAX_ADDRESS
> + // TODO alloc only PDPTE for the user range VM_MIN_USER_ADDRESS, 
> VM_MAX_USER_ADDRESS
>   // and keep the same for kernel range, in l4 table we have different 
> entries
>   p->l4base = (pt_entry_t *) kmem_cache_alloc(&l4_cache);
>   if (p->l4base == NULL)
> @@ -1349,7 +1349,7 @@ pmap_t pmap_create(vm_size_t size)
>   memset(p->l4base, 0, INTEL_PGBYTES);
>   WRITE_PTE(&p->l4base[lin2l4num(VM_MIN_KERNEL_ADDRESS)],
> pa_to_pte(kvtophys((vm_offse

Re: [PATCH 7/9] extend data types to hold a 64-bit address

2023-02-12 Thread Samuel Thibault

Applied, thanks!

Luca Dariz, le dim. 12 févr. 2023 18:28:16 +0100, a ecrit:
> * i386/i386/trap.c: change from int to a proper type to hold a
>   register value
> * x86_64/locore.S: use 64-bit register to avoid address truncation
> ---
>  i386/i386/trap.c | 12 ++--
>  x86_64/locore.S  |  4 ++--
>  2 files changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/i386/i386/trap.c b/i386/i386/trap.c
> index 1e04ae7d..9a35fb42 100644
> --- a/i386/i386/trap.c
> +++ b/i386/i386/trap.c
> @@ -154,9 +154,9 @@ char *trap_name(unsigned int trapnum)
>   */
>  void kernel_trap(struct i386_saved_state *regs)
>  {
> - int code;
> - int subcode;
> - int type;
> + unsigned long   code;
> + unsigned long   subcode;
> + unsigned long   type;
>   vm_map_tmap;
>   kern_return_t   result;
>   thread_tthread;
> @@ -357,9 +357,9 @@ dump_ss(regs);
>  int user_trap(struct i386_saved_state *regs)
>  {
>   int exc = 0;/* Suppress gcc warning */
> - int code;
> - int subcode;
> - int type;
> + unsigned long   code;
> + unsigned long   subcode;
> + unsigned long   type;
>   thread_t thread = current_thread();
>  
>  #ifdef __x86_64__
> diff --git a/x86_64/locore.S b/x86_64/locore.S
> index c54b5cd8..a2663aff 100644
> --- a/x86_64/locore.S
> +++ b/x86_64/locore.S
> @@ -590,7 +590,7 @@ trap_from_kernel:
>  ENTRY(thread_exception_return)
>  ENTRY(thread_bootstrap_return)
>   movq%rsp,%rcx   /* get kernel stack */
> - or  $(KERNEL_STACK_SIZE-1),%ecx
> + or  $(KERNEL_STACK_SIZE-1),%rcx
>   movq-7-IKS_SIZE(%rcx),%rsp  /* switch back to PCB stack */
>   jmp _return_from_trap
>  
> @@ -603,7 +603,7 @@ ENTRY(thread_bootstrap_return)
>  ENTRY(thread_syscall_return)
>   movqS_ARG0,%rax /* get return value */
>   movq%rsp,%rcx   /* get kernel stack */
> - or  $(KERNEL_STACK_SIZE-1),%ecx
> + or  $(KERNEL_STACK_SIZE-1),%rcx
>   movq-7-IKS_SIZE(%rcx),%rsp  /* switch back to PCB stack */
>   movq%rax,R_EAX(%rsp)/* save return value */
>   jmp _return_from_trap
> -- 
> 2.30.2
> 
> 

-- 
Samuel
---
Pour une évaluation indépendante, transparente et rigoureuse !
Je soutiens la Commission d'Évaluation de l'Inria.

[PATCH gnumach] Consider protected payloads in mach_msg_header_t when resizing messages.

2023-02-12 Thread Flavio Cruz

Protected payloads will be 8-byte longs which are the same size as
kernel ports.

Also aligned all the structures to be 4-byte aligned since it makes it
easier to parse them as padding won't be added to mach_msg_user_header_t
before the protected payload.
---
 include/mach/message.h |  9 -
 x86_64/copy_user.c | 22 ++
 2 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/include/mach/message.h b/include/mach/message.h
index 21c551c2..63341e6f 100644
--- a/include/mach/message.h
+++ b/include/mach/message.h
@@ -132,6 +132,9 @@ typedef unsigned int mach_msg_size_t;
 typedef natural_t mach_msg_seqno_t;
 typedef integer_t mach_msg_id_t;
 
+/* Force 4-byte alignment to simplify how the kernel parses the messages */
+#pragma pack(push, 4)
+
 /* full header structure, may have different size in user/kernel spaces */
 typedefstruct mach_msg_header {
 mach_msg_bits_tmsgh_bits;
@@ -151,7 +154,10 @@ typedefstruct {
 mach_msg_bits_tmsgh_bits;
 mach_msg_size_tmsgh_size;
 mach_port_name_t   msgh_remote_port;
-mach_port_name_t   msgh_local_port;
+union {
+mach_port_name_t   msgh_local_port;
+rpc_uintptr_t msgh_protected_payload;
+};
 mach_port_seqno_t  msgh_seqno;
 mach_msg_id_t  msgh_id;
 } mach_msg_user_header_t;
@@ -224,6 +230,7 @@ typedef struct  {
 natural_t  msgtl_number;
 } mach_msg_type_long_t;
 
+#pragma pack(pop)
 
 /*
  * Known values for the msgt_name field.
diff --git a/x86_64/copy_user.c b/x86_64/copy_user.c
index b340d6f0..ae062645 100644
--- a/x86_64/copy_user.c
+++ b/x86_64/copy_user.c
@@ -16,6 +16,7 @@
  * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
  */
 
+#include 
 #include 
 
 #include 
@@ -181,8 +182,21 @@ int copyinmsg (const void *userbuf, void *kernelbuf, const 
size_t usize)
   /* kmsg->msgh_size is filled in later */
   if (copyin_port(&umsg->msgh_remote_port, &kmsg->msgh_remote_port))
 return 1;
+#ifdef USER32
+  /* This could contain a payload, but for 32 bits it will be the same size as 
a mach_port_name_t */
+  _Static_assert(sizeof(rpc_uintptr_t) == sizeof(mach_port_name_t),
+ "rpc_uintptr_t and mach_port_name_t expected to have the same 
size");
   if (copyin_port(&umsg->msgh_local_port, &kmsg->msgh_local_port))
 return 1;
+#else
+  /* For pure 64 bits, the protected payload is as large as a port pointer. */
+  _Static_assert(sizeof(rpc_uintptr_t) == sizeof(mach_port_t),
+ "rpc_uintptr_t and mach_port_t expected to have the same 
size");
+  if (copyin((char*)umsg + offsetof(mach_msg_user_header_t, msgh_local_port),
+  (char*)kmsg + offsetof(mach_msg_header_t, msgh_local_port),
+ sizeof(rpc_uintptr_t)))
+return 1;
+#endif
   if (copyin(&umsg->msgh_seqno, &kmsg->msgh_seqno,
  sizeof(kmsg->msgh_seqno) + sizeof(kmsg->msgh_id)))
 return 1;
@@ -275,8 +289,16 @@ int copyoutmsg (const void *kernelbuf, void *userbuf, 
const size_t ksize)
   /* umsg->msgh_size is filled in later */
   if (copyout_port(&kmsg->msgh_remote_port, &umsg->msgh_remote_port))
 return 1;
+#ifdef USER32
   if (copyout_port(&kmsg->msgh_local_port, &umsg->msgh_local_port))
 return 1;
+#else
+  /* Handle protected payloads correctly, same as copyinmsg. */
+  if (copyout((char*)kmsg + offsetof(mach_msg_header_t, msgh_local_port),
+  (char*)umsg + offsetof(mach_msg_user_header_t, msgh_local_port),
+ sizeof(rpc_uintptr_t)))
+return 1;
+#endif
   if (copyout(&kmsg->msgh_seqno, &umsg->msgh_seqno,
  sizeof(kmsg->msgh_seqno) + sizeof(kmsg->msgh_id)))
 return 1;
-- 
2.39.1

Re: [PATCH 8/9] separate initialization of kernel and user PTP tables

2023-02-12 Thread Samuel Thibault

Applied, thanks!

Luca Dariz, le dim. 12 févr. 2023 18:28:17 +0100, a ecrit:
> * i386/i386/vm_param.h: temporariliy fix kernel upper address
> * i386/intel/pmap.c: split kernel and user L3 map initialization. For
>   simplicity in handling the different configurations, on 32-bit
>   (+PAE) the name PDPNUM_KERNEL is used in place of PDPNUM, while only
>   on x86_64 the PDPNUM_USER and PDPNUM_KERNEL are treated differently.
>   Also, change iterating over PTP tables in case the kernel map is not
>   right after the user map.
> * i386/intel/pmap.h: define PDPNUM_USER and PDPNUM_KERNEL and move
>   PDPSHIFT to simplify ifdefs.
> ---
>  i386/i386/vm_param.h |  2 +-
>  i386/intel/pmap.c| 62 ++--
>  i386/intel/pmap.h|  8 +++---
>  3 files changed, 52 insertions(+), 20 deletions(-)
> 
> diff --git a/i386/i386/vm_param.h b/i386/i386/vm_param.h
> index 5e7f149a..c2e623a6 100644
> --- a/i386/i386/vm_param.h
> +++ b/i386/i386/vm_param.h
> @@ -77,7 +77,7 @@
>  /* This is the kernel address range in linear addresses.  */
>  #ifdef __x86_64__
>  #define LINEAR_MIN_KERNEL_ADDRESSVM_MIN_KERNEL_ADDRESS
> -#define LINEAR_MAX_KERNEL_ADDRESS(0x7fffUL)
> +#define LINEAR_MAX_KERNEL_ADDRESS(0xUL)
>  #else
>  /* On x86, the kernel virtual address space is actually located
> at high linear addresses. */
> diff --git a/i386/intel/pmap.c b/i386/intel/pmap.c
> index a9ff6f3e..7d4ad341 100644
> --- a/i386/intel/pmap.c
> +++ b/i386/intel/pmap.c
> @@ -604,17 +604,22 @@ static void pmap_bootstrap_pae(void)
>  #endif
>   kernel_pmap->l4base = (pt_entry_t*)phystokv(pmap_grab_page());
>   memset(kernel_pmap->l4base, 0, INTEL_PGBYTES);
> +#else
> + const int PDPNUM_KERNEL = PDPNUM;
>  #endif   /* x86_64 */
>  
> - // TODO: allocate only the PDPTE for kernel virtual space
> - // this means all directmap and the stupid limit above it
> - init_alloc_aligned(PDPNUM * INTEL_PGBYTES, &addr);
> + init_alloc_aligned(PDPNUM_KERNEL * INTEL_PGBYTES, &addr);
>   kernel_page_dir = (pt_entry_t*)phystokv(addr);
> + memset(kernel_page_dir, 0, PDPNUM_KERNEL * INTEL_PGBYTES);
>  
>   pdp_kernel = (pt_entry_t*)phystokv(pmap_grab_page());
>   memset(pdp_kernel, 0, INTEL_PGBYTES);
> - for (int i = 0; i < PDPNUM; i++)
> - WRITE_PTE(&pdp_kernel[i],
> + for (int i = 0; i < PDPNUM_KERNEL; i++) {
> + int pdp_index = i;
> +#ifdef __x86_64__
> + pdp_index += lin2pdpnum(VM_MIN_KERNEL_ADDRESS);
> +#endif
> + WRITE_PTE(&pdp_kernel[pdp_index],
> pa_to_pte(_kvtophys((void *) kernel_page_dir
> + i * INTEL_PGBYTES))
> | INTEL_PTE_VALID
> @@ -622,6 +627,7 @@ static void pmap_bootstrap_pae(void)
> | INTEL_PTE_WRITE
>  #endif
>   );
> + }
>  
>  #ifdef __x86_64__
>  /* only fill the kernel pdpte during bootstrap */
> @@ -749,12 +755,12 @@ void pmap_bootstrap(void)
>   pmap_bootstrap_pae();
>  #else/* PAE */
>   kernel_pmap->dirbase = kernel_page_dir = 
> (pt_entry_t*)phystokv(pmap_grab_page());
> -#endif   /* PAE */
>   {
>   unsigned i;
>   for (i = 0; i < NPDES; i++)
>   kernel_page_dir[i] = 0;
>   }
> +#endif   /* PAE */
>  
>  #ifdef   MACH_PV_PAGETABLES
>   pmap_bootstrap_xen()
> @@ -1260,6 +1266,10 @@ pmap_page_table_page_dealloc(vm_offset_t pa)
>   */
>  pmap_t pmap_create(vm_size_t size)
>  {
> +#ifdef __x86_64__
> + // needs to be reworked if we want to dynamically allocate PDPs
> + const int PDPNUM = PDPNUM_KERNEL;
> +#endif
>   pt_entry_t  *page_dir[PDPNUM], *pdp_kernel;
>   int i;
>   pmap_t  p;
> @@ -1328,8 +1338,12 @@ pmap_t pmap_create(vm_size_t size)
>  
>   memset(pdp_kernel, 0, INTEL_PGBYTES);
>   {
> - for (i = 0; i < PDPNUM; i++)
> - WRITE_PTE(&pdp_kernel[i],
> + for (i = 0; i < PDPNUM; i++) {
> + int pdp_index = i;
> +#ifdef __x86_64__
> + pdp_index += lin2pdpnum(VM_MIN_KERNEL_ADDRESS);
> +#endif
> + WRITE_PTE(&pdp_kernel[pdp_index],
> pa_to_pte(kvtophys((vm_offset_t) page_dir[i]))
> | INTEL_PTE_VALID
>  #if (defined(__x86_64__) && !defined(MACH_HYP)) || 
> defined(MACH_PV_PAGETABLES)
> @@ -1339,19 +1353,39 @@ pmap_t pmap_create(vm_size_t size)
>  #endif /* __x86_64__ */
>  #endif
> );
> + }
>   }
>  #ifdef __x86_64__
> - // TODO alloc only PDPTE for the user range VM_MIN_USER_ADDRESS, 
> VM_MAX_USER_ADDRESS
> - // and keep the same for kernel range, in l4 table we have different 
> entries
>   p->l4base = (pt_entry_t *) kmem_cache_a

Re: [PATCH 9/9] move kernel virtual address space to upper addresses

2023-02-12 Thread Samuel Thibault

Applied, thanks!!

Luca Dariz, le dim. 12 févr. 2023 18:28:18 +0100, a ecrit:
> * i386/i386/vm_param.h: adjust constants to the new kernel map
>   - the boothdr.S code already sets up a temporary map to higher
> addresses, so we can use INIT_VM_MIN_KERNEL_ADDRESS as in xen
>   - increase the kernel map size to accomodate for bigger structures
> and more memory
>   - adjust kernel max address and directmap limit
> * i386/i386at/biosmem.c: enable directmap check also on x86_64
> * i386/include/mach/i386/vm_param.h: increase user virtual memory
>   limit as it's not conflicting with the kernel's anymore
> * i386/intel/pmap.h: adjust lin2pdenum_cont() and INTEL_PTE_PFN to the
>   new kernel map
> * x86_64/Makefrag.am: change KERNEL_MAP_BASE to be above 4G, and
>   according to mcmodel=kernel. This will allow to use the full memory
>   address space.
> ---
>  i386/i386/vm_param.h  | 20 
>  i386/i386at/biosmem.c |  2 --
>  i386/include/mach/i386/vm_param.h |  2 +-
>  i386/intel/pmap.h | 12 ++--
>  x86_64/Makefrag.am| 12 ++--
>  5 files changed, 33 insertions(+), 15 deletions(-)
> 
> diff --git a/i386/i386/vm_param.h b/i386/i386/vm_param.h
> index c2e623a6..8264ea11 100644
> --- a/i386/i386/vm_param.h
> +++ b/i386/i386/vm_param.h
> @@ -45,7 +45,7 @@
>  #define VM_MIN_KERNEL_ADDRESS0xC000UL
>  #endif
>  
> -#ifdef   MACH_XEN
> +#if defined(MACH_XEN) || defined (__x86_64__)
>  /* PV kernels can be loaded directly to the target virtual address */
>  #define INIT_VM_MIN_KERNEL_ADDRESS   VM_MIN_KERNEL_ADDRESS
>  #else/* MACH_XEN */
> @@ -72,12 +72,22 @@
>   * Reserve mapping room for the kernel map, which includes
>   * the device I/O map and the IPC map.
>   */
> +#ifdef __x86_64__
> +/*
> + * Vm structures are quite bigger on 64 bit.
> + * This should be well enough for 8G of physical memory; on the other hand,
> + * maybe not all of them need to be in directly-mapped memory, see the parts
> + * allocated with pmap_steal_memory().
> + */
> +#define VM_KERNEL_MAP_SIZE (512 * 1024 * 1024)
> +#else
>  #define VM_KERNEL_MAP_SIZE (152 * 1024 * 1024)
> +#endif
>  
>  /* This is the kernel address range in linear addresses.  */
>  #ifdef __x86_64__
>  #define LINEAR_MIN_KERNEL_ADDRESSVM_MIN_KERNEL_ADDRESS
> -#define LINEAR_MAX_KERNEL_ADDRESS(0xUL)
> +#define LINEAR_MAX_KERNEL_ADDRESS(0xUL)
>  #else
>  /* On x86, the kernel virtual address space is actually located
> at high linear addresses. */
> @@ -141,8 +151,10 @@
>  #else /* MACH_XEN */
>  #ifdef __LP64__
>  #define VM_PAGE_MAX_SEGS 4
> -#define VM_PAGE_DMA32_LIMIT DECL_CONST(0x1, UL)
> -#define VM_PAGE_DIRECTMAP_LIMIT DECL_CONST(0x4000, UL)
> +#define VM_PAGE_DMA32_LIMIT DECL_CONST(0x1000, UL)
> +#define VM_PAGE_DIRECTMAP_LIMIT (VM_MAX_KERNEL_ADDRESS \
> +  - VM_MIN_KERNEL_ADDRESS \
> +  - VM_KERNEL_MAP_SIZE + 1)
>  #define VM_PAGE_HIGHMEM_LIMIT   DECL_CONST(0x10, UL)
>  #else /* __LP64__ */
>  #define VM_PAGE_DIRECTMAP_LIMIT (VM_MAX_KERNEL_ADDRESS \
> diff --git a/i386/i386at/biosmem.c b/i386/i386at/biosmem.c
> index 78e7bb21..880989fe 100644
> --- a/i386/i386at/biosmem.c
> +++ b/i386/i386at/biosmem.c
> @@ -637,10 +637,8 @@ biosmem_setup_allocator(const struct multiboot_raw_info 
> *mbi)
>   */
>  end = vm_page_trunc((mbi->mem_upper + 1024) << 10);
>  
> -#ifndef __LP64__
>  if (end > VM_PAGE_DIRECTMAP_LIMIT)
>  end = VM_PAGE_DIRECTMAP_LIMIT;
> -#endif /* __LP64__ */
>  
>  max_heap_start = 0;
>  max_heap_end = 0;
> diff --git a/i386/include/mach/i386/vm_param.h 
> b/i386/include/mach/i386/vm_param.h
> index a684ed97..e98f032c 100644
> --- a/i386/include/mach/i386/vm_param.h
> +++ b/i386/include/mach/i386/vm_param.h
> @@ -74,7 +74,7 @@
> */
>  #define VM_MIN_ADDRESS   (0)
>  #ifdef __x86_64__
> -#define VM_MAX_ADDRESS   (0x4000UL)
> +#define VM_MAX_ADDRESS   (0xC000UL)
>  #else
>  #define VM_MAX_ADDRESS   (0xc000UL)
>  #endif
> diff --git a/i386/intel/pmap.h b/i386/intel/pmap.h
> index 34c7cc89..78d27bc8 100644
> --- a/i386/intel/pmap.h
> +++ b/i386/intel/pmap.h
> @@ -77,10 +77,10 @@ typedef phys_addr_t pt_entry_t;
>  #define PDPNUM_KERNEL(((VM_MAX_KERNEL_ADDRESS - 
> VM_MIN_KERNEL_ADDRESS) >> PDPSHIFT) + 1)
>  #define PDPNUM_USER  (((VM_MAX_USER_ADDRESS - VM_MIN_USER_ADDRESS) >> 
> PDPSHIFT) + 1)
>  #define PDPMASK  0x1ff   /* mask for page directory pointer 
> index */
> -#else
> +#else /* __x86_64__ */
>  #define PDPNUM   4   /* number of page directory pointers */
>  #define PDPMASK  3   /* mask for page directory pointer 
> index */
> -#endif
> +#endif /* __x86_64__ */
>  #define PDPSHIFT 30  /* page directory pointer */
>  #define PDESHIFT 21

[PATCH gnumach] Add x86_64 registers to i386_thread_state

2023-02-12 Thread Flavio Cruz

This is required to implement ptrace.
---
 i386/i386/pcb.c| 42 +-
 i386/include/mach/i386/thread_status.h | 28 +
 2 files changed, 69 insertions(+), 1 deletion(-)

diff --git a/i386/i386/pcb.c b/i386/i386/pcb.c
index 9ac55a1c..ba856523 100644
--- a/i386/i386/pcb.c
+++ b/i386/i386/pcb.c
@@ -500,6 +500,25 @@ kern_return_t thread_setstatus(
/*
 * General registers
 */
+#if defined(__x86_64__) && !defined(USER32)
+   saved_state->r8 = state->r8;
+   saved_state->r9 = state->r9;
+   saved_state->r10 = state->r10;
+   saved_state->r11 = state->r11;
+   saved_state->r12 = state->r12;
+   saved_state->r13 = state->r13;
+   saved_state->r14 = state->r14;
+   saved_state->r15 = state->r15;
+   saved_state->edi = state->rdi;
+   saved_state->esi = state->rsi;
+   saved_state->ebp = state->rbp;
+   saved_state->uesp = state->ursp;
+   saved_state->ebx = state->rbx;
+   saved_state->edx = state->rdx;
+   saved_state->ecx = state->rcx;
+   saved_state->eax = state->rax;
+   saved_state->eip = state->rip;
+#else
saved_state->edi = state->edi;
saved_state->esi = state->esi;
saved_state->ebp = state->ebp;
@@ -509,6 +528,7 @@ kern_return_t thread_setstatus(
saved_state->ecx = state->ecx;
saved_state->eax = state->eax;
saved_state->eip = state->eip;
+#endif /* __x86_64__ && !USER32 */
saved_state->efl = (state->efl & ~EFL_USER_CLEAR)
| EFL_USER_SET;
 
@@ -696,6 +716,25 @@ kern_return_t thread_getstatus(
/*
 * General registers.
 */
+#if defined(__x86_64__) && !defined(USER32)
+   state->r8 = saved_state->r8;
+   state->r9 = saved_state->r9;
+   state->r10 = saved_state->r10;
+   state->r11 = saved_state->r11;
+   state->r12 = saved_state->r12;
+   state->r13 = saved_state->r13;
+   state->r14 = saved_state->r14;
+   state->r15 = saved_state->r15;
+   state->rdi = saved_state->edi;
+   state->rsi = saved_state->esi;
+   state->rbp = saved_state->ebp;
+   state->rbx = saved_state->ebx;
+   state->rdx = saved_state->edx;
+   state->rcx = saved_state->ecx;
+   state->rax = saved_state->eax;
+   state->rip = saved_state->eip;
+   state->ursp = saved_state->uesp;
+#else
state->edi = saved_state->edi;
state->esi = saved_state->esi;
state->ebp = saved_state->ebp;
@@ -704,8 +743,9 @@ kern_return_t thread_getstatus(
state->ecx = saved_state->ecx;
state->eax = saved_state->eax;
state->eip = saved_state->eip;
-   state->efl = saved_state->efl;
state->uesp = saved_state->uesp;
+#endif /* __x86_64__ && !USER32 */
+   state->efl = saved_state->efl;
 
state->cs = saved_state->cs;
state->ss = saved_state->ss;
diff --git a/i386/include/mach/i386/thread_status.h 
b/i386/include/mach/i386/thread_status.h
index ba1e3dea..2d05947e 100644
--- a/i386/include/mach/i386/thread_status.h
+++ b/i386/include/mach/i386/thread_status.h
@@ -67,6 +67,26 @@ struct i386_thread_state {
unsigned intfs;
unsigned intes;
unsigned intds;
+
+#if defined(__x86_64__) && !defined(USER32)
+   uint64_tr8;
+   uint64_tr9;
+   uint64_tr10;
+   uint64_tr11;
+   uint64_tr12;
+   uint64_tr13;
+   uint64_tr14;
+   uint64_tr15;
+   uint64_trdi;
+   uint64_trsi;
+   uint64_trbp;
+   uint64_trsp;
+   uint64_trbx;
+   uint64_trdx;
+   uint64_trcx;
+   uint64_trax;
+   uint64_trip;
+#else
unsigned intedi;
unsigned intesi;
unsigned intebp;
@@ -76,9 +96,17 @@ struct i386_thread_state {
unsigned intecx;
unsigned inteax;
unsigned inteip;
+#endif  /* __x86_64__ && !USER32 */
+
unsigned intcs;
+#if defined(__x86_64__) && !defined(USER32)
+   uint64_tefl;
+   uint64_tursp;
+#else
unsigned intefl;
unsigned intuesp;
+#endif  /* __x86_64__ and !USER32 */
+
unsigned intss;
 };
 #define i386_THREAD_STATE_COUNT(sizeof (struct 
i386_thread_state)/sizeof(unsigned int))
-- 
2.39.1

Re: [PATCH mig] Make MIG work for pure 64 bit kernel and userland (V2)

2023-02-12 Thread Samuel Thibault

Flavio Cruz, le dim. 12 févr. 2023 12:49:44 -0500, a ecrit:
> For the 64 bit / 32 bit configuration, we can pass --enable-user32 when
> targeting x86_64, and MIG will generate stubs as before. I am unsure
> whether there exists undefined behavior in that particular configuration
> since most of the structures used by the kernel do not seem to use 8
> byte scalars so potentially we should be OK there.
> 
> diff --git a/parser.y b/parser.y
> index 03c5ec8..3aa18c2 100644
> --- a/parser.y
> +++ b/parser.y
> @@ -212,6 +212,16 @@ Subsystem:   SubsystemStart 
> SubsystemMods
>  IsKernelUser ? ", KernelUser" : "",
>  IsKernelServer ? ", KernelServer" : "");
>  }
> +if (IsKernelUser || IsKernelServer) {
> +port_size = vm_offset_size;
> +port_size_in_bits = vm_offset_size_in_bits;
> +#ifdef USER32
> +/* Since we are using MIG to generate stubs for a 64 bit kernel that 
> operates
> +   with a 32 bit userland, then we need to consider that 
> `max_alignof` is 4 byte aligned.
> + */
> +max_alignof = sizeof(uint32_t);
> +#endif
> +}
>  init_type();
>  }

As I mentioned in another mail: I thought Luca's work inside gnumach was
already handling conversion?

At the very least I'd rather see a runtime option to set max_alignof,
rather than having a mig32, a mig64, and a mig32/64.

Samuel

Re: [PATCH gnumach] Consider protected payloads in mach_msg_header_t when resizing messages.

2023-02-12 Thread Samuel Thibault

Applied, thanks!

Flavio Cruz, le dim. 12 févr. 2023 13:11:13 -0500, a ecrit:
> Protected payloads will be 8-byte longs which are the same size as
> kernel ports.
> 
> Also aligned all the structures to be 4-byte aligned since it makes it
> easier to parse them as padding won't be added to mach_msg_user_header_t
> before the protected payload.
> ---
>  include/mach/message.h |  9 -
>  x86_64/copy_user.c | 22 ++
>  2 files changed, 30 insertions(+), 1 deletion(-)
> 
> diff --git a/include/mach/message.h b/include/mach/message.h
> index 21c551c2..63341e6f 100644
> --- a/include/mach/message.h
> +++ b/include/mach/message.h
> @@ -132,6 +132,9 @@ typedef   unsigned int mach_msg_size_t;
>  typedef natural_t mach_msg_seqno_t;
>  typedef integer_t mach_msg_id_t;
>  
> +/* Force 4-byte alignment to simplify how the kernel parses the messages */
> +#pragma pack(push, 4)
> +
>  /* full header structure, may have different size in user/kernel spaces */
>  typedef  struct mach_msg_header {
>  mach_msg_bits_t  msgh_bits;
> @@ -151,7 +154,10 @@ typedef  struct {
>  mach_msg_bits_t  msgh_bits;
>  mach_msg_size_t  msgh_size;
>  mach_port_name_t msgh_remote_port;
> -mach_port_name_t msgh_local_port;
> +union {
> +mach_port_name_t msgh_local_port;
> +rpc_uintptr_t msgh_protected_payload;
> +};
>  mach_port_seqno_tmsgh_seqno;
>  mach_msg_id_tmsgh_id;
>  } mach_msg_user_header_t;
> @@ -224,6 +230,7 @@ typedef   struct  {
>  natural_tmsgtl_number;
>  } mach_msg_type_long_t;
>  
> +#pragma pack(pop)
>  
>  /*
>   *   Known values for the msgt_name field.
> diff --git a/x86_64/copy_user.c b/x86_64/copy_user.c
> index b340d6f0..ae062645 100644
> --- a/x86_64/copy_user.c
> +++ b/x86_64/copy_user.c
> @@ -16,6 +16,7 @@
>   * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
>   */
>  
> +#include 
>  #include 
>  
>  #include 
> @@ -181,8 +182,21 @@ int copyinmsg (const void *userbuf, void *kernelbuf, 
> const size_t usize)
>/* kmsg->msgh_size is filled in later */
>if (copyin_port(&umsg->msgh_remote_port, &kmsg->msgh_remote_port))
>  return 1;
> +#ifdef USER32
> +  /* This could contain a payload, but for 32 bits it will be the same size 
> as a mach_port_name_t */
> +  _Static_assert(sizeof(rpc_uintptr_t) == sizeof(mach_port_name_t),
> + "rpc_uintptr_t and mach_port_name_t expected to have the 
> same size");
>if (copyin_port(&umsg->msgh_local_port, &kmsg->msgh_local_port))
>  return 1;
> +#else
> +  /* For pure 64 bits, the protected payload is as large as a port pointer. 
> */
> +  _Static_assert(sizeof(rpc_uintptr_t) == sizeof(mach_port_t),
> + "rpc_uintptr_t and mach_port_t expected to have the same 
> size");
> +  if (copyin((char*)umsg + offsetof(mach_msg_user_header_t, msgh_local_port),
> +  (char*)kmsg + offsetof(mach_msg_header_t, msgh_local_port),
> +   sizeof(rpc_uintptr_t)))
> +return 1;
> +#endif
>if (copyin(&umsg->msgh_seqno, &kmsg->msgh_seqno,
>   sizeof(kmsg->msgh_seqno) + sizeof(kmsg->msgh_id)))
>  return 1;
> @@ -275,8 +289,16 @@ int copyoutmsg (const void *kernelbuf, void *userbuf, 
> const size_t ksize)
>/* umsg->msgh_size is filled in later */
>if (copyout_port(&kmsg->msgh_remote_port, &umsg->msgh_remote_port))
>  return 1;
> +#ifdef USER32
>if (copyout_port(&kmsg->msgh_local_port, &umsg->msgh_local_port))
>  return 1;
> +#else
> +  /* Handle protected payloads correctly, same as copyinmsg. */
> +  if (copyout((char*)kmsg + offsetof(mach_msg_header_t, msgh_local_port),
> +  (char*)umsg + offsetof(mach_msg_user_header_t, 
> msgh_local_port),
> +   sizeof(rpc_uintptr_t)))
> +return 1;
> +#endif
>if (copyout(&kmsg->msgh_seqno, &umsg->msgh_seqno,
>   sizeof(kmsg->msgh_seqno) + sizeof(kmsg->msgh_id)))
>  return 1;
> -- 
> 2.39.1
> 
> 

-- 
Samuel
---
Pour une évaluation indépendante, transparente et rigoureuse !
Je soutiens la Commission d'Évaluation de l'Inria.

Re: [RFC PATCH mig 7/12] Drop -undef -ansi from cpp flags

2023-02-12 Thread Flávio Cruz

On Sun, Feb 12, 2023 at 10:02 AM Samuel Thibault 
wrote:

> Sergey Bugaev, le dim. 12 févr. 2023 14:10:38 +0300, a ecrit:
> > Since GNU Mach commit d30481122a5d24ad6b921062f93b9172ef922fc3,
> > i386/machine_types.defs defines types based on defined(__x86_64__).
> > Supressing the built-in macro definitions will now result in the wrong
> > type being silently selected.
> >
> > -undef was initially introduced in commit
> > 78b6a7665db7b2eae367e17102821cbdca231d19 without much of an explanation.
> > -ansi was introduced in commit 6940fb91859e46b2e96a331a029f2dc2a0ee51c9
> > "to avoid -Di386=1 and the like".
> >
> > Since glibc has been using MIG with CPP set to a custom GCC invocation
> > which did *not* use either flag, it appears that everything works well
> > enough even without them. On the other hand, not having __x86_64__
> > defined most definetely causes issues for anything that does not set a
> > custom CPP when invoking MIG (i.e., most users). Other built-in
> > definitions could be used in the future in a similar way (e.g. on other
> > architectures); it's really more of a coincidence that they have not
> > been used until now, and things kept working with -undef.
>
> That looks alright to me. Flavior, what do you think about it?
>

Seems fine to me.


>
> > ---
> >  mig.in | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/mig.in b/mig.in
> > index 63e0269..94fd500 100644
> > --- a/mig.in
> > +++ b/mig.in
> > @@ -38,7 +38,7 @@ migcom=${MIGDIR-$libexecdir}/${MIGCOM-@MIGCOM@}
> >  # The expansion of TARGET_CC might refer to ${CC}, so make sure it is
> defined.
> >  default_cc="@CC@"
> >  CC="${CC-${default_cc}}"
> > -default_cpp="@TARGET_CC@ -E -x c -undef -ansi"
> > +default_cpp="@TARGET_CC@ -E -x c"
> >  cpp="${CPP-${default_cpp}}"
> >
> >  cppflags=
> > --
> > 2.39.1
>

Re: [RFC PATCH mig 7/12] Drop -undef -ansi from cpp flags

2023-02-12 Thread Samuel Thibault

Applied, thanks!

Sergey Bugaev, le dim. 12 févr. 2023 14:10:38 +0300, a ecrit:
> Since GNU Mach commit d30481122a5d24ad6b921062f93b9172ef922fc3,
> i386/machine_types.defs defines types based on defined(__x86_64__).
> Supressing the built-in macro definitions will now result in the wrong
> type being silently selected.
> 
> -undef was initially introduced in commit
> 78b6a7665db7b2eae367e17102821cbdca231d19 without much of an explanation.
> -ansi was introduced in commit 6940fb91859e46b2e96a331a029f2dc2a0ee51c9
> "to avoid -Di386=1 and the like".
> 
> Since glibc has been using MIG with CPP set to a custom GCC invocation
> which did *not* use either flag, it appears that everything works well
> enough even without them. On the other hand, not having __x86_64__
> defined most definetely causes issues for anything that does not set a
> custom CPP when invoking MIG (i.e., most users). Other built-in
> definitions could be used in the future in a similar way (e.g. on other
> architectures); it's really more of a coincidence that they have not
> been used until now, and things kept working with -undef.
> ---
>  mig.in | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mig.in b/mig.in
> index 63e0269..94fd500 100644
> --- a/mig.in
> +++ b/mig.in
> @@ -38,7 +38,7 @@ migcom=${MIGDIR-$libexecdir}/${MIGCOM-@MIGCOM@}
>  # The expansion of TARGET_CC might refer to ${CC}, so make sure it is 
> defined.
>  default_cc="@CC@"
>  CC="${CC-${default_cc}}"
> -default_cpp="@TARGET_CC@ -E -x c -undef -ansi"
> +default_cpp="@TARGET_CC@ -E -x c"
>  cpp="${CPP-${default_cpp}}"
>  
>  cppflags=
> -- 
> 2.39.1
> 
> 

-- 
Samuel
---
Pour une évaluation indépendante, transparente et rigoureuse !
Je soutiens la Commission d'Évaluation de l'Inria.

Re: [PATCH gnumach] Add x86_64 registers to i386_thread_state

2023-02-12 Thread Samuel Thibault

Applied, thanks!

Flavio Cruz, le dim. 12 févr. 2023 13:26:29 -0500, a ecrit:
> This is required to implement ptrace.
> ---
>  i386/i386/pcb.c| 42 +-
>  i386/include/mach/i386/thread_status.h | 28 +
>  2 files changed, 69 insertions(+), 1 deletion(-)
> 
> diff --git a/i386/i386/pcb.c b/i386/i386/pcb.c
> index 9ac55a1c..ba856523 100644
> --- a/i386/i386/pcb.c
> +++ b/i386/i386/pcb.c
> @@ -500,6 +500,25 @@ kern_return_t thread_setstatus(
>   /*
>* General registers
>*/
> +#if defined(__x86_64__) && !defined(USER32)
> + saved_state->r8 = state->r8;
> + saved_state->r9 = state->r9;
> + saved_state->r10 = state->r10;
> + saved_state->r11 = state->r11;
> + saved_state->r12 = state->r12;
> + saved_state->r13 = state->r13;
> + saved_state->r14 = state->r14;
> + saved_state->r15 = state->r15;
> + saved_state->edi = state->rdi;
> + saved_state->esi = state->rsi;
> + saved_state->ebp = state->rbp;
> + saved_state->uesp = state->ursp;
> + saved_state->ebx = state->rbx;
> + saved_state->edx = state->rdx;
> + saved_state->ecx = state->rcx;
> + saved_state->eax = state->rax;
> + saved_state->eip = state->rip;
> +#else
>   saved_state->edi = state->edi;
>   saved_state->esi = state->esi;
>   saved_state->ebp = state->ebp;
> @@ -509,6 +528,7 @@ kern_return_t thread_setstatus(
>   saved_state->ecx = state->ecx;
>   saved_state->eax = state->eax;
>   saved_state->eip = state->eip;
> +#endif /* __x86_64__ && !USER32 */
>   saved_state->efl = (state->efl & ~EFL_USER_CLEAR)
>   | EFL_USER_SET;
>  
> @@ -696,6 +716,25 @@ kern_return_t thread_getstatus(
>   /*
>* General registers.
>*/
> +#if defined(__x86_64__) && !defined(USER32)
> + state->r8 = saved_state->r8;
> + state->r9 = saved_state->r9;
> + state->r10 = saved_state->r10;
> + state->r11 = saved_state->r11;
> + state->r12 = saved_state->r12;
> + state->r13 = saved_state->r13;
> + state->r14 = saved_state->r14;
> + state->r15 = saved_state->r15;
> + state->rdi = saved_state->edi;
> + state->rsi = saved_state->esi;
> + state->rbp = saved_state->ebp;
> + state->rbx = saved_state->ebx;
> + state->rdx = saved_state->edx;
> + state->rcx = saved_state->ecx;
> + state->rax = saved_state->eax;
> + state->rip = saved_state->eip;
> + state->ursp = saved_state->uesp;
> +#else
>   state->edi = saved_state->edi;
>   state->esi = saved_state->esi;
>   state->ebp = saved_state->ebp;
> @@ -704,8 +743,9 @@ kern_return_t thread_getstatus(
>   state->ecx = saved_state->ecx;
>   state->eax = saved_state->eax;
>   state->eip = saved_state->eip;
> - state->efl = saved_state->efl;
>   state->uesp = saved_state->uesp;
> +#endif /* __x86_64__ && !USER32 */
> + state->efl = saved_state->efl;
>  
>   state->cs = saved_state->cs;
>   state->ss = saved_state->ss;
> diff --git a/i386/include/mach/i386/thread_status.h 
> b/i386/include/mach/i386/thread_status.h
> index ba1e3dea..2d05947e 100644
> --- a/i386/include/mach/i386/thread_status.h
> +++ b/i386/include/mach/i386/thread_status.h
> @@ -67,6 +67,26 @@ struct i386_thread_state {
>   unsigned intfs;
>   unsigned intes;
>   unsigned intds;
> +
> +#if defined(__x86_64__) && !defined(USER32)
> + uint64_tr8;
> + uint64_tr9;
> + uint64_tr10;
> + uint64_tr11;
> + uint64_tr12;
> + uint64_tr13;
> + uint64_tr14;
> + uint64_tr15;
> + uint64_trdi;
> + uint64_trsi;
> + uint64_trbp;
> + uint64_trsp;
> + uint64_trbx;
> + uint64_trdx;
> + uint64_trcx;
> + uint64_trax;
> + uint64_trip;
> +#else
>   unsigned intedi;
>   unsigned intesi;
>   unsigned intebp;
> @@ -76,9 +96,17 @@ struct i386_thread_state {
>   unsigned intecx;
>   unsigned inteax;
>   unsigned inteip;
> +#endif  /* __x86_64__ && !USER32 */
> +
>   unsigned intcs;
> +#if defined(__x86_64__) && !defined(USER32)
> + uint64_tefl;
> + uint64_tursp;
> +#else
>   unsigned intefl;
>   unsigned intuesp;
> +#endif  /* __x86_64__ and !USER32 */
> +
>   unsigned intss;
>  };
>  #define i386_THRE

Re: [RFC PATCH hurd 6/12] hurd: Fix modes_t and speeds_t types on 64-bit

2023-02-12 Thread Samuel Thibault

Applied, thanks!
Sergey Bugaev, le dim. 12 févr. 2023 19:13:58 +0300, a ecrit:
> On Sun, Feb 12, 2023 at 6:22 PM Samuel Thibault  
> wrote:
> > I'd rather say drop the "long" part, to avoid having to pull the
> > stdint.h header in.
> 
> That's what I meant, yes.
> 
> > Nowadays' BSD headers just use the int type,
> > notably.
> 
> So given that the Linux port has its own bits/termios.h, is it fine to
> just modify the generic one inline, rather than creating a
> Hurd-specific version? The patch for that follows.
> 
> Sergey
> 
> -- >8 --
> 
> From 625b774141823a8f504cce8a92b9f45f5e6f050f Mon Sep 17 00:00:00 2001
> From: Sergey Bugaev 
> Date: Sun, 12 Feb 2023 19:08:57 +0300
> Subject: [PATCH] hurd: Fix tcflag_t and speed_t types on 64-bit
> 
> These are supposed to stay 32-bit even on 64-bit systems. This matches
> BSD and Linux, as well as how these types are already defined in
> tioctl.defs
> 
> Signed-off-by: Sergey Bugaev 
> ---
>  bits/termios.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/bits/termios.h b/bits/termios.h
> index 4439c2f1..6a883ceb 100644
> --- a/bits/termios.h
> +++ b/bits/termios.h
> @@ -99,13 +99,13 @@
> `tcflag_t', `cc_t', `speed_t' and the `TC*' constants appropriately.  */
> 
>  /* Type of terminal control flag masks.  */
> -typedef unsigned long int tcflag_t;
> +typedef unsigned int tcflag_t;
> 
>  /* Type of control characters.  */
>  typedef unsigned char cc_t;
> 
>  /* Type of baud rate specifiers.  */
> -typedef long int speed_t;
> +typedef int speed_t;
> 
>  /* Terminal control structure.  */
>  struct termios
> -- 
> 2.39.1
> 

-- 
Samuel
---
Pour une évaluation indépendante, transparente et rigoureuse !
Je soutiens la Commission d'Évaluation de l'Inria.

Re: [RFC PATCH glibc 11/12] hurd, htl: Add some x86_64-specific code

2023-02-12 Thread Florian Weimer

* Samuel Thibault:

> Florian Weimer, le dim. 12 févr. 2023 17:40:58 +0100, a ecrit:
>> * Samuel Thibault via Libc-alpha:
>> 
>> > Sergey Bugaev, le dim. 12 févr. 2023 19:25:11 +0300, a ecrit:
>> >> On Sun, Feb 12, 2023 at 7:11 PM Samuel Thibault  
>> >> wrote:
>> >> > Sergey Bugaev, le dim. 12 févr. 2023 14:10:42 +0300, a ecrit:
>> >> > > We should not need a getter routine, because one can simply inspect 
>> >> > > the target
>> >> > > thread's state (unless, again, I misunderstand things horribly).
>> >> >
>> >> > For 16bit fs/gs values we could read them from userland yes. But for
>> >> > fs/gs base, the FSGSBASE instruction is not available on all 64bit
>> >> > processors. And ATM in THREAD_TCB we want to be able to get the base of
>> >> > another thread.
>> >> 
>> >> What I've meant is:
>> >> 
>> >> __thread_get_state (whatever_thread, &state);
>> >> uintptr_t its_fs_base = state->fs_base;
>> >> 
>> >> You can't really do the same to *write* [fg]s_base, because doing
>> >> thread_set_state on your own thread is bound to end badly.
>> >
>> > ? Well, sure, just like setting fs/gs through thread state was not done
>> > for i386.
>> >
>> > I don't see where you're aiming. Getting fs/gs from __thread_get_state
>> > won't actually give you the base, you'll just read something like 0.
>> 
>> The convention is that the FSBASE address is at %fs:0.
>
> Yes, but that works only for reading your own base, not the base of
> another thread.

Well, yes, but how do you identify the other thread?  Usually by the
address of its TCB.

Re: [RFC PATCH glibc 11/12] hurd, htl: Add some x86_64-specific code

2023-02-12 Thread Samuel Thibault

Florian Weimer, le dim. 12 févr. 2023 20:29:43 +0100, a ecrit:
> * Samuel Thibault:
> 
> > Florian Weimer, le dim. 12 févr. 2023 17:40:58 +0100, a ecrit:
> >> * Samuel Thibault via Libc-alpha:
> >> 
> >> > Sergey Bugaev, le dim. 12 févr. 2023 19:25:11 +0300, a ecrit:
> >> >> On Sun, Feb 12, 2023 at 7:11 PM Samuel Thibault 
> >> >>  wrote:
> >> >> > Sergey Bugaev, le dim. 12 févr. 2023 14:10:42 +0300, a ecrit:
> >> >> > > We should not need a getter routine, because one can simply inspect 
> >> >> > > the target
> >> >> > > thread's state (unless, again, I misunderstand things horribly).
> >> >> >
> >> >> > For 16bit fs/gs values we could read them from userland yes. But for
> >> >> > fs/gs base, the FSGSBASE instruction is not available on all 64bit
> >> >> > processors. And ATM in THREAD_TCB we want to be able to get the base 
> >> >> > of
> >> >> > another thread.
> >> >> 
> >> >> What I've meant is:
> >> >> 
> >> >> __thread_get_state (whatever_thread, &state);
> >> >> uintptr_t its_fs_base = state->fs_base;
> >> >> 
> >> >> You can't really do the same to *write* [fg]s_base, because doing
> >> >> thread_set_state on your own thread is bound to end badly.
> >> >
> >> > ? Well, sure, just like setting fs/gs through thread state was not done
> >> > for i386.
> >> >
> >> > I don't see where you're aiming. Getting fs/gs from __thread_get_state
> >> > won't actually give you the base, you'll just read something like 0.
> >> 
> >> The convention is that the FSBASE address is at %fs:0.
> >
> > Yes, but that works only for reading your own base, not the base of
> > another thread.
> 
> Well, yes, but how do you identify the other thread?  Usually by the
> address of its TCB.

Yes, but that's not (yet) the case in htl.

Samuel

[PATCH hurd] Use rpc_time_value_t in user space to avoid build errors

2023-02-12 Thread Flavio Cruz

rpc_time_value_t was introduced in 
https://git.savannah.gnu.org/cgit/hurd/gnumach.git/commit/?id=5fdc928d3d29fdc93ad00cea5f5c877a19013d44
but it breaks the build. This change fixes it.
---
 libps/spec.c | 8 
 procfs/process.c | 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/libps/spec.c b/libps/spec.c
index adca2e9c..38ececc1 100644
--- a/libps/spec.c
+++ b/libps/spec.c
@@ -198,7 +198,7 @@ const struct ps_getter ps_max_priority_getter =
 static void
 ps_get_usr_time (struct proc_stat *ps, struct timeval *tv)
 {
-  time_value_t tvt = proc_stat_thread_basic_info (ps)->user_time;
+  rpc_time_value_t tvt = proc_stat_thread_basic_info (ps)->user_time;
   tv->tv_sec = tvt.seconds;
   tv->tv_usec = tvt.microseconds;
 }
@@ -208,7 +208,7 @@ const struct ps_getter ps_usr_time_getter =
 static void
 ps_get_sys_time (struct proc_stat *ps, struct timeval *tv)
 {
-  time_value_t tvt = proc_stat_thread_basic_info (ps)->system_time;
+  rpc_time_value_t tvt = proc_stat_thread_basic_info (ps)->system_time;
   tv->tv_sec = tvt.seconds;
   tv->tv_usec = tvt.microseconds;
 }
@@ -218,7 +218,7 @@ const struct ps_getter ps_sys_time_getter =
 static void
 ps_get_tot_time (struct proc_stat *ps, struct timeval *tv)
 {
-  time_value_t tvt = proc_stat_thread_basic_info (ps)->user_time;
+  rpc_time_value_t tvt = proc_stat_thread_basic_info (ps)->user_time;
   time_value_add (&tvt, &proc_stat_thread_basic_info (ps)->system_time);
   tv->tv_sec = tvt.seconds;
   tv->tv_usec = tvt.microseconds;
@@ -229,7 +229,7 @@ const struct ps_getter ps_tot_time_getter =
 static void
 ps_get_start_time (struct proc_stat *ps, struct timeval *tv)
 {
-  time_value_t *const tvt = &proc_stat_task_basic_info (ps)->creation_time;
+  rpc_time_value_t *const tvt = &proc_stat_task_basic_info (ps)->creation_time;
   tv->tv_sec = tvt->seconds;
   tv->tv_usec = tvt->microseconds;
 }
diff --git a/procfs/process.c b/procfs/process.c
index f608bed6..0ae01db7 100644
--- a/procfs/process.c
+++ b/procfs/process.c
@@ -75,7 +75,7 @@ static const char *state_string (struct proc_stat *ps)
   return "? (unknown)";
 }
 
-static long long int timeval_jiffies (time_value_t tv)
+static long long int timeval_jiffies (rpc_time_value_t tv)
 {
   double secs = tv.seconds * 100. + tv.microseconds;
   return secs * opt_clk_tck / 100.;
-- 
2.39.1

Re: [PATCH hurd] Use rpc_time_value_t in user space to avoid build errors

2023-02-12 Thread Samuel Thibault

Flavio Cruz, le dim. 12 févr. 2023 16:54:52 -0500, a ecrit:
> rpc_time_value_t was introduced in 
> https://git.savannah.gnu.org/cgit/hurd/gnumach.git/commit/?id=5fdc928d3d29fdc93ad00cea5f5c877a19013d44
> but it breaks the build.

Argl, I was wondering a bit about whether exposing rpc_ names could
pose problem, and for structures it makes them incompatible indeed. But
actually in userland we don't have any difference between the two, it's
only C that refuses the compatibility.

I'd say we'd rather avoid such flag-day issue and make the kernel header
use rpc_* only #if KERNEL?

> This change fixes it.
> ---
>  libps/spec.c | 8 
>  procfs/process.c | 2 +-
>  2 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/libps/spec.c b/libps/spec.c
> index adca2e9c..38ececc1 100644
> --- a/libps/spec.c
> +++ b/libps/spec.c
> @@ -198,7 +198,7 @@ const struct ps_getter ps_max_priority_getter =
>  static void
>  ps_get_usr_time (struct proc_stat *ps, struct timeval *tv)
>  {
> -  time_value_t tvt = proc_stat_thread_basic_info (ps)->user_time;
> +  rpc_time_value_t tvt = proc_stat_thread_basic_info (ps)->user_time;
>tv->tv_sec = tvt.seconds;
>tv->tv_usec = tvt.microseconds;
>  }
> @@ -208,7 +208,7 @@ const struct ps_getter ps_usr_time_getter =
>  static void
>  ps_get_sys_time (struct proc_stat *ps, struct timeval *tv)
>  {
> -  time_value_t tvt = proc_stat_thread_basic_info (ps)->system_time;
> +  rpc_time_value_t tvt = proc_stat_thread_basic_info (ps)->system_time;
>tv->tv_sec = tvt.seconds;
>tv->tv_usec = tvt.microseconds;
>  }
> @@ -218,7 +218,7 @@ const struct ps_getter ps_sys_time_getter =
>  static void
>  ps_get_tot_time (struct proc_stat *ps, struct timeval *tv)
>  {
> -  time_value_t tvt = proc_stat_thread_basic_info (ps)->user_time;
> +  rpc_time_value_t tvt = proc_stat_thread_basic_info (ps)->user_time;
>time_value_add (&tvt, &proc_stat_thread_basic_info (ps)->system_time);
>tv->tv_sec = tvt.seconds;
>tv->tv_usec = tvt.microseconds;
> @@ -229,7 +229,7 @@ const struct ps_getter ps_tot_time_getter =
>  static void
>  ps_get_start_time (struct proc_stat *ps, struct timeval *tv)
>  {
> -  time_value_t *const tvt = &proc_stat_task_basic_info (ps)->creation_time;
> +  rpc_time_value_t *const tvt = &proc_stat_task_basic_info 
> (ps)->creation_time;
>tv->tv_sec = tvt->seconds;
>tv->tv_usec = tvt->microseconds;
>  }
> diff --git a/procfs/process.c b/procfs/process.c
> index f608bed6..0ae01db7 100644
> --- a/procfs/process.c
> +++ b/procfs/process.c
> @@ -75,7 +75,7 @@ static const char *state_string (struct proc_stat *ps)
>return "? (unknown)";
>  }
>  
> -static long long int timeval_jiffies (time_value_t tv)
> +static long long int timeval_jiffies (rpc_time_value_t tv)
>  {
>double secs = tv.seconds * 100. + tv.microseconds;
>return secs * opt_clk_tck / 100.;
> -- 
> 2.39.1
> 
> 

-- 
Samuel
---
Pour une évaluation indépendante, transparente et rigoureuse !
Je soutiens la Commission d'Évaluation de l'Inria.

[PATCH mig] Introduce max_alignof to replace word_size

2023-02-12 Thread Flavio Cruz

Remove the concept of word_size since it is meaningless in some
architectures. This is also done in preparation to possibly introduce
8-byte aligned messages.
---
 cpu.sym   |  3 +-
 global.c  |  5 +--
 global.h  |  5 +--
 parser.y  |  8 ++---
 routine.c |  5 ++-
 server.c  | 20 ++--
 type.c| 95 ---
 user.c| 22 ++---
 utils.c   |  6 +++-
 utils.h   |  2 ++
 10 files changed, 91 insertions(+), 80 deletions(-)

diff --git a/cpu.sym b/cpu.sym
index 6bcb878..57084d0 100644
--- a/cpu.sym
+++ b/cpu.sym
@@ -95,8 +95,7 @@ expr MACH_MSG_TYPE_POLYMORPHIC
 
 
 /* Types used in interfaces */
-expr sizeof(integer_t) word_size
-expr (sizeof(integer_t)*8) word_size_in_bits
+expr sizeof(natural_t) desired_max_alignof
 expr sizeof(void*) sizeof_pointer
 expr sizeof(char)  sizeof_char
 expr sizeof(short) sizeof_short
diff --git a/global.c b/global.c
index bf3d10d..ad477c2 100644
--- a/global.c
+++ b/global.c
@@ -66,8 +66,9 @@ string_t InternalHeaderFileName = strNULL;
 string_t UserFileName = strNULL;
 string_t ServerFileName = strNULL;
 
-int port_size = port_name_size;
-int port_size_in_bits = port_name_size_in_bits;
+size_t port_size = port_name_size;
+size_t port_size_in_bits = port_name_size_in_bits;
+size_t max_alignof = desired_max_alignof;
 
 void
 more_global(void)
diff --git a/global.h b/global.h
index 8e57df2..bf8eb38 100644
--- a/global.h
+++ b/global.h
@@ -67,8 +67,9 @@ extern string_t InternalHeaderFileName;
 extern string_t UserFileName;
 extern string_t ServerFileName;
 
-extern int port_size;
-extern int port_size_in_bits;
+extern size_t port_size;
+extern size_t port_size_in_bits;
+extern size_t max_alignof;
 
 extern void more_global(void);
 
diff --git a/parser.y b/parser.y
index 03c5ec8..ccf4726 100644
--- a/parser.y
+++ b/parser.y
@@ -212,6 +212,10 @@ Subsystem  :   SubsystemStart SubsystemMods
   IsKernelUser ? ", KernelUser" : "",
   IsKernelServer ? ", KernelServer" : "");
 }
+if (IsKernelUser || IsKernelServer) {
+port_size = vm_offset_size;
+port_size_in_bits = vm_offset_size_in_bits;
+}
 init_type();
 }
;
@@ -237,16 +241,12 @@ SubsystemMod  :   syKernelUser
 if (IsKernelUser)
warn("duplicate KernelUser keyword");
 IsKernelUser = true;
-port_size = vm_offset_size;
-port_size_in_bits = vm_offset_size_in_bits;
 }
|   syKernelServer
 {
 if (IsKernelServer)
warn("duplicate KernelServer keyword");
 IsKernelServer = true;
-port_size = vm_offset_size;
-port_size_in_bits = vm_offset_size_in_bits;
 }
;
 
diff --git a/routine.c b/routine.c
index c0a016c..5ed0064 100644
--- a/routine.c
+++ b/routine.c
@@ -47,6 +47,7 @@
 #include "routine.h"
 #include "message.h"
 #include "cpu.h"
+#include "utils.h"
 
 u_int rtNumber = 0;
 
@@ -316,19 +317,21 @@ rtFindSize(const argument_t *args, u_int mask)
 const argument_t *arg;
 u_int size = sizeof_mach_msg_header_t;
 
+size = ALIGN(size, max_alignof);
 for (arg = args; arg != argNULL; arg = arg->argNext)
if (akCheck(arg->argKind, mask))
{
ipc_type_t *it = arg->argType;
 
/* might need proper alignment on demanding 64bit archies */
-   size = (size + word_size-1) & ~(word_size-1);
if (arg->argLongForm) {
size += sizeof_mach_msg_type_long_t;
} else {
size += sizeof_mach_msg_type_t;
}
+   size = ALIGN(size, max_alignof);
 
+   /* Note itMinTypeSize is already aligned to max_alignof. */
size += it->itMinTypeSize;
}
 
diff --git a/server.c b/server.c
index 95b0056..14b129b 100644
--- a/server.c
+++ b/server.c
@@ -483,7 +483,7 @@ WriteCheckArgSize(FILE *file, const argument_t *arg)
arg->argLongForm ? ".msgtl_header" : "");
 }
 
-if (btype->itTypeSize % word_size != 0)
+if (btype->itTypeSize % max_alignof != 0)
fprintf(file, "(");
 
 if (multiplier > 1)
@@ -491,10 +491,10 @@ WriteCheckArgSize(FILE *file, const argument_t *arg)
 
 fprintf(file, "In%dP->%s", arg->argRequestPos, count->argMsgField);
 
-/* If the base type size of the data field isn`t a multiple of word_size,
+/* If the base type size of the data field isn`t a multiple of max_alignof,
we have to round up. */
-if (btype->itTypeSize % word_size != 0)
-   fprintf(file, " + %d) & ~%d", word_size - 1, word_size - 1);
+if (btype->itTypeSize % max_alignof != 0)
+   fprintf(file, " + %d) & ~%d", max_alignof - 1, max_alignof - 1);
 
 if (ptype->itIndefinite) {
fprintf(file, " : sizeof(%s *)", FetchServerType(btype));
@@ -530,8 +530,8 @@ WriteCheckMsgSize(FILE *file, const argument_t *arg)

Re: [PATCH mig] Introduce max_alignof to replace word_size

2023-02-12 Thread Samuel Thibault

Renamed to complex_alignof and commited, thanks!

Flavio Cruz, le dim. 12 févr. 2023 17:14:23 -0500, a ecrit:
> Remove the concept of word_size since it is meaningless in some
> architectures. This is also done in preparation to possibly introduce
> 8-byte aligned messages.
> ---
>  cpu.sym   |  3 +-
>  global.c  |  5 +--
>  global.h  |  5 +--
>  parser.y  |  8 ++---
>  routine.c |  5 ++-
>  server.c  | 20 ++--
>  type.c| 95 ---
>  user.c| 22 ++---
>  utils.c   |  6 +++-
>  utils.h   |  2 ++
>  10 files changed, 91 insertions(+), 80 deletions(-)
> 
> diff --git a/cpu.sym b/cpu.sym
> index 6bcb878..57084d0 100644
> --- a/cpu.sym
> +++ b/cpu.sym
> @@ -95,8 +95,7 @@ expr MACH_MSG_TYPE_POLYMORPHIC
>  
>  
>  /* Types used in interfaces */
> -expr sizeof(integer_t)   word_size
> -expr (sizeof(integer_t)*8)   word_size_in_bits
> +expr sizeof(natural_t)   desired_max_alignof
>  expr sizeof(void*)   sizeof_pointer
>  expr sizeof(char)sizeof_char
>  expr sizeof(short)   sizeof_short
> diff --git a/global.c b/global.c
> index bf3d10d..ad477c2 100644
> --- a/global.c
> +++ b/global.c
> @@ -66,8 +66,9 @@ string_t InternalHeaderFileName = strNULL;
>  string_t UserFileName = strNULL;
>  string_t ServerFileName = strNULL;
>  
> -int port_size = port_name_size;
> -int port_size_in_bits = port_name_size_in_bits;
> +size_t port_size = port_name_size;
> +size_t port_size_in_bits = port_name_size_in_bits;
> +size_t max_alignof = desired_max_alignof;
>  
>  void
>  more_global(void)
> diff --git a/global.h b/global.h
> index 8e57df2..bf8eb38 100644
> --- a/global.h
> +++ b/global.h
> @@ -67,8 +67,9 @@ extern string_t InternalHeaderFileName;
>  extern string_t UserFileName;
>  extern string_t ServerFileName;
>  
> -extern int port_size;
> -extern int port_size_in_bits;
> +extern size_t port_size;
> +extern size_t port_size_in_bits;
> +extern size_t max_alignof;
>  
>  extern void more_global(void);
>  
> diff --git a/parser.y b/parser.y
> index 03c5ec8..ccf4726 100644
> --- a/parser.y
> +++ b/parser.y
> @@ -212,6 +212,10 @@ Subsystem:   SubsystemStart 
> SubsystemMods
>  IsKernelUser ? ", KernelUser" : "",
>  IsKernelServer ? ", KernelServer" : "");
>  }
> +if (IsKernelUser || IsKernelServer) {
> +port_size = vm_offset_size;
> +port_size_in_bits = vm_offset_size_in_bits;
> +}
>  init_type();
>  }
>   ;
> @@ -237,16 +241,12 @@ SubsystemMod:   syKernelUser
>  if (IsKernelUser)
>   warn("duplicate KernelUser keyword");
>  IsKernelUser = true;
> -port_size = vm_offset_size;
> -port_size_in_bits = vm_offset_size_in_bits;
>  }
>   |   syKernelServer
>  {
>  if (IsKernelServer)
>   warn("duplicate KernelServer keyword");
>  IsKernelServer = true;
> -port_size = vm_offset_size;
> -port_size_in_bits = vm_offset_size_in_bits;
>  }
>   ;
>  
> diff --git a/routine.c b/routine.c
> index c0a016c..5ed0064 100644
> --- a/routine.c
> +++ b/routine.c
> @@ -47,6 +47,7 @@
>  #include "routine.h"
>  #include "message.h"
>  #include "cpu.h"
> +#include "utils.h"
>  
>  u_int rtNumber = 0;
>  
> @@ -316,19 +317,21 @@ rtFindSize(const argument_t *args, u_int mask)
>  const argument_t *arg;
>  u_int size = sizeof_mach_msg_header_t;
>  
> +size = ALIGN(size, max_alignof);
>  for (arg = args; arg != argNULL; arg = arg->argNext)
>   if (akCheck(arg->argKind, mask))
>   {
>   ipc_type_t *it = arg->argType;
>  
>   /* might need proper alignment on demanding 64bit archies */
> - size = (size + word_size-1) & ~(word_size-1);
>   if (arg->argLongForm) {
>   size += sizeof_mach_msg_type_long_t;
>   } else {
>   size += sizeof_mach_msg_type_t;
>   }
> + size = ALIGN(size, max_alignof);
>  
> + /* Note itMinTypeSize is already aligned to max_alignof. */
>   size += it->itMinTypeSize;
>   }
>  
> diff --git a/server.c b/server.c
> index 95b0056..14b129b 100644
> --- a/server.c
> +++ b/server.c
> @@ -483,7 +483,7 @@ WriteCheckArgSize(FILE *file, const argument_t *arg)
>   arg->argLongForm ? ".msgtl_header" : "");
>  }
>  
> -if (btype->itTypeSize % word_size != 0)
> +if (btype->itTypeSize % max_alignof != 0)
>   fprintf(file, "(");
>  
>  if (multiplier > 1)
> @@ -491,10 +491,10 @@ WriteCheckArgSize(FILE *file, const argument_t *arg)
>  
>  fprintf(file, "In%dP->%s", arg->argRequestPos, count->argMsgField);
>  
> -/* If the base type size of the data field isn`t a multiple of word_size,
> +/* If the base type size of the data field isn`t a multiple of 
> max_alignof,
> we have to round up. */
> -if (btype->itTypeSize % wo

[PATCH gnumach] Make mach_msg_header_t have the same size for both 64 bit kernel and userland.

2023-02-12 Thread Flavio Cruz

This has several advantages:
1) We don't need to resize mach_msg_header_t, it is just a copy.
2) Mig won't require any changes because it statically computes the size
   of mach_msg_header_t, otherwise we would need two sizes (28 vs 32 bytes).
---
 include/mach/message.h | 15 +--
 x86_64/copy_user.c | 35 +--
 2 files changed, 26 insertions(+), 24 deletions(-)

diff --git a/include/mach/message.h b/include/mach/message.h
index 63341e6f..eb3b34c0 100644
--- a/include/mach/message.h
+++ b/include/mach/message.h
@@ -139,7 +139,15 @@ typedef integer_t mach_msg_id_t;
 typedefstruct mach_msg_header {
 mach_msg_bits_tmsgh_bits;
 mach_msg_size_tmsgh_size;
-mach_port_tmsgh_remote_port;
+union {
+mach_port_tmsgh_remote_port;
+/*
+ * Ensure msgh_remote_port is wide enough to hold a kernel pointer
+ * to avoid message resizing for the 64 bits case. This field should
+ * not be used since it is here just for padding purposes.
+ */
+rpc_uintptr_t   msgh_remote_port_do_not_use;
+};
 union {
 mach_port_tmsgh_local_port;
 rpc_uintptr_t  msgh_protected_payload;
@@ -153,7 +161,10 @@ typedefstruct mach_msg_header {
 typedefstruct {
 mach_msg_bits_tmsgh_bits;
 mach_msg_size_tmsgh_size;
-mach_port_name_t   msgh_remote_port;
+union {
+mach_port_name_t   msgh_remote_port;
+rpc_uintptr_t   msgh_remote_port_do_not_use;
+};
 union {
 mach_port_name_t   msgh_local_port;
 rpc_uintptr_t msgh_protected_payload;
diff --git a/x86_64/copy_user.c b/x86_64/copy_user.c
index ae062645..dd9fe2d7 100644
--- a/x86_64/copy_user.c
+++ b/x86_64/copy_user.c
@@ -177,29 +177,24 @@ int copyinmsg (const void *userbuf, void *kernelbuf, 
const size_t usize)
   const mach_msg_user_header_t *umsg = userbuf;
   mach_msg_header_t *kmsg = kernelbuf;
 
+#ifdef USER32
   if (copyin(&umsg->msgh_bits, &kmsg->msgh_bits, sizeof(kmsg->msgh_bits)))
 return 1;
   /* kmsg->msgh_size is filled in later */
   if (copyin_port(&umsg->msgh_remote_port, &kmsg->msgh_remote_port))
 return 1;
-#ifdef USER32
-  /* This could contain a payload, but for 32 bits it will be the same size as 
a mach_port_name_t */
-  _Static_assert(sizeof(rpc_uintptr_t) == sizeof(mach_port_name_t),
- "rpc_uintptr_t and mach_port_name_t expected to have the same 
size");
   if (copyin_port(&umsg->msgh_local_port, &kmsg->msgh_local_port))
 return 1;
-#else
-  /* For pure 64 bits, the protected payload is as large as a port pointer. */
-  _Static_assert(sizeof(rpc_uintptr_t) == sizeof(mach_port_t),
- "rpc_uintptr_t and mach_port_t expected to have the same 
size");
-  if (copyin((char*)umsg + offsetof(mach_msg_user_header_t, msgh_local_port),
-  (char*)kmsg + offsetof(mach_msg_header_t, msgh_local_port),
- sizeof(rpc_uintptr_t)))
-return 1;
-#endif
   if (copyin(&umsg->msgh_seqno, &kmsg->msgh_seqno,
  sizeof(kmsg->msgh_seqno) + sizeof(kmsg->msgh_id)))
 return 1;
+#else
+  /* The 64 bit interface ensures the header is the same size, so it does not 
need any resizing. */
+  _Static_assert(sizeof(mach_msg_header_t) == sizeof(mach_msg_user_header_t),
+"mach_msg_header_t and mach_msg_user_header_t expected to be 
of the same size");
+  if (copyin(&umsg, &kmsg, sizeof(mach_msg_header_t)))
+return 1;
+#endif
 
   vm_offset_t usaddr, ueaddr, ksaddr;
   ksaddr = (vm_offset_t)(kmsg + 1);
@@ -283,25 +278,21 @@ int copyoutmsg (const void *kernelbuf, void *userbuf, 
const size_t ksize)
 {
   const mach_msg_header_t *kmsg = kernelbuf;
   mach_msg_user_header_t *umsg = userbuf;
-
+#ifdef USER32
   if (copyout(&kmsg->msgh_bits, &umsg->msgh_bits, sizeof(kmsg->msgh_bits)))
 return 1;
   /* umsg->msgh_size is filled in later */
   if (copyout_port(&kmsg->msgh_remote_port, &umsg->msgh_remote_port))
 return 1;
-#ifdef USER32
   if (copyout_port(&kmsg->msgh_local_port, &umsg->msgh_local_port))
 return 1;
-#else
-  /* Handle protected payloads correctly, same as copyinmsg. */
-  if (copyout((char*)kmsg + offsetof(mach_msg_header_t, msgh_local_port),
-  (char*)umsg + offsetof(mach_msg_user_header_t, msgh_local_port),
- sizeof(rpc_uintptr_t)))
-return 1;
-#endif
   if (copyout(&kmsg->msgh_seqno, &umsg->msgh_seqno,
  sizeof(kmsg->msgh_seqno) + sizeof(kmsg->msgh_id)))
 return 1;
+#else
+  if (copyout(&kmsg, &umsg, sizeof(mach_msg_header_t)))
+return 1;
+#endif  /* USER32 */
 
   vm_offset_t ksaddr, keaddr, usaddr;
   ksaddr = (vm_offset_t)(kmsg + 1);
-- 
2.39.1

Re: [PATCH gnumach] Make mach_msg_header_t have the same size for both 64 bit kernel and userland.

2023-02-12 Thread Samuel Thibault

Applied, thanks!

Flavio Cruz, le dim. 12 févr. 2023 18:54:58 -0500, a ecrit:
> This has several advantages:
> 1) We don't need to resize mach_msg_header_t, it is just a copy.
> 2) Mig won't require any changes because it statically computes the size

> +#else
> +  if (copyout(&kmsg, &umsg, sizeof(mach_msg_header_t)))
> +return 1;
> +#endif  /* USER32 */

I'm wondering, though: for kernel-produced messages, are we sure that we
fill the padding bytes with zeroes? We don't want to leak information
through such holes.

Samel

Re: [PATCH v3] i386: Refactor int stacks to be per cpu for SMP

2023-02-12 Thread Samuel Thibault

Applied and fixed for PAE, thanks!

Damien Zammit, le sam. 04 févr. 2023 01:15:06 +, a ecrit:
> This also serialises the AP bringup, so paging can be enabled per cpu
> one by one.
> 
> Also-by: Almudena Garcia 
> ---
>  i386/i386/mp_desc.c | 230 
>  i386/i386/mp_desc.h |   7 +-
>  i386/i386at/boothdr.S   |  18 +++-
>  i386/i386at/ioapic.c|   5 +-
>  i386/i386at/model_dep.c |  33 +++---
>  5 files changed, 201 insertions(+), 92 deletions(-)
> 
> diff --git a/i386/i386/mp_desc.c b/i386/i386/mp_desc.c
> index bcf2fbe7..03b2ea68 100644
> --- a/i386/i386/mp_desc.c
> +++ b/i386/i386/mp_desc.c
> @@ -24,25 +24,36 @@
>   * the rights to redistribute these changes.
>   */
> 
> -#if  NCPUS > 1
> -
> -#include 
> -
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
> +#include 
>  #include 
>  #include 
>  #include 
> 
>  #include 
>  #include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> 
> +#include 
> +#include 
> +
>  /*
>   * The i386 needs an interrupt stack to keep the PCB stack from being
>   * overrun by interrupts.  All interrupt stacks MUST lie at lower addresses
> @@ -52,20 +63,35 @@
>  /*
>   * Addresses of bottom and top of interrupt stacks.
>   */
> -vm_offset_t  interrupt_stack[NCPUS];
>  vm_offset_t  int_stack_top[NCPUS];
>  vm_offset_t  int_stack_base[NCPUS];
> 
> -/*
> - * Barrier address.
> - */
> -vm_offset_t  int_stack_high;
> +/* Interrupt stack allocation */
> +uint8_t solid_intstack[NCPUS*INTSTACK_SIZE] __aligned(NCPUS*INTSTACK_SIZE);
> +
> +void
> +interrupt_stack_alloc(void)
> +{
> + int i;
> +
> + /*
> +  * Set up pointers to the top of the interrupt stack.
> +  */
> +
> + for (i = 0; i < NCPUS; i++) {
> + int_stack_base[i] = (vm_offset_t) &solid_intstack[i * 
> INTSTACK_SIZE];
> + int_stack_top[i] = (vm_offset_t) &solid_intstack[(i + 1) * 
> INTSTACK_SIZE] - 4;
> + }
> +}
> 
> +#if  NCPUS > 1
>  /*
> - * First cpu`s interrupt stack.
> + * Flag to mark SMP init by BSP complete
>   */
> -extern char  _intstack[];/* bottom */
> -extern char  _eintstack[];   /* top */
> +int bspdone;
> +
> +extern void *apboot, *apbootend;
> +extern volatile ApicLocalUnit* lapic;
> 
>  /*
>   * Multiprocessor i386/i486 systems use a separate copy of the
> @@ -77,7 +103,7 @@ extern char_eintstack[];   /* top */
>   */
> 
>  /*
> - * Allocated descriptor tables.
> + * Descriptor tables.
>   */
>  struct mp_desc_table *mp_desc_table[NCPUS] = { 0 };
> 
> @@ -102,12 +128,13 @@ extern struct real_descriptor   ldt[LDTSZ];
>   * Allocate and initialize the per-processor descriptor tables.
>   */
> 
> -struct mp_desc_table *
> +int
>  mp_desc_init(int mycpu)
>  {
>   struct mp_desc_table *mpt;
> + vm_offset_t mem;
> 
> - if (mycpu == master_cpu) {
> + if (mycpu == 0) {
>   /*
>* Master CPU uses the tables built at boot time.
>* Just set the TSS and GDT pointers.
> @@ -118,61 +145,28 @@ mp_desc_init(int mycpu)
>   }
>   else {
>   /*
> -  * Other CPUs allocate the table from the bottom of
> -  * the interrupt stack.
> +  * Allocate tables for other CPUs
>*/
> - mpt = (struct mp_desc_table *) interrupt_stack[mycpu];
> + if (!init_alloc_aligned(sizeof(struct mp_desc_table), &mem))
> + panic("not enough memory for descriptor tables");
> + mpt = (struct mp_desc_table *)phystokv(mem);
> 
>   mp_desc_table[mycpu] = mpt;
>   mp_ktss[mycpu] = &mpt->ktss;
>   mp_gdt[mycpu] = mpt->gdt;
> 
>   /*
> -  * Copy the tables
> +  * Zero the tables
>*/
> - memcpy(mpt->idt,
> -   idt,
> -   sizeof(idt));
> - memcpy(mpt->gdt,
> -   gdt,
> -   sizeof(gdt));
> - memcpy(mpt->ldt,
> -   ldt,
> -   sizeof(ldt));
> - memset(&mpt->ktss, 0,
> -   sizeof(struct task_tss));
> + memset(mpt->idt, 0, sizeof(idt));
> + memset(mpt->gdt, 0, sizeof(gdt));
> + memset(mpt->ldt, 0, sizeof(ldt));
> + memset(&mpt->ktss, 0, sizeof(struct task_tss));
> 
> - /*
> -  * Fix up the entries in the GDT to point to
> -  * this LDT and this TSS.
> -  */
> -#ifdef   MACH_RING1
> - panic("TODO %s:%d\n",__FILE__,__LINE__);
> -#else/* MACH_RING1 */
> - _fill_gdt_sys_descriptor(mpt->gdt, KERNEL_LDT,
> - (unsigned)&mpt->ldt,
> - LDTSZ * sizeof(struct real_descriptor) - 1,
> - ACC_P|ACC_PL_K|ACC_LDT, 0);
> -

1 2 >

1 - 100 of 103 matches

Mail list logo