On 25.11.19 14:46, Aleksandar Markovic wrote: > On Sun, Nov 24, 2019 at 2:31 PM Aleksandar Markovic > <aleksandar.m.m...@gmail.com> wrote: >> >> >> >> On Sunday, November 24, 2019, Helge Deller <del...@gmx.de> wrote: >>> >>> On 24.11.19 13:10, Aleksandar Markovic wrote: >>>> >>>> >>>> On Sunday, November 24, 2019, Helge Deller <del...@gmx.de >>>> <mailto:del...@gmx.de>> wrote: >>>> >>>> On 23.11.19 12:34, Aleksandar Markovic wrote: >>>> > On Thursday, November 21, 2019, Helge Deller <del...@gmx.de >>>> <mailto:del...@gmx.de> <mailto:del...@gmx.de <mailto:del...@gmx.de>>> >>>> wrote: >>>> > >>>> > The strace functionality in qemu-user lacks the possibility to >>>> trace >>>> > which real values get returned to pointers in userspace by >>>> syscalls. >>>> > >>>> > For example, the read() and getcwd() syscalls currently only >>>> show the >>>> > destination address where the syscalls should put the return >>>> values: >>>> > 2532 read(3,0xff80038c,512) = 512 >>>> > 2532 getcwd(0x18180,4096) = 9 >>>> > >>>> > With the patch below, one now can specify in >>>> print_syscall_late() which >>>> > syscalls should be executed first, before they get printed. >>>> > After adding the read() and getcwd() syscalls, we now get this >>>> output in >>>> > with strace instead: >>>> > 1708 >>>> read(3,"\177ELF\1\2\1\3\0\0\0\0\0\0\0\0\0\3\0\17\0\0\0\1\0\2bl\0\0\04"...,512) >>>> = 512 >>>> > 1708 getcwd("/usr/bin",4096) = 9 >>>> > >>>> > This patch adds just the framework with the respective >>>> implemenations for >>>> > read() and getcwd(). If applied, more functions can be added >>>> easily later. >>>> > >>>> > >>>> > Great out-of-the-box idea! However, there are some things that are >>>> > not thought over yet. There is a good reason why strace happens >>>> > "early": if something crash-like happens during syscall translation, >>>> > we still have trace of original target syscall. In case of "late" >>>> > flavor, we don't have anything. Another potentially problematic >>>> > aspect is that end user certainly should know whether the trace was >>>> > done "early" or "late" - otherwise, there will be, for certain, cases >>>> > of misinterpretation / misleading / confusion of end users. >>>> >>>> Thanks for the feedback - I'm shortly sending a v2 version which >>>> prints the syscall name with an open parenthesis, e.g. "read(" or >>>> "getcwd(", >>>> before calling the syscall. That way you can see where it >>>> crashed/hangs... >>>> >>>> >>>> ... but I cannot see passed arguments (for example, whether NULL is >>>> passed, or somrthing that looks like a real address)... >>> >>> Right. >>> And you won't see that in native strace either... >>> If you have an idea, just let me know. >>> >> >> Helge, I do have an idea how to amend your proposal that should give us, >> believe it or not, "best of both worlds", but let's wait few days, I don't >> have any dev setup ar hand at the moment to check it in action, and on top >> of that I am swamped with other unrelated tasks. In the meantime, maybe >> other guys will say something too, or perhaps you would have something new >> as well. >> >> Still, by no means, I see this as a (potential) *huge* improvement of QEMU >> strace. Thanks! >> > > Helge, Laurent, > > I don't have time to experiment, but I will write down here my > suggestion to extending Helge's system in a textual form, based on > example from prevous Helge's patches: > > 1. CURRENT STATE: > > early: read(3,0xff80038c,512) > late: = 512 > > early: getcwd(0x18180,4096) > late: = 9 > > 2. HELGE'S PROPOSAL 1: > > late: > read(3,"\177ELF\1\2\1\3\0\0\0\0\0\0\0\0\0\3\0\17\0\0\0\1\0\2bl\0\0\04"...,512) > = 512 > late: getcwd("/usr/bin",4096) = 9 > > 3. HELGE'S PROPOSAL 2: > > early: read > late: > (3,"\177ELF\1\2\1\3\0\0\0\0\0\0\0\0\0\3\0\17\0\0\0\1\0\2bl\0\0\04"...,512) > = 512 > early: getcwd > late: ("/usr/bin",4096) = 9
My v2 patch actually includes the opening parenthesis in the early part: early: read( late: 3,"\177ELF\1\2\1\3\0\0\0\0\0\0\0\0\0\3\0\17\0\0\0\1\0\2bl\0\0\04"...,512) = 512 The open parenthesis should make it visible, that the call hasn't happened yet. > 4. NEW PROPOSAL : > > early: read(3,0xff80038c,512) > late: = 512 > [(3,"\177ELF\1\2\1\3\0\0\0\0\0\0\0\0\0\3\0\17\0\0\0\1\0\2bl\0\0\04"...,512)] > early: getcwd(0x18180,4096) > late: = 9 [("/usr/bin",4096)] > > In other words, we would print for (selected) system calls content of > the parameters both before and after actual ioctl > translation/execution. I don't like this. IMHO it just adds unnecessary noise and differs from what a native strace shows. Helge >>>> > >>>> > Signed-off-by: Helge Deller <del...@gmx.de >>>> <mailto:del...@gmx.de> <mailto:del...@gmx.de <mailto:del...@gmx.de>>> >>>> > >>>> > diff --git a/linux-user/strace.c b/linux-user/strace.c >>>> > index de43238fa4..ff254732af 100644 >>>> > --- a/linux-user/strace.c >>>> > +++ b/linux-user/strace.c >>>> > @@ -61,6 +61,7 @@ UNUSED static void print_open_flags(abi_long, >>>> int); >>>> > UNUSED static void print_syscall_prologue(const struct >>>> syscallname *); >>>> > UNUSED static void print_syscall_epilogue(const struct >>>> syscallname *); >>>> > UNUSED static void print_string(abi_long, int); >>>> > +UNUSED static void print_encoded_string(abi_long addr, unsigned >>>> int maxlen, int last); >>>> > UNUSED static void print_buf(abi_long addr, abi_long len, int >>>> last); >>>> > UNUSED static void print_raw_param(const char *, abi_long, int); >>>> > UNUSED static void print_timeval(abi_ulong, int); >>>> > @@ -1204,6 +1205,37 @@ print_syscall_epilogue(const struct >>>> syscallname *sc) >>>> > gemu_log(")"); >>>> > } >>>> > >>>> > +#define MAX_ENCODED_CHARS 32 >>>> > +static void >>>> > +print_encoded_string(abi_long addr, unsigned int maxlen, int >>>> last) >>>> > +{ >>>> > + unsigned int maxout; >>>> > + char *s, *str; >>>> > + >>>> > + s = lock_user_string(addr); >>>> > + if (s == NULL) { >>>> > + /* can't get string out of it, so print it as pointer */ >>>> > + print_pointer(addr, last); >>>> > + return; >>>> > + } >>>> > + >>>> > + str = s; >>>> > + gemu_log("\""); >>>> > + maxout = MIN(maxlen, MAX_ENCODED_CHARS); >>>> > + while (maxout--) { >>>> > + unsigned char c = *str++; >>>> > + if (isprint(c)) { >>>> > + gemu_log("%c", c); >>>> > + } else { >>>> > + gemu_log("\\%o", (unsigned int) c); >>>> > + } >>>> > + } >>>> > + unlock_user(s, addr, 0); >>>> > + >>>> > + gemu_log("\"%s%s", maxlen > MAX_ENCODED_CHARS ? "..." : "", >>>> > + get_comma(last)); >>>> > +} >>>> > + >>>> > static void >>>> > print_string(abi_long addr, int last) >>>> > { >>>> > @@ -1633,6 +1665,19 @@ print_futimesat(const struct syscallname >>>> *name, >>>> > } >>>> > #endif >>>> > >>>> > +#ifdef TARGET_NR_getcwd >>>> > +static void >>>> > +print_getcwd(const struct syscallname *name, >>>> > + abi_long arg0, abi_long arg1, abi_long arg2, >>>> > + abi_long arg3, abi_long arg4, abi_long arg5) >>>> > +{ >>>> > + print_syscall_prologue(name); >>>> > + print_string(arg0, 0); >>>> > + print_raw_param("%u", arg1, 1); >>>> > + print_syscall_epilogue(name); >>>> > +} >>>> > +#endif >>>> > + >>>> > #ifdef TARGET_NR_settimeofday >>>> > static void >>>> > print_settimeofday(const struct syscallname *name, >>>> > @@ -2428,6 +2473,20 @@ print_fstatat64(const struct syscallname >>>> *name, >>>> > #define print_newfstatat print_fstatat64 >>>> > #endif >>>> > >>>> > +#ifdef TARGET_NR_read >>>> > +static void >>>> > +print_read(const struct syscallname *name, >>>> > + abi_long arg0, abi_long arg1, abi_long arg2, >>>> > + abi_long arg3, abi_long arg4, abi_long arg5) >>>> > +{ >>>> > + print_syscall_prologue(name); >>>> > + print_raw_param("%d", arg0, 0); >>>> > + print_encoded_string(arg1, arg2, 0); >>>> > + print_raw_param("%u", arg2, 1); >>>> > + print_syscall_epilogue(name); >>>> > +} >>>> > +#endif >>>> > + >>>> > #ifdef TARGET_NR_readlink >>>> > static void >>>> > print_readlink(const struct syscallname *name, >>>> > diff --git a/linux-user/strace.list b/linux-user/strace.list >>>> > index d49a1e92a8..220b1f4c46 100644 >>>> > --- a/linux-user/strace.list >>>> > +++ b/linux-user/strace.list >>>> > @@ -272,7 +272,7 @@ >>>> > { TARGET_NR_getcpu, "getcpu" , "%s(%p,%d)", NULL, NULL }, >>>> > #endif >>>> > #ifdef TARGET_NR_getcwd >>>> > -{ TARGET_NR_getcwd, "getcwd" , "%s(%p,%d)", NULL, NULL }, >>>> > +{ TARGET_NR_getcwd, "getcwd" , NULL, print_getcwd, NULL }, >>>> > #endif >>>> > #ifdef TARGET_NR_getdents >>>> > { TARGET_NR_getdents, "getdents" , NULL, NULL, NULL }, >>>> > @@ -1080,7 +1080,7 @@ >>>> > { TARGET_NR_quotactl, "quotactl" , NULL, NULL, NULL }, >>>> > #endif >>>> > #ifdef TARGET_NR_read >>>> > -{ TARGET_NR_read, "read" , "%s(%d,%#x,%d)", NULL, NULL }, >>>> > +{ TARGET_NR_read, "read" , NULL, print_read, NULL }, >>>> > #endif >>>> > #ifdef TARGET_NR_readahead >>>> > { TARGET_NR_readahead, "readahead" , NULL, NULL, NULL }, >>>> > diff --git a/linux-user/syscall.c b/linux-user/syscall.c >>>> > index ce399a55f0..c0079ca2b7 100644 >>>> > --- a/linux-user/syscall.c >>>> > +++ b/linux-user/syscall.c >>>> > @@ -12069,6 +12069,21 @@ static abi_long do_syscall1(void >>>> *cpu_env, int num, abi_long arg1, >>>> > return ret; >>>> > } >>>> > >>>> > +/* >>>> > + * True if this syscall should be printed after having called >>>> the native >>>> > + * syscall, so that values which are fed back to userspace gets >>>> printed. >>>> > + */ >>>> > +static int print_syscall_late(int syscall) >>>> > +{ >>>> > + switch (syscall) { >>>> > + case TARGET_NR_getcwd: >>>> > + case TARGET_NR_read: >>>> > + return 1; >>>> > + default: >>>> > + return 0; >>>> > + } >>>> > +} >>>> > + >>>> > abi_long do_syscall(void *cpu_env, int num, abi_long arg1, >>>> > abi_long arg2, abi_long arg3, abi_long arg4, >>>> > abi_long arg5, abi_long arg6, abi_long arg7, >>>> > @@ -12095,9 +12110,16 @@ abi_long do_syscall(void *cpu_env, int >>>> num, abi_long arg1, >>>> > arg2, arg3, arg4, arg5, arg6, arg7, >>>> arg8); >>>> > >>>> > if (unlikely(do_strace)) { >>>> > - print_syscall(num, arg1, arg2, arg3, arg4, arg5, arg6); >>>> > + int late_printing; >>>> > + late_printing = print_syscall_late(num); >>>> > + if (!late_printing) { >>>> > + print_syscall(num, arg1, arg2, arg3, arg4, arg5, >>>> arg6); >>>> > + } >>>> > ret = do_syscall1(cpu_env, num, arg1, arg2, arg3, arg4, >>>> > arg5, arg6, arg7, arg8); >>>> > + if (late_printing) { >>>> > + print_syscall(num, arg1, arg2, arg3, arg4, arg5, >>>> arg6); >>>> > + } >>>> > print_syscall_ret(num, ret); >>>> > } else { >>>> > ret = do_syscall1(cpu_env, num, arg1, arg2, arg3, arg4, >>>> > >>>> >>>