Re: [Qemu-devel] QEMU/MIPS & dyntick kernel
Paul Brook wrote: There seem to have specific problems when using dynticks in Qemu. What I can see is that it makes the PowerPC emulation quite unusable, at least on my PC, which is an amd64 (with a fix CPU frequency), no matter if I run 32 or 64 bits mode. I'd expect to see the same problems running a non-dynticks qemu on a heavily loaded host, or a host that did not have /dev/rtc available. IIRC vanilla amd64-linux does not yet use the new kernel high resolution timer infrastructure, so posix timers (as used by dynticks) have a fairly high jitter+latency compared to /dev/rtc. As of linux 2.6.24 there is dyn-tick kernel support for 64 bits. Expect for dyn-tick there is also the hpet option. The tradeoff is that on hosts that do implement high resolution timers (e.g. i386) you get lower overhead and/or more accurate emulation. I don't think there's any deterministic way of figuring out which is best. It may be feasible to switch mechanisms dynamically if it's obvious one is sucking, but I'm not sure how well that would work in practice. The only reliable solution is to isolate qemu from the host realtime characteristics, though that has its own set of issues. I hope to have this implemented fairly soon. Paul
[Qemu-devel] [PATCH] tlb_flush() fix
This patch removes the unused param of tlb_flush() function. diff --git a/cpu-exec.c b/cpu-exec.c index 0f55229..76660e0 100644 --- a/cpu-exec.c +++ b/cpu-exec.c @@ -123,7 +123,7 @@ static TranslationBlock *tb_find_slow(target_ulong pc, tb = tb_alloc(pc); if (!tb) { /* flush must be done */ -tb_flush(env); +tb_flush(); /* cannot fail at this point */ tb = tb_alloc(pc); /* don't forget to invalidate previous TB info */ diff --git a/exec-all.h b/exec-all.h index 57086f3..7b0103d 100644 --- a/exec-all.h +++ b/exec-all.h @@ -233,7 +233,7 @@ static inline unsigned int tb_phys_hash_func(unsigned long pc) } TranslationBlock *tb_alloc(target_ulong pc); -void tb_flush(CPUState *env); +void tb_flush(void); void tb_link_phys(TranslationBlock *tb, target_ulong phys_pc, target_ulong phys_page2); diff --git a/exec.c b/exec.c index 0daeaab..fb98e9e 100644 --- a/exec.c +++ b/exec.c @@ -335,7 +335,7 @@ static void page_flush_tb(void) /* flush all the translation blocks */ /* XXX: tb_flush is currently not thread safe */ -void tb_flush(CPUState *env1) +void tb_flush(void) { CPUState *env; #if defined(DEBUG_FLUSH) @@ -613,7 +613,7 @@ static void tb_gen_code(CPUState *env, tb = tb_alloc(pc); if (!tb) { /* flush must be done */ -tb_flush(env); +tb_flush(); /* cannot fail at this point */ tb = tb_alloc(pc); } @@ -1077,7 +1077,7 @@ int cpu_watchpoint_insert(CPUState *env, target_ulong addr) /* FIXME: This flush is needed because of the hack to make memory ops terminate the TB. It can be removed once the proper IO trap and re-execute bits are in. */ -tb_flush(env); +tb_flush(); return i; } @@ -1151,7 +1151,7 @@ void cpu_single_step(CPUState *env, int enabled) env->singlestep_enabled = enabled; /* must flush all the translated code to avoid inconsistancies */ /* XXX: only flush what is necessary */ -tb_flush(env); +tb_flush(); } #endif } diff --git a/gdbstub.c b/gdbstub.c index 139bc25..c5d6a30 100644 --- a/gdbstub.c +++ b/gdbstub.c @@ -1006,7 +1006,7 @@ static int gdb_handle_packet(GDBState *s, CPUState *env, const char *line_buf) return RS_IDLE; } -extern void tb_flush(CPUState *env); +extern void tb_flush(void); #ifndef CONFIG_USER_ONLY static void gdb_vm_stopped(void *opaque, int reason) @@ -1030,7 +1030,7 @@ static void gdb_vm_stopped(void *opaque, int reason) s->env->watchpoint_hit = 0; return; } - tb_flush(s->env); + tb_flush(); ret = SIGTRAP; } else if (reason == EXCP_INTERRUPT) { ret = SIGINT; @@ -1198,7 +1198,7 @@ gdb_handlesig (CPUState *env, int sig) /* disable single step if it was enabled */ cpu_single_step(env, 0); - tb_flush(env); + tb_flush(); if (sig != 0) { diff --git a/target-arm/helper.c b/target-arm/helper.c index 4f851ce..3bac8b3 100644 --- a/target-arm/helper.c +++ b/target-arm/helper.c @@ -680,7 +680,7 @@ void helper_set_cp15(CPUState *env, uint32_t insn, uint32_t val) goto bad_reg; env->cp15.c1_coproc = val; /* ??? Is this safe when called from within a TB? */ -tb_flush(env); +tb_flush(); break; default: goto bad_reg; @@ -841,7 +841,7 @@ void helper_set_cp15(CPUState *env, uint32_t insn, uint32_t val) if (op2 == 0 && crm == 1) { if (env->cp15.c15_cpar != (val & 0x3fff)) { /* Changes cp0 to cp13 behavior, so needs a TB flush. */ -tb_flush(env); +tb_flush(); env->cp15.c15_cpar = val & 0x3fff; } break; diff --git a/target-i386/translate.c b/target-i386/translate.c index cd95412..0bd75d2 100644 --- a/target-i386/translate.c +++ b/target-i386/translate.c @@ -5520,7 +5520,7 @@ static target_ulong disas_insn(DisasContext *s, target_ulong pc_start) gen_debug(s, pc_start - s->cs_base); #else /* start debug */ -tb_flush(cpu_single_env); +tb_flush(); cpu_set_log(CPU_LOG_INT | CPU_LOG_TB_IN_ASM); #endif break;
Re: [Qemu-devel] RFC: Code fetch optimisation
On Mon, 2007-10-15 at 03:30 +0100, Paul Brook wrote: > On Sunday 14 October 2007, J. Mayer wrote: > > Here's an updated version of the code fetch optimisation patch against > > current CVS. > > As a remainder, this patch avoid use of softmmu helpers to fetch the > > code in most case. A new target define TARGET_HAS_VLE_INSNS has been > > added which is used to handle the case of an instruction that span 2 > > pages, when the target CPU uses a variable-length instructions encoding. > > For pure RISC, the code fetch is done using raw access routines. > > > +unsigned long phys_pc; > > +unsigned long phys_pc_start; > > These are ram offsets, not physical addresses. I recommend naming them as > such > to avoid confusion. Well, those are host addresses. Fabrice even suggested me to replace them with void * to prevent confusion, but I kept using unsigned long because the _p functions API do not use pointers. As those values are defined as phys_ram_base + offset, those are likely to be host address, not RAM offset, and are used directly to dereference host pointers in the ldxxx_p functions. Did I miss something ? > > +opc = glue(glue(lds,SUFFIX),MEMSUFFIX)(virt_pc); > > +/* Avoid softmmu access on next load */ > > +/* XXX: dont: phys PC is not correct anymore > > + * We could call get_phys_addr_code(env, pc); and remove the else > > + * condition, here. > > + */ > > +//*start_pc = phys_pc; > > The commented out code is completely bogus, please remove it. The comment is > also somewhat misleading/incorrect. The else would still be required for > accesses that span a page boundary. I guess trying to optimize this case retrieving the physical address would not bring any optimization as in fact only the last translated instruction of a TB (then only a few code loads) may hit this case. I'd like to keep a comment here to show that it may not be a good idea (or may not be as simple as it seems at first sight) to try to do more optimisation here, but you're right this comment is not correct. > The code itself looks ok, though I'd be surprised if it made a significant > difference. We're always going to hit the fast-path TLB lookup case anyway. It seems that the generated code for the code fetch is much more efficient than the one generated when we get when using the softmmu routines. But it's true we do not get any significant performance boost. As it was previously mentioned, the idea of the patch is more a 'don't do unneeded things during code translation' than a great performance improvment. -- J. Mayer <[EMAIL PROTECTED]> Never organized
Re: [Qemu-devel] RFC: reverse-endian softmmu memory accessors
On Sun, 2007-10-14 at 15:59 +0300, Blue Swirl wrote: > On 10/14/07, J. Mayer <[EMAIL PROTECTED]> wrote: > > Here's an updated version of the patch against current CVS. > > This patches provides reverse-endian, little-endian and big-endian > > memory accessors, available with and without softmmu. It also provides > > an IO_MEM_REVERSE TLB flag to allow future support of per-page > > endianness control, which is required by some targets CPU emulations. > > Having reverse-endian memory accessors also make it possible to optimise > > reverse-endian memory access when the target CPU has dedicated > > instructions. For now, it includes optimisations for the PowerPC target. > > This breaks Sparc32 softmmu, I get a black screen. Your changes to > target-sparc and hw/sun4m.c look fine, so the problem could be in IO? Did it worked before my commits ? I may have done something wrong during the merge... I will do more checks and more tests... -- J. Mayer <[EMAIL PROTECTED]> Never organized
Re: [Qemu-devel] RFC: reverse-endian softmmu memory accessors
On Sun, 2007-10-14 at 14:22 +0100, Thiemo Seufer wrote: > J. Mayer wrote: > [snip] > > > > Here's a new version. The only change is that, for consistency, I did > > > > add the big-endian and little-endian accessors that were documented in > > > > cpu-all.h as unimplemented. The implementation is quite trivial, having > > > > native and reverse-endian accessors available, and changes functionnally > > > > nothing to the previous version. > > > > > > The patch does not apply anymore. The Sparc part looks OK. > > > > > > The benefits from the patch can be gained by mapping Sparc64 lduw and > > > ldsw in op_mem.h directly to ldul and ldsl using SPARC_LD_OP and > > > replacing the ldl+bswap etc. for the LE cases with ldlr in > > > op_helper.c. If you prefer, I can do this after you have applied the > > > patch. > > > > Yes, there are conflicts between this patch and the mmu_idx one I just > > commited. I will regenerate an updated diff in the hours to come, after > > I finished commiting the PowerPC fixes and improvments I got waiting in > > stock. > > For the Sparc improvments, as I merged the PowerPC improvments in the > > patch, I think it can be a good idea to include it directly in the > > patch. > > I'm also wondering if it would not be a good idea to define lduq/ldsq > > even if they in fact do exactly what ldq does now, just to have a fully > > consistent API. > > Some architecture specs mention the possibility of 128 bit integers, so > this sounds like a good idea. OK, then I'll add this. And I guess we can avoid the #if (TARGET_LONG_BITS == 64) for ldsl / ldul changing the return type to target_ulong for those accessors. -- J. Mayer <[EMAIL PROTECTED]> Never organized
[Qemu-devel] [PATCH] Enable ACPI interrupts.
Hi, It was some problem to ACPI interrupts delivering to Windows guests. This patch fixes this problem and enables the ACPI interrupts. According to 82371AB spec. we need to reset the bit 0 of PIRQRCA register to enable interrupt routing. Cheers, Igor Lvovsky acpi-irq.diff Description: acpi-irq.diff
Re: [Qemu-devel] QEMU/MIPS & dyntick kernel
Aurelien Jarno wrote: > Hi, > > As announced by Ralf Baechle, dyntick is now available on MIPS. I gave a > try on QEMU/MIPS, and unfortunately it doesn't work correctly. > > In some cases the kernel schedules an event very near in the future, > which means the timer is scheduled a few cycles only from its current > value. Unfortunately under QEMU, the timer runs too fast compared to the > speed at which instructions are execution. This causes a lockup of the > kernel. This can be triggered by running hwclock --hctosys in the guest > (which is run at boot time by most distributions). Until now, I haven't > found any other way to trigger the bug. I found Qemu/MIPS locks up in the emulated kernel's calibrate_delay function. Switching the kernel option off works around the problem. > The problematic code in the kernel from arch/mips/kernel/time.c is the > following: > > cnt = read_c0_count(); > cnt += delta; > write_c0_compare(cnt); > res = ((long)(read_c0_count() - cnt ) > 0) ? -ETIME : 0; > > Note that there is a minimum value for delta (currently set to 48) to > avoid lockup. > > In QEMU, the emulated kernel runs at 100 MHz, ie very fast, which means > that more than 48 timer cycles happen between the two calls of > read_c0_count(). The code returns -ETIME, and the routine is called > again with 48 and so on. > > I have tried to reduce the speed of the timer, the problem disappears > when running at 1MHz (on my machine). > > Here are a few proposed ways to fix the problem (they are not exclusive): > > 1) Improve the emulation speed for timer instructions. This is what I > did with the attached patch. I see no obvious reason of stopping the > translation after a write to CP0_Compare, so I removed that part. The write could trigger an exception. Thiemo
Re: [Qemu-devel] QEMU/MIPS & dyntick kernel
On Mon, Oct 15, 2007 at 04:05:14PM +0100, Thiemo Seufer wrote: > I found Qemu/MIPS locks up in the emulated kernel's calibrate_delay > function. Switching the kernel option off works around the problem. I still haven't patched up the issue which was causing the problem for Aurel. Is the slow execution of the emulator also the cause of what you're seeing? Ralf
Re: [Qemu-devel] RFC: Code fetch optimisation
> > > +unsigned long phys_pc; > > > +unsigned long phys_pc_start; > > > > These are ram offsets, not physical addresses. I recommend naming them as > > such to avoid confusion. > > Well, those are host addresses. Fabrice even suggested me to replace > them with void * to prevent confusion, but I kept using unsigned long > because the _p functions API do not use pointers. As those values are > defined as phys_ram_base + offset, those are likely to be host address, > not RAM offset, and are used directly to dereference host pointers in > the ldxxx_p functions. Did I miss something ? You are correct, they are host addresses. I still think calling them phys_pc is confusing. It took me a while to convince myself that "unsigned long" was an appropriate type (ignoring 64-bit windows hosts for now). How about host_pc? > > > + /* Avoid softmmu access on next load */ > > > + /* XXX: dont: phys PC is not correct anymore > > > + * We could call get_phys_addr_code(env, pc); and remove the > > > else + * condition, here. > > > + */ > > > + //*start_pc = phys_pc; > > > > The commented out code is completely bogus, please remove it. The comment > > is also somewhat misleading/incorrect. The else would still be required > > for accesses that span a page boundary. > > I guess trying to optimize this case retrieving the physical address > would not bring any optimization as in fact only the last translated > instruction of a TB (then only a few code loads) may hit this case. VLE targets (x86, m68k) can translate almost a full page of instructions, and a page boundary can be anywhere within that block. Once we've spanned multiple pages there's not point stopping translation immediately. We may as well translate as many instructions as we can on the second page. I'd guess most TB are much smaller than a page, so on average only a few instructions are going to come after the page boundary. > I'd like to keep a comment here to show that it may not be a good idea > (or may not be as simple as it seems at first sight) to try to do more > optimisation here, but you're right this comment is not correct. Agreed. > > The code itself looks ok, though I'd be surprised if it made a > > significant difference. We're always going to hit the fast-path TLB > > lookup case anyway. > > It seems that the generated code for the code fetch is much more > efficient than the one generated when we get when using the softmmu > routines. But it's true we do not get any significant performance boost. > As it was previously mentioned, the idea of the patch is more a 'don't > do unneeded things during code translation' than a great performance > improvment. OTOH it does make the the code more complicated. I'm agnostic about whether this patch should be applied. Paul
Re: [Qemu-devel] RFC: Code fetch optimisation
Paul Brook wrote: > [...] The code itself looks ok, though I'd be surprised if it made a significant difference. We're always going to hit the fast-path TLB lookup case anyway. It seems that the generated code for the code fetch is much more efficient than the one generated when we get when using the softmmu routines. But it's true we do not get any significant performance boost. As it was previously mentioned, the idea of the patch is more a 'don't do unneeded things during code translation' than a great performance improvment. OTOH it does make the the code more complicated. I'm agnostic about whether this patch should be applied. If it does not correct the existing x86 issues (no code segment limit tests and no explicit handling of code fetch exceptions in the translation phase in VLE case) I see no advantage in commiting it in its current form. Regards, Fabrice.
Re: [Qemu-devel] QEMU/MIPS & dyntick kernel
Ralf Baechle wrote: > On Mon, Oct 15, 2007 at 04:05:14PM +0100, Thiemo Seufer wrote: > > > I found Qemu/MIPS locks up in the emulated kernel's calibrate_delay > > function. Switching the kernel option off works around the problem. > > I still haven't patched up the issue which was causing the problem for > Aurel. Is the slow execution of the emulator also the cause of what > you're seeing? I haven't analysed it further. Thiemo
Re: [Qemu-devel] RFC: reverse-endian softmmu memory accessors
On 10/15/07, Blue Swirl <[EMAIL PROTECTED]> wrote: > On 10/15/07, J. Mayer <[EMAIL PROTECTED]> wrote: > > On Sun, 2007-10-14 at 15:59 +0300, Blue Swirl wrote: > > > On 10/14/07, J. Mayer <[EMAIL PROTECTED]> wrote: > > > > Here's an updated version of the patch against current CVS. > > > > This patches provides reverse-endian, little-endian and big-endian > > > > memory accessors, available with and without softmmu. It also provides > > > > an IO_MEM_REVERSE TLB flag to allow future support of per-page > > > > endianness control, which is required by some targets CPU emulations. > > > > Having reverse-endian memory accessors also make it possible to optimise > > > > reverse-endian memory access when the target CPU has dedicated > > > > instructions. For now, it includes optimisations for the PowerPC target. > > > > > > This breaks Sparc32 softmmu, I get a black screen. Your changes to > > > target-sparc and hw/sun4m.c look fine, so the problem could be in IO? > > > > Did it worked before my commits ? I may have done something wrong during > > the merge... > > I will do more checks and more tests... > > If I disable the IOSWAP code, black screen is gone. I think this is > logical: the io accessors return host CPU values, therefore no byte > swapping need to be performed. > > The attached version works for me. This patch takes the reverse endian functions into use for Sparc. I added hypervisor versions of the functions. This is getting a bit ugly, time for #include magic? Physical versions could be useful too. Index: qemu/target-sparc/op_helper.c === --- qemu.orig/target-sparc/op_helper.c 2007-10-15 16:47:11.0 + +++ qemu/target-sparc/op_helper.c 2007-10-15 17:34:28.0 + @@ -649,8 +649,6 @@ switch (asi) { case 0x80: // Primary case 0x82: // Primary no-fault -case 0x88: // Primary LE -case 0x8a: // Primary no-fault LE { switch(size) { case 1: @@ -669,6 +667,26 @@ } } break; +case 0x88: // Primary LE +case 0x8a: // Primary no-fault LE +{ +switch(size) { +case 1: +ret = ldub_raw(T0); +break; +case 2: +ret = lduwr_raw(T0 & ~1); +break; +case 4: +ret = ldulr_raw(T0 & ~3); +break; +default: +case 8: +ret = ldqr_raw(T0 & ~7); +break; +} +} +break; case 0x81: // Secondary case 0x83: // Secondary no-fault case 0x89: // Secondary LE @@ -679,29 +697,6 @@ break; } -/* Convert from little endian */ -switch (asi) { -case 0x88: // Primary LE -case 0x89: // Secondary LE -case 0x8a: // Primary no-fault LE -case 0x8b: // Secondary no-fault LE -switch(size) { -case 2: -ret = bswap16(ret); -break; -case 4: -ret = bswap32(ret); -break; -case 8: -ret = bswap64(ret); -break; -default: -break; -} -default: -break; -} - /* Convert to signed number */ if (sign) { switch(size) { @@ -726,30 +721,8 @@ if (asi < 0x80) raise_exception(TT_PRIV_ACT); -/* Convert to little endian */ -switch (asi) { -case 0x88: // Primary LE -case 0x89: // Secondary LE -switch(size) { -case 2: -T0 = bswap16(T0); -break; -case 4: -T0 = bswap32(T0); -break; -case 8: -T0 = bswap64(T0); -break; -default: -break; -} -default: -break; -} - switch(asi) { case 0x80: // Primary -case 0x88: // Primary LE { switch(size) { case 1: @@ -768,6 +741,25 @@ } } break; +case 0x88: // Primary LE +{ +switch(size) { +case 1: +stb_raw(T0, T1); +break; +case 2: +stwr_raw(T0 & ~1, T1); +break; +case 4: +stlr_raw(T0 & ~3, T1); +break; +case 8: +default: +stqr_raw(T0 & ~7, T1); +break; +} +} +break; case 0x81: // Secondary case 0x89: // Secondary LE // XXX @@ -795,11 +787,8 @@ switch (asi) { case 0x10: // As if user primary -case 0x18: // As if user primary LE case 0x80: // Primary case 0x82: // Primary no-fault -case 0x88: // Primary LE -case 0x8a: // Primary no-fault LE if ((asi & 0x80) && (env->pstate & PS_PRIV)) { if (env->hpstate & HS_PRIV) { switch(size) { @@ -852,10 +8
Re: [Qemu-devel] [PATCH] syscall_target_errno.patch
On Thu, 2007-10-11 at 14:10 +0200, J. Mayer wrote: > On Wed, 2007-10-10 at 21:38 -0600, Thayne Harbaugh wrote: > > I have noticed that many functions in syscall.c return a *host* errno > > when a *target* errno should be return. At the same time, there are > > several places in syscall.c:do_syscall() that immediately return an > > errno rather than setting the return value and exiting through the > > syscall return value reporting at the end of do_syscall(). > > > > This patch addresses both of those problems at once rather than touching > > the exact same errno return lines twice in do_syscall(). It also > > touches a few functions in linux-user/signal.c that are called from > > do_syscall(). > Hi, > > there are still a lot of problems hidden in syscalls.c and signal.c, as > you noticed. > Your patch seems OK to me and adding all those comments is imho really > great. > My only remark is a cosmetic one: I don't like too much hidding 'goto' > in macros... This is another version of the same patch that doesn't hide "goto" in a macro. It also makes do_sigaltstack() return target values which are more like the other do_*() functions called from do_syscalls(). Index: qemu/linux-user/syscall.c === --- qemu.orig/linux-user/syscall.c 2007-10-15 08:31:52.0 -0600 +++ qemu/linux-user/syscall.c 2007-10-15 08:32:45.0 -0600 @@ -166,6 +166,8 @@ #ifdef __NR_gettid _syscall0(int, gettid) #else +/* This is a replacement for the host gettid() and must return a host + * errno. */ static int gettid(void) { return -ENOSYS; } @@ -387,6 +389,7 @@ target_original_brk = target_brk = HOST_PAGE_ALIGN(new_brk); } +/* do_brk() must return target values and target errnos. */ target_long do_brk(target_ulong new_brk) { target_ulong brk_page; @@ -396,7 +399,7 @@ if (!new_brk) return target_brk; if (new_brk < target_original_brk) -return -ENOMEM; +return -TARGET_ENOMEM; brk_page = HOST_PAGE_ALIGN(target_brk); @@ -530,6 +533,7 @@ } +/* do_select() must return target values and target errnos. */ static target_long do_select(int n, target_ulong rfd_p, target_ulong wfd_p, target_ulong efd_p, target_ulong target_tv) @@ -704,6 +708,7 @@ msgh->msg_controllen = tswapl(space); } +/* do_setsockopt() Must return target values and target errnos. */ static target_long do_setsockopt(int sockfd, int level, int optname, target_ulong optval, socklen_t optlen) { @@ -714,7 +719,7 @@ case SOL_TCP: /* TCP options all take an 'int' value. */ if (optlen < sizeof(uint32_t)) -return -EINVAL; +return -TARGET_EINVAL; val = tget32(optval); ret = get_errno(setsockopt(sockfd, level, optname, &val, sizeof(val))); @@ -812,7 +817,7 @@ goto unimplemented; } if (optlen < sizeof(uint32_t)) - return -EINVAL; + return -TARGET_EINVAL; val = tget32(optval); ret = get_errno(setsockopt(sockfd, SOL_SOCKET, optname, &val, sizeof(val))); @@ -820,11 +825,12 @@ default: unimplemented: gemu_log("Unsupported setsockopt level=%d optname=%d \n", level, optname); -ret = -ENOSYS; +ret = -TARGET_ENOSYS; } return ret; } +/* do_getsockopt() Must return target values and target errnos. */ static target_long do_getsockopt(int sockfd, int level, int optname, target_ulong optval, target_ulong optlen) { @@ -851,7 +857,7 @@ int_case: len = tget32(optlen); if (len < 0) -return -EINVAL; +return -TARGET_EINVAL; lv = sizeof(int); ret = get_errno(getsockopt(sockfd, level, optname, &val, &lv)); if (ret < 0) @@ -884,7 +890,7 @@ case IP_MULTICAST_LOOP: len = tget32(optlen); if (len < 0) -return -EINVAL; +return -TARGET_EINVAL; lv = sizeof(int); ret = get_errno(getsockopt(sockfd, level, optname, &val, &lv)); if (ret < 0) @@ -908,7 +914,7 @@ unimplemented: gemu_log("getsockopt level=%d optname=%d not yet supported\n", level, optname); -ret = -ENOSYS; +ret = -TARGET_ENOSYS; break; } return ret; @@ -945,6 +951,7 @@ unlock_user (target_vec, target_addr, 0); } +/* do_socket() Must return target values and target errnos. */ static target_long do_socket(int domain, int type, int protocol) { #if defined(TARGET_MIPS) @@ -972,6 +979,7 @@ return get_errno(socket(domain, type, protocol)); } +/* do_bind() Must return target values and target errnos. */ static target_long do_bind(int sockfd, target_ulong target_addr, socklen_t addrlen) { @@ -981,6 +989,7 @@ return get_errno(bind(so
Re: [Qemu-devel] RFC: reverse-endian softmmu memory accessors
On Mon, 2007-10-15 at 19:02 +0300, Blue Swirl wrote: > On 10/15/07, J. Mayer <[EMAIL PROTECTED]> wrote: > > On Sun, 2007-10-14 at 15:59 +0300, Blue Swirl wrote: > > > On 10/14/07, J. Mayer <[EMAIL PROTECTED]> wrote: > > > > Here's an updated version of the patch against current CVS. > > > > This patches provides reverse-endian, little-endian and big-endian > > > > memory accessors, available with and without softmmu. It also provides > > > > an IO_MEM_REVERSE TLB flag to allow future support of per-page > > > > endianness control, which is required by some targets CPU emulations. > > > > Having reverse-endian memory accessors also make it possible to optimise > > > > reverse-endian memory access when the target CPU has dedicated > > > > instructions. For now, it includes optimisations for the PowerPC target. > > > > > > This breaks Sparc32 softmmu, I get a black screen. Your changes to > > > target-sparc and hw/sun4m.c look fine, so the problem could be in IO? > > > > Did it worked before my commits ? I may have done something wrong during > > the merge... > > I will do more checks and more tests... > > If I disable the IOSWAP code, black screen is gone. I think this is > logical: the io accessors return host CPU values, therefore no byte > swapping need to be performed. Memory mapped I/O access function hopefully return data in the target endianness. This is the reason why there are so many #ifdef TARGET_WORDS_BIGENDIAN in the emulated devices memory mapped accesses routines and also in io_read and io_write functions for 64 bits accesses. And the emulated CPU is expecting data to always come in its endiannes when doing a "load from memory", even if the access is a device one. Your patch works as long as you don't use load/store with reverse endian accessor routines nor TLB wih reverse endian bit set. On PowerPC, using reverse-endian load and stores, the byteswap in I/O routines is needed for most MMIO device accesses (like IDE, which always returns little-endian data) could ever be accessed. The bug you report just means there's a logical error somewhere in my code. I did download the Sparc test and was able to reproduce it. I'm working to find the bug. And I finally found it. The bug is just that I did something completelly stupid, defining IO_MEM_REVERSE as 3 instead of 4: it's obvious that it has to be a power of 2 to be combined with the other TB bits. I wonder how the PowerPC case was able to run with such a huge bug... Please apologive. I'm going to do more test with this fix and try to merge the sparc_reverse_endian in my code and repost an updated patch. -- J. Mayer <[EMAIL PROTECTED]> Never organized
Re: [Qemu-devel] RFC: Code fetch optimisation
On Mon, 2007-10-15 at 17:01 +0100, Paul Brook wrote: > > > > +unsigned long phys_pc; > > > > +unsigned long phys_pc_start; > > > > > > These are ram offsets, not physical addresses. I recommend naming them as > > > such to avoid confusion. > > > > Well, those are host addresses. Fabrice even suggested me to replace > > them with void * to prevent confusion, but I kept using unsigned long > > because the _p functions API do not use pointers. As those values are > > defined as phys_ram_base + offset, those are likely to be host address, > > not RAM offset, and are used directly to dereference host pointers in > > the ldxxx_p functions. Did I miss something ? > > You are correct, they are host addresses. I still think calling them phys_pc > is confusing. It took me a while to convince myself that "unsigned long" was > an appropriate type (ignoring 64-bit windows hosts for now). > > How about host_pc? It's OK for me. > > > > +/* Avoid softmmu access on next load */ > > > > +/* XXX: dont: phys PC is not correct anymore > > > > + * We could call get_phys_addr_code(env, pc); and remove the > > > > else + * condition, here. > > > > + */ > > > > +//*start_pc = phys_pc; > > > > > > The commented out code is completely bogus, please remove it. The comment > > > is also somewhat misleading/incorrect. The else would still be required > > > for accesses that span a page boundary. > > > > I guess trying to optimize this case retrieving the physical address > > would not bring any optimization as in fact only the last translated > > instruction of a TB (then only a few code loads) may hit this case. > > VLE targets (x86, m68k) can translate almost a full page of instructions, and > a page boundary can be anywhere within that block. Once we've spanned > multiple pages there's not point stopping translation immediately. We may as > well translate as many instructions as we can on the second page. > > I'd guess most TB are much smaller than a page, so on average only a few > instructions are going to come after the page boundary. This leads me to another reflexion. For fixed length encoding targets, we always stop translation when reaching a page boundary. If we keep using the current model and we optimize the slow case, it would be possible to stop only if we cross 2 pages boundary during code translation, and it seems that this case is not likely to happen. If we keep the current behavior, we could remove the second page_addr element in the tb structure and maybe optimize parts of the tb management and invalidation code. > > I'd like to keep a comment here to show that it may not be a good idea > > (or may not be as simple as it seems at first sight) to try to do more > > optimisation here, but you're right this comment is not correct. > > Agreed. > > > > The code itself looks ok, though I'd be surprised if it made a > > > significant difference. We're always going to hit the fast-path TLB > > > lookup case anyway. > > > > It seems that the generated code for the code fetch is much more > > efficient than the one generated when we get when using the softmmu > > routines. But it's true we do not get any significant performance boost. > > As it was previously mentioned, the idea of the patch is more a 'don't > > do unneeded things during code translation' than a great performance > > improvment. > > OTOH it does make the the code more complicated. I'm agnostic about whether > this patch should be applied. I agree that this proposal was an answer to a challenging idea that I received more than a real need. The worst thing in this patch, imho, is that you need to increase 2 values each time you want to change the PC. This is likely to bring some bug when one will forgot to increase one of the two. I was thinking of hiding the pc, host_pc and host_pc_start (and maybe also pc_start) in a structure and add inline helpers: * get_pc would return the current virtual PC, as needed by the jump and relative memory accesses functions. * get_tb_len would return the difference between the virtual PC and the virtual pc_start, as it is done at the end of the gen_intermediate_code functions * move_pc would add an offset to the virtual and the physical PC. This has to be target dependant, due to the special case for Sparc * update_phys_pc would be void for most targets, except for Sparc where the phys_pc needs to be adjusted after the translation of each target instruction. and maybe more, if needed. This structure could also contain target specific information. To address the problem of segment limit check reported by Fabrice Bellard, we could for example add the address of the next segment limit for x86 target and add a target specific check at the start of the ldx_code_p function. But I don't know much about segmentation "subtilities" on x86, then this idea may not be appropriate to solve this problem. -- J. Mayer <[EMAIL PROTECTED]> Never organized
Re: [Qemu-devel] RFC: Code fetch optimisation
> > VLE targets (x86, m68k) can translate almost a full page of instructions, > > and a page boundary can be anywhere within that block. Once we've spanned > > multiple pages there's not point stopping translation immediately. We may > > as well translate as many instructions as we can on the second page. > > > > I'd guess most TB are much smaller than a page, so on average only a few > > instructions are going to come after the page boundary. > > This leads me to another reflexion. For fixed length encoding targets, > we always stop translation when reaching a page boundary. If we keep > using the current model and we optimize the slow case, it would be > possible to stop only if we cross 2 pages boundary during code > translation, and it seems that this case is not likely to happen. If we > keep the current behavior, we could remove the second page_addr element > in the tb structure and maybe optimize parts of the tb management and > invalidation code. The latter may be the only feasible option. Some targets (ARMv5, maybe others) do not have an explicit fault address for MMU instruction faults. The faulting address is the address of the current instruction when the fault occurs. Prefetch aborts are generated at translation time, which effectively means the faulting instruction must be at the start of a TB. Terminating the TB on a page boundary guarantees this behavior. For VLE targets we already get this wrong (the prefetch abort occurs some time before the faulting instruction executes). I don't know if this behavior is permitted by the ISA, but it's definitely possible to construct cases where it has visible effect. Paul
[Qemu-devel] handling SIGWINCH with qemu -nographic
Has anyone looked into finding a way to pass SIGWINCH through to the guest? I started looking at linux-user/signal.c. For those that might not be familiar with SIGWINCH, supporting it would allow qemu -nographic to tell when the terminal is a different size than 80x24.
Re: [Qemu-devel] handling SIGWINCH with qemu -nographic
On Tuesday 16 October 2007, Jeff Carr wrote: > Has anyone looked into finding a way to pass SIGWINCH through to the > guest? I started looking at linux-user/signal.c. > > For those that might not be familiar with SIGWINCH, supporting it > would allow qemu -nographic to tell when the terminal is a different > size than 80x24. qemu emulates a real machine. Signals are an operating system concept, so your question makes no sense. Configure your guest OS exactly the same way you would a real machine with a serial console. The linux-user code is for the userspace emulation. You're obviously using the full system emulation. Paul