On 9/2/19 12:45 AM, liuzhiwei wrote: > > On 2019/8/29 下午11:09, Richard Henderson wrote: >> On 8/29/19 5:45 AM, liuzhiwei wrote: >>> Even in qemu, it may be some situations that VSTART != 0. For example, a >>> load >>> instruction leads to a page fault exception in a middle position. If VSTART >>> == >>> 0, some elements that had been loaded before the exception will be loaded >>> once >>> again. >> Alternately, you can validate all of the pages before performing any memory >> operations. At which point there will never be an exception in the middle. > > As a vector instruction may access memory across many pages, is there any > way > to validate the pages? Page table walk ?Or some TLB APIs?
Yes, there are TLB APIs. Several of them, depending on what is needed. > #0 cpu_watchpoint_address_matches (wp=0x555556228110, addr=536871072, len=1) > at qemu/exec.c:1094 > #1 0x000055555567204f in check_watchpoint (offset=160, len=1, attrs=..., > flags=2) at qemu/exec.c:2803 > #2 0x0000555555672379 in watch_mem_write (opaque=0x0, addr=536871072, > val=165, > size=1, attrs=...) at qemu/exec.c:2878 > #3 0x00005555556d44bb in memory_region_write_with_attrs_accessor > (mr=0x5555561292e0 <io_mem_watch>, addr=536871072, value=0x7fffedffe2c8, > size=1, shift=0, mask=255, attrs=...) > at qemu/memory.c:553 > #4 0x00005555556d45de in access_with_adjusted_size (addr=536871072, > value=0x7fffedffe2c8, size=1, access_size_min=1, access_size_max=8, > access_fn=0x5555556d43cd <memory_region_write_with_attrs_accessor>, > mr=0x5555561292e0 <io_mem_watch>, attrs=...) at qemu/memory.c:594 > #5 0x00005555556d7247 in memory_region_dispatch_write (mr=0x5555561292e0 > <io_mem_watch>, addr=536871072, data=165, size=1, attrs=...) at > qemu/memory.c:1480 > #6 0x00005555556f0d13 in io_writex (env=0x5555561efb58, > iotlbentry=0x5555561f5398, mmu_idx=1, val=165, addr=536871072, retaddr=0, > recheck=false, size=1) at qemu/accel/tcg/cputlb.c:909 > #7 0x00005555556f19a6 in io_writeb (env=0x5555561efb58, mmu_idx=1, index=0, > val=165 '\245', addr=536871072, retaddr=0, recheck=false) at > qemu/accel/tcg/softmmu_template.h:268 > #8 0x00005555556f1b54 in helper_ret_stb_mmu (env=0x5555561efb58, > addr=536871072, val=165 '\245', oi=1, retaddr=0) at > qemu/accel/tcg/softmmu_template.h:304 > #9 0x0000555555769f06 in cpu_stb_data_ra (env=0x5555561efb58, ptr=536871072, > v=165, retaddr=0) at qemu/include/exec/cpu_ldst_template.h:182 > #10 0x0000555555769f80 in cpu_stb_data (env=0x5555561efb58, ptr=536871072, > v=165) at /qemu/include/exec/cpu_ldst_template.h:194 > #11 0x000055555576a913 in csky_cpu_stb_data (env=0x5555561efb58, > vaddr=536871072, data=165 '\245') at qemu/target/csky/csky_ldst.c:48 > #12 0x000055555580ba7d in helper_vdsp2_vstru_n (env=0x5555561efb58, > insn=4167183360) at qemu/target/csky/op_vdsp2.c:1317 > > The path is not related to probe_write in the patch(). Of course. It wasn't supposed to be. > Could you give more details or a test case where watchpoint doesn't work > correctly? If the store partially, but not completely, overlaps the watchpoint. This is obviously much easier to do with large vector operations than with normal integer operations. In this case, we may have completed some of the stores before encountering the watchpoint. Which, inside check_watchpoint(), will longjmp back to the cpu main loop. Now we have a problem: the store is partially complete and it should not be. Therefore, we now have patches queued in tcg-next that adjust probe_write to perform both access and watchpoint tests. There is still target-specific code that must be adjusted to match, so there are not currently any examples in the tree to show. However, the idea is: (1) Instructions that perform more than one host store must probe the entire range to be stored before performing any stores. (2) Instructions that perform more than one host load must either probe the entire range to be loaded, or collect the data in temporary storage. If not using probes, writeback to the register file must be delayed until after all loads are done. (3) Any one probe may not cross a page boundary; splitting of the access across pages must be done by the helper. r~