On 8/9/15 09:10, Chen Gang wrote: > > On 8/9/15 01:23, Chen Gang wrote: >> Hello all: >> >> Below is my current idea for all floating point insns. For me, it is not >> the precise implementation, even not completely implement -- assume pack >> insns can only for packing (u)int32_t when they are used individually: >> >> fsingle_add1 ; return calc flags, save calc result to env. >> >> fsingle_sub1 ; return calc flags, save calc result to env. >> >> fsingle_addsub2 ; set "has result" flag. >> >> fsingle_mul1 ; skip return value, save calc result to env. >> set "has result" flag. >> >> fsingle_mul2 ; skipped. >> >> >> fsingle_pack1 ; skipped. >> >> fsingle_pack1 ; if "has result" >> reset "has result" flag. >> return calc result from env. >> else >> pack srca >> reference from tilegx.md: float(uns)sisf2. >> get (u)int32_t a, then (u)int32_to_float32. > > For "pack srca and srcb", the related demo like below (srca and srcb > are uint64_t): >
Oh, sorry, for "pack srca" (not for "pack srca and srcb") > switch (srca & 0x3ff) { > > /* treat it as uint32_t */ > case 0x9e: > return uint32_to_float32(srca >> 32, &FP_STATUS); > > /* treat it as int32_t, must be negative number */ > case 0x29e: > return int32_to_float32(srca >> 32 | 0x80000000, &FP_STATUS); > > default: > unimplemented (gen_exception). > } > >> >> fdouble_unpack_max: ; skipped. >> >> fdouble_unpack_min: ; skipped. >> >> fdouble_add_flags: ; return calc flags, save calc result to env. >> >> fdouble_sub_flags: ; return calc flags, save calc result to env. >> >> fdouble_addsub: ; set "has result" flag. >> >> fdouble_mul_flags: ; skip return flags, save calc result to env. >> set "has result" flag. >> >> fdouble_pack1: ; if "has result" >> reset "has result" flag. >> return calc result from env. >> else >> pack srca and srcb. >> reference from tilegx.md: float(uns)sidf2. >> get (u)int32_t a, then (u)int32_to_float64. >> > > For "pack srca and srcb", the related demo like below (srca and srcb > are uint64_t): > > switch (srcb & 0xffff) { > Oh, sorry, should use 0xfffff instead of 0xffff. > /* treat it as uint32_t */ > case 0x21b00: > return uint32_to_float64(srca >> 4, &FP_STATUS); > > /* treat it as int32_t, must be negative number */ > case 0xa1b00: > return int32_to_float64(srca >> 4 | 0x80000000, &FP_STATUS); > > default: > unimplemented (gen_exception). > } > >> fdouble_pack2: ; skipped. >> >> >> (fsingle_add1/sub1, fdouble_add/sub_flags can be used individually, >> e.g gcc testsuit for complex number). >> >> >> Next, I shall implement the floating point insns, welcome any related >> ideas, suggestions, and completions. >> >> Thanks. >> >> >> On 8/5/15 22:16, Chen Gang wrote: >>> On 8/4/15 23:04, Richard Henderson wrote: >>>> On 08/04/2015 06:56 AM, Chen Gang wrote: >>>>> >>>>> On 8/4/15 04:47, Chen Gang wrote: >>>>>> On 8/4/15 00:40, Richard Henderson wrote: >>>>>>> On 08/01/2015 02:47 AM, Chen Gang wrote: >>>>>>>> I am just adding floating point instructions (e.g. fsingle_add1), >>>>>>>> but for me, I can not find any details about them (the ISA >>>>>>>> documents only give a summary description, but not details), e.g. >>>>>>> >>>>>>> The tilegx splits the four/six cycle arithmetic into multiple >>>>>>> black-box instructions. You need only really implement one of the >>>>>>> four, with the rest of them being implemented as nops or moves. >>>>>>> >>>>>>> Looking at what gcc produces gives the hints: >>>>>>> >>>>>>> fdouble_unpack_min min, srca, srcb fdouble_unpack_max max, >>>>>>> srca, >>>>>>> srcb fdouble_add_flags flg, srca, srcb fdouble_addsub max, >>>>>>> min, flg >>>>>>> fdouble_pack1 dst, max, flg fdouble_pack2 dst, >>>>>>> max, zero >>>>>>> >>>>>>> The unpack, addsub, and pack2 insns can be ignored, the add_flags >>>>>>> insn can perform the whole operation, the pack1 insn performs a move >>>>>>> from "flg" to "dst". >>>>>>> >>>>>>> Similarly for the single-precision: >>>>>>> >>>>>>> fsingle_add1 tmp, srca, srcb fsingle_addsub2 tmp, >>>>>>> srca, srcb >>>>>>> fsingle_pack1 flg, tmp fsingle_pack2 dst, tmp, flg >>>>>>> >>>>>>> The add1 insn performs the whole operation, the addsub2 and pack1 >>>>>>> insns are ignored, and the pack2 insn is a move from tmp to dst. >>>>>>> >>>>> >>>>> After check the tilegx.md completely, for me, we still need implement >>>>> each of them precisely, or we can not emulate all cases (e.g. muldf3). >>>> >>>> No, you can still implement all of muldf3 in fdouble_mul_flags. >>>> Again, the fdouble_pack1 copies from the flag input to the output. >>>> >>>> Yes, there is a 64-bit multiply in there, but the tcg optimizer >>>> should be able to delete all of that as unused. Especially if you have the >>>> fdouble_unpack* insns store zero into their destinations. >>>> >>> >>> For me, I am not quite sure. But I guess, what you said should be OK (at >>> least, what you said is very useful for the implementation). >>> >>> >>>> Don't get me wrong -- more accurate implementation of the actual >>>> insns would be nice, especially for debugging. But if the insns >>>> aren't accurately documented I don't see what choice we have. >>>> >>> >>> For me, I guess, we can still try to implement the details. >>> >>> - The document has all floating point instructions' summary, so we can >>> think of, or guess its implementation entirely. >>> >>> - gcc uses them all and completely, so it is our good sample and good >>> reference (but we should not assume gcc must be correct, since we >>> just use qemu for gcc testsuite). >>> >>> - Tilegx floating point format should be standard (at least, reference >>> to the standard format), so we can reference the related information >>> from google/baidu. >>> >>> >>>> On the good side, implementing the entire operation as part of the "flags" >>>> step >>>> probably results in faster emulation. >>>> >>> >>> I guess so, too. >>> >>> >>> I shall try to finish the simple implementation, firstly. Then try to >>> implement the floating point instructions in details in the future (it >>> should be lower priority). >>> >>> >>> Thanks. >>> >> > -- Chen Gang Open, share, and attitude like air, water, and life which God blessed