https://bugs.kde.org/show_bug.cgi?id=385409
--- Comment #72 from Andreas Arnez <ar...@linux.ibm.com> --- (In reply to Julian Seward from comment #71) > Thank you for finishing this up. OK to land provided the stuff below > is fixed (none of it is a big deal) and the test in the next para > passes. Thanks for reviewing! > It is also important to check that all new or modified test cases run > on Memcheck without asserting or otherwise failing. The regression > test driver will only check them with the "none" tool since the tests > are inside none/tests/s390. You can do this manually by running the > relevant tests directly on memcheck (just run "by hand"). Done. > [...] > +typedef union { > .. > +} s390x_vec_op_details_t; > > add STATIC_ASSERT(sizeof(s390x_vec_op_details_t) == sizeof(ULong)) Done. > +static void > +s390_cc_set_expr(IRExpr* cc) > +{ > + vassert(typeOfIRExpr(irsb->tyenv, cc) == Ity_I64); > + > + s390_cc_thunk_fill(mkU64(S390_CC_OP_SET), cc, mkU64(0), mkU64(0)); > +} > > Similar to conversation last week this is again a case where, if |cc| > is used more than once (maybe in the callee of this function) then > code for it will get generated to evaluate it multiple times, and it's > quite likely that those duplicates will not get removed during IR > optimisation. It would be safer to pass around an IRTemp and use > |mkexpr(cc)| instead. Done. Most of the time there was an IRTemp already, so it wasn't a real problem. But I agree we're on the safe side this way. Also renamed the function to s390_cc_set and the previously named s390_cc_set to s390_cc_set_val. > +static IRExpr* > +s390_V128_compareLT128x1(IRExpr* arg1, IRExpr* arg2, Bool allow_equal) > > Just a sanity check: is the logic with |allow_equal| here correct? I > am not saying it isn't, but I think a second look would not be a bad > idea. Yes, the logic looks correct to me: `allow_equal' is checked when the high halves are equal, because in that case the operands might be fully equal. Otherwise they certainly differ and `allow_equal' doesn't matter. > +/* Permorms "arg1 + arg2 + carry_out_bit(arg1 + arg2)". > > "Performs" Done. > [...] have you checked that s390_disasm can print all of these new > insns? [...] I haven't checked whether all insns can be printed correctly. The ones I looked at were OK. I will look into that more thoroughly tomorrow. > + const IROp ops[] = { Iop_InterleaveHI8x16, Iop_InterleaveHI16x8, > + Iop_InterleaveHI32x4, Iop_InterleaveHI64x2 }; > + vassert(m4 < sizeof(ops)); > + put_vr_qw(v1, binop(ops[m4], get_vr_qw(v2), get_vr_qw(v3))); > > That assertion is surely not right [...] > > Please fix, also any other similar that you can find in this diff. > [...] Hm, yeah, this is broken -- luckily it doesn't impact the execution of correct code. Fixed all occurrences I found. > static const HChar * > s390_irgen_VSEL(UChar v1, UChar v2, UChar v3, UChar v4) > { > - IRExpr* vA = get_vr_qw(v3); > - IRExpr* vB = get_vr_qw(v2); > - IRExpr* vC = get_vr_qw(v4); > - > - /* result = (vA & ~vC) | (vB & vC) */ > - put_vr_qw(v1, > - binop(Iop_OrV128, > - binop(Iop_AndV128, vA, unop(Iop_NotV128, vC)), > - binop(Iop_AndV128, vB, vC) > - ) > - ); > + IRExpr* vIfTrue = get_vr_qw(v2); > + IRExpr* vIfFalse = get_vr_qw(v3); > + IRExpr* vCond = get_vr_qw(v4); > + > + put_vr_qw(v1, s390_V128_bitwiseITE(vCond, vIfTrue, vIfFalse)); > return "vsel"; > } > > This kind of thing potentially also suffers from the IRExpr* > duplication problem, assuming that s390_V128_bitwiseITE uses vCond > twice. You could fix it directly in s390_V128_bitwiseITE by > allocating a new temp, binding (assign()-ing) vCond to that and then > using the temp twice (if it doesn't already do that). Done. > --- a/VEX/priv/host_s390_defs.h > +++ b/VEX/priv/host_s390_defs.h > > + S390_VEC_COUNT_LEADING_ZEROEZ, > + S390_VEC_COUNT_TRAILING_ZEROEZ, > > Really, _ZEROEZ and not _ZEROES ? Can we use the traditional > spelling, unless ZEROEZ is really what was intended? Hehe, these were cool names though ;-). Anyhow -- fixed. > --- a/VEX/priv/host_s390_isel.c > +++ b/VEX/priv/host_s390_isel.c > > +#define IRCONST_IS_EQUAL_U8(arg, val) \ > + ( ((arg)->tag == Iex_Const) && ((arg)->Iex.Const.con->Ico.U8 == (val)) ) > > This misses out (or assumes) the check that |arg| has type Ity_I8. > This might work, but it's somewhat dangerous in the case of > programming mistakes. Please change it to be > > ((arg)->tag == Iex_Const) > && ((arg)->Iex.Const.tag == Ico_U8) <--- the missing check > && ((arg)->Iex.Const.con->Ico.U8 == (val)) Done. > - s390_unop_t vec_op = 0; > + UInt vec_op = 0; > > Is it necessary to "weaken" the type of vec_op here? Reverted, and also removed the dubious initialization. The compiler doesn't complain, so the weakening probably wasn't necessary to please the compiler, and I haven't found any other reason. > + case Iop_CmpNEZ128x1: { > [...] > + } > > This is (I think!) correct but it's also very inefficient. We're > given a 128 bit value |arg| and we want to make |dst| be 0..(126)..0 > if |arg| is |all| zeroes and 1..(126)..1 if arg is all ones. > > Better would be like this > > IRExpr* low64 = IRExpr_Unop(Iop_V128to64, arg); > IRExpr* high64 = IRExpr_Unop(Iop_V128HIto64, arg); > IRExpr* both = IRExpr_Binop(Iop_Or64, low64, high64); > IRExpr* anyNonZ = IRExpr_Unop(Iop_CmpNEZ64, both); > IRExpr* anyNonZ64 = IRExpr_Unop(Iop_1Sto64, anyNonZ); > reg1 = s390_isel_int_expr(env, anyNonZ64); Done. > > Even then, it's pretty much a hack to create s390 code by first > generating *even more* IR and then handing that off to some other part > of the instruction selector. It also means that the tree for |arg| > will get instruction-selected twice. > > Much better would just be to generate the relevant s390 insns > directly. But fixing that is beyond the scope of this round of > reviewing/bug fixing. Yeah, I didn't spend time on that yet. > + case Iop_PwAddL8Ux16: { > [...] > + } > > ditto (re "pretty much a hack".) At least, please format to fit > inside 80 cols. Done. > Please write the arguments to IRConst_V128 here as 0x0000. This is > consistent with the style of other use of it and emphasises the fact > that the argument is a 16-bit value (one bit for each byte of a 16 > byte vector). Done. > + case Iop_PwAddL16Ux8: > [...] > > Please change to use the house style, not so verbose: > > + case Iop_PwAddL16Ux8: > + if (arg->tag == Iex_Unop && arg->Iex.Unop.op == Iop_PwAddL8Ux16) { > + size = 1; > + arg = arg->Iex.Unop.arg; > + } else { > + size = 2; > + } > + vec_op = S390_VEC_PWSUM_W; > + goto Iop_Pairwise_wrk; Done. -- You are receiving this mail because: You are watching all bug changes.