On 28/09/2019 18:45, Aleksandar Markovic wrote: Hi Aleksandar,
Thanks for taking a look at this! > Mark and Paul (and Stefan), > > Thanks for spotting this and pinpointing the culprit commit. I guess Stefan > is going > to respond soon, but, in the meantime, I took a look at the commit in > question: > > https://github.com/qemu/qemu/commit/4e6d0920e7547e6af4bbac5ffe9adfe6ea621822 > > I don't have at the moment any dev/test environment handy, but I did manual > inspection of the code, and here is what I found (in order of importance, > perceived > by me): > > 1. The code will not work correctly if the shift ammount (variable 'sh') is > 0. This > is because, in that case, one of succeeding invocations of TCG shift > functions will > be required to shift a 64-bit TCG variable by 64 bits, and the result of such > TCG > operation is undefined (shift amount must be 63 or less) - see > https://github.com/qemu/qemu/blob/master/tcg/README. Yes I think you're right here - the old helper got around this by doing an explicit copy from a to r if the shift value is zero. In fact the case that Paul reported is exactly this: vsl VRT, VRA, VRB => 0x100006e0 <vec_slq+132>: vsl v0,v0,v1 (gdb) p $vr0.uint128 $21 = 0x10111213141516172021222324252650 (gdb) p $vr1.uint128 $22 = 0x0 (gdb) stepi 0x00000000100006e4 in vec_slq () 1: x/i $pc => 0x100006e4 <vec_slq+136>: xxlor vs0,vs32,vs32 (gdb) p $vr0.uint128 $23 = 0x10111213141516172021222324252650 I guess the solution is check for sh == 0 and if this is the case, execute a copy instead. > 2. Variable naming is better in the old helper than in the new translator. In > that > light, I would advise Stefan to change 'sh' to 'shift', and 'shifted' to > 'carry'. It looks like the name "sh" comes from the ISA documentation, so whilst it's a little tricky to compare with the previous implementation it does make sense when comparing with the algorithm shown there. Note: this implementation also drops the check for each byte of VRB having the same shift value - should we care about this? > 3. Lines > > tcg_gen_andi_i64(shifted, shifted, 0x7fULL); > > and > > tcg_gen_andi_i64(shifted, shifted, 0xfe00000000000000ULL); > > appear to be spurious (albait in a harmless way). Therefore, they should be > deleted, > or, alternatevely, a justification for them should be provided. I'm not sure why they are needed either - there's certainly no mention of it in the ISA documentation. Stefan? > 4. In the commit message, variable names were used without quotation mark, > resulting > in puzzling and unclear wording. > > 5. (a question for Mark) After all recent changes, does get_avr64(..., ..., > true) > always (for any endian configuration) return the "high" half of an Altivec > register, > and get_avr64(..., ..., false) the "low" one? Yes - the new functions always return the MSB (high) and LSB (low) correctly regardless of host endian. > Given all these circumstances, perhaps the most reasonable solution would be > to > revert the commit in question, and allow Stefan enough dev and test time to > hopefully > submit a new, better, version later on. Given that it has been broken for 3 months now, I don't think we're in any major rush to revert ASAP. I'd prefer to give Stefan a bit more time first since he does report some substantial speed improvements from these new implementations. ATB, Mark.