On 28.05.19 15:03, Richard Henderson wrote: > On 5/28/19 8:02 AM, David Hildenbrand wrote: >> On 28.05.19 14:55, Richard Henderson wrote: >>> On 5/24/19 4:33 AM, David Hildenbrand wrote: >>>> + /* identify the smaller element */ >>>> + if (first_inequal < 16) { >>>> + uint8_t enr = first_inequal / (1 << es); >>>> + uint32_t a = s390_vec_read_element(v2, enr, es); >>>> + uint32_t b = s390_vec_read_element(v3, enr, es); >>>> + >>>> + smaller = a < b; >>>> + } >>>> + >>>> + if (zs) { >>>> + z0 = zero_search(a0, mask); >>>> + z1 = zero_search(a1, mask); >>>> + first_zero = match_index(z0, z1); >>>> + } >>>> + >>>> + s390_vec_write_element64(v1, 0, MIN(first_inequal, first_zero)); >>>> + s390_vec_write_element64(v1, 1, 0); >>>> + if (first_zero == 16 && first_inequal == 16) { >>>> + return 3; >>>> + } else if (first_zero < first_inequal) { >>>> + return 0; >>>> + } >>>> + return smaller ? 1 : 2; >>> >>> Perhaps move the computation of smaller down here where it is used. >> >> Wanted to do that but then I realized that I would have to move >> s390_vec_write_element64() as well, because v1 and v2/v3 could overlap. > > Oh, yes of course. R-B without any changes. ;-) >
Thanks Richard, will send a pull request to Conny for this part soon. I'll start getting the vector floating-point instruction into shape this week. So don't start to relax ;) Cheers! > > r~ > -- Thanks, David / dhildenb