On 9/22/15 22:45, Richard Henderson wrote:
> On 09/21/2015 10:54 PM, Chen Gang wrote:
>> On 2015年09月19日 10:34, Richard Henderson wrote:
>>>
>>> There's a trick for this that's more efficient for 4 or more elements
>>> per vector (i.e. good for v2 and v1, but not v4):
>>>
>>>a + b = (a & 0x7f7f
On 09/21/2015 10:54 PM, Chen Gang wrote:
> On 2015年09月19日 10:34, Richard Henderson wrote:
>>
>> There's a trick for this that's more efficient for 4 or more elements
>> per vector (i.e. good for v2 and v1, but not v4):
>>
>>a + b = (a & 0x7f7f7f7f) + (b & 0x7f7f7f7f)) ^ ((a ^ b) & 0x80808080)
>
On 2015年09月19日 10:34, Richard Henderson wrote:
>
> There's a trick for this that's more efficient for 4 or more elements
> per vector (i.e. good for v2 and v1, but not v4):
>
>a + b = (a & 0x7f7f7f7f) + (b & 0x7f7f7f7f)) ^ ((a ^ b) & 0x80808080)
>
>a - b = (a | 0x80808080) - (b & 0x7f7f7
On 9/19/15 10:34, Richard Henderson wrote:
> On 09/18/2015 05:03 PM, gang.chen.5...@gmail.com wrote:
>> +uint64_t helper_v1add(uint64_t a, uint64_t b)
>> +{
>> +uint64_t r = 0;
>> +int i;
>> +
>> +for (i = 0; i < 64; i += 8) {
>> +int64_t ae = (int8_t)(a >> i);
>> +int64
On 09/18/2015 05:03 PM, gang.chen.5...@gmail.com wrote:
+uint64_t helper_v1add(uint64_t a, uint64_t b)
+{
+uint64_t r = 0;
+int i;
+
+for (i = 0; i < 64; i += 8) {
+int64_t ae = (int8_t)(a >> i);
+int64_t be = (int8_t)(b >> i);
+r |= ((ae + be) & 0xff) << i;
+
From: Chen Gang
Only according to helper_v1shrs.
Signed-off-by: Chen Gang
---
target-tilegx/helper.h | 8 +
target-tilegx/simd_helper.c | 77 +
target-tilegx/translate.c | 26 +--
3 files changed, 109 insertions(+), 2 deletion