On Thu, Nov 28, 2019 at 2:08 AM Konstantin Knizhnik
<k.knizh...@postgrespro.ru> wrote:
> calls float4_accum for each row of T, the same query in VOPS will call
> vops_float4_avg_accumulate for each tile which contains 64 elements.
> So vops_float4_avg_accumulate is called 64 times less than float4_accum.
> And inside it contains straightforward loop:
>
>              for (i = 0; i < TILE_SIZE; i++) {
>                  sum += opd->payload[i];
>              }
>
> which can be optimized by compiler (loop unrolling, use of SIMD
> instructions,...).

Part of the reason why the compiler can optimize that so well is
probably related to the fact that it includes no overflow checks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Reply via email to