Unfortunately, I didn't see visible performance gain on x86. :(
It's near the measurement mistake.
Probably it's because reading from L1 data cache is very cheap.

Thanks. Dmitry.

On Mon, Dec 3, 2012 at 7:33 PM, Nikita Popov <nikita....@gmail.com> wrote:

> On Mon, Dec 3, 2012 at 10:35 AM, Dmitry Stogov <dmi...@zend.com> wrote:
>
>> The new proposed patch: http://pastebin.com/pj5fQTfN
>>
>> Now both execute_data->Ts and execute_data->CVs are removed and
>> corresponding temporary and compiled variables accessed using
>> "execute_data" as the base pointer. Temporary variables allocate directly
>> before the "execute_data" in reverse order and compiled variables right
>> after. So they can be accessed without any additional computations. The
>> patch reduces the number of executed instructions and number of memory
>> reads (about 8% less).
>>
>
> Did you also test how much these 8% less memory reads improve performance?
> Would be interesting to know :)
>
> Nikita
>

Reply via email to