Re: [Mesa-dev] [PATCH v3 4/4] nv50/ir: further optimize multiplication by immediates

2018-08-11 Thread Karol Herbst
yeah, I was mainly commenting on the questionble performance gains. We can't just assume less instructions == more perf as we don't really know what changing instructions really means. And right, I wasn't really taking LoadPropagation into account, but it seems like that at least nvidia prefers XM

Re: [Mesa-dev] [PATCH v3 4/4] nv50/ir: further optimize multiplication by immediates

2018-08-11 Thread Rhys Perry
It seems multiplication by negative powers of two are nonexistent in the shader-db, so an specialized optimization for them would probably not be worth it. It seems my approach gives better instruction counts in shader-db than your approach, since it can generate shorter (for things like a * 7) an

Re: [Mesa-dev] [PATCH v3 4/4] nv50/ir: further optimize multiplication by immediates

2018-08-11 Thread Karol Herbst
I think we could do something else (which may even cover more cases): 1. try to use a shl (we already do that) 2 use shladd for all negative imms with for all power of two negative immediates (are we already doing it? I think we miss a lot of opts where "worse" instructions could include modifier

[Mesa-dev] [PATCH v3 4/4] nv50/ir: further optimize multiplication by immediates

2018-07-23 Thread Rhys Perry
Strongly mitigates the harm from the previous commit, which made many integer multiplications much more heavy on the register and instruction count. total instructions in shared programs : 5839715 -> 5801926 (-0.65%) total gprs used in shared programs: 670553 -> 669853 (-0.10%) total shared us