On Wed, Mar 18 2015, Denys Vlasenko wrote:
> Your code does four 16-bit stores.
> The version below does two 32-bit ones instead,
> and it is also marginally smaller.
>
> char *put_dec_full8(char *buf, unsigned r)
> {
> unsigned q;
> u32 v;
>
> /* 0 <= r < 10^8 */
>
On Wed, Mar 18, 2015 at 1:50 AM, Denys Vlasenko
wrote:
> On Sat, Feb 21, 2015 at 12:51 AM, Rasmus Villemoes
> wrote:
>> The most expensive part of decimal conversion is the divisions by 10
>> (albeit done using reciprocal multiplication with appropriately chosen
>> constants). I decided to see if
On Sat, Feb 21, 2015 at 12:51 AM, Rasmus Villemoes
wrote:
> The most expensive part of decimal conversion is the divisions by 10
> (albeit done using reciprocal multiplication with appropriately chosen
> constants). I decided to see if one could eliminate around half of
> these multiplications by
On Tue, Mar 10 2015, Tejun Heo wrote:
> Hello,
>
> On Tue, Mar 10, 2015 at 11:47:47AM +0100, Rasmus Villemoes wrote:
>> I can't explain why num_to_str apparently becomes slightly slower (the
>> patch essentially didn't touch it), but the put_dec_ helpers in any case
>> make up for that.
>
> Unrel
Hello,
On Tue, Mar 10, 2015 at 11:47:47AM +0100, Rasmus Villemoes wrote:
> I can't explain why num_to_str apparently becomes slightly slower (the
> patch essentially didn't touch it), but the put_dec_ helpers in any case
> make up for that.
Unrelated code changes affecting performance in seemingl
On Thu, Mar 05 2015, Rasmus Villemoes wrote:
> On Thu, Mar 05 2015, Tejun Heo wrote:
>
>> I'd like to see how this actually affects larger operations - sth
>> along the line of top consumes D% less CPU cycles w/ N processes - if
>> for nothing else, just to get the sense of scale,
>
> That makes
On Thu, Mar 05 2015, Tejun Heo wrote:
> On Thu, Mar 05, 2015 at 08:03:33AM -0800, Joe Perches wrote:
>> On Thu, 2015-03-05 at 16:22 +0100, Rasmus Villemoes wrote:
>>
>> > I'm assuming the underwhelming response means NAK.
>>
>> Dunno why you assume that, sometimes it just takes
>> awhile for peo
On Thu, Mar 05, 2015 at 08:03:33AM -0800, Joe Perches wrote:
> On Thu, 2015-03-05 at 16:22 +0100, Rasmus Villemoes wrote:
> > On Sat, Feb 21 2015, Rasmus Villemoes wrote:
> >
> > > [...] decimal conversion [...] it does indeed seem like there is
> > > something to be gained, especially on 64 bits
On Thu, 2015-03-05 at 16:22 +0100, Rasmus Villemoes wrote:
> On Sat, Feb 21 2015, Rasmus Villemoes wrote:
>
> > [...] decimal conversion [...] it does indeed seem like there is
> > something to be gained, especially on 64 bits.
> >
> > $ ./test64
> > Distribution Function Cyc
On Sat, Feb 21 2015, Rasmus Villemoes wrote:
> [...] decimal conversion [...] it does indeed seem like there is
> something to be gained, especially on 64 bits.
>
> $ ./test64
> Distribution Function Cycles/conv Conv/1 sec
> uniform([10, 2^64-1]) linux_put_dec 1
The most expensive part of decimal conversion is the divisions by 10
(albeit done using reciprocal multiplication with appropriately chosen
constants). I decided to see if one could eliminate around half of
these multiplications by emitting two digits at a time, at the cost of
a 200 byte lookup tab
11 matches
Mail list logo