How long of a story to tell(?)... I have this procedural random generator, that uses the result of sha2 as a stream of bits, regenerating another sha2 hash when the 256 bits are all consumed. I started to use this for perlin noise generation, and thought I found sha2 was the culprit consuming most of the time. Since I'm making something just generally random, I went looking for alternative, lightweight, RNGs. After digging for a while I stumbled on PCG (http://www.pcg-random.org/using-pcg.html) It's basically a header-only library because it attempts to generate everything as inline...
I made this JS port of it https://gist.github.com/d3x0r/345b256be6569c0086c328a8d1b4be01 This is the first revision ... https://gist.github.com/d3x0r/345b256be6569c0086c328a8d1b4be01/fffa8e906d5723e66f7e9baa950b3b3d5b4895c7 It has a better flow matching what the C code does closer... the current version is fast generating 115k bits per millisecond (vs the 9.3k bpms of sha2); however, when compared to the C version, which generates 1.1Mbps it's a factor of 10 off... and the routine is generally only doing 64 bit integer math (though the test was compiled in 32 bit mode so it was really just 32 bit registers emulating 64 bit). if I just change the arrays created in getState() (first function), to Uint32Array() it runs MUCH slower... ---- As I write this I was updating some, and some of my numbers from before are a factor of 8 off because I was counting bytes not bits; except in the sha2, which is really slow... But I would like to take this opportunity to say... crypto.subtle.digest("SHA-256", buffer).then(hash=>hash ); is the same output type as my javascript version I'm using ( forked from a fork of forge library and consolidated to just the one return type...), but is another 10x slower than my javascript sha-256. I keep thinking 'Oh I'll just compile this and even use intel accelearted sha2msg1 and sha2msg2 instructions to make the C version 8x faster than it is in C straight, which itself was already faster than the JS version; and hook it into a ... ( oh wait I want to do this on a webpage! can't say Node addon there...). Well... back to optimizing. ---- I was working also on a simple test case to show where using a simple array vs a typed array causes a speed difference, but it's not immediately obvious what I'm doing that's causing it to deoptimize.... so I'll work on building that up until it breaks; or conversely strip the other until it speeds up.. https://github.com/d3x0r/-/blob/master/org.d3x0r.common/salty_random_generator.js#L86 This is getting the bits from a typed array; and it's really not that complex (especially if only getting 1 bit at a time which is what I was last speed testing with; but turns out all the time is really here, swapping out sha2 for pcg(without typed arrays) dropped that from 150ms to 50ms but the remaining was still 3500ms... so I misread the initial performance graph I guess... There's a stack of what were C macros to make the whole thing more readable... https://github.com/d3x0r/-/blob/master/org.d3x0r.common/salty_random_generator.js#L25 and if I inline these, there's no improvement so I guess they're all small to qualify for auto inlining anyway. The version that's current on github ended up creating a new uint32array(1) for every result; I moved that out locally so I can use just a single buffer for that result and it sped up the initialization from 700ms to 200ms (cumulative times) but there's still like 80% of the time in the remainder of the getBuffer routine; maybe I need to move things out of the uint8arrays (data from sha2/pcg) -- -- v8-users mailing list v8-users@googlegroups.com http://groups.google.com/group/v8-users --- You received this message because you are subscribed to the Google Groups "v8-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to v8-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.