Re: [v8-users] V8 optimizations causing spikes in execution times

Jakob Kummerow Fri, 18 Jan 2019 12:58:35 -0800

Answering what I can:

I changed my code from using a "while(true)" loop to calling "setImmediate"
> every once in a while so that it would yield to the event loop, that way
> the GC shouldn't interrupt a turn while it's executing. I didn't notice a
> difference doing this.



Not seeing a difference there makes sense; the GC doesn't care about
yielding to the event loop.

apparently 137 ms were spent optimizing (20 ms to prepare, 115 ms to
> execute, and 2 ms to finalize).


That means 20ms were spent on the main thread some time in the past, 115 ms
in the background, and then 2 ms on the main thread just now (so those 2 ms
would be part of the 37 ms of that turn).

Running that same test with "--minimal"


You mean V8's --minimal flag? That is absolutely not what you want. It's a
development tool that turns off everything except the bare minimum: no more
type feedback, no more optimizing compiler, >100x performance hit. (And
FWIW, that flag has recently been removed.)

 I tried "--predictable" without seeing any differences


Yes, --predictable is another development/debugging tool that is not useful
for production purposes. It turns off all sources of randomness, which
includes all threading. You would only see worse performance and longer
main-thread stalls from this.

Are there any other V8 flags that may be of help here?


The default configuration is the recommended configuration for best
performance.

Is it possible to choose when to run optimizations?


Effectively no. (There is the special intrinsic
%OptimizeFunctionOnNextCall, which does what the name suggests, but
optimization is only useful if the right type feedback is available, so
manually choosing when to optimize a given function only makes sense for
tests where you know exactly which function you want to have optimized
when; I can't imagine it being useful for your use case: especially for
user-provided code, there is no way to tell when it is ready for
optimization, and if you manually optimize it too early, it'll just get
deoptimized right away, so the manually triggered optimization would be a
waste of time and only introduce more/longer delays.)

Based on the data you've provided, I'm not convinced that optimizations are
the reason for those long turns, but I also don't have a good guess for
what else it might be.

If it were optimizations, then warmup would probably be the best
mitigation, but you report that warmup doesn't help, so it's probably
something else. The same reasoning speaks against the other
frequently-given advice, which is to avoid deoptimizations at least in the
code you control (while deopts themselves are fast, a deoptimized function
might get optimized again later, and that might take time).

One thing you could try is wrapping the whole thing into a website so you
can use Chrome DevTools' profiling facilities. Maybe that'll make it easier
to drill down into those rarely-occurring delays.


On Fri, Jan 18, 2019 at 12:19 PM Adam Damiano <onaim...@gmail.com> wrote:

> I've tried to be as descriptive as possible below without covering
> irrelevant details. The summary of the problem is that I think V8
> optimizations may be causing unpredictable processing spikes in my game
> that I need to be able to either defer or measure.
>
> Background
>
>    - The code is closed-source, and there are a *lot* of moving parts, so
>    I'm wondering mostly about theory here and not necessarily implementation.
>    More on this in the "Questions" section at the bottom.
>    - I'm making a game where players write code for bots that battle each
>    other. The game is turn-based, and each match can take thousands of turns.
>    All of these turns get simulated at once in a NodeJS process (v10.15)
>    running in a 1-vCPU Alpine container on AWS. After a simulation is done,
>    the battle is essentially replayed on the client without any input from the
>    player.
>    - Each simulation is CPU-intensive given that it's just running
>    player-defined code. There is no network or disk utilization during this
>    time. The most important performance metrics for me are:
>       - How long (in milliseconds) does an average turn take to simulate?
>       - How long does the longest turn take to simulate?
>       - The average is important so that I know how many turns to limit
>    matches to. The measured average over tens of thousands of real matches was
>    about 0.3 ms/turn, so I chose 3000 turns for the limit since 0.3 ms/turn *
>    3000 turns is still less than a second (which I consider to be a reasonable
>    amount of time for players to have to wait for a result).
>    - The longest turn is important so that I can figure out when to stop
>    executing scripts that will probably never complete on their own. For
>    example, if someone writes an infinite loop, then I need to cut the
>    execution off *somewhere*. I chose 50ms for this.
>    - Code is executed using "eval" (I know about the security risks and
>    have hopefully mitigated any attacks, but I want this issue to stay focused
>    on performance rather than security). Here is a simplified version of the
>    steps that I run, but the summary is that I "eval" a user-defined function
>    once, then I can call that function multiple times per match:
>       - Give each bot a "namespace" for the user to put their code into:
>       global.bot1 = {};
>       global.bot2 = {};
>       ...
>       global.botN = {};
>
>
>    - Get code from the user for a particular bot:
>       const exampleUserDefinedScript = `update = function() {
>           fireLasers();
>       };`;
>
>
>    - Transform it slightly to be able to place it into one of the
>       namespaces:
>       const exampleUserDefinedScript = `global.bot1.update = function() {
>           fireLasers();
>       };`;
>
>
>    - Evaluate that to put it into my global scope:
>
> eval(exampleUserDefinedScript);
>
>
>    - Simulate a match in a loop using those global functions:
>
> while (true) {
>     const bot = findNextBotToUpdate();
>     global[bot].update(); // note: this is what is timed to give me my
> performance metrics
>     if (gameEnded()) break;
> }
>
>
>    - Free references to the global functions:
>
> global.bot1 = null;
> global.bot2 = null;
> // etc.
>
>
> Problem
> The problem I'm running into is that individual turns sometimes take
> longer than 50ms. I investigated heavily (see next section) and I think
> that it's due to V8 optimizing on the main thread.
>
> Investigation
>
>    - As mentioned, this is running in a container with no other
>    processes, so the only threads that could be interrupting would be from the
>    OS or Node itself.
>    - The environment that I'm running in doesn't seem to matter too much.
>    Running on AWS in a 1-vCPU container is my production environment, but even
>    in development, I see turns that take a disproportionately long time.
>    - I ran the Node profiler
>    <https://nodejs.org/en/docs/guides/simple-profiling/>, but nothing
>    stood out to me. Then again, I could just be interpreting the results
>    incorrectly. I get something like this:
>
>     Statistical profiling result from converted, (6468 ticks, 126
> unaccounted, 0 excluded).
>     [Summary]:
>       ticks  total  nonlib   name
>       3184   49.2%   96.2%  JavaScript
>          0    0.0%    0.0%  C++
>        314    4.9%    9.5%  GC
>       3158   48.8%          Shared libraries
>        126    1.9%          Unaccounted
>
>
>    - When I run with "--trace-gc", I see that the garbage collector is
>    mostly running scavenges that take 0.0 ms. Regardless, I changed my code
>    from using a "while(true)" loop to calling "setImmediate" every once in a
>    while so that it would yield to the event loop, that way the GC shouldn't
>    interrupt a turn while it's executing. I didn't notice a difference doing
>    this.
>    - Running with --trace-deopt doesn't show any deoptimizations taking a
>    long time.
>    - Running with --trace-opt shows some optimizations taking a long
>    time, but I don't fully understand the results. For example, on one turn
>    that took 37.33 ms (which was 9 times the average for that match), 1 ms was
>    spent deoptimizating, and apparently 137 ms were spent optimizing (20 ms to
>    prepare, 115 ms to execute, and 2 ms to finalize).
>    - I ended up working perfhooks.performance.now() calls all over my
>    code: one set was placed around all of the code that I execute on behalf of
>    the user, and one set was placed into individual function calls that are
>    called by the user. What I saw was that the time spent in individual
>    function calls didn't add up to the total amount of time that I measured.
>    To me, this indicated that the time spent was not a direct result of my
>    JavaScript code, but rather an indirect result (e.g. optimization).
>    - Running that same test with "--minimal" would not reproduce those
>    results. Instead, the sum of the individual times would indeed add up to
>    the total time measured.
>    - I tried "warming up" the optimizations by running ~200 random
>    simulated matches at startup time, but I still ran into long-running turns.
>    This might have been more effective if instead of randomly simulating, I'd
>    exhaustively simulated such that every line of code was being hit.
>
>
> Questions
>
>    - Is it possible to choose when to run optimizations? For example, I
>    could have them run in between turns, that way I don't count it against a
>    player when optimizations take a while. I realize that it would mean that
>    I'd be running unoptimized code for a turn.
>    - Is it possible to tell exactly how long was spent optimizing from
>    within JavaScript? E.g.
>
> const startTime = Date.now();
> runSomeCode();
> const elapsedTime = Date.now() - startTime;
> const timeSpentOptimizing = aV8FunctionThatMayOrMayNotExist(); // can I do
> something for this?
> const actualElapsedTime = elapsedTime - timeSpentOptimizing;
>
>
>    - Are there any other V8 flags that may be of help here? I tried
>    "--predictable" without seeing any differences, but maybe there's something
>    I'm missing.
>
>
> Thanks for reading through this large blob of text!
> -Adam
>
> --
> --
> v8-users mailing list
> v8-users@googlegroups.com
> http://groups.google.com/group/v8-users
> ---
> You received this message because you are subscribed to the Google Groups
> "v8-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to v8-users+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
-- 
v8-users mailing list
v8-users@googlegroups.com
http://groups.google.com/group/v8-users
--- 
You received this message because you are subscribed to the Google Groups 
"v8-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to v8-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [v8-users] V8 optimizations causing spikes in execution times

Reply via email to