I've tried to be as descriptive as possible below without covering 
irrelevant details. The summary of the problem is that I think V8 
optimizations may be causing unpredictable processing spikes in my game 
that I need to be able to either defer or measure.

Background

   - The code is closed-source, and there are a *lot* of moving parts, so 
   I'm wondering mostly about theory here and not necessarily implementation. 
   More on this in the "Questions" section at the bottom.
   - I'm making a game where players write code for bots that battle each 
   other. The game is turn-based, and each match can take thousands of turns. 
   All of these turns get simulated at once in a NodeJS process (v10.15) 
   running in a 1-vCPU Alpine container on AWS. After a simulation is done, 
   the battle is essentially replayed on the client without any input from the 
   player.
   - Each simulation is CPU-intensive given that it's just running 
   player-defined code. There is no network or disk utilization during this 
   time. The most important performance metrics for me are:
      - How long (in milliseconds) does an average turn take to simulate?
      - How long does the longest turn take to simulate?
      - The average is important so that I know how many turns to limit 
   matches to. The measured average over tens of thousands of real matches was 
   about 0.3 ms/turn, so I chose 3000 turns for the limit since 0.3 ms/turn * 
   3000 turns is still less than a second (which I consider to be a reasonable 
   amount of time for players to have to wait for a result).
   - The longest turn is important so that I can figure out when to stop 
   executing scripts that will probably never complete on their own. For 
   example, if someone writes an infinite loop, then I need to cut the 
   execution off *somewhere*. I chose 50ms for this.
   - Code is executed using "eval" (I know about the security risks and 
   have hopefully mitigated any attacks, but I want this issue to stay focused 
   on performance rather than security). Here is a simplified version of the 
   steps that I run, but the summary is that I "eval" a user-defined function 
   once, then I can call that function multiple times per match:
      - Give each bot a "namespace" for the user to put their code into:
      global.bot1 = {};
      global.bot2 = {};
      ...
      global.botN = {};
      

   - Get code from the user for a particular bot:
      const exampleUserDefinedScript = `update = function() {
          fireLasers();
      };`;
      

   - Transform it slightly to be able to place it into one of the 
      namespaces:
      const exampleUserDefinedScript = `global.bot1.update = function() {
          fireLasers();
      };`;
      

   - Evaluate that to put it into my global scope:
   
eval(exampleUserDefinedScript);


   - Simulate a match in a loop using those global functions:
   
while (true) {
    const bot = findNextBotToUpdate();
    global[bot].update(); // note: this is what is timed to give me my 
performance metrics
    if (gameEnded()) break;
}


   - Free references to the global functions:
   
global.bot1 = null;
global.bot2 = null;
// etc.


Problem
The problem I'm running into is that individual turns sometimes take longer 
than 50ms. I investigated heavily (see next section) and I think that it's 
due to V8 optimizing on the main thread.

Investigation

   - As mentioned, this is running in a container with no other processes, 
   so the only threads that could be interrupting would be from the OS or Node 
   itself.
   - The environment that I'm running in doesn't seem to matter too much. 
   Running on AWS in a 1-vCPU container is my production environment, but even 
   in development, I see turns that take a disproportionately long time.
   - I ran the Node profiler 
   <https://nodejs.org/en/docs/guides/simple-profiling/>, but nothing stood 
   out to me. Then again, I could just be interpreting the results 
   incorrectly. I get something like this:

    Statistical profiling result from converted, (6468 ticks, 126 
unaccounted, 0 excluded).
    [Summary]:
      ticks  total  nonlib   name
      3184   49.2%   96.2%  JavaScript
         0    0.0%    0.0%  C++
       314    4.9%    9.5%  GC
      3158   48.8%          Shared libraries
       126    1.9%          Unaccounted


   - When I run with "--trace-gc", I see that the garbage collector is 
   mostly running scavenges that take 0.0 ms. Regardless, I changed my code 
   from using a "while(true)" loop to calling "setImmediate" every once in a 
   while so that it would yield to the event loop, that way the GC shouldn't 
   interrupt a turn while it's executing. I didn't notice a difference doing 
   this.
   - Running with --trace-deopt doesn't show any deoptimizations taking a 
   long time.
   - Running with --trace-opt shows some optimizations taking a long time, 
   but I don't fully understand the results. For example, on one turn that 
   took 37.33 ms (which was 9 times the average for that match), 1 ms was 
   spent deoptimizating, and apparently 137 ms were spent optimizing (20 ms to 
   prepare, 115 ms to execute, and 2 ms to finalize).
   - I ended up working perfhooks.performance.now() calls all over my code: 
   one set was placed around all of the code that I execute on behalf of the 
   user, and one set was placed into individual function calls that are called 
   by the user. What I saw was that the time spent in individual function 
   calls didn't add up to the total amount of time that I measured. To me, 
   this indicated that the time spent was not a direct result of my JavaScript 
   code, but rather an indirect result (e.g. optimization).
   - Running that same test with "--minimal" would not reproduce those 
   results. Instead, the sum of the individual times would indeed add up to 
   the total time measured.
   - I tried "warming up" the optimizations by running ~200 random 
   simulated matches at startup time, but I still ran into long-running turns. 
   This might have been more effective if instead of randomly simulating, I'd 
   exhaustively simulated such that every line of code was being hit.
   

Questions

   - Is it possible to choose when to run optimizations? For example, I 
   could have them run in between turns, that way I don't count it against a 
   player when optimizations take a while. I realize that it would mean that 
   I'd be running unoptimized code for a turn.
   - Is it possible to tell exactly how long was spent optimizing from 
   within JavaScript? E.g.

const startTime = Date.now();
runSomeCode();
const elapsedTime = Date.now() - startTime;
const timeSpentOptimizing = aV8FunctionThatMayOrMayNotExist(); // can I do 
something for this?
const actualElapsedTime = elapsedTime - timeSpentOptimizing;


   - Are there any other V8 flags that may be of help here? I tried 
   "--predictable" without seeing any differences, but maybe there's something 
   I'm missing.
   

Thanks for reading through this large blob of text!
-Adam

-- 
-- 
v8-users mailing list
v8-users@googlegroups.com
http://groups.google.com/group/v8-users
--- 
You received this message because you are subscribed to the Google Groups 
"v8-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to v8-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to