Debugging Tips: JavaScript CPU Profile

JavaScript CPU Profile#

Identify expensive functions using the Chrome DevTools CPU Profiler.

  • Record exactly which functions were called and how long each took with the CPU Profiler.
  • Vizualize your profiles as a flame chart.

I should mention: I have a lot to learn about the CPU profiler. So, take this info with a grain of salt. Improvements are definitely welcome.

General Process#

  • Choose what you want to improve – be specific
  • Figure out a page-state that aggravates the issue – important
  • Isolate and profile the functionality – the more isolated the better
  • Decipher the profile
  • Explore improvements
  • Repeat to measure the effect of improvements

Prep#

  • Be specific in selecting what to improve
  • Aggravate the issue as much as possible
    • Example: With common designer issues, more legos is usually better
    • Example: Pause the JS execution at key points to bloat the page state via the console
      • Generally, you want to make the page-state (metadata in our case) massive, if possible
    • Example: Hack some server-side changes to return bloated data, even if it's completely artificial – recall, only a single piece of functionality is important (but it needs to function in an authentic manner)
  • Plan to profile the minimal amount of information that also includes the poor performance
    • The functionality only needs to be executed one time
    • As an example: Don't profile loading the page, entering a sub-designer, etc., if all you want is clicking a lego while in the tile permissions designer

Collect The Profile#

  • Choose some functionality to profile, likely something slow
    • Example, clicking a lego to highlight it
  • Prep the page to be ready to do whatever it is you have chosen
    • Usually, it's best to put the page in a state that aggravates the issue – like a tile loaded with legos instead of a new tile with only a handful
  • Start the JavaScript CPU Profiler
  • Execute the functionality you want to improve a single time
  • Wait for execution of the functionality to finish
  • Immediately stop the profiler
  • Side note: If the profile turns out to be useful (clean, to the point) it's worth saving it to disk so it can be viewed later, in particular to compare it to the post-optimization profile.

Viewing The Profile#

There are three views:

  • Heavy (Bottom Up) – All functions are listed, ordered by time-spent in the function (aggregated)
    • Default View
  • Tree (Top Down) – If I understand correctly, allows the call-stack to be explored, where entry points are the top level, the functions they call the next, and so on (aggregated)
  • Chart – Flame graph of the JavaScript call stack

Heavy (Bottom Up)#

Heavy (Bottom Up)

  • Self time: Aggregate time for all invocations of the function across the recording, not including functions called by this function.
    • I use the percentages as a relative indicator of significance
    • In the image above, s is the most costly function, in terms of self-time, by far
  • Total time: Aggregate total time for all invocations of the function, including functions called by this function.
  • Function name / location: The function name followed by the location of the function's source
    • Averting obfuscation is a huge perk of NODE_ENV=development (i.e. local dev code)
  • Expanding a function: Shows the calling paths to the function, the call stack from the callee ⇢ caller
    • In the image above, we can see the function s was called by:
      • s (recursively)
      • e
      • classnames
    • From the look of things, when s calls itself, it doesn't spend much time doing anything else
    • s and e are the first place to look
    • classnames is probably not worth looking into, at all

Tree (Top Down)#

Tree (Top Down)

  • Self-time, total-time, function / location all have the same significance
  • The functions listed are the entry points of the execution flow
  • Expanding a function shows the execution path, i.e. the call stack from the caller ⇢ callee

In this view, things can be a bit of a labyrinth, but it's good to know about it.

Chart – Flame Graph#

Flame Graph

  • Timeline chart: A chart is shown along the top of the view
    • Vertical axis: Height of the call stack
    • Horizontal axis: Time, the full duration of the profile
    • Clicking and dragging in the top chart zooms in, making detail in sub-parts of the profile easier to view
  • Call stack chart: Below the timeline chart, the call stack at any given point in time is shown – callers are on top of callees, and the width of each rectangle indicates the duration of the call
    • Hovering over a call stack rectangle shows the a box with additional information:
      • The function name
      • Self-time and total-time for that particular invocation
      • Aggregate self-time and total-time for the function, in general
      • A note about optimization (not always shown)
    • Clicking a call navigates to the source (if the source is available)
      • Very useful for navigating the code
    • Clicking and dragging on the call stack chart pans the view
    • Scrolling affects the zoom

A side note: It seems like there may be artificial boundaries in the flame graph. Meaning, a rectangle representing a call will end and another will start to its immediate left. The new one is for the same function, and the same boundaries exist for the caller function. These boundaries might be artifacts of the profiling method used by the dev tools.

Search the Flame Graph

Search in the flame graph is really handy. But, it's more helpful when the code is not obfuscated / uglified. For instance, in this particular profile, searching for s yields 1,134,791 matches – anything with an "s" in the name. Not as useful as searching for _clone in unobfuscated code would be, which is the s function's actual name.

Search works in each of the views, but I generally only use it in the flame graph.

What To Look For#

First, in the Heavy (Bottom Up) view, I look at the self-time column; functions with a high self-time are prime candidates for optimization.

Next, in the same view, functions that have a high total-time but a low self-time warrant attention. These were probably ignored in the first pass where self-time was the main focus. Functions with a high total-time but a low self-time are likely the gate-keeper functions that call the very costly functions. The rationale for looking these over is: maybe the slow functions can be called less often.

Then, I go to the Flame Graph. If I found candidates for optimization, I look over the flame graph to increase my general understanding of whats going on and to verify the candidate functions are actually high impact. If I didn't find candidates for optimization in the Heavy (Bottom Up) view, I explore the flame graph and essentially hunt for something interesting.

Verify The Significance Of The Candidates For Optimization#

Generally, I like to verify I'm on the right trail before actually optimizing anything. To do this I basically take out the functionality I'm considering optimizing. For instance, in the profiles shown above the culprit is the s function, AKA _clone, which is the implementation of clone. So, to make sure it was the problem, I put a break-point in the code, found the webpack export name, and change clone to be the identity function:

clone2._default = value => value;

Then, I test out the functionality I want to improve (without making a new profile) and verify things move along much faster.

Sometimes this step is not possible.

Battle#

  • Implement the optimization, whatever it may be (hopefully this is not a nightmare)
  • Verify the performance has improved without profiling
  • Profile for more detail or to figure out why the change didn't help
  • Repeat until things reach a fully crystalized state of awesomeness

Make sure the functionality has not changed#

Absolutely critical verification step.