Unlocking a Mystery: Visualizing the Common Webpack Bundles - Redfin Real Estate News

Unlocking a Mystery: Visualizing the Common Webpack Bundles

by
Updated on October 5th, 2020

TL;DR I wrote a tool to visualize webpack common bundles.  There is a link to the source at the bottom of this post that you can play with.

Background

Webpack is a tool in our front-end build chain that gathers up and bundles all of the front-end assets that we need to send to our users.  It does this by following the dependency tree from our routes file, analyzing things as it encounters them, and “doing things” with the files it finds. If you’re unfamiliar with webpack, there is a good overview here.  After using webpack for a few months at Redfin, we noticed that our bundle sizes seemed large, very large.  On a local developer build with sourcemaps turned on, I once saw a bundle of JavaScript that was nearly 7 MB in size, which is bordering on Twilight Zone levels of absurdity. Some of our production bundles were over 2 MB. Something was very wrong.

Large bundle sizes are very bad for everyone, particularly in regards to performance. For the end user they increase time to download and parse, which blocks our react-server framework from waking up and making the page interactive.  For developers they also can substantially increase build times, both in initial startup and incremental builds. Many of our developers are quite familiar with long-running webpack builds and have been justifiably vocal about it.

 

68747470733a2f2f662e636c6f75642e6769746875622e636f6d2f6173736574732f313336353838312f313931383337372f34383062326664362d376462632d313165332d386261302d3733346661663331353962382e706e67

There are a couple things to know about our specific webpack setup before we dive deeper:

  • Webpack walks all the feature code that we wrote starting from a routes.js file and creates a bundle for each route,  which generally correlate to a page or url.
  • We use the CommonsChunkPlugin to find dependencies that are used on every route and create a separate bundle for those things. This is so when users move from page to page they don’t re-download things like react and can instead used the cached version from the previous page. This also helps to parallelize the initial js and css requests which decreases page load time.
  • We use the ExtractTextPlugin to find CSS in the dependency tree and create a bundle of all the styles, stylesheets are required (“included” is more accurate since we generally enforce ES6 module syntax) by JS imports from their corresponding react component.
Screen Shot 2016-11-18 at 3.01.39 PM
The chrome network tab showing our bundles on the Ui Tests Page

Unfortunately, webpack can be rather opaque. At a company as large as ours we have many dozens of developers contributing code.  Webpack happily finds new dependencies and features and slurps them up. Nobody actually knows that lands in the common bundle because it’s done automatically, but yet it effects every page on the website. In order to start to optimize it, we asked the following questions:

  • What actually is in the common bundle?
  • How much space does each thing take up?
  • Are there duplicates?
  • How many files is are being parsed?

I set out to open Pandora’s box, and shed some light on what we’re shipping down the wire.

Existing Analysis Tools

The first thing to know is that webpack can produce an exhaustive manifest of everything that it found via the stats object. There are a number of tools out there for writing this file to disk after your build so that you can further inspect it. I used the webpack-stats-plugin by Ryan Roemer who works at Substantial here in Seattle. You can feed this output into a variety of tools such as Webpack Visualizer by Chris Bateman which produces a really neat visualization, or the Webpack Analyze tool that is hosted on the webpack site itself and allows deeper inspection. Both are fantastic tools, and offered quite a bit of insight into our bundles.

Unfortunately for us, neither of those tools are able to segment by the output of the CommonsChunkPlugin, which makes our common bundle. They only look at the webpack stats as a whole, which really doesn’t tell us what we want to know. Not only that, but the stats that are generated from our site have a tendency to crash node while they’re being written to disk.  Which seems to be because they’re over 100mb of JSON, which is also rather mind boggling, but we certainly don’t want do to that on a regular basis. There is a very real possibility that I don’t understand the stats bundle contents, and missed some data in there that allowed us to trace dependencies backwards from the bundle to a file list. I would love to be wrong about that.

So I had to write our own tooling, for exporting the information we need from the webpack build process then displaying the results in a meaningful way.

Disclaimer

This tool is looking at the source that’s being parsed as it runs though webpack.  This means that the percentages it shows are not 100% accurate with the output bundles. After webpack we run the bundles through uglify.js which optimizes the files and further reduces output file size.  Also, over the wire compression can compress some bundles better than others.  This tool is purely showing the bundle percentages based on the number of characters in the source file and makes no attempt to understand how that relates to over the wire size in production.  This still does give us a rough estimate of the relative size of things, but don’t start a land war in Asia over the results.  These results are meant as a conversation starter and a jumping point for deeper investigations.

Our Common Bundle

The charts below show the output of the analysis of the corvair customer-facing common bundle. They were created while running Node 6 and NPM 3 which changes the structure of the corvair node_modules folder to be much flatter, which we can see below.

Bundle Overview

This is the basic view of a bundle that first shows when a report is selected. This chart shows the nesting structure of our dependencies by file path. Since our modules are organized by folder, this gives us a great overview of the major parts of what makes up our bundles. It’s easy to see which dependencies have a large number of tiny files such as `react` and which ones are just single long ones, such as `lodash`. The colors of the “modules” are based on what percentage of the total bundle size they take up. In this example, things in the `/corvair` folder make up 97% of the bundle size, so it’s very bright and are dynamically computed in a gradient scale.

Screen Shot 2016-11-18 at 1.46.34 PM

Highlighting by Name

Typing into the highlight field at the top of the chart colorizes nodes that match that name.  I knew that both react-server and corvair had declared `lodash` as a dependency, and was unsure if npm was installing two copies. Sure enough, you can clearly see that we have two versions of lodash in the bundle!

Screen Shot 2016-11-18 at 1.49.02 PM

Inspecting a File

Hovering over items in the chart shows information about them.  As we hover over the file `AwsSdk.js` we can see that it’s taking up 6.38% of our total bundle and isn’t even being required as a module, it’s just a file that we checked in.  We also can make assumptions about why it’s in the bundle, since it’s clearly in the corvair middleware, which is included in all routes. There may be an opportunity here to reduce the bundle size by moving this to a non-blocking script tag lower in the body, or to only use it on specific pages that need it.

Screen Shot 2016-11-18 at 2.00.18 PM

Inspecting a Folder

Hovering over folders shows information about them as well, here we can see information about `react`.  We can make the assumption that it’s a properly installed node module since it’s a direct child of the corvair `node_modules` folder, that it takes up 15.2% of our total bundle size and is including 141 files. This may be an opportunity to speed up our build times by pre-compiling react as a dll (deeper tech articles on webpack dynamic linked libraries here and here), so that every time webpack runs it doesn’t have to parse through all those files and build them.

Screen Shot 2016-11-18 at 2.00.37 PM

Colorizing By File Type

Some people are suspicious of raster assets (images) or other binary files getting into our bundles. Changing the “color” control at the top to “type” colorizes modules into three categories. Javascript is green, css is light mangoes, and other stuff is a bit of a light burnt umber. Since the CommonsChunkPlugin is unaware of the other plugins we may be running such as ExtractText which pulls out the CSS and the file-loader which inlines woff (font) assets, we can only see the output files all combined.

Screen Shot 2016-11-18 at 1.47.05 PM

How it Works

Now, for more bad news. This is not an easy report to generate or display.  I was unable to find any open source tools to do this, so I had to write the whole shebang by hand.

Generating the Report

I was digging around the webpack source of the CommonsChunkPlugin and immediately saw an object that looked like it contained the information that we needed to understand the common bundle. This was great news, since the other ideas I had involved buying a bottle of whiskey and crying a lot because trying to understand the ludicrously complex webpack stats object that it normally outputs had left me in a pretty bleak place. A ray of light in the chaos.

Unfortunately, at the time of this post, in order to generate the information needed to make the visualization tool work, you need to modify the webpack source in it’s installed location. I have not yet made it commonly consumable other than directions to paste a bunch of stuff into a certain line number in a certain file, then run the build.  I’m certain there is a way to make this reusable, but it would most likely require me working with the stats object, which would send me back into bouts of rage and depression, so I’m going to avoid that for now.

The output of this pasted in code is an JSON file that is written to my local desktop during a normal build. It’s about 600kb, which by most maths is significantly smaller than the 100mb+ of the normal stats object, generates rather quickly, and doesn’t crash node. This output is a flat array of things that webpack will be placing in the common bundle and a small amount of information about them. It looks something like this:

[
    {
        "request": "/Volumes/code/main/corvair/target/corvair/middleware/util/AwsSdk.js",
        "userRequest": "/Volumes/code/main/corvair/target/corvair/middleware/util/AwsSdk.js",
        "rawRequest": "./util/AwsSdk",
        "sourceLength": 268993
    },
    {
        "request": "/Volumes/code/main/corvair/node_modules/lodash/index.js",
        "userRequest": "/Volumes/code/main/corvair/node_modules/lodash/index.js",
        "rawRequest": "lodash",
        "sourceLength": 411453
    }
]

If you want to do this yourself, I’d be glad to send you the code, file path, and line number to paste into webpack. I apologize for such a janky solution, but prioritized immediately satisfying my curiosity about whats in the bundle over making the tooling user friendly.

Visualization

To dive into the output, I wanted something similar to the open source webpack visualization tool that I mentioned above. Since it needed to consume a hacked together very custom data structure, I just started from scratch rather than try and modify anything that was pre existing.  This means that it’s fragile, hackily coded, and in some browsers doesn’t allow users to hover over anything.

Eventually I plan on making this an open source project, so I wanted it to be standalone and not part of the Redfin codebase or build. The only real requirement, other than doing it’s job, is that I wanted to build it in a way that was not server dependent so that devs can use it locally, without uploading or otherwise transferring stats off their machine which may contain information that would otherwise not be publicly accessible.

Screen Shot 2016-11-18 at 4.02.54 PM
The default view of the visualization tool

The visualization tool is hand coded with vanilla javascript and css, and lives in an html file that you can open and use locally.  Just open it in a modern browser (I’ve only tried it in Chrome) click the file chooser and select a stats file that was written to disk. I find D3 overkill for most visualizations, so the tool simply constructs an SVG based on the data that it’s fed right there with the JS in the file.

The tool is available as a public gist here: https://gist.github.com/kcjonson/cef39fb0d745c40f0c02e37249a1a304

The stats output from our common public bundle are here:  https://gist.github.com/kcjonson/9cde937cc82ab51073a16747c7a76dd0

Enhancements

There are many things I would love to add to the tool, so if anyone wants to get involved I’d be thrilled to have help

  • Create a webpack plugin or submit a patch back to webpack itself to make this data more accessible
  • Display the “reasons” that something ended up in the common bundle (what are it’s parents in the dependency tree)
  • Make the tool work for not just the common bundle, but the not common bundle (the stuff that landed in the page bundle)
  • Allow it to better segment by things that were extracted by other loaders, such as css

Moving Forward

There is still a bit of work to do on this before I can turn it into a more widely reusable open source tool.  I’m hoping that someone will point out an easier way to get those stats out of webpack and I can laugh a bit then change the visualizer to consume a widely available data format.   I’ll be watching the comments here if you have a better way to do this, or you can email me directly at your leisure. Thanks for reading!

Avatar

Kevin Jonson

Northwest native, drinker of beer, climber of mountains.

Email Kevin

Leave a Comment

Your email address will not be published. Required fields are marked *

Be the first to see the latest real estate news:

  • This field is for validation purposes and should be left unchanged.

By submitting your email you agree to Redfin’s Terms of Use and Privacy Policy

Scroll to Top