Rodrigo Rosenfeld Rosas
Scripts loading trade-offs: a performance analysis
This has been written to serve as some background for two other articles focused on SPA performance:
- Getting an SPA to load the fastest possible way (and how Webpack can help you)
- Improving SPA loading time with webpack (and why Sprockets is in your way)
I’ve been developing Single Page Applications (SPA) since 2009 and I can tell you something for sure. Developing web applications is hard! If you are a full-stack developer like me you have to learn about relational databases, caching technology (Redis, Memcached), a server-side language and framework, sometimes other kind of databases, full-text search (Solr, ElasticSearch), server configuration and automation tools (Chef/Puppet/Ansible), deploy tools (Capistrano), continuous integration, automatic test coverage, network infrastructure, http proxy configuration, load balancers, back-up and monitoring services, just to name a few.
But even if we leave all these technologies out and only focus on front-end development, then, it’s still hard! JavaScript is not a great language and I certainly do not like the language at all but we don’t really have any affordable options since it’s all web browsers understand and if you want your application to load fast you must use JavaScript and you must learn it and learn it well.
Code modularization in JavaScript
But particularly the lack of some sort of require/import mechanism built into the language is the worst part of the language by far and the reason why people spend so many time just to figure out some way to implementing modularization as the code gets big and this will often happen soon when implementing an SPA.
On the other side, the require mechanism when applied to a client-server architecture where the code is stored in the server-side (which is how browsers work) is much trickier than it is for most languages which assume the code is locally available. In such architecture, if you want your code to load as fast as possible you should be worried about transferring only the required bits as you need them. Requiring code on demand is possible in many languages, like Ruby, but in JavaScript it is even more tricky because JavaScript doesn’t allow threaded code (workers only popped up very recently) and works by processing events, one at a time, the so called async programming.
This means a require in JavaScript should also work asynchronously (Node.js is a different beast as it allows some code to work synchronously by blocking the code execution until the operation of the function is finished while any I/O operation in the browser is implemented asynchonously). I just don’t think this is an excuse for JavaScript not providing such mechanism out of the box, but this is not an article to say bad things about JavaScript. There are already tons of those out there, I’m just explaining why modularization is a complex subject in JavaScript and front-end development.
Solutions to JS modularization
There are many attempts to implement code modularization in JavaScript. I won’t get into the details since there are many articles covering only this subject. If you are curious you can search about CommonJS, Require.js, AMD and JavaScript modularization in general. I’m just going to review the solutions from a higher level perspective, and talk about their trade-offs as it’s important to understand them in order to explain how to load applications fast.
Sequence of script tags
When JavaScript was first introduced in Netscape people would simply add each module to the page by adding a script tag in the header for each module. This will block the page rendering until the scripts are downloaded and a user navigating to the site will see a blank page until all script sources are downloaded and executed. When you have big scripts and bad network bandwidth (which is specially true for mobile devices running on 2G, 3G and even 4G) it leads to a really bad user experience.
The main advantage of this approach is that it’s easy to set up and understand, since the scripts are executed in the specified order of the script tags. If your links and buttons depend on the scripts to work properly (which is usually the case) then, by putting the scripts in the document head you wouldn’t have to worry about that. This is the simplest solution to develop. But it’s also the one that will perform worst.
Even if you decide to put your scripts in the end of the page, it’s still a problem if you want your page to load really fast. That’s because it will delay the DOMContentLoaded and Load DOM events and if part of your code is listening on those events it means they will have to wait until all scripts are downloaded and executed. If your code doesn’t depend on those events and if your page is fully functional even before the scripts are downloaded (links and buttons work as expected) then it might be a good strategy for your case, if you target browsers supporting HTTP 2 since it allows you to have great control over per module caching so that if your users visit your application very often and you only change a few files in a new deploy then those users would only have to download the changed files with proper caching headers in place.
But most browsers will limit the amount of concurrent resources download, which means that if your application depend on many script tags they won’t be all downloaded in parallel, which can introduce some additional time to the application loading.
Another drawback for the approach of putting the scripts in the end of the document body is that their download will only start after the document download is mostly completed. This is not a big deal if your document is small, but if takes 1 second just to finish downloading your main document it means your scripts will only start to be downloaded 1s after the user requests your application to be loaded, which means your application may take an extra second to load than it should.
Async scripts
An alternative to putting the script tags at the end of the body is to keep them in the head but flag them as async scripts (or defer too if you target older IE which do not support the async attribute - even though defer and async behave differently defer is still better than the default script blocking behavior). The main advantage over scripts in the bottom is that the scripts will start downloading very soon without blocking the page rendering or the DOM load events (defer works a bit differently than async with regards to those events).
However, your scripts must be async safe for that to work. For example, you can’t load jquery and jquery-ui from CDN in two async script tags because if jquery-ui is loaded before jquery it will fail to run as it assumes jQuery is loaded already.
This strategy is usually used in combination with scripts custom bundles. It could be a single bundle, which is easier to implement or if multiple bundles are created they should be prepared to wait for their dependencies to be loaded before running its code.
Dynamically created script tags (Injected Scripts)
Script tags created dynamically do not block and could be used to implement some async require, which is the strategy adopted by Require.js and similar frameworks. Taking care of implementing this strategy correctly while still supporting old browsers is not an easy task and that’s why there are many frameworks providing this feature and why I think it’s a big failure of JavaScript to not provide such feature out of the box.
There are some different strategies for using technique though. One might simply add all required scripts dynamically to ensure they won’t block page rendering (although I think async scripts are cleaner in this case) or they could be used to dynamically load code on demand, which I will refer as code splitting in this article from now on.
At first it may sound like a good idea to load just the code your application needs so far, as they are needed because it reduces the amount of bytes transferred, but it also increases the number of requests, and more importantly, it shifts the moment when that code download starts.
If you concatenate all this code and put it in an async script on head it will load the application faster, otherwise even if you started the download of all dependencies in parallel, it would be equivalent to putting the script tags in the end of the body which means your application will load 1s late when compared to when it should finish load in the optimal case (see the last comment in the “Sequences of script tags” section).
But it’s more tempting to load each module when they are needed when using this strategy, which makes things even worse. If you need module A, which depends on B, which depends on C the browser will have to finish downloading A to figure out it should also ask to download B and only after B is finished loading the request to C would start. It may not be always obvious that A depends on both B and C so that you could require A, B and C at the same time when we are talking about real code. That’s why Require.js offers a bundling tool to deliver an optimized JS to production environments.
Creating script tags inside scripts has a performance issue, though, which is explained in depth here. Since the scripts could interact with CSSOM, it means it will block until all previous CSS resources have finished downloading, introducing an unnecessary latency. Async script tags are preprocessed by the browsers and their download will start immediately (just like regular script tags with the src attribute, the difference being that async tags won’t block the DOM). That’s why we should prefer async script tags over dynamically created scripts for the initial application loading process (loading code on demand is a separate case).
Scripts bundling - single bundle
This is considered a best practice by many currently and several tools adopt this strategy, including Sprockets, the resources build tool integrated with Ruby on Rails default stack.
How the bundles are built will depend on the bundler tool. Sprockets require the resources to specify their dependencies as special comments in the top of each resource (JS or CSS). Other tools use the AMD or CommonJS require syntax for example to specify the dependencies and will parse the JS to find them, which is more complex than the strategy used by Sprockets, but on the other side allow more powerful features, like code splitting (more on that when I’ll talk about webpack). There’s also another technique of specifying the dependencies outside the resources themselves, which is used by the Grails resources plugin for example, or by some build tools similar to Make.
Which strategy is better will also depend on personal taste. Particularly I prefer to specify the dependencies directly in the code rather than in a separate file, like it happens with Grails resources plugin. But when code splitting is desirable it’s not just a matter of taste. Implementing code splitting while using Sprockets would require a huge amount of effort for example. That’s why I think Sprockets doesn’t suite big SPA and the reason why it should be replaced with a better tool.
Such bundling tools are usually able to perform other preprocessing before generating the final optimized resource, including minifying them with uglifyjs to reduce the download size and compiling from other languages to JS and CSS (after all, as I said, many people dislike those languages and fortunately there are better alternatives out there when you can use preprocessors and transpilers).
By having a single JS file to download and run you reduce the amount of concurrent requests to your server and you can even serve them through a CDN to improve it even more as the limit of concurrent connections work in a per domain basis (even though it may not be best to use a CDN if HTTP 2 is enabled and under some conditions).
For a first not cached request this is probably the strategy with best results if we consider the bundle contains only the required code for the initial page loading, which is hardly the case.
So, here are some drawbacks for this approach. Usually all code is bundled in a single file, creating big files which take a while to finish downloading, even if it only happens once until the next deploy. And it gets worse if you are able to deploy very often. If you deploy every day then the user will often request a request which is not cached. And I wouldn’t say this is an unrealistic scenario for many healthy products.
This might be a good enough solution if your bundle is small or if you deploy once in a month or each 6 months and most of your user access are cached ones, but if you are targeting a great experience for first time users, you should look for a better alternative.
Script bundling - multiple bundles
Even if you deploy often, it’s likely that your vendored libraries don’t change that often. So it may make sense to pack your vendored libraries in a separate bundle so that it would be cached most of the times even after new deploys. Since you should be loading the vendors and application bundles asynchronously you must add some simple code to ensure the application code would only run after the vendors bundle has finished loading.
This will usually add just a little overhead for the first user access when compared to the single bundle but on the other hand it will often speed up other page loads after a new version is deployed while the vendors bundle hasn’t changed.
If your application bundle only contains code for the initial page rendering and implements lazy code loading as the user takes action (code splitting) this gets even better.
In the remaining sections I’ll show how webpack enables such strategy to be implemented and will compare it to Sprockets since I have switched from Sprockets to Webpack and should be able to highlight the weak and strong points of each.
Server-side vs client-side template rendering
Feel free to skip this subsection if you don’t care about this subject.
Some respected developers often state the clients should get a fully rendered HTML partial from the server and simply add it to some container or replace its content trying to convince us that this is the best and fastest approach. To give you one example, David, the creator of Rails, writes about the reasons why he thinks this is the best approach:
Benefit #1 is “Reuse templates without sacrificing performance”. While I agree with the reuse part in the case the content should also be rendered in the server-side and then updated with JS, I wouldn’t blindly trust the “without sacrificing performance” part. Reuse may not be a problem for many SPA, including the one I maintain, so we should evaluate whether there’s any performance difference for both approaches in a per case basis and which one is actually faster.
It’s important to understand the full concepts to get the full picture so that you can pick the right choice. First, I’d like to point out that I don’t agree with David’s terminology: “unless you’re doing a single-page JavaScript app where even the first response is done with JSON/client-side generation”. SPA should mean an application that won’t leave the initial page and use XHR to update the view. Both approaches apply to SPA in my opinion.
Then, you have to understand what is the specific case David is recommending you to render in the server-side and which I would agree. If your application is able to render an initial view, which is useful and functional even before your JS code has finished loading, then I’d also recommend you to render it in the server-side. But please notice that even this approach won’t always be the fastest. It will be the fastest when the static resources are not in cache. But if they are cached the application can load much faster if the rendering is performed in the client-side depending on the template and data. So, it will depend on the kind of access you are optimizing to: cached resources or first user access.
You’ll notice I’m inviting you to think about the reasons behind each statement because they often suppose something which is not always true, so you should understand to see whether it applies to your case or not. Much more often there are trade-offs in all choices and that’s the reason I try to provide you context around every statement I do in this article.
In that same article we can extract another example of such statement which is not always true:
“While the JavaScript with the embedded HTML template might result in a response that’s marginally larger than the same response in JSON (although that’s usually negligible when you compress with gzip)”. This is not always true. If you are working with big templates where just a small percent of it depend on dynamic data, transferring that data with JSON will often be much faster. Or if you are transferring some big table where the cells content (the dynamic part) represents only about 30% of the total HTML, chances are that it will be much faster to transfer the data as JSON.
I’d also like to notice that if your application depend on your resources to be loaded to behave properly (so that links work, menu, tabs, and so on), then I can’t see any great advantages on rendering the initial template in the server-side since you wouldn’t be able to display it to the user anyway because it wouldn’t be functional until the code is fully loaded. In that case (which is the case for the SPA’s I have worked with since 2009) I’d suggest to create a minimal document with a basic layout (footer, header, …) which is fully functional without JS and some message “Loading application… please wait” until the code is fully loaded even if that message would be displayed just for 1 or 2 seconds… With the techniques suggested in this article, you would be able to provide such “Loading application…” state to the user in within half a second, much faster than a big full HTML document leading the user to think the application is very responsive even in mobile devices, even if it will require a few extra seconds to finish loading the application.
Overall I have been noticing that the actual reason why most people prefer to render in the server-side is because they don’t like JS or feel more comfortable with their back-end language and tools. I don’t enjoy programming in JS either, but it shouldn’t matter if the goal is to provide the best user experience. I had to learn JS and learn it well. I’ve spent a lot of time to learn a lot about JS and browser performance and much more even though I don’t enjoy the language nor I do enjoy IE8, but I have to learn about it because our application sadly still has to support it. So here is my advice for those of you that avoid JS at all costs just because you don’t like it. Get over it.
On the other hand, there are some developers which are exactly the opposite. They prefer working with JS so much that they will also run the back-end on Node.js. There are some cases where the “Rails Way” (or DHH way if you prefer) is the right one. For example, if your application is publicly available rather than only for authenticated users, you’d probably want it to be indexed by search engines, like Google. Even though Google engine can now understand JS, I’d still recommend you to render those pages in the server-side if possible. Also, in those cases it’s very likely a user would like to bookmark some specific page or send the link to someone and this works more like a traditional web site than a real application. This is exactly what Turbolinks was designed for. If Turbolinks code is not loaded yet the application should keep working as expected but switching to another page may take longer than when Turbolinks code is loaded. That’s the kind of application I would recommend adopting DHH’s suggestion. If that’s your case, I’m afraid you won’t be much interested in the content of this article as this article is focused on real applications rather than optimizations over traditional web sites, which is what Turbolinks does.
XHR requests and caching
One of the arguments for the server-rendering approach is that they can be cached. But XHR requests can be cached too. But they require some additional work since caching is usually disabled by default by libraries like jQuery, for good reasons of course.
The main problem with allowing cache in XHR requests is that the browser will leave it to the code to handle caching, which can be not always possible and will often require quite some code to handle it properly. I enable caching of XHR requests in the application I maintain and it worths in our case, but the sad news is that it’s only useful if you make some request at least twice as the first request can’t be retrieved from cache unless you enable localStorage and add some extra code… This article is already too long so I won’t explain the details, but if you are curious and want to see some code, just leave a comment and I may consider writing another article just to explain how this works in practice.
When you perform a regular request to the server, the browser will send the etags or if-modified-since headers when it has a cached copy and if the server responds with 304 (Not Modified) it will load that cached response transparently to the user. But for XHR requests your code would have to handle the 304 status but it won’t get a copy of the cached content from the browser, so it’s not that useful. It’s only useful if you have stored a the response of some previous request to the same address so that you could use that response when handling a 304 status response. It’s sad that the browser doesn’t provide a better mechanism for conditional caching of XHR requests or even handle them transparently.
So, for the initial XHR requests, they have a point for rendering in the server-side to take advantage of conditional caching tags but as you can see in the next sections, such XHR requests for the initial page loading should be avoided anyway and it’s possible to cache the initial data in separate script tags loaded async (assuming the initial data is cacheable, or part of it). Keep reading.
Initial client-side rendering performance considerations
If you decide to render your templates in the client-side, you must consider how to make it so without sacrificing performance. Suppose your application relies on some JSON to render the initial page. It’s usual for the application to perform some AJAX requests upon the application load to finish loading the page, and you should avoid this technique if you want your application to load the fastest possible way.
The reason is that the AJAX request will only happen after your application code is downloaded and executed, which means it will add some overhead while that data could be downloaded in parallel or embedded in the main document. Let’s discuss each case.
Embedding all data required for the initial loading in the document body
It’s possible to avoid those extra AJAX requests upon the initial load by embedding all data you need in script tags in the end of your document body, and it should be fine if your data is small and shouldn’t prevent your main page from being cacheable.
If your main document would be cacheable otherwise, or if your data is big enough to require some considerable extra time to finish loading the main document, which would delay some DOM load events, then this technique may not be your best bet.
I don’t recommend permanent caching (even if for an specific time span) for an SPA main document. In case it has some bug and need to be fixed urgently, a permanent cached copy will prevent that for some users. But it doesn’t mean the main document can’t be cached. Your application may use Etags or if-modified-since headers.
Suppose the main document could benefit from such caching while your extra data would invalidate such caching due to its dynamic nature. In that case, you should consider whether embedding it in the end of the document body would still be a good idea. As you can see in the next subsection, it’s not the only alternative.
On the other side, if your data is big but a great part of it is cacheable, than it’s also a good idea to extract the cacheable part and load it separately so that you could take advantage of some caching to speed up the next application loadings.
Using separate async scripts to load initial data
The alternative to embed the initial data in the application document is to load that data in async script tags in the header. This way, the data would start downloading very soon, in parallel with the other required data. In that case, you should either wrap the JSON data in a function call (JSON-P like solution) or add some custom code to store that data in some global variable (window.initialData for instance) or whatever makes sense to your application (attaching data to your body element or anything you could imagine).
When combined to code splitting, where multiple scripts are loaded concurrently, I’d recommend the JSON-P style to avoid some time-based polling with setTimeout to check until all pieces have been downloaded and evaluated. Here’s how the document head could look like:
1 | <link rel="stylesheet" href="app.css" /> |
2 | <!-- this script could be external but should be sync rather than async. Since |
3 | it's small, I'd usually embed it, although it's avised that your server-side |
4 | technology would minify it before embedding it, but it's outside the scope of |
5 | this article explaining how to do that. --> |
6 | <script> |
7 | ;(function() { |
8 | var appSettings = { loadedContent: [], handlers: {}, loaded: {} } |
9 | window.onContentLoaded = function(id, handler, once) { |
10 | var alreadyLoaded = appSettings.loaded[id]; |
11 | if (alreadyLoaded && once) return; |
12 | if (!alreadyLoaded) { |
13 | appSettings.loadedContent.push(id); |
14 | appSettings.loaded[id] = true; |
15 | if (handler) { |
16 | if (once) handler(appSettings); |
17 | else appSettings.handlers[id] = handler; |
18 | } |
19 | } |
20 | for (var i in appSettings.handlers) appSettings.handlers[i](appSettings); |
21 | } |
22 | })() |
23 | </script> |
24 | <!-- remaining async scripts: --> |
25 | |
26 | <script async defer src="/static/vendors-a98fed.js"></script> |
27 | <script async defer src="/static/app-76ea865b.js"></script> |
28 | <script async defer src="/app/initial-data.js"></script> |
As usual, any static builds should contain some content hash in the filename so that it could be permanently cached and your initial-data request(s) should use other cache headers like etags and if-modified-since when possible.
Each script could call onContentLoaded passing an id for that script (‘vendors’, ‘app’, ‘initial-data’) and an optional handler to be called whenever some resource is loaded (or just once if the once parameter is true). The handler gets the appSettings instance which can be used to check which resources have been loaded already for deciding when to take action. This way no polling should be required.
Security concerns
When loading user-sensitive data in the initial-data scripts one should be concerned about security to not allow cross-site script attacks to steal user’s data. I think it should be enough to check for the Referer HTTP header and compare it to a white-list of domains allowed to load that script. If you want to use a CDN for these requests you should set up your CDN to forward the Referer header in that case. It’s always a good idea to check with your security team if you have one. If you think the proposed solution here is not good enough or if you have other suggestions, please comment or send me an e-mail. I’d love your feedback.