Talks
Cargo Cult Web Performance Optimization
Summarized using AI

Cargo Cult Web Performance Optimization

by Ilya Grigorik

The video, titled "Cargo Cult Web Performance Optimization" by Ilya Grigorik, presented at GoGaRuCo 2012, explores the intricacies of web performance and how understanding the browser's architecture can lead to better web applications. Grigorik emphasizes the shift from browsers being a black box to being an open system that developers can investigate, particularly focusing on the WebKit and Chromium engines that power many devices today. Here are the key points discussed throughout the video:

  • Importance of Browser Fundamentals: Many developers lack a fundamental understanding of how browsers work, which is essential for optimizing web performance.
  • Architecture of WebKit/Chromium: WebKit serves as a browser engine with a web core for essential tasks like parsing HTML and constructing the DOM tree. It includes a JavaScript engine and platform APIs for rendering and network operations.
  • Components of Web Performance: The average web page is over a megabyte and requires optimization techniques such as DNS prefetching and TCP preconnect, which help to reduce load times significantly.
  • Minimizing Resource Requests: Understanding which resources block the browser's parser is critical; using async and defer attributes in script tags can help manage performance better.
  • Building the Render Tree: This process involves creating a visual representation from the DOM and CSS objects, focusing on maintaining a high frame rate for smooth rendering.
  • Impact of Hardware Acceleration: Implementing CSS transforms can enhance performance, but developers must consider the trade-offs associated with this approach.
  • Call for Education: Grigorik stresses the need for more educational resources about browsers in computer science curricula to fill existing knowledge gaps among developers.

The video concludes with a reminder that understanding browser mechanisms not only aids in performance optimization but also empowers developers to make informed choices in web design, ultimately improving user experience. Grigorik advocates for expanding educational opportunities surrounding browser technologies to spark innovation in the future.

00:00:08.840 So, Ilya Grigorik has been with us at every GOO event. It's great to have him back; he's one of the speakers we enjoy most. When he started, he was the founder of his own startup company, and last year, they were acquired by Google, where he now works on the "Make the Web Fast" team. He's going to drop some knowledge on you.
00:00:27.439 I think many of us were watching a one-hour infomercial earlier this week done by Apple. Everyone took something different away from that, but the number that stood out to me the most was when Apple announced that they have shipped or activated more than 400 million devices, which is remarkable. Not to be outdone, on the same day, Android announced that they activated half a billion devices. Combining those two numbers means we have almost a billion smartphones. This is interesting because all of those devices run WebKit, they have a browser, and we see about 2 million new activations of these devices every day.
00:01:04.720 From an engineering standpoint, we spend more than 50% of our time on our computers within the browser. It's the tool into which we throw HTML, and while it often gets it wrong in terms of layout, there are a billion devices relying on WebKit, making it the largest deployment platform available to us. Regardless of whether you're developing for iOS, Android, or another platform, at some point, you're using WebKit, whether it's in a UIWebView, web view, or a native browser app.
00:01:40.439 I feel that we're missing actual education or understanding of the fundamentals of how the browser works. When looking under the hood, it’s actually an entire operating system. You could teach an entire Computer Science curriculum on the various elements that function within the browser. We have graphics, high-performance networking, and even machine learning. Through technologies like WebRTC, we're dealing with distributed computing. Virtually every field of computer science is applied here, and browsers are pushing the boundaries of a lot of research in these areas.
00:02:04.800 Every year, I like to step back and examine what I’m doing as a developer and what I need to do to make good progress. Early this year, I realized that I don't really understand the browser. I write apps and use the browser regularly, but I didn’t know what was happening underneath the surface. Over the last six months, I’ve taken the time to dive into the source code, which is a remarkable thing because, until very recently, the browser was a black box.
00:02:45.959 With these billion devices running WebKit, the code is available. Whether you prefer to use Firefox or Chromium, we can inspect this code. Unfortunately, educational institutions, including universities, currently do not teach you anything about the browser. Technology is advancing too quickly; it may take a decade before we see browser-building courses in Computer Science curricula.
00:03:11.519 In the meantime, we need to fill these gaps ourselves, and that, over the last six months, has been one of the best investments I've made. Understanding how browsers work pays high dividends, and in the next 25 minutes, I hope to help you start your own journey into this world. A browser is a complex system with a lot of code; for example, checking out WebKit or Chromium will yield about 4.5 gigabytes of data.
00:03:51.920 There are many major moving blocks or components, so first, let's talk about the architecture. The browser is not a black box; if you’ve paid attention to the WebKit logo, it’s a white box—an open white box. So what exactly is WebKit? It's a browser engine, not a browser itself. You can't build WebKit and directly render a page without a lot of other components.
00:04:18.960 To understand how it works, think of WebKit at its core, with two primary interfaces: one at the top for embedding APIs (like the Chrome interface) and one at the bottom for platform APIs, which grant access to machine capabilities. WebKit includes many components, and you can choose to swap them out. However, the central component, or web core, is shared across all WebKit browsers.
00:05:02.479 Web core handles essential tasks such as parsing HTML, constructing the DOM tree, and dealing with the CSS object model, all of which require substantial engineering effort. Thus, many browsers attempt to avoid the insanity of building separate object models by utilizing Web core to build one unified model.
00:05:39.960 WebKit also comes with a JavaScript engine, which is essential for most web applications. Originally based on the KJS engine from KDE, it has undergone numerous revisions and improvements, providing just-in-time (JIT) compilation and generational garbage collection, which facilitates efficient JavaScript execution. While WebKit includes this JavaScript core, browsers like Chrome have opted for their implementation, the V8 engine.
00:06:19.200 Additionally, the platform APIs are crucial as they include network stacks and graphics engines, which manage how content is displayed on the screen. By default, WebKit does not output anything visually; it necessitates the inclusion of these platform components based on the device being used. Handling fonts is another complex area within the graphics engine, and browser capabilities can vary widely, impacting performance.
00:06:54.280 To illustrate, take Chrome on macOS. It reuses as many components as possible across platforms to ensure consistency and reliability. For rendering, it uses Skia; it has its networking stack, and for fonts, it depends on the macOS Quartz renderer, all with V8 handling JavaScript execution. In contrast, the Android browser implemented many components on its own, contributing to different performance characteristics among browsers powered by WebKit.
00:07:46.400 Now, what does it actually take to put a page together in WebKit? The W3C Performance Working Group developed a comprehensive diagram that outlines all the components involved, with timers tracking various aspects of page rendering. The three major components are network, server, and browser execution, and despite not being drawn to scale, this depiction provides insight into the numerous moving parts.
00:08:53.919 In my exploration of the network stack, I learned that today, the average web page is over a megabyte in size, connecting to over 30 different hosts and sending around 80 different requests. Despite this, we demand that pages load within 300 milliseconds. The browser has become increasingly intelligent; for instance, it can implement DNS prefetching to resolve hostnames before they're clicked, significantly reducing load times.
00:09:37.680 In addition to DNS prefetching, there's TCP preconnect, a method where the browser opens connections in advance so that when a link is clicked, the connection is already established. This approach saves 50 to 100 milliseconds on average. Browsers also optimize connections and leverage caching effectively, especially on mobile devices, which have limited storage.
00:10:28.200 These optimization strategies, such as prefetch, preconnect, pooling, and caching, are implemented variably across different browsers but are fundamental to improving performance. Chrome performs uniquely by remembering the resources associated with frequently visited sites. For example, on returning to CNN.com, it preconnects to previously established hosts automatically, facilitating faster load times.
00:11:24.520 Throughout my exploration of web performance, I realized the significance of minimizing requests and understanding what blocks the browser's parser. For instance, a simple HTML5 page can be blocked while parsing if it encounters a script tag that needs to load JavaScript before proceeding, leading to noticeable latency and visual delays.
00:12:12.880 This issue can be mitigated through the use of the async and defer attributes in script tags, which inform the browser that these scripts will not block rendering. The pre-load scanner, another tool within the browser, seeks out critical resources that must be loaded early, minimizing performance impediments. It scans the document for tags that might block rendering, optimizing resource loading by processing critical items first.
00:13:09.600 The key takeaway is that the network stack feeds data into the tokenizer without waiting for the entire page to download. The browser constructs the DOM tree and utilizes preload scanners to identify blocking resources without incurring delays. The goal is to schedule resources efficiently, where script execution can degrade performance if not managed correctly.
00:14:01.120 While JavaScript is popular for managing dependencies, it can hinder the browser’s efficiency because it obscures information necessary for optimal resource management. The document preload scanner demonstrates that recognizing when to load resources can vastly improve page performance.
00:15:03.840 Building the render tree, which consists of the DOM tree and the CSS object model, is another essential aspect of web performance. The render tree comprises only elements that will visually display on the page, excluding those such as meta tags. This process includes the consideration of specific render layers, which may target elements like videos, requiring additional resources to be rendered properly.
00:16:23.840 As we aim for 60 frames per second in web design, it's critical to ensure that rendering times remain below 16 milliseconds per frame. In practice, this requires significant optimization, given that certain pages can exceed this time due to poorly timed JavaScript execution and resource loading.
00:17:43.440 It’s essential to maintain a consistent frame rate rather than fluctuating between high and low rates. Maintaining a constant frame rate helps avoid perceptual discrepancies for users. Google has discussed different strategies to manage frame rates effectively in development talks.
00:18:57.360 When discussing performance optimizations like hardware acceleration, it's crucial to understand that actions, such as rendering CSS transforms using WebKit's translateZ property, can have significant impacts on performance. While some developers recommend applying this property to improve page speeds, it's vital to consider that pushing elements to the GPU has trade-offs, like increased bandwidth use and potential battery drain.
00:20:02.720 In conclusion, while there are substantial technical intricacies regarding how browsers function and how web pages are rendered, this knowledge is pivotal for developers. You don't need to tackle everything at once but rather focus on individual components over time. Understanding the browser's underlying mechanisms will facilitate better design choices that enhance user experience.
00:21:56.440 As a final note, exploring the document parser and the preload scanner can provide significant insights into how a browser constructs a web page and optimizes resource loading. This knowledge is essential for any developer engaging in web design, as it directly impacts user experience.
00:22:49.640 With that said, I will take questions. By the way, this is not an official logo.
00:23:27.020 Imagine I'm Larry Page, and you have unlimited funding for the next two years. What are the top three things you would work on to make the web faster? In the context of a browser, per my perspective, I wouldn't focus on three things specifically; instead, I would dedicate the funding to education about browsers, which is crucial.
00:24:21.280 There’s a common misconception that education about browsers isn't necessary because they seem mundane compared to topics like file systems or graphics. However, it's imperative to comprehend the underlying technology because what often happens is that after graduating, most developers end up building for the web over other systems.
00:24:55.280 Thus, I would advocate for more courses on these topics in Computer Science departments, much like how courses on map reduce are emerging now. In the future, I hope we can expand educational offerings around browser technology.
00:25:19.440 It’s going to be essential for developers to fill those gaps that currently exist. Gaining a better understanding of these issues will pave the way for more innovation in the future.
Explore all talks recorded at GoGaRuCo 2012
+4