Getting Started With the WebKit Layout Code |...

I am a newcomer to the world of WebKit. Before starting here at Adobe I had never even looked at the source code, so I had quite a bit of learning to do once I started. I’ve done a few things in WebKit since then, but I’m definitely still in the early stages – it is a large, complex project! WebKit has a lot of documentation, but due to the rapid pace of change in the code base, much of it is very high level or outdated. As a beginner, I found this to be rather challenging when trying to learn the ins and outs of the layout system.

Out of this challenge grew the desire to do make it less challenging for others, and help my learning process at the same time. I’ve found that the best way to learn something is to explain it to someone else, so I am beginning a series of posts about learning WebKit and WebKit layout. This first post will focus on the basics and pointers to documentation that has helped me along the way. Note that I assume that you have a working knowledge of HTML, CSS, and JavaScript. I also assume that you have a knowledge of C languages, as WebKit itself is written in C++. You shouldn’t need any great knowledge of C++ to follow along, but you will definitely need to learn C++ if you want to contribute to the code.

What is WebKit?

The first thing that may stump a beginner is quite simply the definition of WebKit itself. WebKit is not a browser, it is a rendering engine. Web browsers use a rendering engine to translate from the code that is written by developers into the visual representation familiar to end users. WebKit is used in many different browsers, from Chrome and Safari to the Playstation 3.

In order to support all of these platforms, the WebKit code base is separated into platform dependent code and platform independent code. Things like reading HTML and CSS, executing JavaScript, and figuring out where different elements should be drawn on the screen (layout) are shared between all of the platforms. However, things like networking and drawing on the screen are implemented differently for each platform (or “port” in WebKit speak), because each platform has its own way of doing these things.

WebKit is further separated into components for different functions:

Webkit Block Diagram

WebKit is not only the name of the entire project, it is also the name of the API that’s used by browsers and other applications. There is a second version of this API, called WebKit2, which was developed to bring multi-process capabilities to WebKit. JavaScriptCore is the JavaScript engine that comes with WebKit, but many browsers replace it with their own implementations. The part of the system responsible for layout is WebCore, which also covers everything to do with CSS, HTML, the DOM, rendering, and more. All of these are built on top of WTF (Web Template Framework), which is a utility library containing things like common data structures and threading primitives.

What is Layout?

When WebKit consumes a web page, it generates a DOM tree from the HTML source. The details of how this is done are outside the scope of this post, but there is an excellent article that describes the general workings and data structures behind both WebKit and Gecko and is highly recommended reading. The task of layout is to examine the DOM tree and determine the size and position of all of the elements so that the rendering step can then draw them.

Understanding the rules of layout

Unfortunately, the mechanics of layout are much too complex to summarize here. The current standard rules for laying out content are defined in the CSS 2.1 Specification, but that can be very hard to understand for the uninitiated. Both Web Platform Docs and Mozilla Developer Network have more accessible descriptions of the rules for layout. Most of what is in these documents should be familiar to seasoned web developers, as it is impossible to get a web site to behave the way you want without understanding how things will get placed on the page. While these are not a substitute for reading the spec if you are a browser implementer, being familiar with the concepts makes the spec much easier to read.

The W3C (the organization that defines many of the standards for the web) acknowledges that the CSS standard can be hard to read, and provides a helpful guide to reading the CSS specifications. The most important sections of the spec for a developer working in WebKit’s layout engine are Box model, Visual formatting model, and Visual formatting model details. Many of the variable and function names in WebKit’s layout code are taken straight out of the specification, so knowing the terminology can greatly help with understanding the system.

As you get more into the layout code, you will find things that are not specified in CSS 2.1, as WebKit has already started implementing features from CSS 3. Unlike CSS 2.1, CSS 3 isn’t a single specification, as it has been broken into modules for each feature area. The modules are all versioned independently, so there isn’t really a canonical list of what makes up CSS 3. The CSS current work page at the W3C is a very good place to see what the latest status of the specifications are.

How is layout done in WebKit?

While understanding the other parts of the spec are very useful, once you have an understanding of the CSS Box Model, you should be able to dive in and understand much of the layout code.

I said earlier that layout is done on the DOM tree, which isn’t entirely true. Before starting layout, a tree of render objects is created. Most DOM elements get corresponding render objects, but some special ones like head and elements with display: none don’t have renderers, since they have no visual representation. Also, some elements get more than one renderer. For example, when text must be wrapped to fit into the width of its container, a new render object is created for each line, even though there is only one DOM element for the run of text. The following diagram shows a simplified version of the render tree and it’s relation to the DOM tree:

DOM and Render Trees

The layout process is managed by the FrameView C++ class, which can start layout from the root of the render object tree (a full layout), or from a subtree (a partial layout). Full layout is the most common, as partial layout can only be done when it is certain that laying out the subtree cannot affect any elements that are not part of that subtree. For example, partial layout is used for when text fields are updated by the user typing in new text.

Regardless of if a full or partial layout is initiated, once the layout method is called on the renderer at the root of the tree (or subtree), it follows this recursive algorithm:

The current renderer computes its width.
For each child, the current renderer:
1. Determines the position of the child.
2. Asks the child to compute its dimensions.
The current renderer computes its height.

Of course there are exceptions and special cases to this, which are as complex as the rules for layout themselves. For much more detail, you should read the render tree construction and the layout sections of How Browsers Work.

Once the layout process has finished, the tree can then be drawn. Each render object has a paint method to accomplish this, but that’s no longer layout and thus outside of the scope of this post.

That’s great, but how does this match up with the code?

It can be a bit dangerous to give links to specific parts of the WebKit code base, as it changes so rapidly that the links are soon out of date. Thus, I will stick to generalizations here. The C++ classes that make up the DOM tree live under Source/WebCore/dom, while the render classes live in Source/WebCore/rendering. Instead of duplicating what has already been written, I will defer to Dave Hyatt’s blog post on Layout and Rendering Basics for a list of the major C++ classes that are used in the layout process. The links in his post are broken, but the names of the classes and the files that contain them are the same. (You can look in the directories I mentioned above for the classes, or just insert “Source/” in the URLs between “trunk/” and “WebCore/”.)

The basics article is the first in a series of blog posts that Dave Hyatt wrote about WebKit layout. They are a bit old and incomplete, but are one of the most useful sources of information about WebKit layout.

I would be remiss if I didn’t mention Eric Seidel’s talk on rendering in WebKit. It is of more recent vintage than Hyatt’s blog posts, and packs a lot of knowledge into a 30 minute presentation.

What now?

If that didn’t give you enough reading, I’ll be following up with a post on a way to visualize which render classes are responsible for which parts of the page. After that, I intend to cover layout tests and DumpRenderTree. And you’ll just have to stay tuned to find out what happens after that! I’ll be sure to update this post with links to the future posts in this series.