Rendering in WebKit

Rendering in WebKit


>>SEIDEL: So, this talk is a little bit longer.
We’re going to go in to the guts of WebKit. If you guys are sitting with laptops, you
can follow along because we’re going to go through some of sources. I am Eric. I’ve been
here at Google for a couple of years. I work in the San Francisco office which is why many
of you have never seen me before. And I’ve been with WebKit since ’05 and don e lots
of stuff in WebKit. So, WebKit, for those of you who are just Chrome people and don’t
actually know what’s way under the covers; it’s the engine that actually draws the WebPages.
It’s almost 2 million lines of code, most of it is written in C++, it’s used everywhere.
About 10 percent of the total web browser share is covered by WebKit-based browsers
at this point. We have 150 committers, 80 of which are active. That’s actually a number
that’s been growing very quickly in the last couple of years. And 40 of those work here
at Google, 6 of us are reviewers. So, who uses WebKit? This list actually goes on and
on and on, but every browser that’s not Fire Fox, IE, or Opera is a WebKit-based browser,
basically. So, WebKit started back from KDE KHTML who’s forked by Apple in 2001, open
sourced again in ’05, there’s no official releases that come from WebKit itself, but
they do provide nightlies. So, WebKit is somewhat of a confusing term because it refers both
of the project and a library itself called WebKit. That thin library that sits on top
that you actually linked against is what is responsible for interacting with the operating
system. WebCore which is most of what we’re going to be talking about today is where all
the rendering actually happens. And then, there’s a JavaScript library below that which
is normally JavaScriptCore but we use a V8 profile. So, here’s a pictorial representation,
the platform directory which you see there in gray is all the hooks for talking to the
operating system which Darren talked about some in his talk. So, what does a browser
do? A lot of things. This is a few of them. And we’re going to talk about these today,
mostly from a 30 thousand foot perspective but we’ll show enough of the guts that you
can–well, at least, be dangerous. So, the first thing that we’re going to talk about
is loading. This is how we actually get the data into our engine, so we can do stuff with
it. Loading is unfortunately ridiculously complicated. It’s split, part of the responsibility
of webs up at the platform layer and part of the responsibility of webs in WebCore itself.
Most of the code is inside WebCore loader if you’re playing along at home. There are
some that’s in WebCore platform network and then there’s the FrameLoaderClient which is
the primary way in which WebCore talks back to the WebKit layer to actually do the network
request. And then, there are two types of loads inside WebKit. One is for loading an
actual page, a frame, and the other is for loading everything else that the frame depends
on. And these go through completely separate paths inside the WebKit. So, loading a frame;
most of this is in one huge monolithic class called FrameLoader, it touches a bunch of
other pieces but that’s where most it resides. There’s three phases that you care about if
you’re looking at this from a high level. The first of which is the policy stage, this
is where WebCore kicks off initial load. We go into the policy question. We ask the embedder
WebKit, Do you want to even–do you want to allow us to open this window? This is a pop-up,
shall we block it? This is where we as Chromium make a decision. You are navigating to a tab
that is no longer in the same domain. So, you’re in GMail and you click on an apple.com
link. We’re going to fork up a new process during this policy decision stage. The second
stage is the provisional phase and this is where we make a decision as to whether this
is a download, so, we’re just–we’re going to shoot the process that we just created
in the head and we’re going to let the browser hand up the download, or whether we’re going
to commit this load and this is going to replace the current contents. So, you click on the
link and if it’s a committed load, it replaces the contents of the page. And then once you
get committed, that’s what you actually start parsing and this is where the actual data
transfer incremental display, that sort of thing happens.
>>Is creating a new process as part of WebCore or just part of Chromium content?
>>SEIDEL: The creating a new process is all part of Chromium. I was just identifying that
as that’s where we hook in. Loading a Subresource, this is done by interface called DocLoader
and this is where it hits the WebCore in memory cache called cache. DocLoader is a device
that you hand it the URL and it gives you back a CachedResource. And it mans to talk
to loader to do that, you may have to talk to the cache do that, but in the end it gives
you a CachedResource. And the CachedResource is what actually handles all the callbacks
and produces an object in the end that you can deal with like a font or an image. So
that’s a simplified, very simplified view of loading. The next stage that we go through
is Parsing. This is where we make a DOM Tree. So, there’s several parsers inside WebKit,
but the two that you care about are the HTML Parser and the XML Parser. We take the data
stream that comes off of the network and we feed it in to a Tokenizer, either the HTML
Tokenizer of the XML Tokenizer. And then from those, we parse it. The XML Tokenizer is actually
with XML under the covers at least under the most platforms, and so they’ve actually hands
us SAX-style callbacks and we don’t have to have separate parsing logic. At this time
we also do things like handling DNS Prefetch, we do pre-load scanning for starting image
loads and CSS loads before we’ve actually finished parsing the entire page, and this
is when the XSS-Auditor runs. Okay, so, in order to talk any further, we need to talk
about some of the data structures that WebKit actually builds from this data that we just
parsed off the network. The first of which is the DOM Tree in this forest of trees, lots
of trees. The DOM Tree we’ll talk about, we’ll talk about the ones that are use in the rendering
tree which Darren mention to some of his talk and Brent mentioned some of this in his talk
and then Line Layout of the actual lines of the page. So, this is the stages that we go
through, we load, we parse, produce a DOM Tree—we attach that DOM Tree to produce
a render tree, we resolve style on that tree and then we layout the tree and then we paint
and things that Brent covered in his talk. So, the DOM Tree–you’ve seen DOM APIs in
Javascript, document.createelement, that sort of thing—it is a tree of values of the HTML
page–element nodes, attribute nodes, CSS Stylesheets. An example of a DOM Tree, you
can see how the HTML on the left maps to actual WebCore elements on the right. The DOM Tree is one piece, the rendering tree
is what we produce from the DOM Tree. This we hold all the style information, links to
things like plug-ins, shadow nodes for forms—these is what we actually layout results style on
and final tell the paint>>[INDISTINCT]
>>SEIDEL: I’m sorry?>>[INDISTINCT]
>>SEIDEL: When you click on the link, what did you actually click on? When you hover
over a page, where did the hover event go? So, you have to figure out what node that
hits. So, yes. So, within what we refer to as the rendering tree—there are actually
four trees that we care about; RenderObjects, RenderStyle, RenderLayers, and Line Boxes.
The RenderObject Tree is everything we need to paint, it is hung off of the DOM Tree and
it’s only created for rendered content. So, if you say, display none [INDISTINCT]—the
entire rendered tree that might exist underneath that is gone, we never create one for it.
If you display on plug-in, we never bother to load the plug-in because it never gets
a render. This is what a RenderObject class would look like—if you have the source in
front of you, there’s a lot more methods than this but you can see it knows how to layout
itself, it knows how to paint, it can give you the size of the tree underneath it, it
points back to the DOM tree, that’s what the node is, it hangs style off of it and it has
layers that hang off of it. We’ll talk about layers in a minute. Some example RenderObjects,
blocks, inlines, images, text, there’s some base classes for the render tree, one of which
is the BoxModelObject which is basically everything that follows the CSS rules is a BoxModelObject,
everything that follow SVGs rules which are completely separate inherit from the SVG Tree.
Here’s an example of mapping, on the left hand side we see a DOM Tree, on the right
hand side we see the render tree so DivElement maps to a block, the text maps to render text.
So, going back to our set of stages, we’ve now talked about loading, getting the source
text, we’ve created a DOM Tree from it. We’ve created a RenderObject Tree from it. Now,
we are going to talk about resolving style and layout of this tree. So, RenderStyles
are the computed style values. So, when you do Divstyle equals background color red, that
is computed into a color value with an RGB value that’s stored off of this render style,
so it’s ready to paint. There’s a whole phase about resolving style–that would be another,
at least 10 minutes of talk and we are going to skip over that but this is where the style
information is held and we use for layout. Two things are important to point out here
is that RenderObjects can share the same styles for memory efficiency but it means that say,
if you’re hacking in at WebCore, you can’t just grab a style and start modifying it because
somebody else might be using it. Also styles, again for memory efficiency, inherit from
their parent and commonly share data members. So, like when we look at the actual RenderStyle
class, you see down at the bottom the DataRef, inheritedData, you probably don’t have your
own inheritedData object, you point to your parents. So, as you can see here, the render
style class, commonly you instantiate them by inheriting it from a RenderStyle and then
it has a zillion of these access or methods to get the color, to set the color, the original
color, to get the shadow value, to set the shadow value, the original shadow value all
those. Uh-hmm.>>When you say, it’s owned by the render
tree, does that mean you RenderObjects will hold the reference to do RenderStyles and
then apply to the object or something.>>SEIDEL: The Rendering tree will create—I’m
sorry—the rendering tree will create styles and hold pointers to them, yes.
>>Do you RenderObject Tree [INDISTINCT]>>SEIDEL: Technically the method is actually
in the DOM Tree but the styles are held off of the Rendering Tree. Yes.
>>Are the RenderStyleObjects immutable since they’re shared?
>>SEIDEL: The RenderStyleObjects are not immutable but they are shared.
>>So, you have to be careful to do all the changing before you…
>>SEIDEL: You have to clone one if you for some reason are modifying a style outside
the CSS system, normally you would just tell the CSS system, I’m changing this attribute
or whatever and it would take care of making style shared correctly et cetera. But if you
are manually overwriting a style, you need to clone it first.
>>Okay.>>SEIDEL: So, okay. So, we’ve covered all
the way to style resolution, now let’s talk about layout. We’ve covered RenderObjects,
RenderStyles, there’s RenderLayers which is another sparse tree which is connected to
the RenderObject Tree. Layers are for things like transparency, scrolling, the whole clipping,
that sort of thing. And they actually end up as textures on the iPhone, so on the graphic
card and we use a similar API for passing from… Brent talked about passing textures
from the render to the browser that uses CG layer. We use something similar here in WebCore.
Here’s some example HTML, an annotation of where the layers are. So you can see when
we give that span an opacity it gets a layer. And we give the div overflow, it gets another
layer. And we create a tree of these. So more of these–more of these individual elements
have render objects than those that have layers. Another tree that’s used inside the Rendering
Tree is the LineBox Tree. So when we actually lay out the lines of a block, we use a separate
data structure called the LineBox Tree to do that. This is what actually does the text
flow. And again, there’s a–it’s a sparse mapping, you might have–you create a render
text which holds the actual text content, but many lines are going to point back into
that text. So here is an example. We have a Rendering Tree on the left-hand side, and
then for this Rendering Tree we create InlineBoxes in the BoxTree. We get one RootLineBox for
every line and then InlineBoxes with the net line. So every image–it’s laid down in a
line will get its own InLineBox, every tag effectively. So here’s an example of one RenderText being
split into two line boxes because it wraps. One other thing that’s also held in the Rendering
Tree are Shadow Trees. These are DOM Trees which are held off with the Rendering Tree,
which is a little confusing because normally the DOM Tree holds the Rendering Tree, but
here is a Rendering Tree which is holding a DOM Tree. And this exists for things that
are hidden from JavaScript. When you have a form control, those are actually rendered
by the engine, by the WebCore engine, but those are not exposed. Their DOM Trees, at
least, are not exposed to JavaScript, and so there’s a held off the Rendering Tree.
And then we render them using special theme images from the OS like, in this example,
down at the bottom, that button is actually using Mac OS 10s, underlying button drawing
routines, but we’re doing so. We’re making the paint call back from WebCore. Okay, so
that’s… We’ve talked about the data structures that get us to the point where we can actually
do a layout. Now, let’s talk about a layout. And there’s actually a really good example
of what layout looks like, thanks to our friends at Mozilla. This is an example of Gecko laying
out Google.com. You can see those are rects representing the Rendering Tree. And it’s
creating them all and then it’s moving them into place. So reflow is what Gecko calls
it. We call it layout, but they’re very similar. Gecko also has a DOM Tree, they also have
a Rendering Tree. They also have a layer concept. So that’s actually what we’re doing under
the covers. Well, that’s what it looks like to the human eye. So, layout is all done from
this layout method. The first thing we do is we save the old repaint rect, we pull in
any changes from the DOM, we then go and layout our children, and we repaint the difference
using the metrics that our parents had.>>[INDISTINCT]
>>I’m sorry? So, we can ask at the start of our layout method, “What is our current
bounce?” and we shove those off into a rect. And then we go and do the layout, and then
we ask again, “What is our current bounce?” And whatever the difference is, we pass that
to the system and save. Obviously, something changed, so please repaint it. But the general
layout method is agnostic to these. So how you get a layout because a layout is actually
done on demand. It’s not–it’s not done asynchronously in that it’s done on some other thread, but
you don’t generally say, “Oh, layout now.” You say, “I need a layout,” and then by the
time we next paint, paint says, “Oh, make sure you’re layed out before we paint.” So
you mark something as “Needing Layout.” Say, you’re in the DOM Tree and you’re parsing
some new value, you tell the Rendering Tree-you tell your renderer… By the way, you need
to re-layout before you ever paint again. There are a few times that we do immediate
layout that you as a webpage author would notice or someone working on Chrome, the new
tab page or something like that. And that’s when you access properties that require a
layout like an offsetHeight in order to tell you the height of an object, just like that
Gecko example. You have to compute where the heck the object is, and that’s what the layout
does. So, an overview for you–what we covered in our structures and our layout, Parsing
produces the DOM Tree, we build the Rendering Tree from the DOM. The Rendering Tree has
the four parts that we talked about. These are the objects, the layers, the styles, and
the lines. And then we do layout lazily on demand. So, one more time back to our little
diagram, we’ve covered all of these, and we will talk briefly about painting. This was
entirely covered by Brett, so we’re just going to touch on it here. When we paint, we paint
the Rendering Tree, and we actually take the root layer of the Rendering Tree and we tell
it to paint itself, and we tell it to do so 12 times because that’s what CSS 2.1 requires.
There’s a whole bunch of different phases, and painting is actually done incrementally.
I got some strange looks there, people don’t believe me. Well, we actually–we paint first
backgrounds, and then we’ll paint foregrounds, and then we paint underlines, it’s a whole–you
can read the spec. Yeah. So Only Render classes paint, and as Brett talked about, there’s
a GraphicsContext abstraction, that is where we actually handle talking with the OS bits.
And then, RenderTheme exists for every different platforms where it handles things like what
should my form control look like. So that was the rendering tree and the DOM Tree. We’re
going to talk briefly about a couple of other things that WebCore does, one of which is
HTML Editing. If you ever used your iPhone, all of the text entry in there is HTML Editing
inside WebCore, they’re written in mail.app. It’s all, I mean, if you’ve ever written in
g-mail, which you do every day, the rich text editing is all done by WebCore if you’re using
Chrome. So things that fall under the category of Hit Testing, oh, I’m sorry, of editing,
Hit Testing being one. We find out what is actually under the mouse plane. Handling selection,
handling focus, doing execCommands, undo/redo, serialization, copy/paste. Hit Testing works
by we start at the root layer just like painting, and we walk up from the root asking each RenderObject
does this point fall within inside your bounds, and if it does, and it hands us bac itself
or its associated node. There are multiple phases to Hit Testing because you might hit
test backgrounds or you might hit test foregrounds for different purposes like a mouse over,
only cares about backgrounds. But if you are actually clicking on a link, you only want
to actually click the text on the link. You hit test all the time in a web engine. Every
time you move a mouse, you’re Hit Testing at least once. Selection is handled through
a few abstractions. The easiest to think about is just positionForPoint. We, when you click
to figure out where the heck you actually clicked in the text content, we do a Hit Test
to find out what node, and then we find the closest adjacent character break, and that’s
positionForPoint. VisiblePosition is an abstraction that we use to hold a position in the DOM
that represents a position that a user could get to. Not just a range or a node offset
pair, but actually one that a user could click on. And then Selection is one of the 12 paint
phases. EditCommands, this is where we do all the editing logic. We have some examples
at the bottom. We build compoundable command so when you type, say, you type five characters,
you wait a second, you type five more, those all get into one command group, so when you
undo, it undoes the whole slew, depending on your platform. Here’s an example of an
EditCommand. You apply the command, you Unapply, you Reapply, this is for Undo/Redo. But if
you’re implementing a specific command, you would just implement these methods. EditCommands
know what their selection should look like before and after for doing Undo/Redo, and
they also form trees. We let the OS handle Undo/Redo but we actually execute the Undo/Redo
when we’re told to. This is all built Off of EditCommands. One last thing we should
talk about before we get to Q and A is the actual JavaScript DOM Bindings. So when I
originally wrote this talk which was for the Wave team, there’s confusion as to what’s
provided by the JavaScript engine versus what’s provided by WebCore, you as a Chrome engineered
might as well one and the same. DOM Bindings are how we take this DOM Tree that we talked
about an expose it to different languages. Here, the example language is JS, Objective-C,
COM C++. We take IDL files or Web IDL as they are now called, and we generate some code
using a bunch of Perl scripts, we instantiate these objects and we cache them, and WebCore
manages their lifetime by keeping them around until they’re no longer needed. So here’s
an example IDL file. We run a Perl script across this IDL file and generate a bunch
of C++. And that would be for JavaScript or we generate some other C++ that would be COM,
or generate some other C++ that will be for JNI bindings. Objects that are provided by
the JavaScript Engine itself, all the basic permanent types, access of prototype, up to
callee chain, getters and setters, insiders, twoStrings. Things that WebCore provides are
all of these things you actually think of as the DOM. That’s pretty much all of the talk. We can
talk briefly about bugs. You guys should know how to file bugs at this point. Basically,
you file them at bugs.WebKit.org, then we fix them because as–although we have a lot
of commanders here, we do know a lot of people who work a 100% on WebKit, and so a lot of
bug fixing actually happens upstream. Pam covered how to write a test case. Examples
of what a good test case looks like. These are resources for how you find out more about
WebKit. You see what WebKit implements. You can actually look at IDL files at the .in
files. How you contact the WebKit folks including some Google specific resources including the
last one which reaches the Safari team. And that’s it. So questions, comments, concerns?
>>
So two slides back, at the bottom, you mentioned something I’d never heard of before, what’s
an in file?>>SEIDEL: So these are used also in more
Perl Scripts for creating–these are used for creating atomic string caches. Atomic
String is a type of string in WebKit that we only have one of. Atomic String caches
out of attributes and element names. So if you are a webpage implementer, and want to
know what elements WebKit supports, you can look that at .in files. You want to know what
attributes or what properties and elements its supports you can look at .in files.
>>Okay. Thanks. The other question I had is, you said in talking about editing with
Undo/Redo you let the OS handle Undo/Redo. What do you mean by that?
>>SEIDEL: So when don’t handle–we don’t catch the Command Z, and we let the OS level
stack management. We pass to Mac OS10 and say, “we have an undo event that just happened”
added to your undo stack. And it manages to the menus for us et cetera. WebCore itself
doesn’t deal with that. The WebKit Layer takes care of that.
>>Okay, but we are actually the ones that are implementing the commands actions that
happened when you have [INDISTINCT].>>SEIDEL: Correct, those edit commands do
those, I think your mike just died. Any other questions from that torrent of information?
Great.>>So you talked a little bit about that a
bunch of those trees aren’t actually built if, you know, say display equal, or display
is none. What exactly happens, like what’s the process when display becomes, you know,
something else like block. Could you explain how those trees are then built on, you know,
on demand or whatever?>>SEIDEL: Okay. So, if you would change a
property, the DOM Tree has a method called parsed map attribute, just to handle attribute
pursing when you change an attribute. That would learn about the change. It would pass
it off to the style system. The style system would change the display. Notice that, “oh
my God, we need a Renderer now”, it would create a renderer for the DOM object. We resolved
style for it and all of its children. We’d go through the normal process that you would
during pursing when you create that element and you attached it by creating a render for
it.>>Then also, what’s the level of dependency
in terms of the different trees depending on the previous one. Like, do they all have
pointers bidirectionally or how does that work?
>>SEIDEL: The only bidirectional pointers are from the DOM tree to the Render Object
Tree. The Render Object has pointers to the Render Layers which actually do also have
pointers back to the render objects. The render objects have pointers to the render styles,
but the render styles do not have pointers back to the render objects. And the LineBoxes,
the block has a pointer to the RootLineBoxes, the liast of them. And the individual LineBox
do point back to their render objects. So, there are more bidirectionals than I initially
implant.>>You mentioned the issue that at times you
can query for attributes or positions which force a paint operation. Those WebCore have
the concept of a partial paints that it only does a paint or lay out sufficient to answer
the query leaving it midstream or does it require a full layout of the entire display
– sorry if it’s a dumb question.>>There’s an open bag, I can decide it to
you if you want.>>SEIDEL: So, not to confuse paint and layout,
paint is actually very simple we just basically put the rendering tree to the buffer. But
lay out itself is the complicated part and we do not support incomplete layouts. We ensure
that the entire document is laid out before we will answer the question.
>>Yeah, there is support for noting that only part of the tree needs layout, so that
when the paint happens it will only layout that sub tree but it’s somewhat limited and
it doesn’t really apply for queries.>>Yup, but we should fix that.
>>So, just to be clear if I have some really, you know, node really deep in the tree and
I say like dot client web on it, is it going to have to layout the whole page?
>>SEIDEL: As it’s currently implemented?>>Yes.
>>SEIDEL: Okay.>>To the most extreme, we give you the top
element which is just completely obvious from the first 12, yet the entire page down to…
>>SEIDEL: So, when that happens when it’ just the top element that you’re trying to
layout, you may have not even finished loading the entire page and in that case we will do
a layout of the three elements that are on the DOM and it won’t take very long. But if
you’ve loaded the million element page and then you finally ask for some piece and we
haven’t laid out the rest of the page, then yes, it would a while, or longer than if we
need to.>>There’s some objects that manage object
lifecycle, so in some places there is raw pointers passed around and sometimes there’s
this other objects, are there any rules as to when you should use a raw pointer and when
you should use one of these other objects.>>SEIDEL: So, the other objects that are
most common are what’s called Ref pointers and pasref pointers and these are for indicating
that you own the object. Ref pointers are, you exposed when you own the lifecycle so,
I ask some object to get its frame pointer and its saying, “I’m holding on to the frame
for you, I’m just allowing your access to It”. If it is, if it is taking a pasref pointer,
that means that it’s taking ownership. And ref pointers are generally not used in arguments
and returns, ref pointers are used to maintain ownership and take care of the destructor
or letting go of the ref during destructor.>>Thanks.
>>If you want to know anything more about the pasref pointers and ref pointers and all
that stuff in WebKit, if you just Google for pasref pointerm, I think it’s the first results,
it’s a really, really great document on WebKit.org about when do you use different ones and that
kind of stuff and explain the history.>>SEIDEL: Yup.
>>If you have to write WebKit code, you should read that document.
>>Peter says, if you have to write WebKit code, you should definitely read that.
>>SEIDEL: Maybe like three times. Derrin Adler wrote it, as you said you Google pasref
pointer and it will be the first hit. Any other questions? Great. END OF FILE

11 thoughts on “Rendering in WebKit”

Leave a Reply

Your email address will not be published. Required fields are marked *