Browser Rendering

The web browser is possibly the single most important piece of software on many computers. It is, of course, one of the primary gateways to the internet. Understanding a more about what a browser does and how may help shine further light on TripleLift's value proposition.

The browser consists of a few different components - a user interface (what you see), the browser engine, the rendering engine, and a data layer. The rendering engine is the component that actually draws the webpage that the user interacts with.

webkitflow.png

First and foremost, the browser "renders." This means its takes a representation of where things should go and how they should interact with each other and actually makes them do that. So, for example, if you say that you want a table with 3 columns, each 1/3 the width of the browser window - with text in each one - then the browser has to figure out the size of the browser window, divide it up appropriately, "paint" the table with the right dimensions, and place the font in the table. As you then resize the browser window, it needs to "repaint" the table given the new dimensions. This is, of course, a very simple example.

HTML is the language used to construct web pages (we'll discuss CSS and javascript in a minute). A simple example of HTML is:

<html>
 <head><title>Example</title></head>
  <body>
   <h1>Hi!</h1>
   <p>TripleLift Rules</p>
  </body>
</html>

The start and end of an "element" of HTML is represented by the <tag> and </tag>, as appropriate. Without getting into what HTML does, if you think about the text above, you can see every tag is either a child of another tag - and some tags can have multiple children. The browser defines this structure for the entire page and keeps the representation as the Document Object Model (DOM). The browser - at all times - has a representation of the webpage as a DOM state. This can be manipulated by javascript adding or removing elements - but it's always done with reference to the DOM.

CSS is the styling of the HTML - sizes, colors, and more complex attributes. This is often a separate file or files - so the browser needs to fetch the HTML and then the CSS file specified in the HTML. Like the DOM, the CSS is turned in a CSS Object Model (CSSOM) by the browser.

When the DOM and CSSOM are complete on the page's initial load, the next step is creating the render tree. This is basically rectangles with attributes like sizes, colors, etc., and in the correct order to be shown on the screen (pairing DOM w/ CSSOM). DOM elements that are invisible or not meant to be shown (e.g. metadata) won't be in the render tree. Similarly, a single DOM element that floats on the page, has multiple states, is text flowing onto multiple lines (each line is its own element), etc may have complex representations in the render tree.

The browser then creates a layout of the render tree, by giving each element a precise coordinate for where it should actually be placed on the screen. Finally, each element is "painted" to the screen in a specific order (background color / image, border, children). Layout can be re-triggered on the entire render tree when there is a global style change, like javascript changing the site's default font or the browser is resized. In this instance, you may see a flicker or non-responsiveness. There may be incremental layouts for smaller changes, like a modification of the DOM by javascript. In this case, a small piece of the render tree is laid out and painted.

TripleLift ads are generally rendered by injecting our ad content into the DOM of the page, using the same sort of content structure as the rest of the page, and generally using that page's CSSOM. This makes for a significantly more integrated ad experience that's better for user experience and often much quicker and more effective in the way the ad loads.

ASCII Characters

Many years ago, long before Snapchat, emails used to be limited to English letters and a few extra characters. Now you can throw in all the Asian letters, as well as some emoji nonsense. What's going on behind the scenes is a shift from ASCII to Unicode and the Internationalization of the internet.

ASCII was, more or less, the original way text was represented online. Because computers take binary data and make human-readable versions, a standard was needed to say what binary data would correspond to what piece of text. So the early internet pioneers came up with ASCII (American Standard Code for Information Interchange). In ASCII, each character is represented with 7 bits. That means there are 128 possibilities - upper case, lower case, numbers, basic symbols, and few other things. Back in the day, data interchange was more constrained, so ASCII was a reasonable way to send representations of text. This was especially true because the early internet was a largely American and Western European effort.

1200px-ASCII_Code_Chart.svg.png

ASCII couldn't handle much of the required complexity of a more sophisticated internet. UTF-8 (Unicode Transformation Format - 8 bit) was invented as a way to preserve interoperability with ASCII and add a much more robust character representation. UTF-8 allows 1,112,064 different character representations - including basically every character in every language. The difference between ASCII and UTF-8 is that ASCII is a simple 7-bits-to-a-character representation of binary to letter. UTF-8 is more complicated, involving a variable-length number of bytes (each byte is 8 bits) that must be interpreted based on the bits that have already arrived for a particular character.

So when your browser sends a request for a web page, the server will respond with a header including something like the following:

Content-Type:text/html; charset=UTF-8

This means that after the headers, there will be a stream of bits that the browser must interpret using the UTF-8 decoding, instead of ASCII or whatever. All HTTP headers, ironically, are in ASCII. UTF-8 now accounts for about 88% of all web page encoding online., whereas 10 years ago, ASCII had a similar percent of the web.

And as the web gets internationalized, so too do ads. So even something like French and it's ç's and German and it's ü's needed something more robust than ASCII. The transition from ASCII to UTF-8 has been very important to the internationalization of the web and for ads. And it's the same UTF-8 that underlies our support for internationalization.

When we answer the question of whether TripleLift ads support, for example, Japanese characters, you're actually asking a number of questions:

  1. Does our implementation of the OpenRTB protocol support UTF-8? Yes
  2. Do our databases store text as UTF-8? Yes, where necessary. But if we know 100% that we will only need english characters - and there will be a lot of text - then ASCII is more efficient.
  3. Do we respond with a UTF-8 encoding for our ads? Yes
  4. Do browsers support rendering Japanese characters - meaning do they support UTF-8? Yes: all modern browsers natively support UTF-8.

So yes, end-to-end we support all the characters supported by UTF-8. Some of the challenges have been ensuring that we used UTF-8 at every step of the way - and never only had support for ASCII.

Internet Basics: DNS

We talk a lot about digital advertising, but sometimes it's worth taking a step back to think about what exactly is happening on the web. Today we're talking about DNS, or the Domain Name System. As you'll see, in isolation, DNS establishes some of the framework necessary for a fully-functioning internet, which we'll go into more in future Internet Basics Lift Letters™. 

When you type adexchanger.com into your browser and hit enter, what happens? Just like your phone, when you dial "Mom" (which you should do more, she misses you) - you're actually calling a number. There's a translation that happens to go from the human-readable, text version of the domain to a computer-readable digital form. These computer versions are called internet protocol addresses (IP address). DNS is what makes that translation possible.

 

IC195483.gif

Behind the scenes, websites can have multiple IP addresses associated with them, and they can change at any given time. Also, if a website is theoretically mapped to the wrong IP address, perhaps by a malicious actor, they could do some bad things. Finally, you want a system that's resilient. If the system went down, for whatever reason, nobody would be able to access anything that they didn't already have the mapping for - so being stable and distributed are vital.

When you type in adexchanger.com, the browser needs the DNS translation, so it has to issue a request to a DNS server. But who exactly is the DNS server and how does it know the mappings? Basically at various levels - your ISP or cell company, or a company's network, there is a cache of "root" name servers. There are 13 groups of name servers, each of which consists of thousands of actual servers, and are managed by a variety of entities including Verisign, the US military, NASA, the University of Maryland, etc. These root servers contain a mapping for each suffix (e.g. .com, .org, etc) to a top level domain ("TLD") name server. So the initial request for adexchanger.com would be root server for the .com TLD, then a DNS lookup directed to the .com TLD server. It's worth noting that all but 3 "root" server groups, which at some level control the operation of the internet, are managed by American entities - and the control over who manages the root server groups belongs ultimately to the US Department of Commerce. 
The actual requests are generally not done by the "browser" itself, but a "recursive resolver" which is either in the operating system or at the ISP, but that doesn't really matter. Each TLD server stores the information for all the next-level information. Meaning the .com servers store the mappings all the .com domains. The TLD server will respond with the name server (not the IP address) for the domain you requested. Finally, your recursive resolver would send a request to the domain's name server - in this case, the name server for adexchanger.com and get the IP address. If you had requested events.adexchanger.com, instead of requesting adexchanger.com, your recursive resolver would have gotten the name server for events.adexchenger.com from the adexchanger.com domain name server, then issued a followup request for the final IP. Only after getting the IP address does the browser issue the web actual request.

As you can imagine, there are a lot of DNS lookups. The root and TLD servers are particularly resilient, with sophisticated load balancing techniques. There is also a lot of upstream caching by various layers, including your computer, company networks, ISPs, etc. Each cache entry has a time-to-live (TTL), meaning you cannot instantaneously expect changes in DNS entries to be reflected in requests - and meaning that requests for the same information won't be done for the duration of that TTL period. 

Finally, you may be curious whether malicious attackers ever try to take down the internet by taking down the root servers. It is, indeed, the case that if the root servers went down - after the various levels of caching expired in a few days - there would be some chaos. There have been several attempts to do just this, including one last year that featured 5 million requests per second aimed at a single server with the intent of breaking it and, possibly eventually the internet. It didn't work.