Skip to main content

Command Palette

Search for a command to run...

How a Browser Works: A Beginner-Friendly Guide to Browser Internals

Published
5 min read
How a Browser Works: A Beginner-Friendly Guide to Browser Internals

Have you ever typed a URL in your browser and wondered:

“What actually happens after I press Enter?”

Most people think a browser just opens websites. But the truth is, browsers are complex systems that transform code into pixels on your screen — all in a few milliseconds! Let’s take a step-by-step journey and understand how this happens.

What is Browser ?

A browser is an application for accessing websites to show on users screen. Internally it:

  • Fetches resources like HTML, CSS, and JavaScript documents from a server

  • Understands the structure of the page

  • Applies styles and layouts

  • Executes scripts

  • Renders pixels on your screen

Now we should see how browser do these all things what is inside the browser —

Main Parts of a Browser

Browser is like a collection of components working together. Each component does their own part then we see the result on our screen. Let’s take a look at each components-

  1. User Interface (UI) – address bar, tabs, buttons, bookmarks. This is what we see.

  2. Browser Engine – the conductor, coordinating other parts. Acts like a bridge between UI and Rendering Engine.

  3. Rendering Engine – the artist, painting the web page. Examples of rendering engines:Blink (Chrome, Edge), Gecko (Firefox), WebKit (Safari)

  4. Networking – fetches resources from the server. Uses protocols like HTTP, HTTPS, DNS.

  5. JavaScript Engine – executes scripts and manipulates the page. Examples of JS engines: V8 (Chrome, Edge), SpiderMonkey (Firefox), JavaScriptCore (Safari)

  6. UI Backend – interacts with your operating system to draw windows and widgets.

  7. Data Storage – stores cookies, cache, localStorage, IndexedDB.

How browser works step by step

  1. Networking component send HTTP request to the server and server sends the response to the Browser.

  2. Once the browser receive the full chunk of data, it began parsing. Parsing is done by rendering engine.

    Parsing is the step the browser takes to turn the data it receives over the network into the DOM and CSSOM, which is used by the render engine to paint a page to the screen.

    DOM Tree

    • Browser first read the raw html document from the network and convert each byte to character based on file’s encoding (e.g. - UTF-8).

    • Then html parser (part of the Rendering Engine) in the browser convert these characters into distinct tokens. Each token represents a part of the HTML structure, such as <html>, <body>, or a text string.

    • Then Each token then converts into object or node like HTML tag becomes an "element node", text becomes a "text node", and a comment becomes a "comment node".

    • Finally the parser links these nodes together in hierarchical order like tree structure which creates the DOM tree.

CSSOM

  • The browser first fetches the CSS files from the network (or reads <style> in HTML).

  • The CSS parser (part of the Rendering Engine) reads the raw CSS text character by character.

  • It tokenizes the CSS: selectors, properties, and values.

  • Each rule is converted into objects called CSS rules.

  • These objects are then linked together into a hierarchical tree called the CSS Object Model (CSSOM).

    • For example, rules inside a media query are nested under that query node.
  • The CSSOM represents all style information the browser will apply to the DOM nodes.

  1. Render Tree

    • The Rendering Engine combines the DOM tree (structure) and the CSSOM tree (styles).

    • This combination creates the Render Tree.

    • The Render Tree contains only visible elements.

      • Elements like display: none are excluded.
    • Each node in the Render Tree knows:

      • What element it represents

      • What styles apply to it

  1. JavaScript Execution

    • While parsing HTML, if the browser encounters JavaScript:

      • The JavaScript Engine executes the code.
    • JavaScript can:

      • Modify the DOM (add/remove elements)

      • Modify styles (which updates the CSSOM)

    • If DOM or CSSOM changes:

      • The Render Tree may need to be rebuilt

      • Layout and painting may happen again

  1. Frame Construction

    • After the Render Tree is ready, the Rendering Engine performs Frame Construction.

    • Each render tree node is converted into a frame (box).

    • Frames store:

      • Size

      • Position

      • Style information

Think of frames as boxes that will be placed on the page.

  1. Layout (Reflow)

    • Now the browser calculates:

      • Exact width and height of each frame

      • Exact position on the screen

    • This step is called Layout or Reflow.

    • Any change in size (font-size, width, viewport resize) can trigger reflow.

  1. Painting

    • After layout, the browser starts painting.

    • Painting means:

      • Filling pixels with colors

      • Drawing text, borders, shadows, images

    • Each visual part of a frame is painted separately.

  1. Final Render

    • After compositing, the final image is sent to the screen.

    • You now see the web page.

Very Basic Idea of Parsing

Imagine you see this written:

2 + 3 × 4

Step 1: Read the raw characters

The computer first sees characters, not math:

'2'  '+'  '3'  '×'  '4'

This is just text.

Step 2: Tokenization (breaking into meaningful pieces)

Now the parser groups characters into tokens:

  • 2 → Number token

  • + → Operator token

  • 3 → Number token

  • × → Operator token

  • 4 → Number token

So the parser understands: “These are numbers and operators”

Step 3: Build a structure (tree)

Math has rules (operator precedence).

So the parser builds a tree structure like this:

    +
   / \
  2   ×
     / \
    3   4

This tree means:

  • First calculate 3 × 4

  • Then add 2

Step 4: Use the structure

Now the computer can:

  • Evaluate it → 2 + (3 × 4) = 14

  • Or transform it

  • Or optimize it

Summary

When you type a URL and press Enter, the browser fetches data from the server using the networking layer. The rendering engine then parses HTML to build the DOM and parses CSS to build the CSSOM. These two trees are combined to create the render tree, which is used to calculate layout, paint pixels, and display the page on the screen. JavaScript can modify the DOM and CSSOM during this process. Overall, a browser works as a collection of components that cooperate to turn raw web data into a visible, interactive webpage.

Thank you for reading & Happy Coding 👨‍💻


References