Many people, when opening a Chrome extension project for the first time, see a bunch of filenames and immediately get confused. index.html, style.css, manifest.json, background.js, content.js, app.js all look like "code files," but they don't operate on the same level at all. To truly understand an extension, the key isn't memorizing filenames, but understanding where these files are positioned, what responsibilities they bear, and how they work together with the browser, web pages, and user interface.
What this article aims to do is straightforward: to thoroughly connect the logic behind these files, so that when you see an extension directory, you can quickly determine what each file is actually doing.
index.html is the skeleton of the extension interface
In a Chrome extension, index.html is typically used to describe the content and structure of a page. It determines what elements will appear on the page, such as headings, buttons, input fields, text areas, and is also responsible for linking in style files and script files. You can think of it as the skeleton of an interface, because what "exists" on the page is primarily determined by HTML.
If an extension has a popup interface, the content seen after clicking the extension icon usually comes from an HTML file. Some projects name it popup.html, others might call it index.html. The name isn't important; what matters is whether it's actually loaded by the browser as an extension page.
The following example is sufficient to illustrate its role:
In this code, what content is on the page is entirely determined by the HTML. It tells the browser that this page has a heading and a button, and also imports a style file. Every interface element you see on the page essentially starts here.
style.css determines what the interface looks like
If HTML is responsible for laying out page elements, then style.css is responsible for making these elements readable, hierarchical, and more like a truly usable interface. Font size, color, background, margins, button appearance, arrangement between elements—these all belong to the domain of CSS.
For example, the following code:
This style doesn't change what "exists" on the page, but it significantly changes what the page "looks like." This is precisely the role of CSS. Many beginners initially confuse HTML and CSS, but in reality, they solve two completely different problems. HTML determines content and structure; CSS determines visuals and layout.
It's the same in the extension environment. Whether this page is a popup page, an options page, or another interface extended by the extension, the responsibility of CSS remains stable: to turn the originally rigid structure into something readable, operable, and more in line with interface conventions.
In extensions, HTML, CSS, and JavaScript each occupy different positions
Looking only at HTML and CSS, you've only understood the static part of the extension interface. A truly usable extension must also make the interface "dynamic," and that's when JavaScript comes in.
HTML is responsible for building the page structure, CSS is responsible for giving it visual effects, and JavaScript is responsible for making the page react to user actions. For example, fetching information after a user clicks a button or displaying a result on the page—these are the jobs of JavaScript.
The following code is simple, but it accurately demonstrates the role of JavaScript:
Now you can see the collaborative relationship between the three. HTML places a button, CSS makes that button clearer and more usable, and JavaScript gives that button the ability to "do something when clicked." Together, they form the complete logic of the extension interface layer.
manifest.json is the entry point and rule center of the extension
When you shift your focus away from the interface, you'll see the extension's core configuration file: manifest.json. This file is extremely important because when the Chrome browser installs and loads an extension, it reads this file first. Without it, the extension cannot be recognized. If it's written incorrectly, the extension may not run at all.
Its responsibilities can be summarized as one thing: telling the browser who this extension is, what pages it has, what scripts it has, what permissions it wants to request, and how these capabilities should be organized.
The simplest content usually looks like this:
This records the basic identity information of the extension. Next, you'll also see it declare the extension's popup page:
When the browser reads this, it knows that when the user clicks the extension icon, it should open popup.html. If the extension has a background script, a similar configuration will appear:
If the extension needs to inject scripts into web pages, it might look like this:
If the extension needs to access certain browser capabilities, it must also explicitly request permissions:
So, from a more fundamental perspective, the role of manifest.json is to define what the extension "can do, where it does it, and through whom." This is why it's like a master switch. The pages, scripts, and permissions you see later ultimately need to be confirmed here.
background.js is like the extension's background dispatcher
After understanding manifest.json, looking at background.js becomes much easier. This file is typically not responsible for displaying interfaces, nor is it directly embedded in a web page. It's more like the extension's background control layer, responsible for listening to browser events, handling global logic, and coordinating communication between different modules.
For example, when the extension is first installed, it can execute initialization logic:
It can also listen for certain global events or receive messages from interface pages and content scripts:
Why does an extension need such a file? Because some things aren't suitable to be written in interface scripts or web page injection scripts. For example, managing state uniformly, coordinating multiple tabs, handling browser-level events, accessing certain APIs only allowed for background use—these tasks are more suitable to be placed in background scripts.
If you're using Manifest V3, there's an important change here. In many cases, background.js runs as a service_worker. It doesn't stay resident all the time; it wakes up when an event arrives and may go to sleep after completing its task. This reflects Chrome's design orientation: it wants extensions to be more resource-efficient and easier to control risks.
content.js is the executor after the extension enters the web page
If background scripts handle logic at the browser level, then content.js works on-site within the web page. It is injected into a web page, so it can directly access that page's DOM—the actual element structures on the page like headings, buttons, body text, and input fields.
Take a simple example:
This code can directly read the title content on the web page. It can also modify the page:
It can even listen for certain operations on the page. In other words, the core value of content.js lies in allowing the extension to truly enter the web page environment, see the page content, and read or modify the page.
However, there's a boundary here that's very easy to overlook. Although content.js runs in the web page, it still belongs to the extension. It has the capabilities granted by the extension and is also subject to the restrictions of the extension environment. It doesn't share everything completely with the page's native scripts because the browser uses isolation mechanisms to prevent them from polluting each other. This detail is crucial because many beginners mistakenly think content scripts and the web page's own JavaScript are exactly the same thing, but the reality isn't that simple.
The core difficulty of extensions lies in the simultaneous existence of multiple runtime environments
When you look at popup.js, background.js, and content.js together, it's easy to think they're all JavaScript, so they just seem to have different writing styles. The real difference isn't in the syntax, but in the runtime environment.
Interface scripts run in the extension's own pages and are only active when those pages are open. Background scripts run in the extension background, specifically handling global events and relay logic. Content scripts run in the target web page, responsible for interacting with the page itself. Although these three types of scripts are all written as .js files, the objects they can access, the permissions they have, and their lifecycles are all different.
This is where the learning curve for Chrome extensions truly becomes steep. What often holds you back isn't the API, but the lack of a mental picture of "multi-environment collaboration." Once you establish this picture, looking at the file structure becomes much clearer.
How does a most common collaboration process actually run?
Let's assume we're building a very simple extension. The user clicks the extension icon, a small window pops up, there's a button inside the window, and clicking the button reads the title of the current web page and displays it. This feature is small, but it's enough to connect all the roles mentioned earlier.
First, the browser reads manifest.json:
After this step, the browser already knows what this extension looks like, what popup page it has, what background script it has, and that it has requested permission for the active tab.
When the user clicks the extension icon, the browser opens popup.html based on default_popup. Once the page opens, HTML renders the structure, CSS handles the styling, and the page script handles the interaction logic. If popup.html has a button and an area to display the result, the script can be written like this:
If the requirement is just to read the title of the current tab, this is enough. But if you want to read more detailed content inside the web page, like a paragraph of body text, a button's text, or an element's attribute, then relying solely on the popup script usually isn't enough; you need content.js to enter the web page scene to execute.
It can first read the web page content and then send the result back to the extension system via the messaging mechanism:
At this point, if the process is slightly more complex, the background script will come into play, taking on the role of coordinator. For example, the popup first sends a message to the background, the background then contacts the content script in the current tab, the content script gets the web page data and returns it to the background, and the background then forwards the result to the popup. This chain seems to add an extra layer, but you'll find its division of labor is clear. The interface only handles user interaction, the background handles coordination and dispatching, and the content script only focuses on the web page scene.
The sample code can roughly be written like this.
popup.js:
background.js:
content.js:
Once you truly understand this process, you're already introduced to the overall architecture of extensions. Because you'll realize that the essence of extension development isn't simply writing a page, but connecting the browser environment, extension interface, and web page environment into a system.
What exactly is app.js, and why is it always confusing?
Many people learning up to this point see a new filename, app.js, and start wondering if they've missed learning some "official role." Actually, the most important point to clarify here is that app.js is not a file that must exist according to Chrome extension specifications. It's usually just a name developers give themselves.
This means that when you see app.js in an extension project, you can't directly judge what it's responsible for based on its name. The key to determining its responsibility always comes down to two things: where it's loaded and in what environment it runs.
If popup.html has code like this:
Then this app.js is likely the main logic script for the popup page. It might be responsible for listening to button clicks, getting input field content, calling browser APIs, updating page text, and other interactive behaviors. For example:
If it's imported by options.html, it might be the settings page script. If it appears in the background configuration of manifest.json:
Then even though it's named app.js, it actually takes on the responsibility of a background script. If it's declared in content_scripts:
Then it's essentially a content script.
So, the most important sentence for understanding app.js is: The filename itself does not determine identity; the loading location and runtime environment determine identity. Many beginners are easily misled by filenames, thinking that something named a certain way must be that thing. In actual development, names like main.js, index.js, and app.js are very common; they reflect engineering naming habits more than browser specifications.
What is the most reliable method for determining a script's role?
If you ever open an unfamiliar extension project again, the safest way to read it is to first look at manifest.json, then at the HTML files, and only then at the specific script content.
manifest.json will tell you what pages there are, what background scripts there are, what content scripts there are, and what permissions are requested. HTML files will tell you which JavaScript files are page scripts because they are directly imported via <script src="...">. The script content itself will further tell you what specific business logic is written inside this file.
This reading order is important because it forces you to understand the code from the "runtime context" rather than from "filename guessing." Once you develop this habit, not only will you understand Chrome extensions much