There has been a lot of posts and flutter on Google’s new purported Chrome OS and how it will be a great battle with Microsoft, how it will confuse and/or kill Android etc. And also, of course, taking off from the Google Blog on Chrome OS “For application developers, the web is the platform. All web-based applications will automatically work and new applications can be written using your favorite web technologies” people are also throwing in “Web OS” everywhere.
I thought I’d write a post detailing my thoughts (technical) on this entire issue.
First off, let us define what is a “Web OS”. The market uses this term mostly for a “web based desktop environment” which is nothing but a standard OS, which happens to export a desktop view over HTTP so people can launch a browser, and work on a remote OS via the web. Technically speaking, this is not a “Web OS”. It is any standard OS (Linux, Windows, etc.) that happens to give users desktops which can be accessed via the web. The innovation here has been that with the advent of broadband, and AJAX, providing more realtime web based access to a remote desktop is faster than it was years before. To put it succinctly, in this “definition” of WebOS (which is a misleading term), the innovation is AJAX – how to update the UI quickly and smoothly so that users don’t feel the latency of the web. Example of such “Web OSs" are EyeOS, Ghost OS, etc.
And then, more recently, the term “Web OS” has been used in a different connotation, which to me is still not purely correct, but more accurate that the above usage. While such efforts are not new, with the increasing popularity of the use of the web as a platform that powers social interactions, these efforts have taken on a renewed vigor. In this definition, the “Web” is not used as a platform to “deliver the OS UI” as in the previous example. In this case, “Web Components” are integrated “as closely as possible” into the core OS architecture, thereby making web based access and approach (web based programming) faster and more powerful. But let us get into this in a little more detail, as this is where, really, Google Chrome OS, Palm’s WebOS and others are headed.
The path to a “Web OS”
Let’s first take a look at the traditional OS and application layer architecture (there are variances, but let’s take the most general case)
Basically, at a layman level, the “core system OS and utilities” is the heart of the operating system. It takes care of access to hardware, memory management, thread management, application lifecycle (how to start, suspend, terminate any application). Typically, in addition to this, it includes a TCP/UDP/IP stack as a device driver that any overriding application can use for network communication. All of this is at ‘Kernel space’ which is ‘sacrosanct’ in an operating system. At a high level, any OS is divided into two spaces, the ‘kernel space’ and the ‘user space’. Typically, the ‘kernel space’ uses a single memory address space (again typically) which makes applications execute fast, but it also means that a crash there can crash the entire system. In contrast, the user space hosts all ‘userland’ applications, with each one having its own virtual memory space. This architecture attempts to make sure that a rogue application does not bring down the complete system. In addition to that, userland applications do not get direct access to system hardware. For example, if they need accelerated graphics, they will have to use a system library to access it (example DirectX). Now, let us assume, someone is writing a “facebook application” on this OS via the facebook aPIs. To the Core OS, there is no difference between this application and a excel program. It is just another process/thread (or a set of) that the OS is managing and multi-tasking. In this model, the Core OS is completely unaware of the functions that app is performing at a macro level (I mean, it knows this app is using TCP/IP, but doesn’t really know it is using facebook payload or SOAP payload on top of HTTP on top of TCP). At userland, applications have very limited control of prioritization and scheduling of its own self (besides making requests to the OS). One of the key delay factors introduced by userland applications is the concept of “context switch”. Each time the userland application invokes a system API, the OS “switches context” which is an expensive operations (involves saving data space, registers, etc.). This “context switch” also happens if the OS decides to switch this application with another one (why? Because that is how multitasking is done – context switch frequency is in milliseconds) or an interrupt occurs (more details here).
To cut a long story short, this is a prime architecture example of the OS treating the “web application” as just another application. No special recognition. No special treatment. Further more, there are several issues such as, tere is no guarantee that different web apps use the same stacks – some systems provide basic stacks like HTTP, while others need the application to bundle one. While it is technically possible to ‘share’ these resources, they are very often not shared and each application uses its own (more bloat, more memory)
Furthermore, browser engines such as webkit are also embedded in userland, which means for each time a user develops a UI using HTML and CSS, to actually render the image on the screen, there is a context switch going from user to kernel space.
So really, that is where I see a “Web OS” come in
The core principles of a “Web OS”, to me are:
- The browser will be converted into an application canvas. To put it another way, there will be no difference is capabilities or speed when building a native application as we know it today, vs. writing an HTML/JS/CSS application that is rendered by the browser. Think of it another way, today, when a program gets executed, it is represented in a particular way (say ELF in Linux) – the loader decodes the ELF headers and transfers execution as specified in the headers. Well, in a Web OS, that ELF concept is based around the browser object. The ‘Browser’ engine is the system execution engine. Programs written for the browser, are therefore ‘native’ to the system. It is important to also realize that a “browser” is not the end application with an address bar and menu. That is why I keep using the word “browser engine” – it is just a plain canvas, that happens to understand protocols/data formats/specifications like HTTP/HTML/JS/Ruby/CSS to ‘display and execute’ program logic
- As part of the programming interface, the programmer will have the capability to extend the core JS (or Ruby or whatever) engine and write plugins to access system resources where needed (theoretically they should not have to, but practically, we will always find resources that are not available by normal API access)
- Access to any system resource will be via a URL scheme (again, a URL does not mean it needs to connect to a resource outside of your computer)
- Browser memory management will be part of the kernel memory management. Threads and processes instantiated within the browser engine will be directly mappable to system threads/processes
- The OS (or a privileged layer on top) will provide ‘building blocks’ that will allow developers to build applications that can tie into social networks better (what this means depends on what aspects of social networks are considered to be core by the designer)
- The OS will have the capability to completely, and natively manage whether a resource is available locally or remotely. However, it is important to note that ‘having the capability’ is different from ‘doing it’. To me, a ‘Web OS’ is about adapting web programming standards to build powerful and native applications effectively. It is not about forcing the developer to use or not use any resource outside of his/her computer. In other words, it will eventually be upto the developer to access or not access remote resources. However, should the developer choose to do so, he will not have to manage them separately – they will be transparent, but configurable.
- The ‘Web OS’ will not reinvent from scratch, scheduling, pooling, switching etc. It will base itself on robust kernels (like Linux) and extend the code to make the web components discussed above more integral and the prime focus for application developers
- The desktop presented by the OS will effectively be a headless browser – with every object made dynamic and updatable in real-time via technologies such as (reverse)AJAX, RSS, push/pull etc.