Author: Kay Ramme Created: 01/25/2006 Type: white paper
From a user and consumer perspective, computers and networks of computers are currently just what they seem to be, a collection of loosely coupled entities. Consumers and users need to explicitly take care where they store their data, where they run their processes and where they display their content. Consumers need to ensure that their data stays consistent when distributed and backed up, that software gets updated regularly and that processes become restarted on system shutdown or crash. They need to ensure that links (e.g. URLs) stay valid and so on. Actually, users and consumers are forced to do what classically were the jobs of system administrators.
This paper suggests to completely remove the burden of any system administration from users, consumers and administrators, by establishing a software system / framework, which deals with administration automagically, falling back to viable defaults for most scenarios, providing a new freedom and simplicity when interacting with computers or computer networks.
The author believes that this is technically feasible and desirable and that the importance of such a framework at least grows linear with the number of installed systems and computer appliances (such as cell phones, PDAs, set-top-boxes, etc.).
Some scenarios are described which are in one or the other way unsatisfactory, they are opposed with ideal scenarios. Requirements are derived out of the ideal scenarios, giving a brief understanding of how a supporting (software) system would look like.
Because of the very important aspect of continuity (see below), a fulfilling system is called a Software Continuum.
- 1 Scenarios
- 2 Requirements
Below are some usage scenarios, describing the current state of affairs. All scenarios emphasize problems with current systems, may it be that the user needs to fulfill administrative tasks or that things are inconvenient or error prone.
System Shutdown / Crash
Current: If for any reasons a system crashes or needs to shut down, the user / administrator has to ensure that all provided services / running applications become available again after restart of the system or after migration to another system. This is / can partly be automated with initialization scripts etc., which is in particular difficult if switching hardware or hard disks, and which is in general not possible for the users applications (word processors, spreadsheets, etc.). In particular, most runtime data, as open connections, state information (e.g. window positions, open documents, recent changes), etc., get lost, as current processes are neither persistent nor transactional, and may leave the system in an inconsistent state, even if using journaling file systems.
Ideal: A system works transactional always and on all levels, ensuring that only a specified amount of data may be lost during a crash, guaranteeing that the system always comes up in a consistent state. All data and state is persistent and platform independent, which allows to migrate running processes / applications to another system, to ensure seamless operation, in case of a needed shutdown.
Deployment of Software
Current: Today, users need to explicitly deploy or install software on (exactly) one computer. Which basically means, that the software has been deployed for this one computer only. That means, that a user always needs to be aware of, which software had been deployed where, while his/her ability to roam from one system to another is limited. The user also needs to explicitly ensure that all installed software is kept current. The user has to administer all his computers, which include PCs, set top boxes, cell phones, PDAs, game consoles, car systems etc.
Ideal: All of a users software is explicitly listed in the users profile. All software is always available while the user roams from one computer to another. On the availability of new versions of a particular software, the system automatically synchronizes the local copies with the provider. The user does not need to carry out any administrative tasks.
Remote Hardware Access
Current: Their is a trend towards remote hardware access in current computer networks, providing access to video or sound devices of remote systems. This access is implemented explicitly, by inventing another (the first level is the operating systems driver abstraction) level of abstraction, creating diverse streaming protocols etc., and adapting current software sub systems to be able to use these abstractions. Examples are NAS, NFS, X11, VNC, Samba, ssh, rlogin, telnet. All current solutions differ in user interfaces, abilities and configuration, leaving the administration to the user. The end goals of such systems is to eventually reach remote transparency.
Ideal: Any kind of hardware may be used remotely, the user just selects what he wants to use via a common interface. Programmers only develop standard drivers for the different hardware, while leaving remote accessibility to the runtime system, thus remote accessibility becoming the default. Only in case of special requirements, the driver programmer needs to implement a custom protocol to solve any streaming problems.
Current: Users with the need to share data can only do this explicitly, e.g. sending around document files via e-mail. This inherently injects multiple copies into the system, which need be versionized and synchronized properly.
Ideal: Users would just make data available as services. E.g. a document to be worked on, in a collaborative way, could be exported as a service. Others would then be able to directly alter this document.
Current: Currently, there is no generic notion of a users current physical context, namely being the set of devices next to him, which participate in the network. That means, that users explicitly have to select the printers they want to print on, the monitors where they want their application to show up and the speakers they want to listen to. This is an administrative burden in particular when a user roams, because he / she has to ensure that the current default devices reflect his / her location or time zone.
Ideal: The set of default devices for a user reflects the users location, ensuring that print outs appear on the most logical (nearest) printer and that applications appear on the nearest monitor. All locations are described as contexts, which the user enters when logging into the system. Contexts may be nested, e.g. to reflect different levels of locality, e.g. table / room / building /city / state / country / continent / planet / galaxy, even security privileges can be associated with a particular context, for example ensuring that some types of documents may only be viewed while being physically in the company. In another example, the outer context of a cell phone could provide the build-in headphone and microphone as the default for creating and recording sound, which could be overridden by a car's stereo system in the moment the user enters the car.
Current: Users using remote services are basically lost in case the connection breaks down unexpectedly. In many cases the current state of work is lost after reestablishing the connection, forcing the user to recreate the state of affairs. In case of announced downtimes, users may copy their data to local storage beforehand, to be able to work offline, but are also forced to explicitly synchronize back this data when going online again.
Ideal: The system automatically takes care of distributing data and processes, leveraging local storage to its full extent for caching, ensuring that the user can continue to work as long as the desired data is cached. The user is even able to advice the system to go offline and to specify data to be cached, while the system takes care of automatic synchronization when attaching back to the network, resolving potential conflicts automatically.
Current: Long distances with typically high latency can render remote services unusable. E.g. a mounted NFS volume exported by a server in China and imported by a client in Germany does not provide similar performance to a co-located export.
Ideal: Accessing distant resources which are migrate- or copyable often automatically leads to a migration or copying of the accessed data, giving the same performance as accessing local resources (replicate on demand).
Current: Currently, users have to distribute and synchronize their data explicitly. Users distribute their data while roaming from one system to another. Users have to take care, that modifications to one copy of the data get merged with the other copies. Depending on the type of data, this is easy to impossible. Users are forced to do data administration.
Ideal: The user roams from system to system, while all data is instantly available, without the need to distribute it explicitly. The user even may alter his data on multiple unconnected systems, while the system ensures that all changes get merged when the systems get in touch again.
Current: Current software mostly only interoperates on an OS defined level (Drag&Drop, clipboard, Object Linking and Embedding (OLE)). An applications components / objects can not easily be integrated with other applications, if not explicitly designed for interoperation. E.g. an application can not easily store a document into a data base or web-dav server, without dedicated support. The most generic level of interoperation currently is the file-system, mostly providing access to byte sequences only.
Ideal: Applications are sets of components / objects, which are addressable and combinable independently. A component / object which accesses data bases may be used by every other component or object. Therefor new software can easily be constructed out of existing parts.
Current: Applications wanting to support multiple (distributed) views on the data (model) need to explicitly support this. There is actually no generic way e.g. to realize teacher / learner or collaboration scenarios.
Ideal: Ideally two (distributed) views would just share the same data (model), allowing concurrent manipulation of a text document.
Current: A user logging into a system has to start and adjust (e.g. to position the windows) all programs needed. This can partly be automated by saving a sessions, but which only works for programs supporting it and which typically only includes very few state.
Ideal: A user logging into a system finds it in the same state as when he / she logged out. All program state had been saved (including transient changes e.g. last typed characters of the current document) and all data is instantly available again.
The following requirements are roughly derived of the above scenarios. They are mostly independent of each other, only sometimes imply directly or indirectly one of the other requirements. A system supporting these requirements supports all of the above scenarios inherently.
- Continuous: A system needs to be continuous, there must be no breaks in abstraction or interoperation.
- Extensible: A system needs to be fully extensible. First class parts, e.g. types, can be added any time (build-in vs. add-on types are indistinguishable) and may behave exactly like build-in parts.
- Contextual: Contextual data access must be supported. Object orientation can be seen as one way of context implementation.
- Transparent: The system must be transparent for the user and the programmer. A transparent system allows the programmer or user to focus on its tasks, without the need to understand or to care about other things. Any transparency must be escapable, to allow the programmer to solve special aspects explicitly.
- Consistent: The system must be transactional, guaranteeing consistency all the time.
- Replicatable: The system must allow to distribute / replicate the whole or parts of it, over a network or into a file.
- Accessible: All parts of the system must always be directly accessible by the user. Systems only accessible over an API are not directly accessible (this is not to be confused with 508 requirements).
- Comparable: The system and its parts must be comparable. That means that parts can be diffed and joined (like in classical source code versioning systems as CVS).
- Scalable: The system must scale, e.g. allowing different CPUs to alter parts of it simultaneously.
To achieve continuity, one may take a look at basic software operations with similar semantics (e.g. reading data from a file, reading data from memory) and provide a way to support these operations in the same way, by introducing transparency.
Examples for mismatches and their solution:
- primary vs. secondary storage (RAM vs. hard disk) -> Persistence
- local vs. remote access -> Remote Transparency
- file-system vs. database -> Transactional Transparency
- explicit vs. automatic concurrency -> Transparent Concurrency
Remote transparency typically involves the requirement to access remote resources (resources of another process on the same or another machine), the same way as local resources. Remote transparency is currently not the default, but the exception. These exceptions are domain specific (e.g. NFS for files, NAS for audio, X11 for rendering 2D) and do not apply in general.
With full remote transparency, a client would not be able to distinguish local from remote resources at all. Unfortunately, this goal is not hundred percent reachable. Fortunately, it seems, that this is not necessary to be of use. Some problems of remote transparency are:
- Latency or time transparency - Hardly solvable, but does not need to be as there are no time constraints in most systems (except realtime) anyway.
- Unexpected errors: Local objects are never unreachable, while remote objects may be.
To be fully extensible, the system must support the addition of first order parts. Build-in parts are typically examples of first class parts. E.g. if a system supports template types, such as arrays or structures, it needs to support the addition of new such template types.
Every operation is executed in a context. The contexts can be seen as a nested set of capsules, providing data (e.g. “global” variables) etc.
Ideally, a programmer developing software for a “transparent” system does not need to know about the system, but only about what should be achieved. Things, which could be taken care of by the system, should be taken care of by the system, examples are:
- Transparent Concurrency: Ensuring mutual exclusion of multiple threads of execution for a piece of code and the associated data, can either be realized by the programmer by implementing proper locking internally, or by the system externally.
- Transparent Transactionality: Transactionality can either be programmed explicitly into an objects methods by the programmer, or it can be achieved by the runtime system, e.g. creating a back up of an object before altering it. In case of a roll back, the original object could then be restored from the back-up.
- Remote Transparency: Distributing objects over processes or hosts can either be done explicitly, e.g. by programming communication protocols, or implicitly, by system provided proxies. The latter achieving remote transparency.
Any transparency must be escapable, for the programmer to gain full control. E.g. remote transparency must be dissolvable, allowing the programmer to explicitly detect remote objects and to send and to receive messages to and from these objects.
Continuous consistency may be achieved by transactionality. Basic operations for transactionality are the creation of a transaction and the final commit or rollback. All data in the system need to be initialized and finalized properly.
Replicatable systems provide mechanisms to distribute all or parts of the system. This may be done explicitly by the user (“copy all my data to this laptop”) or automatically (“cache data with high access frequency locally”). Basic operations for replication are move, copy and merge.
All data must be accessible by the user at any time. This may be restricted to the users data only, to ensure that the system does not become vulnerable. Accessibility may be realized by providing human readable textual descriptions. Support operations for accessibility are parse and un-parse.
All data must be comparable, to allow the user or the system to find what the differences are between two particular values of a type. This can be compared to what is provided by classical source code control systems, e.g. CVS or SCCS. Basic operations are join, merge, diff and branch.
The system must scale with the different dimensions it can be extended. E.g. if the system runs on a network of computer systems, it must be able to fully utilize added computers and storages.