KnownSpace Symphony

What's wrong with the desktop that a million hobbyists couldn't solve?

The ITS system is not the result of a human wave or crash effort. The system has been incrementally developed almost continuously since its inception. It is indeed true that large systems are never finished.... In general, the ITS system can be said to have been designer implemented and user designed. The problem of unrealistic software design is greatly diminished when the designer is the implementor. The implementor's ease in programming and pride in the result is increased when he, in an essential sense, is the designer. Features are less likely to turn out to be of low utility if users are their designers and they are less likely to be difficult to use if their designers are their users.
Donald Eastlake, 1972, quoted in Hackers, page 127

Task

To develop KnownSpace Symphony, a multi-user, persistent KnownSpace.

Summary

KnownSpace Symphony is intended to connect remote java developers through a distributed java development environment, making possible transparent network establishment, peer and resource discovery, peer authentication, code development, code sharing, and direct component integration into running sessions. By reducing code development friction (geographic separation, barriers to resource discovery, authentication issues, data and code non-persistence, limitations on code sharing, and indirect code integration because of separate checkout, ftp, compile, and link steps) it is intended to speed up any java development project. Think of it like a cross between AIM and Napster and cvs and a java IDE, all on steroids---and targeted solely at java developers.

KnownSpace Introduction

KnownSpace is a personal data manager that eventually could be a common platform for worldwide research and collaboration in artificial intelligence, networking, and user interfaces. There are numerous details of the project's overall code structure, so here's a brief summary:

Entity - a data holder. Any number of entities can be linked to any entity as attributes of that entity. Entities can also be clusters of entities, generalizing the notion of "folder" or "directory" into category, topic, or concept. Any entity can belong to any number of clusters. Entities have names but those are not unique; their identifiers however are. Entities and their linkages store data relationships. the set of entities is arbitrarily extensible.
Entity Value - a piece of data (a webpage, an email message, a file, an image, a person, a corporation, a university, a website, an ip address, a single line of text in an instant message session, an entire instant message session, whatever). The set of entity values is arbitrarily extensible.
Entity Pool - an entity server.
Event - what it says. The set of events is arbitrarily extensible.
Event Pool - an event server; it acts as a distributed and anonymous event channel, as in the Event Notifier design pattern.
Constraint - a predicate or filter or search criterion describing and serving specific events (Event Constraint) or entities (Entity Constraint). the set of constraints is arbitrarily extensible.
Simpleton - a piece of code. Simpletons only communicate with each other through events; there is no static method-binding between simpletons. The set of simpletons is arbitrarily extensible.

Knownspace Hydrogen - the original alpha release of the data manager.
Cerulean - one of several pluggable Hydrogen user interfaces (one other interesting one is EverSphere, so far though it's just vapor).
KnownSpace Spectra - an interface builder built on top of sun's beanbox to make entire desktops more easily buildable and sharable by normal (non-programmer) users, just as web pages are. Now superseded (in the design phase anyway) by Fluency.

KnownSpace Symphony needs

Java developers need to discover each other on the net, they need to authenticate themselves to each other, and they need to form dynamic developer teams:

discovery and membership [myjxta]

peers can announce themselves to be findable by other peers
peers can find other peers regardless of firewalls and dynamic ips
peers can join a group and establish its membership requirements controlling the joining of other peers
peers can agree on what each peer in the peergroup can do
peers can find out something about what other peergroup members are doing or at least whether they are even connected presently

Developers need to exchange data and programs, do shared designs across the network in realtime, and chat with each other:

transport [jxta or jetty & xml]

peers can chat with other peers [visual chat], use a shared whiteboard, and talk with voice over ip
peers can exchange entities, events, and constraints
peers can exchange simpletons and integrate them into running sessions

Developers need to be able make group-accessible some parts of their environments while keeping other parts private:

sharing [permissions & entityproxies]

peers can see (some portion of) other peergroup member's data space as if they were sharing a lan, without actually being forced to fetch the entityvalues associated with those entities unless they need them

permissions

permissions specify capabilities of the peer, any peergroup member, and other peers
pools can control the operations they support, which peers can execute which operations on them, how many operations per unit time it allows, and when those operations are allowed, if ever
sessions can control operations allowed in the session

Developers need to have their data and programs and communications (and session state) be persistent:

persistence [xspaces or mysql & xml]

entities are persistent
events may be persistent
communications are persistent
sessions are persistible (i.e not just the entities, but all the currently executing simpletons, currently existing events, and currently used constraints) so that they can be saved in cvs

versioning

simpletons are versionable even when they are being jointly edited by multiple peers

Developers need to be able to modify their interface on the fly to make it better suit improved development:

interface [contextbox]

designers need a flexible interface to snap together shared interface components as peers in the group develop them

pools [currently multiple pools are contested]

sessions support multiple pools for email, news, ftp sites, websites, search engines, remote databases, local and remote file systems, remote sessions, etc, yet have all entity pools hidden behind one entity pool and all event pools hidden behind one event pool

sessions [postponable features]

sessions are hot-restartable
sessions can be checkedout from cvs into another running session (seems hard)
sessions have transaction support (logs, undo, redo) and an expandable operation set

struts

reliable and fast persistent sessions
inter-user event, entity, constraint, and simpleton transport
seamless multiple local and remote pools
session-level and pool-level permissions

vision

let's say we built the four struts above. what could we then do? suppose that bryan is running a symphony session and gordon is also running a symphony session. bryan's just finished developing some cerulean simpleton that has a visual frontend with some controls he can click on to generate events that can then be noticed by other simpleton instances of the same class. bryan notices gordon, registers with him, and send him a message wrapped in an entity: "hey gordon, i just built this neat thing. wanna see it?". gordon accepts and bryan's session ships him an xml descriptor file for the new simpleton, including source code. gordon browses the code and clicks "integrate". his session uploads the code, compiles it, then creates an instance of the simpleton inside his running session. gordon sees the simpleton's visual frontend appear on his screen. bryan then clicks on one of the controls in the visual frontend of his instance of the simpleton running on his machine and that event is transported to gordon's session, where the simpleton instance running there can notice it. the event generated in bryan's session by bryan's actions triggers some visual change in the simpleton's display on gordon's machine, which gordon notes. meanwhile gordon can also play with the widget as it's running in his session. perhaps there's something else to click in the widget and thus perhaps bryan's simpleton instance can in turn notice gordon's remote actions. depending on how the simpleton is written, they could each control a little piece of the other's session.

when gordon closes his session and restarts it, it is in exactly the same state as before---including the new simpleton, which he can of course modify or he can build another simpleton that works with the first one---using only events for communication between them---and then transport it to bryan. he can also save his entire session to a cvs server, for checkout by anyone, or he might checkin only the public parts (the simpletons, mainly) and keep a private copy for himself of the entire session (containing his private data---example, his mail and so on). the two could work together to rebuild all of cerulean in maybe a few months of evening hobby work, even though one is living in austin, the other is living in houston, and both are working fulltime. anyone else willing to run their code can get the entire thing just as easily and have it integrated into their running sessions. with symphony, gordon and bryan could rebuild cerulean a tiny piece at a time in realtime armed only with parts of the above four struts: multiple pools, pool persistence, pool-level and session-level permissions, dynamic registration, and dynamic event, simpleton, and entity transport between them, with no centralized cvs server or separate compile step or loss of data or state on session close.

the next day, gregory wakes up, fires up his symphony session, which he has set to accept all new simpletons from either gordon or bryan because he's a trusting soul, and automagically has a completely new system to play with. he is very happy.

basically, this vision is that of KnownSpace Spectra, but intended to support knownspace developers rather than non-programmer users. it makes the development environment very malleable and so should allow much more rapid development. it is, of course, entirely unsecure.

building the struts

jack and matt have already made a start on the first strut: persistent sessions. that leaves just three: multiple pools, inter-user transport, and permissions. supporting those three leads to many smaller problems, but all seem solvable.

design problems

registration: how to handle user registration so that multiple users can become aware of each other and join with each other in teams?
communication: how to handle inter-user communication to support team tasks?
permissions: how to handle permissions so that multiple users can team?
multiple pools: how to handle multiple sources of entities: mail, news, ftp sites, websites, search engines, remote databases, local and remote file systems, remote knownspace sessions, whatever?
snapshotting: how to handle session snapshotting to allow hot restarts? (if this proves to be hard, we could postpone it, it's not needed for the core multi-user functionality.)

design proposals

communications

factor out events from the default pool and make a separate EventPool and an EntityPool. each will be servers, the first of events, the second of entities. every simpleton will have access to both default pools. the two default pools will be automatically persisted, however events will have a time-to-live (ttl) rather than be persisted forever. eventually, even entities might have a ttl. further, although the default EntityPool may proxy, lock, and cache entities, the default EventPool does not. proxying events isn't needed because they are small objects and so can be loaded into memory at will. locking events isn't needed because they are only needed read-only, so multiple copies can be cloned as needed. and they don't need to be cached for the same reason.

allow simpletons to be transported and reified in a new session by creating an xml descriptor file with a link to the source code file. transporting the simpleton is then just a matter of transporting the xml descriptor along with the simpleton's class file. on entry into a new session the user is notified of its arrival and is given the choice of browsing the simpleton's code and compiling and integrating the running simpleton into his running session. if asked to create an instance of the simpleton, the session executes the java compiler contained in the jdk's tools.jar and compiles the simpleton, then creates an instance of it inside the running session.

allow entity and event transport by converting them to an xml representation. or perhaps just use raw rmi? conversion to xml would be easy and efficient for events, but harder for entities because of the variability of entity values....hmmm...rmi might be the better bet here. we could also transport constraints, exceptions, and anything else in a knownspace session. transporting constraints, for example, would be nice because then a pool can ask its "upstream" pools not to send it events unless they match a particular constraint. and of course, if we can transport anything in a session, we can also store the entire session! if we can store an entire session, we could make that part of regular session shutdown.

snapshotting

allow events to be persisted, so that system state can be stored. on session restart the session could be in exactly the same state it was last left in---including running simpletons. that gives hot restarts. persisted events would have a ttl and the hot restart support initially need not be at the level of specific instructions being executed by each simpleton at the last session close---session close should put each simpleton into a state that can be restarted from the start of that simpleton.

registration

allow users to register with each other's running sessions. for example, say that bryan's session registers as a running session (or is running off some standard knownspace port on his machine). gordon's session becomes aware of bryan's session on its startup (either by pinging the "knownspace port" of bryan's machine or by querying some centralized remote registration server---for example, just a simple servlet running on www.knownspace.org). gordon's session then presents gordon with the option of registering itself with bryan's session. if he agrees, that connects their default EventPool and their default EntityPool by creating two new local pools mirroring the other's remote pools in each session and placing each beneath (that is, as servers to) their respective default pools. consequently both entities and events can pass from one to the other session in either direction, depending on the permissions that both have set up for their respective pools.

allow pools to register() with each other so that the local default pool (either for events or for entities) can see all other local pools of that same type, as well as any registered remote pools of that same type (either events or entities). the default pool can then transparently serve all the data available from any of the local or remote pools to the local application layer simpletons.

permissions

allow multiple pools with one (default) pool as the central server to the application layer simpletons and give each pool its own permissions. simpletons without the right permissions (perhaps the right private keys in a public key system?) can't even see various entity or event servers. all any application simpleton can ever see is the two local default pools, one for each type (entity and event).

each pool must have its own permissions, not just to control who can see what, but also to control who can do what. for example: a pool that represents a remote website likely won't allow entity writes, even though it allows entity reads and entity attachment (this last of course can only happen in the local application layer/local default pool if the remote pool doesn't allow writes; in fact, the remote pool may not even know about entities and their identifiers etc, all it may know is its webpages, say). similarly, a user's knownspace session may accept new events but perhaps it won't accept new entities. or perhaps it accepts new entities but won't allow local entities to be modified.

each session also needs its own separate set of permissions, to control, for example, searches, because even though they make no changes in the entities and they may be looking at public entities only (that is, entities in public entity pools) each search still takes computational effort. for example: i may not want other knownspace sessions to search as many times as they want in my local pool, even if they have permissions to search, each search will take time on my machine---it's a remote command executing on my machine---and i may want to limit the number of foreign searches to, say 5 an hour.