The main architectural property we seek is flexibility, the ability to dynamically reconfigure the design at runtime. An ideal system would let us, at runtime, insert a completely new kind of simpleton, delete any simpleton, or replace any simpleton by a completely different one, all without having to change anything else or recompile the system. One big step toward that goal is to make all simpletons as simple as possible.
There is a difference between simpletons that see things (Collectors), simpletons that notice what the attributes of those things are (Examiners), simpletons that compare those attributes against the attributes of things the user already likes or dislikes (Clusterers), simpletons that decide whether the user actually likes or dislikes the new things (Evaluators), and simpletons that figure out how to find new things the user is likely to like (Meta-Analysis).
Filter Advisors, for example, generate probabilistic tests based on the attributes presently attached to the pages in each cluster that distinguish those pages from other clusters of pages. Filters, however, actually apply those tests to incoming pages. Both of those actions are different from identifying the clusters in the first place (Clusterers).
Breaking computations down this far makes recombining simpletons to produce a new design much easier, which in turn makes the architecture more flexible---and more extensible.
To increase flexibility further, all simpletons appear in families. Instead of one purger, for example, there is a family of purgers, all roaming over the same pool of pages and marking any pages they come across with attributes of various kinds.
Writing one large, complex purger to do the entire purging job is complicated; it's much simpler to write several small, simple ones, each of whose decisions combine to decide whether a page should be purged. Further, since purging is now distributed among many simpletons, several programmers can independently write purgers without having to continually communicate with each other and so slow the entire programming task. Finally, it's much easier to add new capabilities to the purger family simply by adding new (and simple) purgers rather than having to overhaul (and debug) one giant, complex purger. The same flexibility lessons apply to all the other simpleton families.
This granular approach to programming also increase robustness enormously. For example, since purger simpletons appear in families, if one purger dies for some reason, purging does not cease, nor need the program as a whole crash.
Loose simpleton coupling allows dynamic runtime reconfiguration of the whole system. For example, it's possible to introduce a simpleton that modifies the work of another simpleton yet still leaves the original simpleton in place and unchanged. Suppose it was necessary, say, to add an intermediate simpleton between pollsters and filters. The new simpleton simply roams over the pages marked by the "upstream" simpletons (in this case, the pollsters) and puts the marked pages in the pool the "downstream" simpletons read from (here, the filters). There is no need for a complete rewrite, or recompilation; only the new simpleton must be compiled. Neither of the two old families of simpletons need to be modified.
The fundamental objects are: pages, page clusters, pools, simpletons, and attributes. Pools have pages and clusters. Pages have attributes and appear in pools and clusters. Clusters have attributes and pages and appear in pools and other clusters. All simpletons, pages, clusters, pools, and attributes have identifiers.
The fundamental operation is one of simpletons marking pages or clusters by attaching attributes to them. Each simpleton reads from an input pool, marks the page and writes to an output pool. Each page appears in a variety of pools of pages each marked by some simpleton. Thus, the pages flow down the data stream, becoming more and more enriched the further they flow.