next up previous contents
Next: A Hypothetical Interaction Up: Stating the Problem Previous: Stating the Problem

Introduction

Trying to find information on the world wide web is like trying to find something at a jumble sale: it's fun, and you can make serendipitous discoveries, but for directed search it's better to go to a department store; there, someone has already done most of the arranging for you. Unfortunately, the web's continuing explosion in size, its enormous diversity of topics, and its great volatility, make unaided human indexing impossible.

The Library of Congress has to deal with a mere 100 million items and a change rate of 31,000 items a day. The Excite web search engine indexes over 50 million webpages. The number of internet hosts alone is doubling every year and should reach 100 million by January 1st, 2000. The number of webpages is doubling even faster. If the web keeps doubling every six months or so, then by January 1st, 2000, there could be 6.4 billion webpages. And unlike the fairly static Library of Congress collection, many of those webpages are constantly changing--and constantly moving.

The web has brought us a world of electronic information glut, and in such a world, information about information is more valuable than information. The widespread popularity of web search engines proves that. However, as the volume of electronic information continues to leapfrog ahead, the ability of even the most powerful search engines to keep us aware of what's where will continue to fall behind. All too often, web searches produce either no documents or thousands. What the web needs is the same mechanism we use to manage search in the real world: locality.

There needs to be a Shops Neighborhood, in which there is a Food Neighborhood, in which there is a Chinese Food Neighborhood. Of course, the Chinese Food Neighborhood should link to the Chinese Cooking Utensils Neighborhood, which links to the Chinese Culture Neighborhood, which links to the Chinese History Neighborhood, which links to the Ming Dynasty History Neighborhood, which links to the Antiques Neighborhood, and so on forever. However, the main links out of the Chinese Food Neighborhood should be to other food neighborhoods, say the Italian Food Neighborhood, the French Food Neighborhood, and so on, because, as is evident from its name, food is the chosen way of organizing this particular neighborhood.

The point is that if food is your dominant interest at present then you don't want to get sidetracked into the intricacies of Antiques. Such information is, at present, only cluttering up your mental workspace and should be hidden from you. Further, if you find some appropriate sites but then decide that Chinese food isn't what you really want after all, you are probably still interested in food. So sites featuring Italian food should be near at hand.

The Chinese Food Neighborhood should be ``closer'' to the Italian Food Neighborhood than it is to the Antiques Neighborhood. A notion of nearness and farness imposes locality, and that in turn imposes neighborhoods. Users should be assured that when interested in a particular topic, service, or artifact that one big jump to a specific virtual location coupled with a number of small hops ``around'' that location is sufficient to find it or its nearest relatives on the web. Today this is not the case.

Given the web's continuing explosion, it's infeasible to hire enough people to do all of this mapping for us. Even if we could, the web changes far too fast to make any but a cursory stab at mapping possible. So it's unlikely that a human-only solution to mapping is possible (Yahoo, for example, is one attempt to do this). On the other hand, this mapping needs human supervision since choosing the right linkages is often semantic and not simple to deduce automatically. Finally, it is unlikely that we can create global linkage maps of all webpages with sufficient discrimination to satisfy all users. There are serious issues of deciding where each topic should most logically go if the resulting ordering must seem natural to everyone. We have too many different agendas and tastes, and they change too often. In sum, what we need is a single-user-based yet semi-automated mapping utility. This proposal sketches a possible solution to that problem.


next up previous contents
Next: A Hypothetical Interaction Up: Stating the Problem Previous: Stating the Problem
Gregory J. E. Rawlins
1/13/1998