Joi Ito points out that Wikipedia just passed one million articles: “Wikipedia is in more than 100 languages with 14 currently having over 10,000 articles… At the current rate of growth, Wikipedia will double in size again by next spring.” Wikipedia itself points to the power of a massive, decentralized content authoring effort.

Ross Mayfield adds, “To put this in perspective, if each article took 1 person week to produce, getting the next million would take 40,000 full-time equivalent resources to get it done in the same amount of predicted time. Co-incidentally Wikipedia has about the same amount of registered users, but they have day jobs too.”

Even more impressive, “Wikipedia is a volunteer effort supported by the non-profit Wikimedia Foundation.” When I clicked on their fundraising effort, I discovered that they’re looking to raise fifty thousand dollars — a tiny amount by corporate standards. It speaks to the fact that a massive, decentralized effort need not cost a tremendous amount to have a huge impact.

It’s taken a few weeks for this to sink in: John Battelle’s post on Sell Side Advertisting (inspired by Ross Mayfield’s post on Cost Per Influence).

Ross wrote: “An important facet of this format is the amount of user choice. Users decide what feeds to subscribe to and ads to block. Bloggers should be able to choose what ads to both host and pass through.”

John’s addition:

Instead of advertisers buying either PPC networks or specific publishers/sites, they simply release their ads to the net, perhaps on specified servers where they can easily be found, or on their own sites, and/or through seed buys on one or two exemplar sites. These ads are tagged with information supplied by the advertiser, for example, who they are attempting to reach, what kind of environments they want to be in (and environments they expressly forbid, like porn sites or affiliate sites), and how much money they are willing to spend on the ad.

Once the ads are let loose, here’s the cool catch – ANYONE who sees those ads can cut and paste them, just like a link, into their own sites (providing their sites conform to the guidelines the ad explicates in its tags). The ads track their own progress, and through feeds they “talk” to their “owner” – the advertiser (or their agent/agency). These feeds report back on who has pasted the ad into what sites, how many clicks that publisher has delivered, and how much juice is left in the ad’s bank account. The ad propagates until it runs out of money, then it… disappears! If the ad is working, the advertiser can fill up the tank with more money and let it ride.

This concept of decentralizing ads (instead of “classified ads”, they’re “declassifieds“) empowers multiple agencies — not just advertisers and ad networks but publishers, too — to determine which ads propogate.

Taking this a step further to create a truly decentralized advertising network requires asking oneself the question of who is empowered to determine what goes in the square inch of real estate used by web browsers and feed readers to display ads? Not just advertisers, ad networks, and publishers — but the software writers and the actual people reading the web and feeds as well.

Jeremy Zawodny talks about the inevitability of search results as RSS that can be subscribed to, quoting Tim Bray:

They’ve also done something way cool with their Google appliance; one of the bright geeks there has set up a thing where you can subscribe to a search and get an RSS feed. Well, duh. Anyone could fix up one of those using the Google API, I wonder why Google isn’t supporting this already?

This in turn reminds us of Jeff Barr’s real-time headline view (more thoughts), which he talked about this weekend at Foo Camp, also attended by Sam Ruby, who talked about FeedMesh, a working group to establish a “peering network” for decentralized web(site|log) update notifications and content distribution. This is the start of something potentially wonderful…

I came across this preparing to visit Prof. Plott at my alma mater next week… here’s some snippets from one of his very latest papers; he’s been interested in information aggregation of late.

DECENTRALIZED MANAGEMENT OF COMMON PROPERTY RESOURCES:

For several centuries, villages in the Italian Alps employed a
special system for managing the common properties. The
experiments and analysis of this paper are motivated by an attempt
to understand why that particular system might have been successful
in comparisons with other systems that have similar institutional
features. The heart of the system was a special monitoring device
that allowed individual users to inspect other users at their own cost
and impose a predetermined sanction (a fine) when a free rider was
discovered. The fine was paid to the user who found a violator.
In addition to the replication of the results of others, the paper
finds three classes of results. First, in comparison with a classical
model of identical, selfish agents, the data can best be captured by a
model with heterogeneous and other-regarding preferences where
altruism and especially spite play an important role. Second, the
model with heterogeneous agents suggests that the success of the
institution is related to its ability to turn these individual differences
to socially useful purposes. Third, the model also explains important
paradoxes that can be found in the existing literature.

The success of the Carte di Regola system appears to be related to its ability to use the
heterogeneity of preferences to socially advantageous ends. The system also appears to have
a type of robustness against institutional and parameter changes. Notice first that the Carte di
Regola channels attitudes that might normally be considered as socially dysfunctional, such as
spiteful preferences, into socially useful purposes. People with spiteful preferences choose
to monitor and sanction at a monetary loss. But when their preferences are considered as
part of system efficiency, they are the ones who can perform the function most efficiently
and are channeled into the activity for which they have a comparative advantage.

One might think that the Carte di Regola is similar to a system of vigilantes but there are
important differences. In the model, spiteful people do not care who they hurt, they just
enjoy hurting others, so it is important to direct and constrain them. The Carte di Regola
directs them by reserving the judgment of guilt for the courts, as opposed to the vigilantes,
who would be happy to judge anyone guilty. The court convicts a person only when the guilt
is consistent with social purposes. The magnitude of punishment is also reserved for the
courts in the Carte di Regola system, while in a vigilante system the inspector is allowed to
judge and determine punishment. So, the Carte di Regola constrains what the spiteful can do
to the guilty. Thus, there are important differences (OWG, 1992).

The Carte di Regola also channels arbitrary or random behavior toward useful ends. Such
behavior might ordinarily be regarded as dysfunctional from the point of view of economic
efficiency. Mistaken inspections or impulsively random inspections are costly to the
inspector and thus involve efficiency losses, but the fact that inspections take place has
consequences for those who are excessive users of the common pool resource by increasing
the likelihood that a sanction is imposed. Thus random inspection behavior that would
appear irrational helps preserve the commons.

BoingBoing in Wikipedia proves its amazing self-healing powers pointed us to The Isuzu Experiment, which goes like this:

Joi Ito points to an ongoing discussion regarding the authority of wikipedia as a source of information and knowledge. The discussion was prompted by an article in the Syracuse Post-Standard that suggests, in part, that wikipedia “take[s] the idea of open source one step too far” by allowing the user to make corrections.

The article has been correctly ridiculed by many, including Mike at Techdirt. In a later posting, he suggests an experiment: why not go to a certain page, insert something provably incorrect, and see how long it lasts.

No matter which side of the debate you find yourself on, this sounds like an interesting experiment. So, I have made not one, but 13 changes to the wikipedia site. I will leave them there for a bit (probably two weeks) to see how quickly they get cleaned up. I’ll report the results here, and repair any damage I’ve done after the period is complete. My hypothesis is that most of the errors will remain intact.

Does that invalidate Wikipedia? Certainly not! If anything, the general correctness and extent of Wikipedia is a tribute to humankind. It suggests the Kropotkin may be right: that the “survival of the fittest” requires that the fittest cooperate. It means that there are very few Vandals like me who are interfering with its mission.

Terrible experiment, but it demonstrates how decentralized authoring can be self-healing. Wrote Phil in the Boing Boing summary,

Remember Al Fasoldt, the journalist who disparaged Wikipedia? He was challenged by a Techdirt writer to change an item and see if his change was found. While Fasoldt dismissed the idea, Alex Halavais thought it was an interesting idea. He made 13 changes to 13 different Wikipedia pages, ranging from obvious to subtle. He figured he’d give them a couple of weeks and then fix the ones that weren’t caught. Every single change was found and changed within hours.

It’s a terrible idea to vandalize Wikipedia like this. But it’s a wonderful thing how quickly self-healing Wikipedia is to such attacks.

A step in the direction of “decentralizing Akamai” — but still uses the “centralized DNS” to create an interesting distributed web caching network — is the Coral Content Distribution Network.

Mike Dierken talks about the Coral CDN by quoting Gordon Mohr quoting Michael J. Freedman’s post to the p2p-hackers list:

To take advantage of CoralCDN, a content publisher, user, or some third party posting to a high-traffic portal, simply appends .nyud.net:8090 to the hostname in a URL. For example:

http://news.google.com/ –> http://news.google.com.nyud.net:8090/

Through DNS redirection, oblivious clients with unmodified web browsers are transparently redirected to nearby Coral web caches. These caches cooperate to transfer data from nearby peers whenever possible, minimizing the load on the origin web server and possibly reducing client latency.

Mike writes: “DNS/HTTP based P2P — Wicked cool and finally a REST based scalable p2p network. I wonder how I could use that at Amazon…”

Rohit asks if this technique could help Slashdot alleviate “The Slashdot Effect.” According to the Slashdot post on Coral, apparently not.

We’ve recently written a new position paper titled
Agoric Architectural Styless for Decentralized Space Exploration. It’s been submitted to the
2004 Workshop on Self-Managing Systems (WOSS’04) to be held at FSE-12 in Newport Beach. it was originally based on some notices of intent (NOIs) for a NASA Broad Agency Announcement (BAA) on innovative Human & Robotic Technologies (H&RT) for future space exploration missions.

Abstract: This position paper discusses an architectural approach to managing decentralized space exploration missions. Developing control applications in this domain is complicated by more than just the challenging computing and communication constraints of space-based mission elements; future exploration missions will depend on ad-hoc cooperation between independent space agencies’ elements. Currently, the frontier of interoperability is providing communication relays, as shown in by recent Mars missions, where NASA rovers relayed data via ESA satellites.

Future mission planning envisions more extensive autonomy and integration. Examples include: taking advantage of excess storage capacity at another node, multicasting messages along several paths through deep space, or even scheduling concurrent observations of an object using several instruments at different locations. An architectural style for developing mission control applications that does not depend on positive ground control from Earth could provide (a) increased margins for space-based computing systems, (b) increased reusability by an effective build-it-for-autonomy-first strategy, and (c) avoid the single-point of failure bias in standard distributed system design approaches.

In particular, we propose combining an architectural style for decentralized applications based on the Web (ARRESTED) with agoric computing to apply market discipline for allocating resources dynamically among coalitions of mission elements in space. Similar approaches may have applicability in other domains, such as crisis management or battle management.

The following are only a few lines of excerpts from an extremely important argument about the “culture of design” surrounding software. It is a critical aspect of any effort to design “software that works the way society works,” to cite the credo of the decentralized software architecture crowd.

It may have an important impact on how CommerceNet Labs refines its own mission, too…

Software That Lasts 200 Years

In many human endeavors, we create infrastructure to support our lives which we then rely upon for a long period of time…

By contrast, software has historically been built assuming that it will be replaced in the near future (remember the Y2K problem). Most developers observe the constant upgrading and replacement of software written before them and follow in those footsteps with their creations…

In accounting, common depreciation terms for software are 3 to 5 years; 10 at most. Contrast this to residential rental property which is depreciated over 27.5 years and water mains and brick walls which are depreciated over 60 years or more… I can go to city hall and find out the details of ownership relating to my house going back to when it was built in the late 1800’s.

[Dan Bricklin] will call this software that forms a basis on which society and individuals build and run their lives “Societal Infrastructure Software”. This is the software that keeps our societal records, controls and monitors our physical infrastructure (from traffic lights to generating plants), and directly provides necessary non-physical aspects of society such as connectivity.

What is needed is some hybrid combination of custom and prepackaged development that better meets the requirements of societal infrastructure software.

How should such development look? What is the “ecosystem” of entities that are needed to support it? Here are some thoughts:

* Funding for initial development should come from the users…

* The projects need to be viewed as for more than one customer… Funding or cost-sharing “cooperatives” need to exist.

* The requirements for the project must be set by the users, not the developers. The long-term aspects of the life of the results must be very explicit…

* … Impediments such as intellectual property restrictions and “digital rights management” chokepoints must be avoided…

* The actual development may be done by business entities which are built around implementing such projects, and not around long-term upgrade revenue…

* The attributes of open source software need to be exploited. This includes the transparency of the source code and the availability for modification and customization… The availability of the source code, as well as the multi-customer targeting and other aspects, enables a market for the various services needed for support, maintenance, and training as well as connected and adjunct products.

* The development may be done in-house if that is appropriate, but in many cases there are legal advantages as well as structural for using independent entities..

* Unlike much of the discussion about open source, serendipitous volunteer labor must not be a major required element. A very purposeful ecosystem of workers, doing their normal scheduled work, needs to be established to ensure quality, compatibility, modifications, testing, security, etc… The health of the applications being performed by the software must not be dependent upon the hope that someone will be interested in it; like garbage collecting, sewer cleaning, and probate court judging, people must be paid.

The ecosystem of software development this envisions is different than that most common today. The details must be worked out. Certain entities that do not now exist need to be bootstrapped and perhaps subsidized. There must be a complete ecosystem, and as many aspects of a market economy as possible must be present.

This is all very old school – 1988! – but it’s always refresing to take another look at the basics. While a survey paper from U. Florida says that the concept can be traced to Ivan Sutherland auctioning timesharing slots in 1968, the likely origin of the term “agoric systems” (from the greek agora, or market) is Mark S. Miller and K. Eric Drexler’s chapter in The Ecology of Computation.

Like all systems involving goals, resources, and actions, computation can be viewed in economic terms. Computer science has moved from centralized toward increasingly decentralized models of control and action; use of market mechanisms would be a natural extension of this development. The ability of trade and price mechanisms to combine local decisions by diverse parties into globally effective behavior suggests their value for organizing computation in large systems.

This paper examines markets as a model for computation and proposes a framework-agoric systems-for applying the power of market mechanisms to the software domain. It then explores the consequences of this model at a variety of levels. Initial market strategies are outlined which, if used by objects locally, lead to distributed resource allocation algorithms that encourage adaptive modification based on local knowledge. If used as the basis for large, distributed systems, open to the human market, agoric systems can serve as a software publishing and distribution marketplace providing strong incentives for the development of reusable software components. It is argued that such a system should give rise to increasingly intelligent behavior as an emergent property of interactions among software entities and people.

Jon Udell wrote in Infoworld this week:

In the end, scalability isn’t an inherent property of programming languages, application servers, or even databases. It arises from the artful combination of ingredients into an effective solution. There’s no single recipe. No matter how mighty your database, for example, it can become a bottleneck when used inappropriately.

It’s tempting to conclude that the decentralized, loosely coupled Web architecture is intrinsically scalable.

Not so. We’ve simply learned — and are still learning — how to mix those ingredients properly. Formats and protocols that people can read and write enhance scalability along the human axis. Caching and load-balancing techniques help us with bandwidth and availability.

But some kinds of problems will always require a different mix of ingredients. Microsoft has consolidated its internal business applications, for example, onto a single instance of SAP. In this case, the successful architecture is centralized and tightly coupled.

For any technology, the statement “X doesn’t scale” is a myth. The reality is that there are ways X can be made to scale and ways to screw up trying. Understanding the possibilities and avoiding the pitfalls requires experience that doesn’t (yet) come in a box.

Decentralization is not a hammer that can hit every nail. Experience is still the mother of good judgment.