What Does “Web Paradigm” Mean, Anyway? (Part 3)

January 13, 2006

In Part 1 and Part 2 of “What Does ‘Web Paradigm’ Mean, Anyway?” I put forward the view that web integration with the desktop has already been done to the extent it is possible, and that integrating the desktop into the web browser is a long-term losing prospect. It is clear, however, that these answers are not satisfactory. There must be things we can do to more effective use the web in our daily lives than Firefox and Internet Explorer, right? We’ve taken pretty good advantage of the hyperlink, but can we finally take full advantage of the common languages and ubiquitous protocols, the final two things the web offers to us?

The answer is yes, but not with our current idea of the web. We need to fundamentally re-define it, we need a new version, we need a Web 2.0.

Yeah, okay, so it’s the buzzword of the month. I do my best to avoid over-use of buzzwords, but this time I’m guilty. (Although this is the first time I’ve used the phrase on my site.) However, if we are truly going to bind the web into our lives, we need this re-thinking of the fundamentals of web development; from now on I will refer explicitly to Web 1.0 and Web 2.0.

As I’ve said, one of the biggest promises of Web 1.0 was language commonality. Web 1.0 achieved this mostly through standardization of HTML, which is also why it failed to meet our secret expectations. At its core, HTML is a language for rendering text, and it is very good at that. However, we yearn for it to represent generalized abstract concepts, a task for which it was not designed and thus is unable to fulfill. Representing a business card in HTML requires tables and images and paragraphs, all of which are processable by a human being as a “business card,” but none of which are easily handled by a computer.

As an anecdote, Chris Schultz and I were chatting yesterday, and he just happened to mention an obscure feature of Microsoft Excel. Apparantly, you can point Excel at URL containing an HTML table, and it will import that table data into the current spreadsheet, even going so far as to dynamically recalculate the spreadsheet data if the remote data changes. At one of his previous employers, their client required reporting data to be generated in Excel format. Instead of building some sort of component to generate the spreadsheet in its entirety, they built the reporting spreadsheet to use the “table scraping” feature in Excel to pull down the most current data and then do the calculations in Excel. It’s quite an ingenious solution, and it should be obvious that this is a very early type of “web service.”

Despite its cleverness, this story makes the architect and purist in me gag a little bit. The solution is extremely fragile: Since there is no semantic information associated with the table - it only has structure - modifying that structure even the slightest will break the reports. This fragility causes the reporting system to be very difficult to modify. If new reports are needed, or new spreadsheets require additional data, everybody will be required to update their local spreadsheets or they will break. What is needed is a generic, web-accessible language in which the reporting data, or really any data, can be defined.

I am sure there will be no gasps of surprise when I say that XML is that language. I won’t dwell on it long, since we all know how great it is. Suffice to say that XML is generalizable and extensible, and thus finally provides a common, general-purpose language for web communication. That’s not to say that HTML will vanish - it will always be useful for displaying text - but now it is a subset of a more powerful, generalized language. This new language can represent concepts, not just text.

So then what about the protocols? It seems that HTTP has become the de facto web protocol, and it will continue to be so. While specific protocols will continue to be developed as specialized needs require - BitTorrent is a good example of a system for which HTTP is too ill-suited - but HTTP will be used for as much as humanly possible. Continuing with BitTorrent: The tracker protocol works over HTTP even if the data exchange protocol does not. If this standardized protocol is coupled with XML as a standard language, then suddenly anything can connect to anything and just work. The graceful degeneration of XSLT and other XML processing tools permits meaningful processing even when the conversational language is not fully understood by one of the participants. As an analogy, if a Web 2.0 client were an English-speaking person, and a Web 2.0 server were a Spanish-speaking person, then HTTP would be the telephone system. One may contact the other and initiate a conversation, and the conversation will be as meaningful as possible - as meaningful as picking out English-sounding words in a stream of Spanish.

Now mix in the one shining achievement of Web 1.0: the link. Suddenly Metcalfe’s Law no longer applies just to hyperlinked HTML, but to all of the data made available on the network in a publicly addressable form; instead of screen scraping HTML, our programs may communicate with one another in a language they understand, rather than a language we understand, but easily translated for humans. The power and potential is indisputable. This is the promise of the new Web 2.0.

Feel free to laugh maniacally here.

Done? Good. Now that we have this new definition, let’s just start calling Web 2.0 the web, and explicitly refer to Web 1.0 only when talking about the past. Next time, we’ll play the “What If” game, and explore some of the possibile concrete ways we might harness this power.

Brian Vargas