RDF: You Went Too Far

The updated RDF and new OWL specifications have been released by the W3C. The purpose of these languages are to help define machine-understandable semantic relationships between resources both on and off the Internet. It’s a nobel goal, and one that promises to hold the future of the Internet in its hand, but I think RDF has started down a dangerous path that will forever dominate its destiny.

A large portion of the the RDF spec is concerned with the construction of RDF Schemas. An RDF schema is intended to define RDF vocabularies. By defining a vocabulary, you can then talk in more complex terms. For example, rather than simply having an element with the creator’s name, you can reference a Person resource as the creator. This theoretical Person resource can then contain a name, email address, or any other information deemed appropriate. This is all well and good; after all, it’s difficult to speak using monosyllabic words, so creating bigger words makes it easier to express larger concepts.

The problem is that this makes RDF “Yet Another Class Language.” RDF defines a very nice, elegant way for associating name-value properties with resources, and then describing the relationship graphs between those resources, but it should have stopped there. Imagine for a moment that we are modelling a Person object. Today we need to, in some arbitrary order: 1) Define a class in our favorite object-oriented language (in this case C#).

public class Person
{
    public String Name;
    public Date   BirthDate;
    public String SocialSecurityNumber;
}

2) And define an XML schema.

<xsd:element name="Person">
    <xsd:complexType>
        <xsd:sequence>
            <xsd:element name="Name" type="xsd:string" />
            <xsd:element name="BirthDate" type="xsd:date" />
            <xsd:element name="SocialSecurityNumber" type="xsd:string" />
        </xsd:sequence>
    </xsd:complexType>
</xsd:element>

So now we have two definitions of our class structure that must be maintained together. With the addition of RDF Schema (RDFS), we have to maintain a third representation of the same class!

<rdf:RDF   
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"  
    xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">
    
    <rdfs:Class rdf:ID="Person" />
    
    <rdfs:Property rdf:ID="Name">
        <rdfs:domain rdf:resource="#Person" />
        <rdfs:range rdf:resource="xsd:string" />
    </rdfs:Property>

    <rdfs:Property rdf:ID="BirthDate">
        <rdfs:domain rdf:resource="#Person" />
        <rdfs:range rdf:resource="xsd:date" />
    </rdfs:Property>

    <rdfs:Property rdf:ID="SocialSecurityNumber">
        <rdfs:domain rdf:resource="#Person" />
        <rdfs:range rdf:resource="xsd:date" />
    </rdfs:Property>
</rdf:RDF>

Of course a lot of the required generation/translation/maintenance could be handled by software tools, but I don’t think we should have to resort to this. I posit that, for its class structures, RDFS should have yielded to XSD to define new complex and simple types. Also like RDFS, XSD also allows for the definition of complex type hierarchies. Heck, the RDF documents use the XSD primitive types all over the place in their examples! Why not just cut the crap and use XSD?