Ok, folksonomies are the super-simple way allowing people to annotate their documents. And when we use tags in blogs, flickr, wiki, ets, the world is better and way cooler. Websites like technorati can show us what is hip at the moment and I can customize these services to work together. I can even add a photo to flickr and then press the “blog this” button there and it will blog here. Cool, they use web-services, they provide their services in standardized interfaces and it works.
But then …
people find out that the tags aren’ so really good, because they miss stemming.
A stemmer is a computer program or algorithm which determines a stem form of a given inflected (or, sometimes, derived) word form — generally a written word form. The stem need not be identical to the morphological root of the word; it is usually sufficient that related words map to the same stem, even if this stem is not in itself a valid root.
wikipedia – stemmer
So all these tags aren’t right and people use different word forms for tags. One uses “book” the other chooses “books”. Or “surfing” “surfed” “surfs” etc. which all mean similiar things. So horray for the programmer that added stemming to services like technorati.
Then people see that they mean the same things with their tags, they use “surfing” and “browsing” to reference web browsing. This is the problem of synonyms.
Synonyms (in ancient Greek syn ‘συν’ = plus and onoma ‘όνομα’ = name) are different words with similar or identical meanings and are interchangable.
And the next thing would be that two words have different contexts – surfing can be the watersport or web surfing. These are Homonyms.
Homonyms (in Greek homoios = identical and onoma = name) are words that have the same phonetic form (homophones) or orthographic form (homographs) but unrelated meaning.
Ok, and if I search for “surfing” and I mean the watersport, then it might be good to include terms like “bodyboarding” and “surfspot” and “surfer” and “bigwavesurfing” into my search terms. Or if I search for “surfboard” I might also want items tagged with “longboard” “shortboard” “bodyboard” etc.
To structure terms that belong to each other, we use something called taxonomies. The homonyms and synonyms are usually in a thesaurus. The stemmer is a basic tool used before we start. Alltogether, we would like a dataformat to exchange and store those taxonomies and thesauri. Hey, there is one, its called RDF and you can even store Ontologies with it. I know, its hard to learn and you can’t really do everything you want, but its a start and there are tools out there that work with it.
So what will happen? I predict, that major tools that build today on what we know as folksonomies will include the core elements of the semantic web, and at a certain point in time, the necessarity for RDF and the good stuff will be obvious to the masses of developers and users out there. The good thing in all this is the learning approach: If someone knows what tags are, really knows what tags are because using them for a year, really knowing what the problem of synonyms, stemming, taxonomies are, because the somone has tried searching for something, than this someone will understand what the semantic web is about.