according to a press release the new oracle server supports RDF. does this mean that the semantic web is mainstream now? hurray!
ola a argentine et Inkel
Ingrid and me are in Argentina right now. If you do not get emails by me or wonder why I am so silent – its permanent barbecue and sitting around drinking mate at the moment.
Btw: I have met Inkel! as organized through this weblog. The blogosphere worked: through the help of DanBri I got to know Inkel and we met and had, together with Daniel an evening of tough FOAF and Semantic Web Chat. We met in Buenos Aires, in our Hotel and then went eating together, it was like a Semantic Web Lounge Meeting in Vienna, cool.
Photos will follow – and be warned – the photos will definitly rock (surprise surprise…)
Java code generators
a question that is often faced by RDF developers: how to change those RDF graphs?
ususally, you would use jena or sesame’s api to directly add triples, delete them, list classes using (x rdf:type rdfs:class) and so on…
Then, some developers make libraries that allow you to generate code that wraps the RDF in an object model.
This means, instead of adding triple
X.addProperty(FOAF.name, “Leo”);
you say:
X.setFoafName(“Leo”);
where the setFoafName method was generated by such a code generator.
Now some code generators for you:
- rdf2java by Sven Schwarz, DFKI. Jena based, works on RDFS and OWL. used in some research projects.
- SWeDE by BBN. they mention Kazuki code generation.
- rdfreactor by some Karlsruhe guys.
from my perspective, all of them do the same. pick what you like.
foaf:knows in Argentine
Ingrid and I are going to argentine on August 13, to be guests at the wedding of my cousin Robert Sauermann. We’ll be first in Buenos Aires from 14th to about 17th August and then in Santiago del Estero, at the 20th is the wedding there.
Later on we’ll move through the country and see a few places. I don’t know anything about Argentine except the Evita movie and that che was born there. Hm, steaks and Churrascaria.
So how will I find out whom i foaf:know in Argentine to visit a few Swhack and rdfig people?
sparql fast as hell
In the last two months we shifted the gnowsis search services to SPARQL. Our problem was, that the common ARQ implementation is slow and does not work in a named graph scenario.
A fine twist in history brought Richard Cyganiak to work at HP Labs for a while and he hacked sparql2sql there, a mapping of sparql to the jena database scheme with support for named graphs.
I asked him if I can use it in gnowsis and he was quite happy to have this as a test-case. Result is, that Richard added a special framework for fulltext search and I hacked some stuff regarding MySQL’s fine FULLTEXT index functions (if you don’t know it: its similiar to lucene, can do relevance and query expansion and so on)
outcome: an hour ago I transformed the SQL index to the new fulltext format and got hit in the face with the new answer times:
SPARQL of this kind works practically instantly (10-20msec):
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?x ?label WHERE
{ GRAPH ?source {
?x rdfs:label ?label.
FILTER REGEX(?label, “test”, “i”)}}
But the real astonishing thing is that SPARQL of this kind is also fast as hell (10-500ms):
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?x ?p ?label WHERE
{ GRAPH ?source {
?x ?p ?label.
FILTER REGEX(?label, “test”, “i”)}}
In simple words: this is a fulltext scan over all properties of all statements. don’t get bothered by the warmup time, the thing will need about 10-20 sec warmup, but then its great.
triplecount:
371994
for the freaks we will probably pack a little executable jar that packs all into a nice demo. for the real freaks:
http://gnowsis.opendfki.de/repos/gnowsis/trunk/sparql2sql/
smushing
I am putting together more about smushing, which will be a key factor in the global semantic web: to connect annotations that were made by different people.
A typical smushing algorithm would be:
- take a large datastore DS that contains a set of triples Tset = {Ta, Tb, Tc, … }
- iterate through known InverseFunctionalProperties IFPset = {Ia, Ib, Ic, ….}
- for each InverseFunctionalProperty Iy that is represented in the Tset as predicate, do a check for smushing.
- find all triples TxIy so that Tx has Property Iy
- find one triple Txc of TxIy that points to a grounding resource / canonical resource (see below)
- Use the subject Sx from the triple Txc and aggregate all other triples of subjects of TxIy to Sx. This means, change the subject in the triples to Sx.
- add owl:sameAs triples to connect all Subjects(TxIy) to Sx
The problem is, when you have a set of triples TxIy that have several subjects that should be the same – as defined by IFP – to choose which subject is the “canonical” subject and should now be filled with the triples.
There are different approaches to find the canonical resource:
- take by random
- prefer the resource that is annotated in special ontology (ie prefer SKOS concepts over foaf:Persons)
- prefer the more public resource (googlefight, public urls wins over private uris)
- prefer the best annotated resource (the resource with the most triples – attention, this is self-amplification of single resources)
- prefer the resource with the shortest / the longest uri
- prefer named resources over anonymous resources (this is very important, you must not smush to anonyms)
Another question is what to do with the smushing. Different approaches
- store the smushing in an extra graph
- delete the old triples, add the smushing
- add the smushing additional to the old triples (tricky)
Each has obvious advantages and disadvantages. For gnowsis I would prefer (1)to smush into an extra graph, which is similiar to (3) but seperates the data.
In gnowsis we have the problem of incremental smushing, which means that we crawl thousands of emails per day and then would like to smush the persons in the addresses, but only of the new messages.
I have posted this algorithm also in the ESW wiki, where you can comment on it.
new diploma thesis topics available
I updated my diploma thesis website, where I offer topics related to the Semantic Desktop as Master Thesis.
http://www.dfki.uni-kl.de/~sauermann/projekte/index.html
this always takes so much time, to write down my ideas and things that will be needed. The next student starts in October with a Thesis, so there are still places open.
here are the current topics. Well, I should add another one for GUI!
- your own idea – “I have an idea for the Semantic Web, its X and Leo, you will like it!” mail me.
- business card annotation– annotating business events
- managing concepts using SKOS – concepts for the masses
- semantic tools – evaluate semantic tools for our future
- semantic search – search inside the semantic desktop (taken by student)
- semantic email – how to annotate emails and how to classify emails. Eat Spam!
- text annotation – using the semantic desktop to annotate documents or write notes about sport events
- metadata integration – integrate information using RDF
- cyberspace in ultima
online – Are you a mage or a hacker? or both?
i-know interview and link to me
http://futurezone.orf.at/futurezone.orf?read=detail&id=270140&tmp=17566
there is a orf futurezone entry about the i-know conference, that goes more into detail about the work of me and Matteo Bonifacio.
Some of the gnowsis and Semantic Desktop features are mentioned there, great thing to be linked on the fuzo.
And also Matteo, I am always happy to see this guy, he is full of energy and good science.
the conference was a nice place to be!
i-know 2005 article and radio program
Dr Sonja Bettel from the austrian radio oe1 was at the i-know conference and wrote an article about it.
There is a radio feature about the conference also, if you are oe1 club member, you can download it.
I was interviewed for this broadcast about the Semantic Desktop, hope to get a copy of the audio soon 🙂
I-Know 2005 conference
I am at the I-Know 2005 right now, and if you add 1+1 you might inference that I am listening to a keynote that is so breathtaking that I am web-surfing and blogging.
I-Know itself is going really fine. I meet dozens of people from i-know 2003 and the WM-Conference or old buddies from vienna like this one (andreas blumauer).
Harald Holz gave an interesting talk on his stuff in the BPOKI trackand Oleg Rostanin gave insights to some elearning stuff on PROLEARN.
Yesterday we had a welcome reception at the Schlossberg and Hermann Maurer, the legendary professor here in Graz, gave a humorous explanation of the historical backgrounds of graz. At the end of the night, just before the last train went down from Schlossberg Hermann Mauerer was one of the last guests to leave (the others were Harald, myself, Keith Andrews and Syliva). Another good bottle of wine with even finer people.
Yesterday was a great day of doing coffe-talks about the Semantic Desktop and crazy swiss guys. Happy to be here, my own talk is on friday in Room 3 at 11:15. I still improve slides, they get better from minute to minute, talks here are either very inspiring myself to change my slides or otherwise the talks are so unintersting that I have enough time to improve gnowsis and my personal semantic web.