gnowsis alpha release planed for 28th september

I plan to publish the gnowsis alpha on 28th September, 12:00 berlin time.

So I still have 20 workdays to go, to ….

write the documentation, test it, build tarballs, write an installer, test it, drink coffee, hate my computer, write ontologies, hire new employees, obfuscate the dirty hacks, work on my todo lists, create new todos, hate todo lists, talk to developers, talk to my people, get the website running, get the website looking decent, test the website, use gnowsis, integrate our research project, write papers, love god + my neighbor + myself …

So the time is not enough, but something will be done.

I will post updates here from time to time (probably daily) to keep you udpated and to keep myself on track.

Integrated foafnaut and gnowsis

a good thing to ánnounce:

Jim Ley and me integrated www.foafnaut.org and www.gnowsis.com. a little foafnaut now runs here on my local dekstop and shows me my foaf network,

http://foafcamp.asemantics.org rules!

see more at this announcement
http://rdfweb.org/topic/FoafCampGnowsisAndFoafNaut

and the screenshots:

http://www.dfki.uni-kl.de/~sauermann/2004_08_21_foafcamp/foafnaut_gnowsis.png

http://www.dfki.uni-kl.de/~sauermann/2004_08_21_foafcamp/monochrom.png

the smell of libraries

The future is the paperless office.

Actually, we have a project in our institute, which is exactly about this. The Virtual Office of the Future. Paperless office. So no libraries needed anymore?

But libraries where great, I will tell you why. In my office room, there is a shelf with old File Folders, hundreds of documents and contracts are there. My office (which I share with Lars) has been the secretary office a while ago, and they left-over these folders here. I never saw anybody come here to get them, so the contents is probably dead. A look at the backs of the folders shows titles like “project proposals” or the like.

Imagine the time when these where still at a central place: Project reports, project proposals, accounting stuff. And another place has been the library, where all the books where – “PHP cookbook”, “Stroustroup”, “Peopleware”, etc… . Libraries have this special smell, it is not generated by the books, it is the sweat of anticipation of finding a book. It also smells when you peek over the shoulder of other people in public libraries, to see what they read (to get inspired to lend the book when they returned it).For thousands of years we had to walk there to get to the information. This means, you meet people there, that search for the same stuff as you do. These people are interested in the same stuff as you. You may like to meet them.

Now imagine a virtual office of the future, paperless, with an Organisational Memory. Today, you just browse a website. Not in the future: I would like to walk to the library. I want to walk to our shared file folders on the file server. I want to see the other people that are commiting to CVS at the moment. I want to see who browses at this blog at the moment, I want to see who else browses at planetrdf.

That would be a nice facet of cyberspace. The smell of libraries.

Calendaring

I tried to get an overview of calendaring but got an headache instead.

There are several authorative documents about getting iCal to work in RDF, but the schema is not consistent and some use other dialects, so pity me.

the mailing list
http://lists.w3.org/Archives/Public/www-rdf-calendar/

the wiki:
http://esw.w3.org/topic/RdfCalendar

the webpage
http://www.w3.org/2002/12/cal/

the good presentation
http://esw.w3.org/topic/RdfCalendarPresentation

this is the official test-case collection
http://www.w3.org/2002/12/cal/test/

the example by Masahide Kanzaki
http://kanzaki.com/works/2004/cal/concerts-tokyo.rdf

some of the cases use other ways to note dtend and dtstart, uff.

my Timezone…
http://www.w3.org/2002/12/cal/tzd/Europe/Vienna#tz

Principles of Boundaries in the Semantic Web

Introduction
While hacking here at www.dfki.de we came to some “knee deep in the dirt” problems of Semantic Web querying and triple transmission.

We have the opinion that a Semantic Web server does not have “models” or you query by “passing a model uri”. That is not feasible in a world that goes towards a global triplespace. So what we do instead is have one big virtual model that is inside build out of many different models that contain data. These models can be made out of adatpers (like gnowsis adapter or think of something like D2RQ ).

So from the outside you have a “Semantic Web Server” that answers your queries. The queries are in three different forms:

  1. find (s p o) patterns
  2. RDQL queries
  3. Chatty Bounded Descriptions

Find(s p o) is easy to understand, every hacker has called one of those before. RDQL is also well known, you pass a few patterns and get a result as RDF subgraph or variable binding. The third, Chatty Bounded Descriptions are the gnowsis way of handling “Concise Bounded Descriptions“. In short, you ask for data about a resource (by passing the url of the resource) and get back a subgraph of RDF around the resource, mostly literals and links to other resources.

The problem is: When you have anonymous nodes in your result, what do you do?

In case (3) it is no problem, as ChBD return a subgraph that has a closure around anonymous resources.

But in (2) and (1) you have a problem. Consider yourself querying a remote store and the store returns an anonymous resource as part of the result. like
“find (?person ?foaf:name “Leo”)
and the result is an anon identifier
?person = “234234:234234:243234”

Ok, if the server has just one big model then no problem – but what if the server is an aggregation engine, embedded in an enterprise integration environment?

We have this problem right now: we implemented above search and return an anonymour resource, but just by looking at the anonymous resource it is not possible to guess where to look for more information about it. Sven Schwarz and I thought about writing a buffering system, that holds the triples with anon resources but implementing a buffering system of outgoing triples would be the source of much bug.

So we decided to create the principles of boundaries

Principles of Boundaries
In the Semantic Web we always talk about models or chunks of RDF to do something with them.

The principle of boundary is, that a Semantic Server only returns closed boundaries, with no anonymous resources at the end. If you ask a “find (s p o)” question and get an anonymous resource at the end (o=anonid), that is your problem. The server does not have to answer to “find (anonid p1 o1)”. You have instead to ask the question again in RDQL, with “SELECT (s p o) (o p1 o2)”.

So no anonymous resources are part in communication between servers. They may still be passed in models, but only in chunks of rdf.

This approach of Boundaries does help us very much here to implement our Semantic Web Service. If you understand what I mean, you are a real hacker.

Semantic Web is alive!

IBM Semantics Toolkit

I heard of the IBM Semantics Toolkit and had to d/l it to give it a try.
http://www.alphaworks.ibm.com/tech/semanticstk

Roughly, it consists of three parts (see above site):

  • Orient – Visual editor
  • EODM – like Jena, but less
  • RStar – storage like kowari or Sesame

I focused on Orient, as it is Eclipse based and a visual editor. The other two may be interesting, but –
1. I am Jena based
2. they are not open source
3. RStar is only IBM databases (needs either IBM DB2 or IBM CloudScape which I don’t have and are too lazy to install. Cloudscape seems to be an embedded database, whatever, I stick to Orient first)

Orient

Installation is easy, read their readme.txt and unzip a few files. One mistake I made was to add Orient to my existing Java Development Eclipse – that does not work. You need the Eclipse platform, 3.0.

Then I changed to the Window-Open Perspective-Other-Orient. Voila. easy. Good looking. Reminds me of Protege. only neater. Nice typos “Please select a Konwledge I/O”. First try: new project. A lot of stunning options comes up, I keep pressing “next”. Ah, next experiment – import FOAF. Like Danny Ayers’ FOAF driven Development. WORKS. I see foaf:Person and other stuff.

Then i tried some editing. Disturbing things happen: when I create a new resource, to add its rdf:type and therefore class, I have to enter the URI of the class. At least I can select the namespace, but the existing classes are not listed. Drag-Drop doesn’t work either. Can this be true… lets see. Yes, this is low tech. I am not able to even add properties to this resource, or maybe I have to try harder. Ok, it is not a resource editor, it is an ontology editor. On the ontology side, it is better. I can add properties to classes – again by entering the URI of the property by hand.
Undo does not work also. Not always.

And then the obvious RDFS limit. The thing I often do is have one property with two classes as domain. I do this and add “foaf:depicts” as a property to “foaf:Project” (ok, not very wise). boom – it has two rdfs:domain entries, which has some disadvantages (the property is only valid for resources that are of type of both classes). This is only expressable in OWL, so RDFS crashes here anyway, so maybe this is a desired behaviour.

Export to RDF is good. The ontology comes out as clean, nice RDFS.

my bit: cool looking UI, eclipse rules. Good stable IBM ground, the EODM framework looks like it can handle this. Very easy to install, no exceptions, everything works – end user compatible. But no features, we have to wait and see until they implemented all of Protege 🙂