stop fumbling the semantic web, do science

A short note to myself and the community:

Stop fumbling around with the semantic web, make quality science.

A prototype for a search engine with a bad user interface, an implementation of a rdf database that only works half, an ontology that is never used, we all know these projects.

Our discipline is a crossover, we need results from artificial intelligence, web 2.0, usability, personalization, databases, data integration, software engineering . . .

So – science would be to concentrate on one aspect and then improve that, for example to fix yourself on a scaleable rdf database. You develop a scaleable algorithm and prove in a test setup that it works – voila. But then, it takes YEARS until you yourself or others can use this result in their other projects. Saying “we need named graphs” is far away from having an RDF store that supports them in a scaleable way, but the distance is often underestimated by us.

so, I should concentrate on writing down the good ideas we have and wait -YEARS- until I can benefit from my own ideas using software written by somebody else. Like TimBl using a Firefox.
😉

making sesame2 SPARQL protocol conformant

At the moment we need a SPARQL conformant interface to Sesame2, and as there is none I know of, the power of open source allows us to write one.
openrdf

My first question is: did somebody already write a Servlet that does map Sesame2 servers to the SPARQL protocol?

At the moment sesame2 does not support full SPARQL querying, but it will soon. We don’t have to insist on SPARQL as query language, we can pass in SERQL queries and treat them like they were SPARQL, but we have to start with a conformant servlet 🙂

Then some longer questions I also asked on the Sesame developer mailinglist:

We think that a SPARQL protocol conformant HTTP servlet is most important for any use of Sesame2 and are willing to invest 10 hours a week into this, more precisely, a clever student worker. We hope to get this done until the end of September.

We would implement a SPARQL protocol conformat query server and a SPARQL protocol conformant query client (issue tracker) for the reading operations of a HTTPSail. For updates of the model, we would stick to the current implementation of the latest CVS of sesame2.

want to know what the sparql protocol is?

I understand that these are MANY questions, I tried to think of all the calamities we are going to face in the next months. And I expect that some hackers out there already handled half of these questions, so don’t hesitate to write me, or comment here, or to the sesame devel list.

* The org.openrdf.sesame.server.http.RepositoryServlet is not conformant to the SPARQL protocol,
as defined in the WSDL, or?
http://www.w3.org/TR/rdf-sparql-protocol/#query-bindings-http
the protocol described at org.openrdf.sesame.server.http.protocol.txt does not say anything about sparql

* If not, does anybody know how to generate stubs for the servlets automatically (so that they strictly conform to the protocol)?

* If not, we would examine the Jena / Joseki implementation, as it serves as reference implementation.

* When we implement a SPARQL conformat servlet – can we put it directly into the package org.openrdf.sesame.server.http.SparqlReadServlet, directly in the latest CVS, to have the best uptake and feedback possible?

* If yes, is there also a parser for query results, that can be used on the Client side HTTP sail to read results written by the server?

* What is the status of the HTTP Client? Did anybody do since we last mailed? if yes, please add comments to this ticket:
http://www.openrdf.org/issues/browse/SES-205

* Is the query string already part of the Query object? Jeen said this is a prerequisite for this hack. If not, Jeen: could you do this? This is such a core thing that I don’t want to touch it and for you its probably only 50 lines of code.
I mean the solution 1) suggested here:
http://www.openrdf.org/issues/browse/SES-205#action_10533

* Can Sesame2 serialize Query results according to the SPARQL protocol?
I see the QueryResultFormat.SPARQL which would indicate that.

* last but not least: any news about SPARQL query support?

* Do you have a debug environment to test the existing servlets from org.openrdf.sesame.server.http?

* Does the WebClient work? (the code looks SOOO COOL! spring rocks)
– I cannot find any code in the webclient project that actually *changes* triples… hm.

* When Sebastian starts hacking, whom can he jabber/icq for help?

going to burningman-who will be there?

Ok, who from the Semantic Web guys and girls is going to Burning Man this year?

I am going, and we are blogging here about that:

I tried to find you guys by using search engines, but I am not witty enought to get the right results, so blogging and asking this question is probably ok for today.

Here is the story how I tried to find you:

This time I swoogled the semantic web to find out who lives near SFO, which indicates that they may go to burningman.

My first try was to go to swoogle.com, a very interesting website where you may lose hours during your office time. Then I found the search engine at swoogle.umbc.edu.

I tried to search for:

  • ns:foaf SFO but this broke. Internal Server Error in /work/swoogle/www/swoogle/3.1/components/com_frontpage/writer.search.php on line 61
  • perhaps not search documents, search ontologies instead. SFO – no results
  • just entering francisco returned too much.
  • refining to ns:foaf francisco, zero.
  • at this point, I registered for an account at swoogle, perhaps then more. Hm, no.

Ok, perhaps the search syntax is wrong. Lets go to Intellidimension’s Semantic Web Search engine at www.semanticwebsearch.com.

ok, change of tactics: shoot straight at the target of burning man.

  • semanticwebsearch for burningman. Brings some RSS feeds. Ok, that are bloggers. I need semweb bloggers though.
  • perhaps some foaf person said something in interests or so? search, nothing.
  • with “burning man” we get two livejournal users.

ok, lets see what the market leader does.

ok, so I didn’t hear of these guys before and I see that this use case is interesting. Anyway:

Ok, who from the Semantic Web guys and girls is going to Burning Man this year?

Talk on Semantic Desktop at ZGDV, 19.10.2006

There is a congres on “Semantic Web und Wissenstechnologien” in Darmstadt at ZGDV on 19th October 2006.

There are many interesting people from Germany giving talks there, Benjamin Novack from the hacker side, Georg Lausen, Andreas Kupfer, Michael Stollberg from Innsbruck, Torsten Priebe from Capgemini Austria, Holger Rath from Empolis, Achim Steinacker from intelligent views, and Leo Sauermann from myself.

If you are a student of university, reduced conference fee for academics is 120€/290€.

semwebzgdv

trying out eclipse RCP and RDF

We created a sample Eclipse RCP application that shows how to use sesame and a bit of gnowsis inside SWT and eclipse. We plan to see if we can benefit from Eclipse RCP in Nepomuk. more about this hack here on the gnowsis site.

rcp-pimo

This is what we did:
We started at 17:30 by downloading Eclipse 3.2 and slavishly following the Hello World Tutorial here.

ok, all worked, the empty “hello World” deployment thingy with .exe file and so on weights 7MB. wuff but ok.

Then Leo decided to rename the packages from “semanticdesktop” to “com.example.semanticdesktop” and that was the last time we saw our hello world. shoots, restart.

After Benny got the control, he changed the plugins first window from title “Hello World” to “Semantic Desktop”.

Hours later….

ok, we try now the “mail demo” and extend it with a RDF view showing hte pimo tree. First problem: we don’t want to import openrdf directly but instead import it as OSGI bundle (woa, cool). Hm, the best approach seems the wizard “plugin from existing jar archives”. that generates useful output.

Result at 19:30: we managed to include Sesame and gnowsis as Eclipse OSGI bundles and were able to load the PIMO ontology language from a local file and display it in a tree. Its a lot of work but it looks cool. So, Beer now.

Cypher: a product that translates language to RDF

cypher by monrai


The Cypher™ alpha release is a program which generates the .rdf (RDF graph) and .serql (SeRQL query) representation of a natural language input. With robust definition languages, Cypher’s grammar and lexicon can quickly and easily be extended to process highly complex sentences and phrases of any natural language, and can cover any vocabulary. Equipped with Cypher, programmers can now begin building next generation semantic web applications that harness what is already the most widely used tool known to man – natural language.

So, a company is going on the market! Horray. Best wishes to them, and I sure want to check if that baby can help gnowsis.

Validating ontologies in Gnowsis and Named Graphs for provenance

In the latest SVN version of gnowsis, we have various improvements on handling ontologies, validating them and so on.

All information about that is here:

Managing Domain Ontologies

Adding ontologies, removing ontologies, updating ontologies is implemented in the PimoService. A convenient interface to these functions is implemented in the web-gui:

A list of ontologies that work with gnowsis is at DomainOntologies.

The implementation of Domain Ontologies is done using named graphs in sesame.
read on at Named graphs in Pimo.

Domain ontologies are added/deleted/updated using methods of the PimoService.
You can interact directly with the triples of an ontology in the store, but you have to care for inference and the correct context yourself then.

Validation of PIMO Models – PimoChecker

The semantics of the PIMO language allow us to verify the integrity of the data. In normal RDF/S semantics, verification is not possible. For example, setting the domain of the property knows to the class Person, and then using this property on an instance Rome Business Plan of class Document creates, using RDF/S, the new information that the Document is also a Person. In the PIMO language, domain and range restrictions are used to validate the data.
The PIMO is checked using a Java Object called PimoChecker?, that encapsulates a Jena reasonser to do the checking and also does more tricks:

The following rules describe what is validated in the PIMO, a formal description is given in the gnowsis implementation’s PIMO rule file.

  • All relating properties need inverse properties.
  • Check domain and range of relating and describing properties.
  • Check domain and range for rdf:type statements
  • Cardinality restrictions using the protege statements
  • Rdfs:label is mandatory for instances of ”Thing” and classes
  • Every resource that is used as object of a triple has to have a rdf:type set. This is a prerequisite for checking domains and ranges.

Above rules are checking semantic modeling errors, that are based on errors made by programmers or human users.
Following are rules that check if the inference engine correctly created the closure of the model: –

  • All statements that have a predicate that has an inverse defined require another triple in the model representing the inverse statement.

The rules work only, when the language constructs and upper ontology are part of the model that is validated. For example, validating Paul’s PIMO is only possible when the PIMO-Basic and PIMO-Upper is available to the inference engine, otherwise the definition of the basic classes and properties are missing. The validation can be used to restrict updates to the data model in a way that only valid data can be stored into the database. Or, the model can be validated on a regular basis after the changes were made. In the gnowsis prototype, validation was activated during automatic tests of the system, to verify that the software generates valid data in different situations. Ontologies are also validated during import to the ontology store. Before validating a new ontology, it’s import declarations have to be satisfied. The test begins by building a temporal ontology model, where first the ontology under test and then all imported ontologies are added. If an import cannot be satisfied, because the required ontology is not already part of the system, either the missing part could be fetched from the internet using the ontology identifier as URL, or the user can be prompted to import the missing part first. When all imports are satisfied, the new ontology under test is validated and added to the system. A common mistake at this point is to omit the PIMO-Basic and PIMO-Upper import declarations. By using this strict testing of ontologies, conceptual errors show at an early stage. Strict usage of import-declarations makes dependencies between ontologies explicit, whereas current best practice in the RDF/S based semantic web community has many implicit imports that are often not leveraged.