In the last two months we shifted the gnowsis search services to SPARQL. Our problem was, that the common ARQ implementation is slow and does not work in a named graph scenario.
A fine twist in history brought Richard Cyganiak to work at HP Labs for a while and he hacked sparql2sql there, a mapping of sparql to the jena database scheme with support for named graphs.
I asked him if I can use it in gnowsis and he was quite happy to have this as a test-case. Result is, that Richard added a special framework for fulltext search and I hacked some stuff regarding MySQL’s fine FULLTEXT index functions (if you don’t know it: its similiar to lucene, can do relevance and query expansion and so on)
outcome: an hour ago I transformed the SQL index to the new fulltext format and got hit in the face with the new answer times:
SPARQL of this kind works practically instantly (10-20msec):
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?x ?label WHERE
{ GRAPH ?source {
?x rdfs:label ?label.
FILTER REGEX(?label, “test”, “i”)}}
But the real astonishing thing is that SPARQL of this kind is also fast as hell (10-500ms):
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?x ?p ?label WHERE
{ GRAPH ?source {
?x ?p ?label.
FILTER REGEX(?label, “test”, “i”)}}
In simple words: this is a fulltext scan over all properties of all statements. don’t get bothered by the warmup time, the thing will need about 10-20 sec warmup, but then its great.
triplecount:
371994
for the freaks we will probably pack a little executable jar that packs all into a nice demo. for the real freaks:
http://gnowsis.opendfki.de/repos/gnowsis/trunk/sparql2sql/