Does anyone know of (or want to write) a tutorial for querying JSON in postgres for people who know SPARQL..?
sparql (12 out of 12)
- Wrote some of the longest SPARQL queries I've ever seen.
- Disambiguated all the things.
- Added all the Things to DBpedia.
- Got my very own BBC/things guid.
- Corrected typos in public ontology documentation.
- Discovered rewriting a script from scratch in Python is easier than learning Scala to make a couple of small changes.
- Wrote an IRC bot script that replied in the affirmative if anyone mentioned falafel in the team channel.
- Accidentally made every artist in the triplestore
owl:sameAs
David Bowie. - Socially Aware Cloud Storage by Tim Berners-Lee (2011)
- WebDAV
- WebID
- MyProfile: unified user profile service which you sign into with WebID and stores data about you with FOAF. There's a hosted demo, but it's intended for people to host their own instance I think. Code on Github
- RWW.IO: a personal Linked Data store; looks like RemoteStorage but for LD. Code on Github
- Read/Write Linked Data by Tim Berners-Lee (2013)
- Not particularly helpful list of things that support editing Linked Data on the Web
- Identity Interoperability: need to read this properly.
Tried to write an SQL query and discovered I now only know SPARQL.
So SPARQL DESCRIBE
queries are literally 8x slower than an equivalent CONSTRUCT { ?s ?p ?o . }
. Who knew? Not me. Maybe this isn't news to anyone who knows anything about graphs and/or query languages.
Live demo'd my blog sparql endpoint at the pub, pretty sure I'm the coolest
TA and Marker, Semantic Web Systems
School of Informatics, University of Edinburgh
Same as last year, but with twice as many students. Tirelessly answered student emails, made a few supplementary materials, mostly got the feedback sent on time; was nominated for a Best Teaching Award ^^ Oh, also organised a hands-on workshop this year, because I generally disagree with lectures.
Junior Data Architect, Linked Data Platform
BBC, London
Worked with an amazing team to boost the profits of Mr. Falafel in Shepherd's Bush and on the side helped with modelling the world as the BBC sees it, and learnt all of the corners to cut and ideologies to give up in order to develop linked data applications to improve the lives of people who don't know/care about linked data.
Key achievements include:
ARC2 SPARQL Endpoint
So Slog'd got stuck for a little while because the fast, nice-looking, somewhat magical SPARQL endpoint provided by ARC2 stopped working for no discernable reason.
I thought I'd try leaving it alone for a few weeks to see if it started working again by itself, but alas, it has not.
Everything is fine until I try to query for a specific predicate. (Specific objects or subjects are fine). The query runs, it just returns no results. I know the data is in there, because I can get it out with less specific queries. Also because I can see it all in the MySQL database on which it is based. When I left it, it was working fine.
I'm going to kill the database and set it up again.
I did this by - and oh, it was joyous - going into the database settings and appending '2' to the name of the database. I then reloaded the endpoint page, and it set everything up by itself :)
I inserted two triples, and successfully queried for a specific prefix. So, it works. I wonder what will happen if I dump all my old data back in there? (I validated the raw file with all the triples in RDF/XML, and they're fine).
I inserted the rest the rest: LOAD <path/to/rdf.rdf> INTO <>
Ran a test query, aaaand... it's fine.
So what the hell was wrong with my other database? Perhaps I'll never know...
TA and Maker, Multi-Agent Semantic Web Systems
School of Informatics, University of Edinburgh
Marked courseworks where undergrad/taught masters students had to convert an open dataset to linked data and query it and stuff. It was pretty fun. Also helped students understand the course materials by email.
Remote SPARQL endpoints and RDF parsing
Didn't have much success talking to the Dydra SPARQL endpoint yesterday. I was briefly worried as there are no docs describing how to write back to the SPARQL endpoint, so I thought that was write-off at once, but then I found a blog post from 2011 about how that has been introduced. Just not documented yet apparently.
But to start with, I imported some test triples using the Web interface, into dydra.com/rhiaro/about-me and tried to read them back.
With ARC2, along the lines of:
include_once("ARC2/ARC2.php");
$config = array(
'remote_store_endpoint' => 'http://dydra.com/rhiaro/about-me/sparql'
);
$store = ARC2::getRemoteStore($config);
$query = 'select * where {?s ?p ?o} limit 20';
$rows = $store->query($query, 'rows');
But all I got back was an empty array. I tried with with the DBPedia endpoint, which fell over a couple of times, but I got results... except... they were different from the results I got when I queried the endpoint directly through their interface. They seemed sort of metadata-y, rather than actual triples from the store. But it's hard to tell.
So I had a go with Python's RDFLib to try to figure out who had the problem.
import rdflib
rdflib.plugin.register('sparql', rdflib.query.Processor, 'rdfextras.sparql.processor', 'Processor')
rdflib.plugin.register('sparql', rdflib.query.Result, 'rdfextras.sparql.query', 'SPARQLQueryResult')
g = rdflib.Graph()
query = """
SELECT *
FROM
WHERE {
?s ?p ?o .
}Limit 10
"""
for row in g.query(query):
print row
And with that I got some triples... but not from the triplestore. It parsed, I presume, whatever semantic markup it could find in the page itself, the page you see when you visit dydra.com/rhiaro/about-me/sparql. Eg.
(rdflib.term.URIRef(u'https://s3.amazonaws.com/public.dydra.com/stylesheets/style.css?1337867890'),
rdflib.term.URIRef(u'http://www.w3.org/1999/xhtml/vocab#stylesheet'),
rdflib.term.URIRef(u'http://dydra.com/rhiaro/about-me/sparql'))
Do I have to send an accept header? Surely RDFLib is supposed to take care of that for me... Whatever.
If that's how you're going to play it, I'll just make the request with CURL directly. (I used Python's Requests because the Web says it's nicer than urllib2):
import requests
import rdflib
q = "select * where {?s ?p ?o}"
url = "http://dydra.com/rhiaro/about-me/sparql"
p = {'query': q}
h = {'Accept': 'application/json'}
r = requests.get(url, params=p, headers=h)
print r.text
Boom! Triples! Better yet... the ones in the triplestore! By default (with no Accept
header set) they come through as RDF/XML, and it won't give me Turtle, so JSON seems to be the nicest looking option. That doesn't really matter though, as nobody really needs to look at it.
I guess I'll try CURL with PHP for Slog'd, and just parse it with ARC2. It seems a shame that ARC2's remote endpoint querying didn't Just Work with Dydra, but I don't have the time or energy to try to figure out why right now.
Then I need to figure out if I can write to it or not. If I can't... In the name of progressing, I'll have to ditch it and use ARC2's built in MySQL-based triplestore.
Update: Parsing the results with RDFLib
Because I want to understand exactly what Dyrda is giving back to me, I wanted to quickly parse the results and use them like I should be able to use a graph.
The XML that Dydra is returning is not straightforward RDF/XML that RDFLib can just understand. It's a 'SPARQL Result. It looks like this:
https://rhiaro.co.uk/about#me
http://xmlns.com/foaf/0.1/homepage
https://rhiaro.co.uk
...etc
So later I either have to work out how to make RDFLib understand this, or make RDFLib understand the JSON alternative. I really don't want to have to write a custom parser to deal with it.
Update: Solved
Turns out it's as simple as using CONSTRUCT
instead of SELECT
in the query. Rookie mistake? I don't know. I feel like RDFLib ought to be able to handle the SPARQL results format somehow though.
Last modified:
Owning your Linked Data
Thinking about the options for hosting the Linked Data for Slog'd implementations, and also thinking about hosting personal and shared Linked Data in general, and Linked Data authored by an Agent who isn't the subject, because this is likely to be appropriate in a lot if cases.
For Slog'd it's a bit easier, because people setting up a Slog'd implementation would have a bit of knowledge / interest in the Semantic Web and data ownership, in theory. Such people will probably have already a unique URI (like a WebID) or the ability to set one up. The data is also most likely to be authored by the subject.
I'm thinking about it in broader terms because (PhD-related) I want content creators (who don't know or care about the Semantic Web) to publish Linked Data about their involvement in the creation of different digital media and online creative works.
Where does this data live, to give them absolute and definitive control over it? If they can author data about anyone and anything (which of course, they can) how do we verify what they say when there are no conflicts, or deal with conflicts that arise? Is content attribution data stored in a giant graph that anyone can update?
How do we handle IDs? Is there a central service for generating URIs for people who can't use their own domains (like the MyProfile demo)? How do we link - or not - multiple identities of the same person, according to what they want other people to know?
Things found during this braindump:
Last modified:
Putting a blog on the Semantic Web
It's easy to see why not many people do. I say not many, because I'm sure there are people using the various RDF Drupal or Wordpress plugin. But I don't really know if they count, because that's really just people annotating their content for SEO purposes, and maybe a bit of interlinking with DBPedia to let visitors find out a bit more about topics of interest.
But what I'm really talking about is the whole blog running from a triplestore, with a publicly accessible SPARQL endpoint.
All the publicly accessible SPARQL endpoints I can find are big 'useful' datasets that people might want to mash up with other things. Archives of data. Not live or frequently updated stuff. Certainly not the contents and metadata of a personal blog.
So what's the point?
There's not a lot else 'on' the Semantic Web to tap into. There's no hidden Semantic Blogosphere that would yield great worth if I could only tap into it. There are no smart agents traversing the Semantic Web and aggregating interesting blog content for intelligent readers (are there?).
And there never will be, if nobody publishes their content this way.
So I suppose I'd better get on with it.
My point is, it's easy to see why nobody does.
Last modified:
Scotland Public Notices experiments
Tell Me Scotland publish public notices for things like traffic, planning permissions... And they have a SPARQL endpoint!
I'm hacking around with the ultimate goal of creating an interface that allows people to generate a GeoRSS feed for a particular area. Ally of GreenerLeith suggested this, so that they can use this to feed into their own apps. A stage beyond that is to smoosh the whole lot into a Wordpress plugin, to make it accessible to anyone (who uses WP).
So far I've got the notices on an OpenStreetMap. I haven't had a whole lot of time, but will make more soon.
I'm using the PHP library ARC2 to deal with the linked data.
NB. the TMS endpoint is, at the moment, flaky at best. I think they're working on this.
Last modified: