profile profile

In reply to: https://rhiaro.co.uk/2018/04/thewebconf-timbl

My transcription of TimBL's talk in the Web History track at TheWebConf 2018.

So much stuff I could bore you with.

I'll tell you about some random bits, some of the other systems.

I start with the tip of the hat to Mum and Dad for brining me up as two programmers, some of the earlier programmers. Mum and Dad met designing and working on the team that put together the Ferranti Mark I computer, the commercialization of the Ferranti Mark I. The spirit then was very much that all computers were the same and whatever you can do with one computer you can do with all and it's really up to your imagination. They imagined they'd have early computers translating to Russian and back by the end of the week. Figuring out that some things were easy and some things were more difficult.

One of the early themes was my Dad talking to people about computers, explaining things to people, using intersecting water jets to explain how binary worked for people.

One of the things he tried to explain was the difference between what people can do and what computers can do. People can do that random association, they can think about something technical they learned and every time they smell the cheese they were eating when they learned it (at the WebConf in Lyon) then they can make that connection, random connection, between cheese and the idea. Computers couldn't do that.

One of the early fascinations was with things that could do that. The first time I went to CERN in 1980 was write a program called ENQUIRE. Two versions, one for PC, COMPAQ portable luggable PC. This was a notepad thing, basically ran on a terminal so you could look at a notepad, create a note and at the bottom of the note would be links, you could add extra links, browse through the thing, only go through it by following links, start with the homepage and organise your stuff with an hierarchy, remember the place, mark this point, go to another place and say you want to link to this place, mark and link, a link would store to the place I'd previously marked and when you link between the two it would give you a sentence you had to fill in from a popup menu with the relationship between the two things. This thing in this note is described by this or had things like 'created by', and so on. Pretty semantic webby in a sense, in that you had a choice of different predicates. Useful for describing particular projects, the sort of thing a British programmer would end up going to Switzerland to work to help program.

I came in as a contract programmer for 6 months, during that 6 months we had to get up to speed. So the idea that any organisation ought to have some sort of system for allowing people to write down what they've done, explain why they did it and then go away, at the end of the student's summer or whatever. One of the critical things was you could make arbitrary links, when you realised there was a connection between two things you couldcapt ure it.

Spreadsheets were ridiculously popular, a huge amount of human knowledge is.. there's a great TED talk, someone who looked at how much knowledge is locked up in spreadsheets, and nobody who wrote the spreadsheet has the faintest idea of what it means, and people use the spreadsheet as a critical part of the business end up abandoning parts of it because they don't understand it. The spreadsheet can't make arbitrary connections between different things.

When you say, when we tell it that I want this thing, I put the same formula in all these rows it doesn't even realise you've got an array, you just made a lot of things that are the same, but it doesn't realise that you're talking about an array of objects let alone what the array of objects actually represents or the semantic relationship between the columns.

A lot of fascination with trying to capture the semantics of things and of arbitrary links.

I won't talk about the history of the Web itself because it's already been talked about, so I'll just pick some things.

One thing about the architecture of the Web. The crucial thing about the Web you realise you've been told is taking the Internet stuff, the idea of getting things over the Internet world of documents which existed before wikipedia, and the world of hypertext. The world of hypertext, at the time, you could make links on my little hypertext note thing between different files, you could typically make links within a CD-ROM but you couldn't make links between different CD-ROMs, they were used for manuals and things, there was a whole thing there.

I hadn't come across Ted Nelson at all, not until afterwards.

The idea of Web architecture, if you think about URLs, http://yadayad/yada - which is the most important symbol in the URL? The key one is the #. Everything hangs around the hash. The thing that connects one world to the other world and makes them work together.

On the left hand side of the URL that's all the name of a document. And then there's a hash and after that means: okay when you've got that document, within that document - whatever that document is, whatever system whatever language - then that thing. So it's global identifier for a document, followed by # local identifier within the document.

I didn't expect the whole Web to be a Web of HTML, I imagined lots of PDFs and SVGs and things, and that HTML would be mainly used for finding those things so you could follow the links and eventually you'd end up with the jewel of information you were after. Like gopher had menus and menus and then the text file. HTML would be used for the menus. Wasn't like that at all, HTML became powerful enough to use for all the documents. It became an HTML Web.

Relatively soon afterwards at the first conference I remember having a slide about Web semantics and saying that we have links between different documents but actually... when you have my birth certificate and the title deeds for my house and the title deeds for the house are saying this house and this person, and that this person owns this house. With hypertext links you clicked on one thing but yeah, it's more interesting to work on semantics, certainly from a computer science point of view, to build systems which process the actual knowledge underneath.

You've heard about the Web consortium.

The browser wars... we only just had the consortium running in time, Microsoft and Netscape were furiously battling each other for the domination of the net.

I'll sumamarise Web history from then on.. a lot of people were terrified that Netscape dominated the Web, then that Microsoft dominated the Web and all computing. Then the Web realising they're not worried about Microsoft dominating because there are lots of browsers, but then the world worrying that the browser is irrelevant and the dominant search engine has lots more power, and then Facebook login and so on..

The eras of the Web you can characterise by what the dominant commercial, by what the threat was. Always when you have a monopoly it threatens the duration.. the person who is running the monopoly can decide arbitrary standards themselves, they can just write specs and if you're lucky they can let you know in due course how they work.

There are mainly 4 dominant companies, not so many code bases... this has always been an issue. When there are times when you wake up and you're not worried about the AT&T any more, the Netscape monopoly, the AOL monopoly any more.. things can change very quickly, maybe there will be one day you don't worry about the Facebook monopoly. Maybe things will use Solid (solid.mit.edu) where we're using the Web to build systems that don't have the problem that everyone is in one silo.

The hash was a key part of the web architecture generally. And it took off with largely with HTTP and HTML. The idea of the http-colon-yada.. the idea was that you should be able to change out the protocols every now and again. When you found that you wanted a new space of documents then you could change that out. Initially it could be gopher:, ftp:. By allowing it to be ftp: - you were told the ftp address, you probably were given the ftp instructions before the Web. Go to this site and log in as anonymous, give your email address, cd to here, get the thing. Wrap those instructions up into a URL and wow! All of the legacy ftp system became part of the Web.

The colon was the second most important part of Web architecture, that says even though it's HTTP at the moment we can change that.

One of the things we messed up quite early on when people said this is not secure enough, and there was competition between two designs: shttp and https... wrooongg.. we should.. what we ended up doing is saying we should have https everywhere. This means that every Web page, wherever you have an HTTP URL, you should be changing it to an https URL. I bet when you do that you don't even keep it all the same, you change one thing you change some other things too.

The push for https everywhere basically breaks the entire Web. This is the only technical change that breaks the entire Web. The links to W3C TRs, the link to semantic technologies, we had to put in a lot more things to allow computers to understand that http and https are interchangeable.

Like HSTS.. upgrade secure requests.. should have used that way back at the beginning, instead of adding an s to the URL. People felt they wanted to be able to give the guarantee of security to link follower.

The colon is an interesting thing, fun now to see that.

We have this project called Solid which allows you to store your data on datastores that you control and uses https, and somebody just wrote all the code for safe:. SAFE is a decentralised network where your data is stored by lots of people and is encrypted and out of the eyes of governments and companies, they just added safe: handling to the Solid stack. You can now run your Solid apps over a SAFE infrastructure.

The hash is something that is important for moving on bit by bit, decade after decade, about every ten years we can afford to change those protocols. Most of the people in the W3C TAG, Henry [Thompson], had been involved in this and is just getting around.. finding people who introduce new namespaces, want them to go to their system, their protocol, pay them for a name.. The TAG goes around saying no just put it in http space. It works. We all want to take over the world, but...

How many minutes have I got? Who's in charge?

I'll tell you the history of the Semantic Web.

> Wendy: Can we still get lunch?

I had no idea.

Okay I won't tell you the history.

In one sentence... the Semantic Web which was introduced, we started talking about back in the days of this conference, which later became the crazy focus of a whole bunch of dedicated logicians. And it was a lot.. much pride was lost and face was lost and which was much poo-poo-ed by the incumbents, shot down by Google and Microsoft, by people whose careers had been built on existing systems like XML. And poo-poo-ed by people like the CTO, the chief researcher of Google, Peter Norvig, as severely.. that Semantic Web was attacked by all these people, it battled on. It is now huge. I was just in the Linked Open Data track, there is a ridiculous amount of Linked Open Data in the world. A serious proportion of webpages have embedded microdata and Google will honour that data if you have it. A product, a band, if you put

RDFa in that webpage Google will understand what it's about.

Google created schema.org in the way that a large company with a couple of friends can do... gobbled some standards up gave them a different name and a different URL, and because of that the Semantic Web is a thing, there's a ridiculous amount of it.

The people who had been plugging along at it for a long time, it was great to see them celebrating in the session today.

If you were talking about graphs people would look at you weirdly, 'excuse me, we use trees'. But now if you're not using a graph database you're just missing out. It's the year of the graph, the year of the Semantic Web stuff. A story with a happy ending.

🏷 timbl TheWebConf