Posts between 2013/01 and 2014/01 (130 out of 130)
Tutor and demonstrator, Dynamic Web Design
MSc Design & Digital Media, University of Edinburgh
Taught basic web design and development, including responsive design and progressive enhancement, to students from a variety of (mostly non-technical) backgrounds. Marked coursework.
As a farewell present to Beth, who
was vacating sunny Scotland for the harsh and unforgiving shores of the US, I
made a squirrel. But not just any squirrel.
A Morrissquirrel.
That's Morrissey-squirrel. Don't question it.
I largely followed this pattern for the normal squirrel parts, then improvised to Moz it
up.
I'm supervising (created the brief, overseeing and guiding a group of 6 MSc
students, grading) a Digital Media Studio Project in the Edinburgh College of
Art.
For the second year, I'm tutoring PHP, MySQL, HTML, CSS and JavaScript to MSc
Design and Digital Media students for the Dynamic Web
Design course in Edinburgh College
of Art.
The 1st International Open Data Dialogue in Berlin in December was broadly a
discussion about real-world applications of Open Data. Lots of practice, less
theory. Despite this (or perhaps because of this, now I think about it) it
wasn't as technical as I expected. Felix Sasaki [1] talked about some basic
technicalities of Linked Data and the Semantic Web, kind of the first things
you'd learn if you were studying it in a structured way, and I heard a lot of
people afterwards complaining that that had been too technical.
Importantly, there was a real message of getting things done at this event,
and plenty of evidence that a world built on Open Data is not an idealistic
pipe dream, but a reality right now. Challenges are being articulated, and
solutions are being created, and problems are being overcome.
I stress this particularly because a couple of sceptics who weren't at the
conference tweeted things along the lines of "Sounds like your conference is a
bunch of idealist hippies preaching to the choir…" A genuine concern, but
what's really exciting is that this definitely wasn't the case. It was
instead a bunch of realist technologists with the expertise and influence to
actively overcome barriers to improving the world.
Open Data is about social change and empowerment. It is about
accountability of organisations with massive influence over the lives of
ordinary people. It is not about an abandonment of personal privacy, or
everybody knowing everything about everyone else.
It should go without saying (yet it still needs to be said) that it is not
appropriate to blindly make all data available to everyone about every aspect
of everybody's life. But what if you had access to all of the data anyone
had ever collected about your life? Think about purchase history (shop
loyalty cards, travel tickets), online activities (searches, browsing history,
social networking). All this stuff is being stored anyway, all over the
place. Often by organisations who fully intend to profit from it, presumably
with your unwitting consent. They went to the trouble of collecting it, but
you went to the trouble of providing it. It's your data too. What could you
do with it (or hire a software developer to do with it)? Then imagine you had
access to the same data from everyone in your town, aggregated and anonymised,
and visualised in a nice way. Maybe you could team up with your neighbours
for cheaper bulk food purchases? Maybe you'd realise that others had similar
hobbies or problems nearby, and could form special interest or support groups?
Reduce costs by sharing transport to similar destinations (or just have some
company on the journey)?
There's so much potential within data that's already held.
The UK government's Midata initiative is a massive step in the right direction
[3] toward compelling commercial enterprises to hand over machine-readable
datasets to consumers upon request.
In Slovakia and Kenya (and possibly others, but these were the ones that came
up), there is a constitutional right to data held by the government. Not
without loopholes and other problems, of course [5, 2].
One of the obvious problems is convincing large organisations that hold lots
of data (like commercial enterprise and governments) of the circumstances in
which it would be in everybody's best interest to release (some of) it.
Reasons they don't include a lack of understanding of the benefits;
disproportionate assessment of risks; aversion to change; a lack of technical
expertise and infrastructure; "data hugging syndrome" [2]; licencing issues;
outdated business models.
Nigel Shadbolt's experience says that large organisations who open data
always see benefits. It's always worth the effort. When the data is there,
suddenly developers start doing things with it; applications appear, many
unexpected, and usually free. He stressed that it's important to have a
stockpile of success stories in case you need to convince someone in charge of
the value of Open Data, and his favourite one was the publication of MRSA
rates in hospitals (resulting in sharing of good practice, and an 85%
reduction in MRSA over two years). See a list at the end of this post for all
of the success stories I came across over the course of the two days.
There were lots of discussions about the users or audiences of Open Data, and
the various different roles people can have. Most consumers of Open Data are
developers, and 'ordinary people' see the data via an application. Many won't
know (or care) about the source of the data that powers the app, even if it
about them. Many will, and trust must be built for people see the value that
such apps could bring to their day to day lives. Ideally, releasing a dataset
would be part of an ecosystem, rather than a one-time thing. Data providers
should value consumer feedback, and commit to good quality, up-to-date data.
Rufus Pollock wonders why every dataset doesn't have a public issue tracker,
and notes that poor quality data creates wasted time, especially at hack
events [4].
A successful Open Data world needs partnership between the public, media and
organisations. All of these parties need educating on appropriate
combinations of the realistic potential of Open Data, and the technicalities
of releasing and using it. Michael Hörz [6] discussed the journalist
perspective on Open Data; they're desperate for data about everything, and
often manage to get hold of it. But they find themselves begging for
spreadsheets or CSV files, because what they get given are PDFs. Eugh! Yet
they're not asking for Linked Data formats? Which means, presumably, that
after they've been through the trouble of extracting data from PDFs, they're
putting it in a spreadsheet or something, and there's still a whole level of
usefulness missing. And I assume that's because they don't know otherwise, or
perhaps don't have the resources to learn even if they're aware of the
possibilities. Similar sorts of reasons that they're being given PDFs by
organisations in the first place.
So awareness, and easily digestable educational resources (how about
SchoolOfData.org) need to be promoted.
Now then, about those success stories... This list includes data publishing
projects, groups and apps that have been built on Open Data.
That'll do for now. Lots of the portals and competitions have links to app
examples etc. There's lots to explore.
Finally, I highlighted in my notes quite a lot of things that I need to find
out more about. A lot of them are technology or platforms for publishing or
sharing Open Data, and various standards or studies I need to read in more
detail.
I have a couple of questions to ponder on, too:
There's a massive focus around hacks (more often than not one off events) as a
way of using and promoting Open Data. What other ways are there? What will
the path to a deeper integration of Open Data in society look like?
There are lots of datasets and vocabularies about public services and society,
as well as science and education. What arts, culture and media datasets are
out there? (And what has been done with them?) Ooh, or online social
interactions? Maybe I'll do a survey.
[1] Prof. Dr. Felix Sasaki, keynote: "Linked Open Data @ W3C-Vocabularies,
Working Groups, Usage Scenarios."
[2] Prof. Dr. Simon M. Onywere, talk: "The Kenya Open Data Incubator Project –
Outreach to Research Community."
[3] https://www.gov.uk/government/consultations/midata-2012-review-and-
consultation via Nigel Shadbolt
[4] Dr. Rufus Pollock, keynote: "Open Data, Building the Ecosystem"
[5] Peter Hanełák, talk: "Open Data and Open Government Partnership in
Slovakia."
[6] Michael Hörz, talk: "Open Data in Local Journalism: An Excel file?"
Notes as I scribbled them. Read a proper
review.
**Peter Webster - Digital Resources for Social Scientists at the British Library**
British Library research methods guide under social sciences.
Web about individuals and organisations is fragile, things disappear. Most
site owners ignore requests for permission to archive stuff. New legal deposit
legislation next year will allow scraping the Web and archiving everything
without worrying about infringing copyright. Using this data is restricted;
library premises or print one copy for non-commercial use. Can't use an item
simultaneously with any other user.. (regulation is for print, derp)
Full text search for internet archive... open question, very complicated.
Consultation going on about this. Legislation is very restrictive; need to
look at data derived from dataset and how that can be made available.
Mike Thelwell - Sentiment Analysis for the Social Web
Sentiment is seen as peripheral, and often ignored, but actually it's core.
Emotional reaction to tea or coffee.
sentistrength.wlv.ac.uk SentiStrength - detect strength of +ve and -ve
sentiment in short text. Takes into account that text might not be
gramatically correct. In social media, sentiment expressed in different ways
(eg. emoticons, deliberate misspellings to embed sentiment: haaaapppyyyyyy).
List of +ve and -ve term stems and strengths from -5 to 4. People disagree
about sentiment surprisingly much. Something lots of people tweet about must
arouse sentiment. But for big events, surge of quantity of tweets, but not
surge in +ve or -ve sentiment. Lots of sentiment is implicit.
YT comments are easy to get so good source of social data. No ethical concerns
about getting permission to analyse because it's already public. (hmm, no?)
Longer the text, less well it works. But does have a long text mode, with
slightly different scoring.
Used by Yahoo question answering system to work out best people to give
answers. Companies use it for product reactions.
Would like to see these techniques to smaller scale case studies. The most
focused data stream still has loads of junk like what they had for breakfast..
Why are 1 and -1 neutral instead of 0? Because psychology - two scales, not
one.
Jo Belcher - mixing traditional and digital methods to research hidden carers.
Is online support for parents and carers socially patterned like social media
use? Needs to reach people who don't use the Internet, too, for comparisons.
People respond in the medium they were first contacted. Treat online and
offline results differently for analysis.
Lorenza Antonucci - Using social media in different phases of research process
Digital methods allow you to do something different. A lot of focus on big
data and secondary digital data (twitter). Not about collecting own data, but
using data already available on social media.
Might be a problem fitting secondary data into existing theoretical
frameworks. So need to collect. She's looking at real vs virtual identities.
Not possible to do follow up interviews just over facebook. DM help to select
people. Also cross-national research. Age/generational aspects.
Jeremy Knox - MOOCs
Is learning simply the consumption of information online?
Web enabled sensors. GPS to record where he goes within MOOC - physical
location and digital space together (cool).
Locations tweet when he's in the MOOC.
RFID system that allows office books to tweet content.
Experiments to disrupt and critise. Playful methods to think about the MOOC in
a different way. Assemblage of human, technology and place; learning might be
post-human.
Final outcome of research? Not sure.. a way to provoke thinking in what is
often a closed area.
How is learning affected by physical space?
Sue Thomas - technobiophilia
O'Reilly's topsoil metaphor is cool.
Five categories about how people talk about online experiences.
Lots of nature metaphors. Metaphors aren't deliberate. We bring nature into
computing because we innately want it to be there (biophilia)
Carole Kirk - Digital reflection: A method for arts practice-led research?
Questions and methods come from practical. Tacit knowledge. Capturing creative
processes.
How to leave a trace of an action for reflection?
Digital methods can help.
Not a complete record, but a trace. Digital technologies that involve a high
level of manipulation stimulate greater reflection. Only archives - metadata,
feedback & discussion. Visible record of reflection. Process of creating
records doesn't replace the practice itself. Might trigger embodied memory.
Help to articulate fleeting things.
In arts, practice-led research is more about creating digital data.
Emma Hutchinson - Asynchronous online interviews and image elicitation
Async, like email or forum.
Complement interview with photos, but not used much online yet.
Identity performance of online gamers.
Images help with articulation. Lots to talk about.
Photos that do/don't get uploaded to facebook, why / why not?
Eve Stirling - Facebook profile as research tool
Undergraduate transition to university.
Lots of HE happens on fb. Looking at the every day.
Digital and physical spaces.
Personal fb is not academic, and hidden. Twitter is academic.
Ethnography is about understanding every day culture and developing trust and
rapport with participants.
Fb friends are linked to study. Intrinsically linked. Does becoming fb friend
need a disclaimer? Informed consent? Personal and professional lives blurring.
After study - delete friends?
Me! Digital Media on the Semantic Web.
My slides are here.
danah boyd - Making Sense of Teen Life: Strategies for Capturing Ethnographic Data in a Networked Era
Understanding social networks before there were sns
The rise and fall of myspace.
How much can be made sense of from a distance? Engaging with own friends not
like working with young people. How radically difficult it is to interpret
what she sees. Young people better at encoding the information they make
available, because of adult surveillance. Just because she can see their
content, doesn't mean she knows what's going on.
Observes offline too. Adults help to recruit youths with different
perspectives.
Thought about recruiting online but stopped. Not a good norm to start.
Make sense of online things with offline interviews (ethical things
considered, parents/friends nearby).
Don't begin the conversation with online material; need them to feel
comfortable first; usually an hour into the interview.
How to coordinate data? Serious challenge. Blogging about things as she's
thinking through them; trying to make sense of them in a public way.
Thinking out loud, can be corrected and challenged during sense-making
process. Not just experts, but her participants too.
Controversial piece about shift from myspace to fb. Got picked up overnight
and got over 10k responses. Most people frustrated and angry and didn't
understand where she was coming from. People came forward with quantitative
data that helped. Adults attacked her for being racist; young people responded
with their stories.
How public to make the young people? Don't expose them, no real names unless
their already public figures. Never quotes online material exactly so people
can't search to identify the young people. Visibility has consequences people
don't expect or understand. Can make people more vulnerable by making them
visible.
The young people could choose to make themselves visible through the process
(some do).
Speak for them or help them speak?
Public in the media to make young peoples' voices heard as much as possible.
Never publishes in a closed access journal.
Collaboration is in sense-making, not writing. More in intervention-driven
projects.
Generally don't want to, but a few exceptions - published two papers with
teenagers.
Tension between MS and research?
No, MSR is academic institution. Lots of freedom.
Teenagers expect her to fix xbox.. External perception is more confusing than
internal.
Can make the case for open access in a way that lots of university scholars
can't.
Problems with paraphrasing quotes from websites?
Tries to make quotes more common, found everywhere.
Her ethics about making people not more vulnerable is worth more than skimping
on real quotes. Helps that she doesn't rely entirely on online data.
Says she's not good at articulating her methods.
Ibrar Bhatt - e-focus groups and e-interviews
Separate summer project on student experience for postgraduate research
students, distance learners, part time students globally.
Needed in depth focus group without them being there.
How involved do the students feel in activities in the School of Education?
Different doing focus groups online - affordances and challenges.
Used Adobe Connect. Facilitator echos some questions.
Multidimensional focus group. Participants could also discuss with each other,
and with researcher, over personal chat.
Guidelines - rehearsal, drop-in session, beforehand - recording of everything;
can integrate their video into a transcription.
Temitayo Abinusawa - Social Networking and innovation
Technical background. How organisations use IT to promote activities.
Social networking - Internet was loads of words, chaos. Make sense of the
words, then you can innovate. Can create products and services to meet needs.
Good ideas that need funding. People discuss ideas online. Organisations are
looking for new ideas too. Organisation can search the Web for ideas to create
innovation.
Outcomes: organisations can create more for less.
Feedback is consumer interaction that takes place on the Internet. Dell
IdeaStorm - turn your ideas into reality.
Openness is important, not exploitation. (Seems to be reward focussed, rather
than transparent exchange culture, query).
People don't read t&c so think they're being exploited.
Evelyn McElhinney - Social virtual worlds: a new place for the avatar researcher
Focus groups in Second Life with avatars collecting, sat in chairs like IRL.
Most people aren't roleplaying, their avatars are just themselves.
Closing Discussion
Ethics
Commercial research is sometimes ahead with digital methods practices. Things
happening one end aren't being noticed by other end. Need to communicate and
look out for each other. Mixed methods research.
Digital methods need legitamising to be taken seriously. Sometimes people not
knowing much about it can help to get stuff done.
DMMM was a conference aimed at sociologists and anthropologists and the like,
so, having never studied these disciplines in any way, I was worried I'd have
no idea what was going on.
Fortunately everyone was friendly, and everyone's research was really
interesting, relevant and mostly made sense to me. You can read all of the
notes I took here.
Humanities researchers are using and gathering digital data in lots of
interesting and unique ways. Using social media and other digital methods to
engage with study participants (Jo Belcher, Lorenza Antonucci, Eve Stirling);
sentiment analysis (Mike Thelwell); examining archives; image use in online
interviews (Emma Hutchinson); e-focus groups (Ibrar Bhatt); digital records
(reflections) of a creative arts process (Carole Kirk); crowd-sourcing of
commercial ideas (Temitayo Abinusawa); avatars and virtual interaction spaces
like SecondLife (Evelyn McElhinney); brilliant playful use of hacking to
disrupt discussions about online learning (Jeremy Knox on MOOCs).
danah boyd, whose work I've followed more or less since my undergraduate,
teleconferenced in to give a really interesting keynote called "Making Sense
of Teen Life: Strategies for Capturing Ethnographic Data in a Networked Era."
She discussed working with young people for the last few years to examine
their use of social networks (mostly MySpace), and all of the challenges and
considerations that came up along the way. She was surprised a lot.
The open discussion at the end raised a lot of discussion about ethics. It
was implied at one point that the content of tweets or YouTube comments are
ripe for the picking with no strings attached because they're already in the
public space. It's definitely not that simple.
There's also a danger of humanities researchers being out of touch with modern
techniques and best practices. Commercial research is sometimes way ahead,
but there's no communication between each end of the spectrum, so methods get
developed and optimised unnecessarily.
A lot of people had experiences indicating that digital methods in humanities
are often not taken seriously. Supervisors, ethics committees, funding
bodies, who have only worked with traditional methods can struggle to see the
legitimacy of results gathered by digital means. On the other hand, certain
levels of ignorance can sometimes work to the researcher's advantage in terms
of being allowed to get stuff done with minimal red tape (because authorities
don't know what questions to ask).
Personally I was interested by this general feeling of novelty about digital
methods. Having been in computing for almost my whole academic career (not to
mention a child of the Web), a lot of things were being critically and
confusedly discussed that I just take for granted. Things like the validity
of friendships that exist entirely online, and feelings expressed through
short-lived text alone. I think arts and humanities researchers who really
want to get to grips with digital methods as legitimate research tools should
consider orchestrating placements alongside technical researchers and
immersing themselves in a world where the main options are all digital by
default.
17th December - 23rd December Mostly things not strictly PhD related... 18th - Initial meeting about organising a hack for undergraduates during Innovative Learning Week in February
19th - Discussed how local small scale community groups can benefit from Open
Data, and how they could get involved in the ILW hack, with Freda O'Bryne and
Euan Jackson.
\\\\- Also discussed these things with Ewan, and started on a spreadsheet of decent apps that use Open Data, and links to their data sources.
7th January - 13th January I was subject of Paolo's procedural task labelling pilot experiment. It was quite complicated. I don't know if I did a very good job.
I went to TechMeetup and networked like crazy.
I talked to Andy Hyde about ALISS getting involved with the ILW hack, and Edinburgh OD scene in general.
And a few other people about supporting the ILW hack in various ways.
I talked to Felix Gilfedder of Popcorn Horror about novel ways of engaging with creative digital media communities (PhD related!)
I briefly met Russell, with whom I hope to discuss digital video metadata (PhD related!)
One other thing which isn't related to Open Data or my PhD but is very exciting nonetheless.
I went to a Design Informatics talk about the Internet of Things.
14th January - 20th January Further discussed ILW hack with various parties, and released a website with information for anyone interested in being involved, and registration for students.
I went to the first Machine Learning and Pattern Recognition class. It looks
interesting, but might be a bit maths heavy... I might concentrate on learning
the maths now, and take the class next year, instead of killing myself with
both.
Which was hosted by the National Library of Scotland.
(Information).
I reported on the 1st International Open Data Dialogue in Berlin that I'd been
to in December, but then had to immediately leave, so I don't have any notes
on the rest of the talks..
21st - 27th January Brainstormed about how to work with amateur digital media creators, mostly with the aim of starting to put together a vocabulary for representing various digital media creation processes, collaboration dynamics and audience engagement. More coherent thoughts on that next week, I should think.
Lightning-talked about the Berlin Open Data Dialogue at an OKFN meetup in the
National Library of Scotland on Thursday 24th.
Recently I read and brainstormed some things about interesting ways of
creating a decentralised network (social or otherwise).
Tent.io is still coming out on top as having the most
potential for something I could actually implement and build upon. Tent
server is originally Ruby, but there are already Python and PHP
implementations. PHP might be the most useful as the mostly widely supported
by cheap shared hosting providers, reducing the barrier for people wanting to
run their own Tent server. Depot in particular has the
goal of targeting the technical lowest common denominator, and making
installation and maintenance as easy as possible for a non-expert.
Problems with having nodes of a network, whereby people who don't want to set
up their own node can sign up to someone else's node, include ensuring
consistent URIs for things. If someone wants to up and move to another node,
what are the best ways of maintaining connections? Obviously anyone who buys
their own domain name and hooks it up won't have a problem, but not everyone
can or will. A centralised permanent URI service like
PURL (or just use PURL..)?
Unhosted and their
remoteStorage protocol are interesting. The idea
is that a user registers with a remoteStorage provider, then signs into
Unhosted apps with that identity. Unhosted apps are all frontend JavaScript,
and no data is sent to a server. Anything that needs to be stored - content
you creating whilst using the app - is sent to your personal remoteStorage
account. There are only two remoteStorage providers listed though, and one of
those is a test one that could get wiped at any time. The other,
5apps.com, doesn't appear to provide an interface for
browsing and exporting what you have stored with them. Obviously this is an
early project, and since anyone with a webserver can theoretically set up a
remoteStorage service, has lots of potential. In the short time I spent
investigating, I couldn't work out possibilities for sharing data in your
remoteStorage, so Unhosted apps might just be useful for personal, non-
collaborative activities. Some examples they have are a simple text/code
editor, a really pretty simple writing interface, to-do lists, time
management, favourite drinks list. Oh wait, friendsunhosted.com is like a
twitter service. Because I don't know anyone else using friendsUnhosted, I
haven't tested it properly, but it appears to offer a stream of peoples'
statuses, which are presumably stored in their personal remoteStorage. I'll
investigate better. The mailing list is active and the main developer seems
to be on top of things, so this is something to keep a close eye on.
Idea: Possible to set up 'remoteStorage' on peoples' own Google Drive accounts, via that API?
I don't know anything technically about peer-to-peer, but it seemed like a
good avenue to pursue with regards to very established, very decentralised
networks of people and files.
At an extreme end of 'owning one's own data', I'm curious about storing all
your files (eg. linked data graphs) locally and sharing them / accessing other
peoples' only through a browser interface. Drawbacks obviously include your
files only being available when you're online.
I read a handful of good things about the G3 protocol and API used in
FilesWire, by Dreamsoft Technologies. I got really excited for a little
while, before reluctantly giving up because the whole project seems to be old
and very dead. I did track down the developer on LinkedIn and facebook, but
I haven't decided yet if it's worth pursuing at all.
I investigated Freenet, and their freesites.
The Freenet network is all about anonymity; any data you upload is broken up
and stored on many nodes in the network. You can't identify where it
originally came from, or where you're pulling something from when you retrieve
it. There were some messages from 2000/2001 on a mailing list about RDF on
Freenet, but nothing seems to have come of that. Requires a software download
of course. Again, I don't know enough about the technicalities yet to judge
if something like this would be a viable approach for a decentralised linked
data sharing network.
I also came across (the now deceased) Opera
Unite, which is a webserver running
inside Opera that lets people share files and serve webpages without the
hassle of paying for or setting up their own server.
28th January - 3rd February I read Annotation of Multimedia using OntoMedia by K. Faith Lawrence et al., and we discussed it during Ontologies with a View. OntoMedia might well prove useful in that describing the content of digital media can improve searching, sorting and sharing.
I started reading a couple of other papers by Faith, but haven't finished them
yet, so expect summaries in the near future.
I brainstormed about decentralised networks, with thinking of ways of individuals
sharing linked data about themselves and their projects without surrendering
all that data to a server in mind.
4th - 10th February When helping to organise an event, it's impressive how much time sending emails, updating websites and spreadsheets, having meetings and generally coordinating things can take up. So that's a lot of what I've done this week. Not PhD, but related enough to be excusable... sort of... (Smart Data Hack and Open Data Day: Joined Up Edinburgh)
I have finally booked plane tickets to Serbia for
Resonate new media festival, at which I'm hoping to
make useful contacts with independent digital media producers, learn some
stuff about big data visualisation at the workshops (my places at the ones
I've applied to haven't been confirmed yet though), and generally learn more
about the digital art scene (and figure out ways SW technologies can benefit
people who are part of it). And also to enjoy Belgrade, as I'll have a spare
couple of days either side of the festival. Because the flights were cheaper
then! Honest! I actually lost an entire afternoon hunting for cheapest
flights on websites that weren't obviously scams (Cheap-o-Airlines? Really??
Ultimately booking through JAT directly was the best option), then negotiating
cheapest trains to London in order to catch flights, then sending CouchSurfing
requests, because why spend money on a hostel when you can meet wonderful new
people and get tourist advice for free?
I also applied for a place at the Semantic Web Summer
School in Spain in July. It looks fantastic and
educational and stuff. And I applied for funding to help cover then 900EUR
entry cost (a whole week, accommodation and meals included), and
SICSA are providing £500, yay!
Two thirds of Ontologies with a
View met this week, and we discussed BBC use of Linked Data. I only
managed to skim the paper, but knew the general principles from articles I'd
read about their work before... I will read it properly at some point, though
I think a more technical discussion of what they did might be useful.
I was fortunate enough to attend the Computer Mediated Social Sense-
Making workshop, conveniently situated
on the ground floor of the building I work in, on the 14th of February.
Whilst more technical than the Digital
Methods
conference I went to in December, the talks and panel sessions served to build
upon things I started to think about then. Namely, beginning to situate my
research interests amongst many concepts from the currently quite alien fields
of sociology and anthropology.
The talks were varied, and key themes that emerged were the collection/use of
data for social improvement (health and wellbeing, teaching and learning,
disaster recovery), and the importance of context in making collected data
genuinely useful. A notable challenge is that one piece of data might have a
thousand different contexts from the perspectives of a thousand different
human beings. So how to communicate these variations to software that
processes this data, and perhaps makes decisions using it?
Perhaps not to worry too much about that at all. Process things locally
instead of globally, using local contexts and understandings, but make sure
everything is annotated such that information can still be exchanged across
the whole network, and differences in understanding can be accounted for or
reasoned out if a need occurs.
For the record, I'm looking at how Semantic Web technologies could be used to
better connect human and machine in the context of amateur digital content
creation (movies, comics, music, art), including how semantically annotating
creative (often collaborative) processes as well as the end products of
these processes and the engagement of an audience with these products, could
improve the overall experience of creating content (along a number of
dimensions). A massive part of this will be creating tools that actually
collect the necessary data from users. Ultimately, these tools will need to
be invisible, ie. easily integrated into existing online routines, with no
effort required to use them for the non-technically minded so that a network
effect can take place.
Incentives for crowdsourcing came up during CMSSM, and someone pointed out
that by gamifying data collection for research projects, incentives become the
same as ones offered by gambling companies; something competitive and
potentially addictive. I think things like global systems of reputation and
trust are useful on a network where people are to share data about their own
work (or opinions of the work of others) and may be nurturing a desire for
popularity or exposure on the network (a network where the people are
central, because the data could not exist without them, but where the users
and the data are simultaneously co-dependant).
Dropped the ball on my PhD for months to wrangle sponsors, set up the website, recruit participants and generally get shit together. It went well, with feedback from students like "easily the most useful and fun week I've had since I started uni".
One of the first useful things I crocheted were these extremely masculine
handwarmers.
The extremely soft and fuzzy wool was from Age Scotland I believe.
I used no pattern and had never made handwarmers before, so they came out far
too big. Not to worry, at least you can fit a Tigo in them!
I just repeated dc ch1 around, dc-ing in the gap left by the chain each
time. I reduced a bit around the wrist area to shape it, but it wasn't
particularly effective. I chained about four, instead of carrying on, to make
the gap for the thumb, then continued as normal. One of them is shorter than
the other, because I didn't count anything.
They serve their purpose though, and the lucky owner has even gone so far as
to wear them in public in the chilly heights of Appleton Tower.
Notes about Annotating Multimedia with OntoMediaLawrence, K.F., schraefel, m.c.: Bringing communities to the semantic web and the semantic web to communities. In: Proceedings of WWW2006. (2006)
Michael O. Jewell, K. Faith Lawrence, Adam Prugel-Bennett, and m. c. schraefel (200?) Annotation of Multimedia Using OntoMedia
OntoMedia for representing "diverse range of media".
Others for media:
CIDOC Conceptual Reference Model (museums)
ABC Ontology (multimedia in libraries and digital archives)
Functional Requirements for Bibliographic Records (attribute and relationshops for task performed when consulting bibliographic records)
FictionFinder - FRBR to Online Computer Library Centre
WorldCat db - Metadata about characters and fictional places
Describe contents of films & comics etc for tracking things down to share you forgot?
None quite did what was needed.
So created to map to current models but specifically describe media content.
Hierarchical approach.
1\\\\. Overview
Entity / Event system
Entity: object, concept
Event: interaction between one or more entities
0 or more Entities are modified OR new Entity created
Entities not destroyed, but may have not-exists attribute
Expression (primarily elements and subclasses; Entity, Event)
Media (binding between media and Expression objects)
Space (extension of Signage Location Ontology, buildings, and regions of structures)
Extensions (more detailed subclasses to Core)
Being (people)
Trait (attributes of Entities)
Events (extends Core->Event)
Action
Gain
Loss
Travel
& properties thereof
Fiction
Character (on Being)
spoiler info. and accuracy
Media
More detailed than the one in Core; includes audio, image, photo, text and video subclasses
Misc - classes used by any or all of other classes, eg. colour, geometry
Specified in OWL
Developed in Protege and SWOOP.
2\\\\. Case Study
Scene from Total Recall annotated. Represent script and characters, and
characters from related book, and links between two forms.
Screenplay annotation - SiX - Screenplays in XML.
Wraps around existing content.
Transition (cuts, fades, blackouts), location, dialogue and direction (action
taking place in the script)
SiX allows for DC, for creators, date, descr, title.
Custom XSL to conform marked up scripts to Oscar requirements for readability.
Script Item extends Media Item to link script representation to OntoMedia.
Use has-expression to tie to OntoMedia:Expression.
Describing places and access etc, like lift, like IF.
\\\\- For describing character continuity, eg 'can character really see x' etc.
Must describe events that don't occur. Characters want to occur, etc.
Multiple timelines, dreams.
Declare events.
Create timeline and add occurances.
\\\\- events can be reused, and coincide.
3\\\\. Testing / querying
Imported into Sesame triplestore.
RDQL queries (subset of SPARQL, simpler, only ever has 1 graph pattern,
doesn't use RDF data typing)
4\\\\. Conclusion
No examples of pictures - how to annotate comics?
Combining OM with other apps.
Stuff integrated into Mediate.
Lawrence, K.F., schraefel, m.c.: Bringing communities to the semantic web and the semantic web to communities. In: Proceedings of WWW2006. (2006)
Research into SW communities:
Communities of practice
Social networks, eg. FOAF
Compare with other definitions of communities outside of SW. Concept: Internet Based Community Network, has properties of COP and SN. Case study: Amateur Fiction Online.
Early community definitions, Howard Rheingold: "..webs of personal
relationships in cyberspace."
1996 CSCW Conference defined prototypical attributes of communities
(Whittacker):
Shared goal / interest / need
Repeated active participation, emotional ties and shared activities.
Shared resources and access policies
Information, support and services reciprocated between members ( overlap with ^ ?)
Shared context (culture, language)
Can be applied to virtual and offline communities.
-> More attributes = clearer example of community.
Preece:
Social interaction
Shared purpose
Common set of expected behaviours
Computer system that facilitates and mediates communication
^^ Things in common. Whittacker's is more inclusive/broad.
So for a SW SN:
Accessible via browser
Explicit links between users
System supports creation of these links
Links are visible and browseable
COP or SN may describe a community, not necessarily. IBCN will do, and could
be a COP or SN too.
Problem of Amateur Fic. is fluctuation of archive. Personal sites go down
etc. How to find a story you remember a bit of?
IBCN is also combination of WBSN and virtual community.
Lack of incentive to use FOAF (eg. on LiveJournal etc) (Plus ignorance).
Doesn't offer anything they don't already have.
They don't use much metadata, just tons of human-readable stuff.
FOP extension to FOAF for anonymous identities.
('Fan Online Persona' - why not just 'Online Persona'?)
Consistency likely in community-based system because of advantages of
reputation etc. Identity cost.
Shared set of behaviour values, or risk losing rep.
Reputation gained by taking part. (definitive part of community).
Additionally by creating works.
foaf:document and foaf:groups allow users to give details about their own
creations and review work of others.
OntoMedia to describe content complements FOP.
Options in FOP gathered from study of metadata of works in mailing lists,
websites and groups.
Recommender system -> notification system.
Allow SNS of writers to be studied at friend level and collaboration level.
Application to allow users to create FOP under development...
J. Hendler, T. Berners-Lee, From the Semantic Web to social machines: A research challenge for AI on the World Wide Web, Artificial Intelligence (2009), doi:10.1016/j.artint.2009.11.010
Powerful human interactions enabled by futuristic high-speed infrastructure.
Empower Web of people via coupling of AI, social computing and new
technologies. "humanity in the loop".
Social machine: "...processes in which people do the creative work and the
machine does the administration." (Weaving the Web, p172).
Struggling with social mechanisms to control predatory behaviour and threats
to privacy.
-> Tech must be developed that allows user communities to construct / share / adapt social machines, so successful models evolve through trial, use and refinement.
Claims a new generation of Web Technologies needed to overcome barriers to
this; cross-disciplinary approach needed.
creating tools
creating principles and guidelines
extending Web infrastructure re: information sharing and address privacy and user expectations of data use.
"...a revolutionarily more powerful platform for the individual, enabled by
realizing that the individual is also a member of a community" (/ies)
"architecture of the future Web must be designed to allow the virtually
unlimited interaction of the Web of people" (vs. documents now)
Giant Global Graph - dig.csail.mit.edu/breadcrumbs/node/215
SW deployment:
[3] T. Berners-Lee, J. Hendler, O. Lassila, The semantic web, Scientific American (May 2001) 28–37.
[10] J. Hendler, Web 3.0 emerging, IEEE Computer 42 (1) (January 2009).
"disruptive potential" of SWt, "important paradigm shift"
"little work in understanding the impact of their new capability"
"the smaller we can make the individual steps of this transformation, the
easier it will be to find humans who can be incentivized to perform those
steps."
"need to develop mechanisms to enable [connections between people]"
Lack structure for formally computing qualities like:
trustworthiness
reliability
expectations about use of information
privacy
copyright
(etc)
"requires data structures... to treat social expectations and legal rules as
first-class objects" ("declarative rule-based infrastructure that is
appropriate for the Web").
"open and distributed nature of the Web requires that rule sets be linked
together."Cross-context use, sometimes unanticipated.Inconsistency sure to
arise. No logics that control contradiction have been shown to scale well.
New approaches to problem of specifying contexts (need).
SMs must be able to apply different policies based on context.
Work in ontologies must extend to allow user communities to identify bias and
share different interpretations.
Current security models / mechanisms insufficient.
[1] - formal models for privacy: L. Backstrom, C. Dwork, J. Kleinberg,
Wherefore art thou r3579x?: Anonymized social networks, hidden patterns, and
structural steganography, in: Proceedings of the 16th International World Wide
Web Conference, Banff, 2007, pp. 181–190.
Provenance important in determining trustworthiness.
[20] information accountability, legal and public policy: D. Weitzner, H.
Abelson, T. Berners-Lee, J. Feigenbaum, J. Hendler, G. Sussman, Information
accountability, Communications of the ACM (June 2008).
policy-rule-based languages.Reasoners that can interpret policy and determine
which uses of data are policy-compliant.-> How to tackle scaling?
[4] Lit review: T. Berners-Lee, W. Hall, J. Hendler, K. O’Hara, N. Shadbolt,
D. Weitzner, A framework for web science, Foundations and Trends in Web
Science 1 (1) (2006).
I also translated all of my written meeting notes into Evernote, which
promptly glitched out and doubled the amount of typing I had to do. (I
considered switching back to Google Docs, but I need labels). I sure love
technology.
_ I refamiliarised myself with the structure OWL. Awesome diagrams
here.
_I did some more thinking about how I need to work with amateur content
creators to make an ontology that fits their workflow. I should have finished
the planning stage of this ages ago, but.. I blame the ILWhack.
I keep wondering about the best way to have a system of consistent URIs across
a network where the information can move from server to server on the whim of
a user. During this wonderment I discovered that purl.org's login system is
broken. I joined the mailing list, and people complain about it and have it
re-fixed fairly regularly, so I'll just wait..
I'm hacking around with the ultimate goal of creating an interface that allows
people to generate a GeoRSS feed for a particular area. Ally of
GreenerLeith suggested this, so that they can
use this to feed into their own apps. A stage beyond that is to smoosh the
whole lot into a Wordpress plugin, to make it accessible to anyone (who uses
WP).
Lovingly crocheted over a couple of days. No pattern, all approximated, with
scraps of wool of the appropriate colours I just happened to have.
I started with a chain the length of the width (shortest side) of my Nexus 7,
and single crocheted in each stitch down both sides, and carried on round; so
I did the very bottom first and worked my way up in a flat rectangular spiral.
I dunno if that makes sense, but it immediately begins to take shape.
When I got to the right height, I carried on across the back, then just
chained across the front, and continued as before, to make the opening.
Though it may look deliberate that the 'tail' sticks out a bit at each side, I
actually made it too wide. So I had to decide whether to carry on making it
too big and pad it on the inside, or tuck the ends in and make it tighter with
the green. I chose the latter and it worked far better than I anticipated.
Most of it is just double crochet for no particular reason, but his fuzzy lil'
chest is a pattern of alternating front-post triple crochet and back-post
triple crochet (triple to compensate so it stays level with the 'normal'
double crochet around the triangle, as front- and back-post double crochet are
slightly shorter than regular double crochet).
His feet and wings are just chains; his eyes made just the way you'd make
small circles, but skewed a bit to be ovals (I can't remember, I probably
chained three, then [sc, hdc, hdc, tc, hdc, hdc, sc]x2, then sc in each stich
around. Don't hold me to that though). The black centres are literally just
threaded through and through until they looked right, and the loose ends used
to attach. I shaped the beak by periodically crocheting two together. I'm
surprised it's as remotely symmetrical as it is, because I didn't count
anything.
I spent a good deal of time during January and February helping to organise a
couple of Open Data oriented events. At least, that's the excuse I'm sticking
to for not having done much of my PhD in that time.
The Smart Data Hack, also known as ILWhack (Innovative Learning Week* hack)
came first, between the 18th and 22nd of February.
Innovative Learning week at the University of Edinburgh is an annual week of off-timetable activities for students, designed to enhance their learning experience. Arguably every week in higher education should be innovative and striving provide the best possible education... And Innovative Learning implies the student should be making the special effort and I don't think many would be happy with the idea of paying so their tutors can have a week off, so maybe Innovative Teaching week would be more... better. But that aside.
The hack was targeted at first and second year undergraduates in Informatics
on the basis that third and fourth years would be busy with final projects.
This was by no means a restriction however, and we harboured hopes of enticing
along design students and data buffs from other departments to mix up the
skill set a bit as well.
I knocked up a website with two primary
functions.
Students could pre-register, add some info about themselves and start to form teams online.
Anyone interested in getting involved who wasn't a student could figure out where they might fit in and get in touch. This included people who could sponsor prizes, present real-world challenges to solve, offer data to be wrangled, or provide technical support to participants.
We anticipated about 50 students, and invited them to form teams of up to 5.
In parallel with gathering sponsorship, we came up with five prize categories
of equal merit:
Best for travel
Best for health and wellbeing
Best for communities
Best visualisation or UI
(First year prize for) Best data mashup
We hoped to encourage students to make whatever they wanted, using whatever
technologies they wanted, with use of open (or specially provided) data being
favourably looked upon.
Skyscanner were the first main sponsor on board,
pledging prizes for two categories and some massive datasets that aren't
usually public and access to internal APIs, as well as engineers to mentor.
We partnered with ALISS to encourage use of their local
health and wellbeing data API; ALISS also sponsored in part a prize category.
The City of Edinburgh Council were on board with
some never-before-seen downloadable datasets (still
online!), a bunch of
pre-approved API keys and refreshingly open minds and supportive attitudes.
CompSoc heroically sponsored an entire prize category
and promoted the event to its members.
Greener Leith proposed a challenge and
sponsored a special Mosque Kitchen lunch for everyone after the mid-point
presentations on Wednesday.
We were able to hold some terrific practical
workshops, thanks to:
Tom Armitage and Stuart MacDonald, for handling geolocated resources.
Philip Roberts, for data visualisation with d3.js.
Oli Kingshott, for an introduction to version control, and HTML5 for beginners.
We also recruited mentors from UG4 and PhD students, as well as industry
professionals, who were consistently present in the hacking space all week or
available by Twitter, email and IRC.
We marketed the event in the couple of weeks prior (though we were organising
up to the very last minute) through shout outs in lectures, posters around the
Informatics department, emails to many university mailing lists and word of
mouth.
As a result, we overshot our expected numbers, with well over 100 sign-ups by
the start of the week. This was good news and bad news at the time, as we had
to scramble around to make sure we had enough sponsorship to feed everyone and
whatnot.
By the end of the week, there were around 80 students still actively
participating, across about 25 teams. Pretty good! Most of them were
Informatics undergraduates as expected, but we had a handful of postgraduates
and students from the ECA as well.
And the outcome?
Some amazing projects and really positive feedback from participants and
supporters alike.
Naturally only a couple of days passed before somebody noticed that I hadn't
sanitized input fields on the website for HTML and CSS input, so they made the
projects page spin and play the Harlem Shake before I sorted that out, having
been alerted at around midnight. /grumble. Should have seen that coming, of
course.
In the end we gave away £1500 in Amazon vouchers, five Nexus 7s and ten Kindle
Fires. Skyscanner even upped their sponsorship to three prizes because they
were so spoilt for choice.
It was a really exciting and inspiring week for everyone involved. Many of
the students are taking their projects further (which is probably the most
important outcome) and are in discussions with relevant parties to do so.
4th - 10th March I played with the Tell Me Scotland SPARQL endpoint to put Scottish public notices on an OpenStreetMap: https://rhiaro.co.uk/publicnotices/map.php. It works intermittently, as the TMS endpoint is a 'proof of concept'. Okay, that's not PhD related, but it's still interesting.
I went to the 5th OKFN Edinburgh meetup at Napier. Look, evidence, I'm in
the picture!
Dave Robertson
asked me hard questions about how I'll turn my vague research interests and
intentions to make something useful into a PhD with genuine contributions to
knowledge and I floundered a bit, but he agreed to be my second supervisor
nonetheless. I hadn't properly thought about it in those terms, and I'm still
figuring it out with like, words. (As opposed to a gut feeling). I'd always
planned to plough ahead enthusiastically and hope for the best. That's how I
approach everything, actually.
I started reading Community-Based Annotation for the Semantic Web by Matthew
Rowe (2007); I think it was a first-year PhD review, and I was primarily
reference-harvesting. More on that next week.
I attended a Social Informatics Cluster meeting, and heard about the Smart
Society project; very much in its figuring
out stages, but worth following as it appears to have many diverse goals, and
various crowd-computation related outcomes might be relevant to what I'm
doing.
I went to an ESALA lunchtime talk entitled 'Data Objects', by Ian Gwilt of
Sheffield Hallam. I have to admit, I was expecting something more technical
and internet-of-things-y. Actually data objects are primarily tangible
visualisaitons; 3D printed, carved out of wood or sculpted from bronze. The
research was about how people reacted to and understood data differently when
it was presented in different forms. Interesting, and they got some very
pretty artefacts out of it, but not directly relevant so I don't have room in
my head to store it unfortunately.
I attended the new Social Computing interest group kick-off meeting,
coordinated by Dave Murray-Rust. Everyone introduced themselves, and we
arranged a time slot for future meetings. It looks like I'll be presenting
there at some point in the probably not-so-distant future (it was decided that
everyone should). There were a very diverse bunch of people there, and
'social computing' hasn't been defined officially for the group yet. Quite a
few people are attending to essentially see what all the fuss is about. I'm
hopeful that there will be a hardcore technical leaning, because that's where
most of the gaps in my knowledge are. Well, I'm more gaps than knowledge
about everything, but that's where I feel particularly vulnerable.
The 23rd of February was International Open Data
Day. Since this fell at the end of the
ILWhack week, it
seemed like a good opportunity to take advantage of the momentum and engage
the stirring Edinburgh Open Data scene.
Many ODD were organised around the world; most of them were hacks. In
Edinburgh we went for a different approach, thanks largely to input from local
community activists like Freda,
Leah, Andy and
Ally. Some might call it 'social hacking'.
Our aim was to gather together people with little knowledge of Open Data,
people with data that may or may not be open, and developers and technical
types.
With Open Data stuff exploding worldwide, and developers going nuts creating
cool apps and services that make use of data being released, it's important,
you see, for {local, small-scale, voluntary, grassroots} groups and
individuals to get involved early on. Such parties are arguably likely to
benefit the most from empowerment by data, and if they're not part of the
discussion early on we might well see a lot of services developed that meet
needs imagined by a not-quite-connected but well-meaning tecchie.
So on that note, we want to spread the word about Open Data to those who might
normally be left behind. When these groups know what the possibilities are
(we can show them successful projects, locally (eg. ILWhack) and worldwide),
what is available, and what could and should be available, we empower them to
take action that will benefit them. More specifically, people can find out
that the English government has released data about such-and-such-of-interest,
and politely demand that the Scottish Parliament or local councils do
similarly. They can interface early on with developers who are keen to start
making, and make sure their real problems get solved (or at least
prioritised over potential imaginary ones). They can get involved with things
like ILWhack, and have a better idea of what it's all about.
Between 10am and 2pm, we gathered around 35 people in the Informatics Forum
and, fuelled by tea, coffee and biscuits, began the discussion.
We started with an hour of ten minute talks, about a variety of topics:
Sally Kerr told us about Open Data at the City of Edinburgh Council; the
progress they've made so far and where they hope to go in the future; NESTA's
local government Make It Local programme helped Edinburgh Council to move
forward with Open Data. She gave a nod to the ILWhack projects that made use
of Council data in the week prior.
Alex Stobart of MyDex talked about big, open data, and
the challenges this presents to citizens and politicians.
Iain Henderson explained the Standard Label; an
easy to read specification for data holders to present to their users how the
user data will be used. Like nutritional advice, but for data.. Other ODD
events were centred around hacking with it we spoke!
Bob Kerr talked about OpenStreetMap and GeoRSS. I love the obsessive
hyperlocal detail in some places, like where the animals live in Edinburgh
Zoo. On a serious note, OSM has really empowered local governments and NGOs
in developing countries.
Andy Hyde discussed asset mapping for voluntary groups; how ALISS collate
dispersed health and wellbeing information into a central, open repository,
ripe for manual and programmatic access.
From Lizzie Brotherston we heard about the Post-16 Learner Journey Project;
helping the Scottish Government understand the learning landscape. They're
holding a hack in April.
Next it was unconference time!
We had a short while of whole-room discussion, before identifying three key
areas:
Standardising visualisation (headed by Bob Kerr)
Small scale voluntary organisations (headed by Leah Lockhart)
Sustainability of data projects (headed by Ewan Klein)
Everybody picked a group and we broke apart for the next couple of hours.
The final part of the day was a return to the main room, and further room-wide
discussion of the breakout debates.
The _s__tandardising visualisations _conversation focussed around bringing
people into conversations about data using visuals. Someone pointed out that
if news readers used Open Data visualisations, the general public would be a
lot more interested in Open Data. It's interesting to imagine a future where
data visualisations are embedded into the world, into the landscape. To be
able to interact with data meaningfully, you've got to know what it is - to
recognise it. A standard - think periodic table - would help people to know
exactly what you're talking about straight away. This goes beyond graphs and
charts, into a world of layered visualisations that allow layered public
contributions of interpretations.
Those interested in small scale voluntary organisations discussed data
holding and data access issues, including strategies for persuading big
organisations to open data (eg. by showing success stories, and proving a
certain return on investment). It was agreed that interfacing with developers
is important to get things done that organisations really need; but
organisations might not know what they need. It was discovered that there's a
lot of crossover between groups represented by people who were in this
discussion; common needs but gaps in talent.
Finally, with regards to sustainability of data projects it was agreed that
strategies are needed for keeping things going beyond short hack events; how
to sustain that burst of energy for a longer term usefulness? How to keep
track of everything that's going on, and link communities with events (see
OpenTechCalendar!). Some kind of
coordination body might be useful, or working groups / task forces.
We wrapped up, collected everyone's details for sharing (to ensure
sustainability of the outcomes of the day by making sure everyone can keep in
touch!) and people began to drift away.
There was an enormous positive energy throughout the day. Discussions were
lively and passionate, and we had an excellent mix of people, exactly as hoped
for.
NB. It looks like Joined Up Edinburgh will come under the umbrella of the
Scotland branch of the Open Knowledge FoundatioN, so
http://scot.okfn.org will be a good place to keep an
eye on now. And to keep in the loop, join the Joined Up Edinburgh mailing
list.
Other people have blogged about this too. Check out these by Leah
Lockhart and Dave
Meikle (more links welcome).
Notes about Meervisage - A Community Based Annotation Tool (for the Semantic Web)
Rowe, M. (2007) Meervisage - A Community Based Annotation Tool. ‘Towards a Social Science of Web 2.0’ Conference at the University of York 5-6th September, 2007.
How SW can benefit from incorporation with existing 'Social Web'.
"...collaborative generation of metadata... using social networks as a user
base..."
Uses fb groups created for sharing and organisation of research. Suggests
posting links to useful resources is comparable to annotating the resource.
Comments are more metadata.
Points out usual stuff of actually generating semantic data being a problem
for SW.
System requirements:
Annotations must be shared in a community.
Annotations can be reviewed and edited (/audited) (by group)
Collaborative
Central repo.
Annotations contain semantic metadata.
Content of resource annotated, not URL.
Communication layer that doesn't interrupt annotation (uses external services).
Review of existing systems:
Annotea [9] [13]
_J. Kahan, M.R. Koivunen, E. Prud Hommeaux, R.R. Swick. Annotea: an open _RDF infrastructure for shared Web annotations. Computer Networks. 2002.
__
_M Koivunen. Annotea and Semantic Web Supported Collaboration. Proc. Of ___
the ESWC2005 Conference, 2005.
No communication layer (but has discussion threads, wat?).
Can only be edited by author, but can be reviewed by others.
Can be local, private or shared. RDF.
Piggy Bank [10]
_D Huynh, S Mazzocchi, D Karger. Piggy Bank: Experience the Semantic Web _Inside Your Web Browser. Springer-Verlag GmbH. 2005.
RDF.
Auto and manual. Bundled with scrapers; if they fail, manual. Only of one type.
Share group or global, or save to local 'semantic bank' <\\\\-- find="" is="" out="" this="" what="">
Reviewed by all, edited by author.
Community of users, but no SNS integration.
KIM [14]
A Kiryakov, B Popov, D Ognyanoff, D Manov, A Kirilov, M Goranov. Semantic Annotation, Indexing and Retrieval. Journal of Web Semantics, Springer. 2004.
Automatic named entity recognition.
Links to knowledgebase with ontology.
Creates new URIs for new entities or link swith entities it already knows about.
Global sharing.
Can be deleted but not edited.
No social involvement.
Magpie [11]
_J Domingue, M Dzbor and E Motta. Semantic Layering with Magpie. _Handbook on Ontologies. 2004.
Auto annotate webpage.
Similar to KIM, but does not hyperlink to knowledgebase; instead each item gets context menu (right click) with services depending on entity.
'Multi-dimensional approach'. Uses ontology to trigger other services depending on concept.
Plugin for IE.
Simply looks for entities that are in ontology (Dzbor 2004).
[1] Using existing information to derive semantics from folksonomies
(delicious):
_X Wu, L Zhang, Y Yu. Exploring social annotations for the Semantic Web.
_Proceedings of the 15th international conference on the World Wide Web,
2006.
[15] Social bookmarking tools and how semantic info aids resource discovery.
Probabalistic model of how resources are annotated:
_A Plangprasopchok, K Lerman. Exploiting Social Annotation for Automatic
_Resource Discovery. Eprint arXiv, 2007.
[16] Distributed nature of folksonomies. Improve search mechanisms. Tags not
great:
S Choy, A. Lui. Web Information Retrieval in Collaborative Tagging Systems.
Proceedings of International Conference on Web Intelligence, 2006.
(vs.)
[17] Rigid taxonomies not great:
_C Shirky. Ontology is Overrated: Categories, Links, and Tags. Clay Shirky’s
_Writings About the Internet, 2005.
[18] Methodology for easier browsing of large scale social annotations:
Z Xu, Y Fu, J Mao, D Su. Towards the semantic web: Collaborative tag
suggestions. Collaborative Web Tagging Workshop at WWW2006, 2006.
All use one annotation per resource, not annotation of content within, so only
one lot of metadata about a page.
Meervisage
"To aid the process of collaborative annotation of web documents"
Allows sharing of annotations between subset of SNS users (eg. fb group).
Management of users and groups offloaded to third party.
Stored in central annotation store.
Annotations contain author, SNS, folksonomies and date. Made from content
within.
Meerkat is "responsible for generating semantic metadata by annotating
external web resources." Meervisage for management via social network.
Meerkat allows a user to edit another user's annotations if they are members
of the same group on facebook.
Popularity rating of resources rises with fb discussion.
Meerkat informs browser users if they come across a resource that has been
heavily discussed on fb, and by which group etc.
Meervisage also provides RSS feed.
Evaluate by comparing precision and recall metrics of annotations by one user
in an allotted time, and those by a group.
-> Don't know how this helps to assess quality of annotations; maybe I'm dumb? Find out.
Limited to private, says it's like that's a good think :s
Oh, because public access would be "laborious and resource intensive".
Annotations rated on usefulness and weighted.
[20] Attempt to describe folksonomies as part of formal ontology. Meervisage
doesn't; limited to users' viewpoint:
_S Angeletou, M Sabou, L Specia, E Motta. Bridging the Gap Between
Folksonomies and the Semantic Web: An Experience Report. Workshop: Bridging
the Gap between Semantic Web and Web 2.0, European Semantic Web Conference,
_2007.
[9] + [13] are most similar. Have groups, but groups aren't already
established networks.
Future work
Annotating multimedia.
Matching assigned tags with ontology terms mined from Web.
[19] Desktop app for annotating text with ontology:
A Chakravarthy, F Ciravegna, V Lanfranchi. AKTiveMedia: Cross-media
Document Annotation and Enrichment. Poster Proceedings of the Fifteenth
International Semantic Web Conference, 2006.
11th - 17th March Had a sensible conversation with Ewan about when I should stop reading and start actually producing something. The answer is sooner, rather than later. I'm aiming for a literature review that is "neither a first nor final" draft by the end of April. I started outlining topics that should be in it, and began faffing about with LaTeX (and Markdown and Pandoc).
Then started panicking and promptly quadrupled the amount of things on my to-
read list. Oops. I blame Matthew Rowe.
I finished reading his first year review (2007) as well as his paper about
Meervisage: Community-based Annotation for the Semantic
Web.
I compiled a massive list of more things to read from his lit. review, and
then looked up what he's been doing since then. A lot. Mostly relevant. At
the very least, interesting. Damn. Don't worry, I'll prioritise that list,
and don't aim to finish it before I start writing.
Going to Serbia on Tuesday, for Resonate.io. What should I bring back?
Information about creative processes for digital artworks.
Anything to do with open data or decentralized social networking movements.
Potential case studies or people to work with.
... to be discovered?
Had a chat with Russell of Makur Media about
video metadata and ontologies for describing film/TV production processes.
We're doing some of the same research and keeping in touch is likely to be
mutually beneficial. Note to self: Summerhall Cafe is nice and the food is
cheap; go there for lunch one day.
I'm at the Resonate new media festival in Belgrade.
Studying: YES
I started a PhD in the Centre for Intelligent Systems and their Applications
(CISA) in Informatics at the University of
Edinburgh in October 2012. I'm looking at how Semantic Web tools might be able
to augment digital media creation by providing collaborative opportunities for
content creation and improving connectivity between creative projects. I'm
interested in decentralized social networks, and retaining control of and
making good use of personal data.
Writing: Sometimes
Gearing up for Camp Nanowrimo this April.
Working: YES and no
Tutoring and doing ongoing freelance work, but not taking on any new projects.
Last modified:
Week in review: Belgrade
18th - 24th March I went to Belgrade for Resonate.io, and wrote about it here.
Resonate, held between the 21st and 23rd of March in
Belgrade, Serbia, is "...a platform for networking, information, knowledge
sharing and education. It brings together distinguished, world class artists
with an opportunity of participating in a forward-looking debate on the
position of technology in art and culture." (from the website).
Before I left, I suggested I might return with the following:
Information about creative processes for digital artworks.
Anything to do with open data or decentralized social networking movements.
Potential case studies or people to work with.
I largely failed on all three counts.
I was thrown from the outset by the apparent poor organisation of the event.
Not to mention a complete lack of free food. But the main problem was that
well over one thousand people had tickets, but on the first day the main
lecture room could hold a few hundred at best. Seating consisted of a handful
of sofas and armchairs and valuable floor space was occupied by altogether too
many stylish coffee tables. For everyone not lucky enough to be among the
first ten in the room it was aching backs and/or pins and needles all round.
This situation improved slightly after the first day, when two more tracks
opened in slightly bigger rooms, but there was still nowhere near enough
space. People were bursting out all doors, so switching tracks ever wasn't an
option. There were also several long delays or postponements. A few were
weather related, but too many (ie more than none) were organisational; lack of
projector in main room, etc.
That aside, I was aware that an event labelled 'festival' wasn't going to be
right at the conference end of the party<->conference scale, but I was
surprised at just how much party it was. A party with thousands of people,
where everybody knew someone else but me. This made it particularly difficult
to interact. You might expect the opposite. Indeed, I suspect that for most
people this was the perfect environment to make new friends, start
collaborations etc. I'm (usually) great at networking. I'm never great at
social situations involving large crowds, a bar and loud music. I tried. But I
couldn't catch anyone's eye, there was never a moment to start a conversation.
The most interaction I had over three days was being elbowed out of the way by
people who felt more entitled to see what was going on than me.
I might have fared better if...
...I had succeeded in getting a place at one of the workshops. Places were
very limited, but it was explained that all workshops were open for anyone to
listen in on even if you couldn't participate directly. Had I taken part in
one, it would have been a lot easier to talk to some specific people. I went
along to attempt to listen in, however, to find all of the workshops (twenty
or so) consisted of people grouped around tables, together in the same giant
hall. The actual participants were craning their necks, straining to hear
their workshop instructor over the general clamour of the event, so it was
impossible for bystanders to be involved at all. Plus, only a couple of the
workshops had (handwritten) signs indicating which they were, so there was
also no way of tracking down the ones I was particularly interested in.
...I had been to any of the performances or night club tours that started
about about 9pm each day and ran until the early hours of the morning. The
performances, as far as I could tell, were electronic music sets, held in
night clubs or similar venues. I don't do night clubs, and I was knackered
by 7pm anyway, so that was a no go. Having said that, it probably wouldn't
have been easier to meet new people over very loud music in a place where
everyone was getting drunk, so maybe I didn't miss out.
Now I've explained that, I will write a bit about the talks I did manage to
get in to, which were generally interesting and of good quality. (The
itinerary I sketched out for myself beforehand differed greatly from what I
actually achieved because of crowd/small room issues mentioned previously).
These aren't the only things I went to, but the only ones I took notes or
tweeted about.
**Marcin Ignac**, talking about Data Art, showed some really cool things he's done with Plask and WebGL, including 3D data visualisations, hacking with fonts, and realtime installations like a 3D visualisation of global energy market transactions. Plask and WebGL are capable of a lot, just in the browser. He also mentioned basil.js, which is "a library that brings scripting and automation into layout and makes computational and generative design possible from within InDesign" (cite) which looks useful for artists wanting to get into coding.
Mike Tucker ("Unity as a Tool for Non-Games") suggested that Unity fills the creative gap recently vacated by Flash. He started out as a Flash guy, but isn't sad or bitter about Flash's demise, and understands that it's time to move on. His current WIP is an app to explore an abstract visual and audio landscape using the device's gyroscope. The audio is 'physically' located in a virtual 3D world, and changes as you navigate around by moving the device in space.
Julia Laub told us about her Generative Design book, that she worked on as part of her thesis project. She defined (with a diagram) generative design as creating choices, then making choices, rather than controlling a visual output. She created a visualisation of Wikipedia pages that presents as a self-optimising network - as you interact with the diagram to expand the information you want to see, it rearranges itself for optimal viewing. Her book looks amazing, and getting my hands on a copy is on my things-to-do list.
Dmitry Morozov ("An Autonomous Synthesis") showed some great circuit bent installations and sound projects; check out http://www.vtol.tk/.
Signal | Noise (oops, I didn't take down the names of the actual guys) ("Datatainment") talked about gamification of data collection. People like "digital navel gazing"; they derive satisfaction from their own data, and comparing themselves to others. They mentioned a "top secret" client project for which they're aiming to "quantify everything people do"... intriguing...
Lucas Werthin ("Design, Tech and Architecture for Large Scale Projection Mapping") showed us the ins and outs of an incredible project he'd worked on.. Described here (with videos).
The onedotzeroscreening was a
compilation of digital animation work from a number of artists. It was weird
and awesome, with some inspiring visuals and music I need to listen to more
(inspiring for writing fiction, not for the PhD unfortunately). Notes I wrote
during that suggested I need to listen to the music in Warsnare, and the one
with the giant Catzilla in.
Markus Heckmann and Barry Threw ("Building by Doing - Visually guided design in TouchDesigner") described another easy bridge for artists who want to code. I wrote down "TouchDesigner" in my notes during this talk, but I can't remember why now. Find out more here.
My favourite talk was by Ivan Poupyrev ("Computing Reality"). I tweeted
loads about it, but none of them got sent because the wifi and my phone
weren't playing nice or something. Fortunately I also made a ton of notes.
Ivan describes himself as an 'inventor'; he worked for Sony, and now works for
Walt Disney, and he is inventing the future. He has a great ethic and vision
for the world; all about "giving people tools to make the world the way they
want it to be." He envisions a decentralisation of production; large
corporations only want to make their part of the world interactive, not the
whole world. So ordinary people must have the technology to use, develop,
spread, build on.
In 1999, his team created an augmented reality toolkit, before it's time. In
2001, they developed a flexible display with is interacted with by bending it
and sliding fingers around the back of the screen. A huge amount of
interactions are possible just by bending and flexing in different ways. In
2004, Sony said "users will never accept a device with no buttons", and all
early touchscreen devices also had buttons because of this. He says the
iPhone was the "fall" of the button, proving everyone wrong. Last year (2012)
the Sony PS Vita has touchback interaction, and Samsung have released a
flexible display this year (2013) but "nobody cares".
Now, he says, everything has been invented already, the market is saturated
with new gadgets. He sees the future of the technology curve as embedded in
people and surroundings: "no question... that it's coming to your body ...
going to seep into the environment, disappear into the environment ...
seamlessly, invisibly, efficiently" and describes a reality that computes
itself, where "the computer doesn't have to exist at all."
Ivan was very expressive about not being any kind of "tree-hugger", but is
convinced that we don't need to "make more junk". So many resources have been
used, and the earth can't support another industrial revolution. Instead, he
wants to turn everything that already exists into interactive objects,
including humans, animals and plants. That may sound weird / scary / far-
futuristic but guess what... they've already done it.
Flipping interaction on its head, they're all about not changing the
environment, but changing you, or your perception of the environment.
Touchৼ/em> is used for 'virtual tactile perception'... they can create a charged
field around the human hand so that you feel things differently. The objects
themselves are passive, simple, unchanged. The person just has to be in
contact with the device that creates the field, which can be embedded in an
object you're already touching like clothing, a shoe or an umbrella. Then,
with no wires or weird contraptions, the person can touch some object (like a
teapot) and as the settings of the field are changed, so the texture of the
object appears changed.
With this technology they can also tell who is touching something, or which
part of your own body you are touching, because everything has a different
electronic resistance. An example they produced was a touchscreen drawing
application that changed the pen colour depending on who was drawing, with no
additional information than sensing the fingers on the screen.
Disney has the botannicus interacticus \\\\- an interactive plant. Electrodes
in the soil transform any plant into a multi-touch controller! Gestures
around the plant (à la theremin) or touching the plant in different ways, can
be mapped to things like sound. It's possible to have very high precision.
All plants are different, too! So the same tech applied to different plants
will cause different outcomes. Video.
There's an open source version of Touchৼ/em> for Arduino.
Ivan also played with 3D printing, and sees this as something that will become
hugely accessible to the extent that people will start to manufacture most
things themselves; or at least, pop down to their local corner shop to get
something printed from an existing design.
They've done some experiments with 3D printing transparent objects, and 'light
pipes' which direct colours and sizes of light precisely. They can create
interactive displays by projecting light from below or behind objects and
piping images onto them.
It's possible now to 3D print a broad variety of sensors.
These things result in interactive objects that respond to you, but all of the
electronics are outside of the object, so you can switch one object for
another one and have it work the same very easily.
He concluded with:
1\\\\. Digitizing what we already have, not making more junk.
2\\\\. Sustainability requires augmenting humans and growing your electronics.
3\\\\. Distributed manufacturing vs. mass production.
Resulting thoughts and ideas I got a general feeling of disparity between 'art' and 'real-life', with strong suggestions that it doesn't matter if interactive, technology-powered art installations break, so long as people are compelled to play with them. That's something I absolutely loved and absolutely hated simultaneously during my MSc in the ECA last year, and still causes internal conflict. (Ie. I understand the value of play and experimentation, but I'm passionate about things being useful and empowering, and it's possible to do both, and it bothers me when people take the easy way out, slap the 'experimental art' label on it and move on to their next solid-outcome-less project).
Despite not actually talking to anyone about what I am doing and how it might
in some way link to what they do, I suspect digital artists like these kinds
of people might be good use cases for what I'm trying to make. They
collaborate, have varied processes. And are more likely than amateur
YouTubers to be interested in engaging with a new experimental technology.
They could, for example, be incentivised to record their processes and actions
over the course of a project, and be rewarded with visualisations of their
data, and comparisons with the data of others. (Actually making the
visualisations is out of my remit, but there will be someone who can..).
A thing I should do is analyse blogs, articles, reports, etc about creative
digital projects for the vocabulary about their processes. I thought about
this as one of the speakers was just describing step by step the process for
one project... but I wasn't listening properly; I only realised in time to
have this thought, too late to write it all down. But there will be loads of
documentation already out there that can be harvested.
I always think of my project as something that helps a lot toward connecting
with others for collaborating, but a large part can be finding other art/media
projects for inspiration. That kind of pitch would sell it to this kind of
audience, at least.
Change of plan. Trying to run the whole thing with Blogger templates did not
result in a happy time. So the main site pulls and manipulates the Blogger
RSS feed, with Simple Pie making that easy.
I then simplified the actual Blogger template and made it match and stuff.
The general intention now is that individual posts and comments will be
viewed, if necessary, via Blogger, in a kind of archive-y way, but the content
organised as I see fit exists on the main site as the primary port of call.
But my goodness does Blogger automatically add some crap into its templates.
I delete stuff and it comes right back. Pointless classes, and divs, and
general unnecessary bulky markup. I give up with trying to strip all that
out.
In 2008 I was in my first year of university, and the second ever Lovelace
Colloquium was held in Leeds. I was encouraged to attend by Professor Cornelia
Boldyreff and then-PhD student, now-Dr, Beth Massey. Doing so may have changed
my life.
At my first Lovelace, I was introduced to the very concepts of conferences,
mentors and (importantly) networking. The event was, and has been ever since,
a forum for thought-provoking technical talks, inspiring motivational speeches
and stimulating discussions about technology-related disciplines, careers, and
womens' role within this world. To attend Lovelace is to be surrounded by
extraordinary and excited minds; undergraduates at the top of their game, and
successful academics and industry professionals to advise and mentor. Having
now been along as an attendee, a poster competition entrant and for the past
two years as a judge, the conference has provided perfect annual milestones to
mark my own academic progression and personal development. I have met so many
wonderful people and made so many important connections thanks to this event
that I genuinely think I would be in a different place today, perhaps as a
different person, had I never been. I can trace back directly or indirectly to
one or other Lovelace Colloquium many of the opportunities I have had to
develop academically (poster presenting, inspiring conversations),
professionally (networked my way to a Google internship) and personally
(overcoming low self-confidence, understanding imposter syndrome and
conquering public speaking).
This year's, hosted by the School of Computer Science at Nottingham
University, has been no different.
For the first time ever I arrived with time to spare before registration, and
got to know some of the other helpers and attendees. I was put in charge of
organising posters, directed towards a room containing lots of large fuzzy
blue boards, divided up the space based on the number expected in each
category (First Year, Second Year, Final Year, and taught Masters) and
cheerfully handed out drawing pins to entrants as they arrived.
At 10 the crowd who had gathered in a lecture theatre were welcomed by the
superhuman Dr Hannah Dee, and the first round of talks began.
Instantly relevant (to me), Natasha Alechina discussed work on logic in
ontologies. The use of logic can help with debugging when creating new
ontologies by detecting inconsistencies (eg. fallasies, contradictions) or
incoherance (eg. empty sets). The method they use is to compute a minimal set
from a big graph in which nodes are statements, and they keep track of where
all the statements are derived from. It was "surprisingly fast" when tested
with 1600 large random ontologies, compared to state of the art methods to
compute minimal sets.
Logic is also useful in ontology matching, for example Ordnance Survey
vocabularies versus Open Street Map. Logic helps the process by finding what
might need to be changed or removed, but human intervention is needed to make
the final call.
Next up, Jemma Chambers turned out to be a brilliant speaker and surely
inspired everyone in the room by telling us how she'd made the most of a
career in technology over the past decade. She was in her last week as a CISCO
business development manager, about to move to a similar role at Virgin Media.
She started with some statistics:
51% of gamers are girls, but only 6% of those who make games are female.
21% of jobs in technology overall are held by women.
Companies with women in their management report a 34% return on investment over companies with only males.
20% of C-level (CEO, CTO, CIO, etc) leaders worldswide are female.
(Disclaimer: I may have botched the context of those stats slightly, my notes
aren't very clear. But you get the idea. Also she didn't say where these stats
are from).
Jemma did a year-in-industry during her degree, programming for Oracle. She
was bored out of her mind coding (I'm sure some people in the audience
sympathised, but probably a minority) and thus learnt what job she didn't
want to do when she graduated. Instead, she joined an accounts management
graduate program at CISCO, had some doubts but stuck it out, rocked hard in
sales and climbed the ladder through hard work and force of will, despite
various sexist or ageist behaviour directed her way. A key point here is
whatever you end up doing, do it well; being successful wherever you end up
opens doors to what you really want to do, if you're not already there.
Especially in the big tech companies like CISCO, where moving between jobs
internally is facilitated and even encouraged.
On a related note, Jemma talked a bit about the flexibility of CISCO (and
other similar companies). Working hours, for example, are yours to choose so
long as you get the job done. Similarly she's had no problem negotiating
maternity leave, and eighteen months after the birth of her son she's working
three days a week (and still feels guilty about dropping him with the
babysitter).
Naturally she mentioned a few (legitimate) generalisations about women in the
workplace (nothing I haven't heard before, but this is my fifth Lovelace)
and followed them up with some solid advice. Women seem to attribute success
to outside forces like luck, or kindness of others, where men attribute
success to themselves. It's much easier to move forward if you remind yourself
that you worked hard for this and deserve it.
Successful men are more likely to be percieved as likeable than successful
women, who are often construed as bitches. Ignore what other people think, and
don't let yourself get walked on to try and make friends. At the same time,
don't let this stereotype go to your head; remember to support other women in
the workplace rather than being competitive.
Women and men have different leadership styles (generally) as well as other
strengths and weaknesses of their own, and it's a combination of the two that
really make a successful team, not more of one than the other.
Jemma recommends reading Lean In by Sheryl Sandberg.
She also discussed the various merits of networking (of which I am happy to
attest there are many!) and how to source mentors in the tech community.
This talk was a fantastic one to start the day with, especially to prompt any
in the audience who might otherwise have not done so, to talk to everybody.
Jemma's enthusiastic speaking style will have kept everyone engaged, too, even
those still waking up.
Dr Julie Greensmith filled us in on her journey from a pharmacy undergraduate through to her current work on artificial immune systems. These are algorithms inspired by human immune systems; robust, decentralized, adaptive and tolerant. They work by knowing what is normal instead of what isn't, which is particularly useful if you don't know what attack is going to come next. Their early work, though excellent, was based on a rudimentary computer scientist understanding of how immune systems work; these days they have a more interdisciplinary team with biologists to improve things even further.
Gillian Arnold, who is exceptionally well known and officially recognised as An Inspiring Woman, was filling in for a speaker who couldn't make it. She talked through the best career moments of various people she knew, which ranged from getting software into the hands of the public to promotions and financial incentives. She also talked through a few of the stereotypical problems women have in the a male-dominated workplace, but most of what she could have said had been covered by Jemma. A pro tip for getting attention at meetings if you're being talked over is to bang the table.
Dr Hannah Dee gave us a technical talk about her current research, as well as a little background on how she got where she is. She is much happier as a lecturer as opposed to a post doc, as she gets to direct her own research areas, and isn't constrained within fields she's not totally comfortable (like surveillance). So now she's interested in time and change in nature, doing things like laser scanning and time lapsing plants to find out new things that are particularly hard to find out. Some really interesting stuff about camera hacking with the Canon development kit, which lets you write programs in Lua or BASIC, and provides the sorts of menu options you'd usually only find on a really expensive camera.
Milena Nikolic is an engineer at Google London who has worked on Google's mobile sites, integrating results from mobile app stores into search results and the Android Market / Play Store. She says she has undergone a "journey of scale", and loves shipping projects that make a real difference and are used by real people. She answered lots of questions about working at Google. As with Jemma's experience at CISCO, hours are flexible at Google, and there are no strict iterative phases for development, but projects have their own cycles. She doesn't spend as much time coding as she'd like, but this varies depending on the stage a project is in, too.
Then someone asked "why are girls scared of coding?" and a lively discussion
ensued. For some reason I didn't take notes, but things I can remember that
were suggested include:
Girls are more hesitant about diving in, or scared of breaking things. To progress with programming, you've gotta just keep trying and failing.
Girls are more emotionally affected if their code does fail. Guys just shrug it off and try something out. (I personally have never felt like this).
Girls are less likely to be exposed to programming or programming-like activities at an early age, so by the time they come across computer science they may see it as boring, too mathsy or not creative. I suspect that had I not got interested in making websites aged ten, it might have passed me by during high school, and I would have ended up doing chemistry or French at university.
There were more; I'll add them if I remember.
In between these fantastic talks were coffee, lunch and networking breaks and
of course, poster judging. I teamed up with Milena Radenkovic to assess the
second years, and after three quarters of an hour of lunch, plus a 'last
minute' extra half-hour before the decision had to be made (thus I missed the
panel discussion), we had narrowed it down to five... It was hard.
Seriously. We discussed the poster content, presentation, practicality of the
ideas, whether the student was showing a project they were personally involved
with or intending to do (this holds weight with me) and how well the student
explained their ideas in person. They were all brilliant on all counts. We
negotiated splitting the second place prize in two, but still had to choose
three out of our final five.
Eventually we settled on Carys Williams (quantum cryptography; University of
Bath) for the first prize, and Heidi Howard (routers that pay their way;
University of Cambridge) and Jo Dowdall (smart tickets; University of Dundee)
for joint second place.
I only wish I'd had time to look at the rest of the posters!
I finished off the day by joining other attendees for dinner, which was all
round brilliant, and resulted in a late night.
Inspiring and empowering: The Lovelace Colloquium, Nottingham 2013
In 2008 I was in my first year of university, and the second ever Lovelace
Colloquium was held in Leeds. I was encouraged to attend by Professor Cornelia
Boldyreff and then-PhD student, now-Dr, Beth Massey. Doing so may have changed
my life.
At my first Lovelace, I was introduced to the very concepts of conferences,
mentors and (importantly) networking. The event was, and has been ever since,
a forum for thought-provoking technical talks, inspiring motivational speeches
and stimulating discussions about technology-related disciplines, careers, and
womens' role within this world. To attend Lovelace is to be surrounded by
extraordinary and excited minds; undergraduates at the top of their game, and
successful academics and industry professionals to advise and mentor. Having
now been along as an attendee, a poster competition entrant and for the past
two years as a judge, the conference has provided perfect annual milestones to
mark my own academic progression and personal development. I have met so many
wonderful people and made so many important connections thanks to this event
that I genuinely think I would be in a different place today, perhaps as a
different person, had I never been. I can trace back directly or indirectly to
one or other Lovelace Colloquium many of the opportunities I have had to
develop academically (poster presenting, inspiring conversations),
professionally (networked my way to a Google internship) and personally
(overcoming low self-confidence, understanding imposter syndrome and
conquering public speaking).
This year's, hosted by the School of Computer Science at Nottingham
University, has been no different.
For the first time ever I arrived with time to spare before registration, and
got to know some of the other helpers and attendees. I was put in charge of
organising posters, directed towards a room containing lots of large fuzzy
blue boards, divided up the space based on the number expected in each
category (First Year, Second Year, Final Year, and taught Masters) and
cheerfully handed out drawing pins to entrants as they arrived.
At 10 the crowd who had gathered in a lecture theatre were welcomed by the
superhuman Dr Hannah Dee, and the first round of talks began.
Instantly relevant (to me), Natasha Alechina discussed work on logic in
ontologies. The use of logic can help with debugging when creating new
ontologies by detecting inconsistencies (eg. fallasies, contradictions) or
incoherance (eg. empty sets). The method they use is to compute a minimal set
from a big graph in which nodes are statements, and they keep track of where
all the statements are derived from. It was "surprisingly fast" when tested
with 1600 large random ontologies, compared to state of the art methods to
compute minimal sets.
Logic is also useful in ontology matching, for example Ordnance Survey
vocabularies versus Open Street Map. Logic helps the process by finding what
might need to be changed or removed, but human intervention is needed to make
the final call.
Next up, Jemma Chambers turned out to be a brilliant speaker and surely
inspired everyone in the room by telling us how she'd made the most of a
career in technology over the past decade. She was in her last week as a CISCO
business development manager, about to move to a similar role at Virgin Media.
She started with some statistics:
51% of gamers are girls, but only 6% of those who make games are female.
21% of jobs in technology overall are held by women.
Companies with women in their management report a 34% return on investment over companies with only males.
20% of C-level (CEO, CTO, CIO, etc) leaders worldswide are female.
(Disclaimer: I may have botched the context of those stats slightly, my notes
aren't very clear. But you get the idea. Also she didn't say where these stats
are from).
Jemma did a year-in-industry during her degree, programming for Oracle. She
was bored out of her mind coding (I'm sure some people in the audience
sympathised, but probably a minority) and thus learnt what job she didn't
want to do when she graduated. Instead, she joined an accounts management
graduate program at CISCO, had some doubts but stuck it out, rocked hard in
sales and climbed the ladder through hard work and force of will, despite
various sexist or ageist behaviour directed her way. A key point here is
whatever you end up doing, do it well; being successful wherever you end up
opens doors to what you really want to do, if you're not already there.
Especially in the big tech companies like CISCO, where moving between jobs
internally is facilitated and even encouraged.
On a related note, Jemma talked a bit about the flexibility of CISCO (and
other similar companies). Working hours, for example, are yours to choose so
long as you get the job done. Similarly she's had no problem negotiating
maternity leave, and eighteen months after the birth of her son she's working
three days a week (and still feels guilty about dropping him with the
babysitter).
Naturally she mentioned a few (legitimate) generalisations about women in the
workplace (nothing I haven't heard before, but this is my fifth Lovelace)
and followed them up with some solid advice. Women seem to attribute success
to outside forces like luck, or kindness of others, where men attribute
success to themselves. It's much easier to move forward if you remind yourself
that you worked hard for this and deserve it.
Successful men are more likely to be percieved as likeable than successful
women, who are often construed as bitches. Ignore what other people think, and
don't let yourself get walked on to try and make friends. At the same time,
don't let this stereotype go to your head; remember to support other women in
the workplace rather than being competitive.
Women and men have different leadership styles (generally) as well as other
strengths and weaknesses of their own, and it's a combination of the two that
really make a successful team, not more of one than the other.
Jemma recommends reading Lean In by Sheryl Sandberg.
She also discussed the various merits of networking (of which I am happy to
attest there are many!) and how to source mentors in the tech community.
This talk was a fantastic one to start the day with, especially to prompt any
in the audience who might otherwise have not done so, to talk to everybody.
Jemma's enthusiastic speaking style will have kept everyone engaged, too, even
those still waking up.
Dr Julie Greensmith filled us in on her journey from a pharmacy undergraduate through to her current work on artificial immune systems. These are algorithms inspired by human immune systems; robust, decentralized, adaptive and tolerant. They work by knowing what is normal instead of what isn't, which is particularly useful if you don't know what attack is going to come next. Their early work, though excellent, was based on a rudimentary computer scientist understanding of how immune systems work; these days they have a more interdisciplinary team with biologists to improve things even further.
Gillian Arnold, who is exceptionally well known and officially recognised as An Inspiring Woman, was filling in for a speaker who couldn't make it. She talked through the best career moments of various people she knew, which ranged from getting software into the hands of the public to promotions and financial incentives. She also talked through a few of the stereotypical problems women have in the a male-dominated workplace, but most of what she could have said had been covered by Jemma. A pro tip for getting attention at meetings if you're being talked over is to bang the table.
Dr Hannah Dee gave us a technical talk about her current research, as well as a little background on how she got where she is. She is much happier as a lecturer as opposed to a post doc, as she gets to direct her own research areas, and isn't constrained within fields she's not totally comfortable (like surveillance). So now she's interested in time and change in nature, doing things like laser scanning and time lapsing plants to find out new things that are particularly hard to find out. Some really interesting stuff about camera hacking with the Canon development kit, which lets you write programs in Lua or BASIC, and provides the sorts of menu options you'd usually only find on a really expensive camera.
Milena Nikolic is an engineer at Google London who has worked on Google's mobile sites, integrating results from mobile app stores into search results and the Android Market / Play Store. She says she has undergone a "journey of scale", and loves shipping projects that make a real difference and are used by real people. She answered lots of questions about working at Google. As with Jemma's experience at CISCO, hours are flexible at Google, and there are no strict iterative phases for development, but projects have their own cycles. She doesn't spend as much time coding as she'd like, but this varies depending on the stage a project is in, too.
Then someone asked "why are girls scared of coding?" and a lively discussion
ensued. For some reason I didn't take notes, but things I can remember that
were suggested include:
Girls are more hesitant about diving in, or scared of breaking things. To progress with programming, you've gotta just keep trying and failing.
Girls are more emotionally affected if their code does fail. Guys just shrug it off and try something out. (I personally have never felt like this).
Girls are less likely to be exposed to programming or programming-like activities at an early age, so by the time they come across computer science they may see it as boring, too mathsy or not creative. I suspect that had I not got interested in making websites aged ten, it might have passed me by during high school, and I would have ended up doing chemistry or French at university.
There were more; I'll add them if I remember.
In between these fantastic talks were coffee, lunch and networking breaks and
of course, poster judging. I teamed up with Milena Radenkovic to assess the
second years, and after three quarters of an hour of lunch, plus a 'last
minute' extra half-hour before the decision had to be made (thus I missed the
panel discussion), we had narrowed it down to five... It was hard.
Seriously. We discussed the poster content, presentation, practicality of the
ideas, whether the student was showing a project they were personally involved
with or intending to do (this holds weight with me) and how well the student
explained their ideas in person. They were all brilliant on all counts. We
negotiated splitting the second place prize in two, but still had to choose
three out of our final five.
Eventually we settled on Carys Williams (quantum cryptography; University of
Bath) for the first prize, and Heidi Howard (routers that pay their way;
University of Cambridge) and Jo Dowdall (smart tickets; University of Dundee)
for joint second place.
I only wish I'd had time to look at the rest of the posters!
I finished off the day by joining other attendees for dinner, which was all
round brilliant, and resulted in a late night.
1st - 7th April On Monday and Tuesday I'd blocked out 'PhD' time on my calendar, but what was I doing? I do make a point of updating my calendar with hindsight so it doesn't contain lies. I must have done something.
On Wednesday I jetted off to exciting Newark, then on Thursday I was in
Nottingham for the Lovelace Colloquium (and had a fabulous
time). Friday to Sunday saw me running around in circles for
GemuCon.
GemuCon, a first-time gaming convention, wasn't normally the sort of event I'd
go to. Especially not with the £35 ticket price tag.
But it was being organised by one of my friends from my undergraduate, so I
agreed to do the website (violating my no-more-freelance-work policy), and
having botched together a custom registration system (scope creep) I was
drafted in as 'Registrations Officer' on the committee, too. Since I was in
Nottingham on the 4th for the Lovelace Colloquium anyway, I had no excuse not
to go.
It was a good job I did, as the checking-people-off-who-arrive system was web
based, and the hotel wifi was not playing ball from the outset. We'd thought
of that of course, and brought backup wifi dongles. Neither of which could
get signal. So half an hour before registration opened I was writing a script
to export the database into a nicely formatted spreadsheet (sounds simple;
wasn't; ask if you're curious) so that we had more than one machine (I had the
database locally on my laptop) we could register over 700 people with. Then
it was literally non-stop.
The other reason I was there was to morally support my good friend
TomSka, who was attending as a guest because he is
Internet Famous.
So my time was split between hanging out in the Operations Room (mostly) to
help confused con-goers with things like registration, lost property, picking
up merchandise, finding the stairs, getting free cupcakes; making myself
useful by running up and down ten flights of stairs on errands (until I
discovered the service elevator; two 6-man lifts between 800 people hadn't
been so accessible); and hanging out with Tom and Matt.
On Saturday I helped him on his merch stall (we sold everything but all of the
wristbands and all of the keyrings and earrings).
During the quiet times when there were other big events on, and thus no
customers, I had to make my own fun.
On Sunday I live-tweeted Tom and Matt's panel "How to YouTube".
This generated a small amount of controversy, as people who have never had to
live off advertising revenue often hate people who live off advertising
revenue even if it means they have found a way to survive by doing something
they love, and can provide what they create to the world for free.
Frankly I'm just excited that we do live in a world where young creatives
can be their own boss, make a living from doing what they love, and where the
only hoops they have to jump through to do so are getting better at their
craft. Whilst the advertising-centric revenue model may be outdated and may
be despised by a good number of people, it's working for YouTubers at the
moment and I haven't seen a better alternative present itself. Not everyone,
particularly consumers of amateur media, can afford to pay for content they
consume; accessing content for free empowers consumers too because their
entertainment choices are not controlled by the same person who controls their
finances (and thus probably most other aspects of their lives). I'm also
fairly convinced that if the advertising revenue model falls flat in the
future, amateur content creators will be much faster to recover and adapt than
traditional media industries would.
The other great thing about this business model - for YouTubers at least - is
that many/most don't start out with financial motivations. After a while they
realise thier hobby is giving pleasure to increasing numbers of people, so
they carry on, and suddenly a side-effect is that they're making money as
well, at no cost to their audience. (This may change as YouTubing is
acknowledged as a career choice).
Maybe naive or overly idealistic, but I don't believe anyone should be stuck
doing a job they hate. It's a very, very long term goal for society, but the
ultimate utopia is a world in which everybody is motivated and empowered to
develop skills they enjoy or knowledge they're passionate about, and to put
their abilities to some use that can sustain an acceptable standard of living
for themselves and their family. Technology plays a crucial role in this (for
a start, all the jobs nobody wants to do will be automated).
Anyway, Tom and Matt's panel was full of sound advice for digital creatives
just starting out, though the current landscape is a very different one from
when _they _began their YouTube journeys (for example: no YouTube).
Though I didn't experience much of it myself, GemuCon had all sorts going on.
There were a few rooms packed full of video game consoles (for people to
entertain themselves at leisure, as well as scheduled tournaments with cash
prizes), a room of tabletop games, merch dealers and artists galore, various
panels with the various guests, a talent show, a cosplay masquerade, and
parties all night every night. Now, I don't like parties, but even I couldn't
resist hanging around at a rave for a bit when the music was Pokemon theme
remixes, or the Zelda soundtrack. Et cetera.
The most impressive thing about this wee convention (and 700-odd people is
wee, compared with similar more established cons) is the air of friendliness
and solidarity that seemed to be ever-present. Granted not everyone could
have been happy at every moment, and there was definitely douchebaggery from
time to time, but in general there was a unification of nerds; an unspoken
understanding between the stereotypically socially awkward that allowed people
to come out of their shells and enjoy themselves in a way that they might
normally suffer abuse for, thanks the the common background provided by video
game and Internet culture. This is somewhat tongue in cheek, but... hopefully
you know what I mean.
I made a few new friends, too.
If you're thusly inclined, check out 'official' photos (here
and here) and videos (here) from Team Neko.
The UK Ontology Networks Workshop took place over one day in the Informatics
Forum.
There was a mix of people there; some talks were way over my head and very
technical, and some talks were by people who confessed they had had to look up
"ontology" that morning. And things in between.
Lazy writeup, but following are notes as I scribbled them:
John Callahan
US navy research.
Focused information integration.
Human intervention to keep predictive part on track. Tweaking.
Alan Bundy
Interaction of representation and reasoning.
Changing world so agents must evolve. How to automate? What would trigger a
need for change:
Inconsistency
Incompleteness
Inefficiency
how to diagnose which?
Interested in language and perception change.
Unsorted first order logic algorithm called Reformation. Based on standard
unification algorithm.
Allows blocking and unblocking unification.
Phil Barker
Schema.org
Cetis (JISC funded)
learning resource metadata initiative.
Big names behind schema.org.
= ontology + syntax
Big and growing ontology.
Dumbed down for people.
LRMI adds to it. W3C go through it. It's creeping, how much do the big names
actually care about stuff that's added?
don't know how Google uses it.
People should consider using it for more sophisticated search and
disambiguation.
Gill Hamilton
Doing more with library metadata. Learnt from OKFN. Had to convince people in
charge.
Dublin core, didn't like; not specific enough. Instead RDF > OWL. "We know
best how to structure our data"
Hardest was convincing marketing people that there was no commercial value.
Metadata is advert to actual resource.
Enrico Motta
Traditionally top down approach. So now so many people interacting with
semantic structures, so should involve users.
Recognise there isn't a unique or best way of doing things.
Initial study included modeling task with binary relations.
Patterns that are more or less intuitive. 4D least, 3D+1 most.
N-ary most widely used by experts.
Relationship between reasoning power and intuitiveness of writing? More
creativity needed for simpler ones. (Not really sure what he's saying)
Email him for copy of study.
Chris Mellish
Ontology authoring is hard. Better ways to do it.
Controlled language input (mature tech); responsive reasoning (also mature,
information as you're editing); understanding the process (beginning to
understand more).
Hypotheses:
users don't know what they're doing. What if questions. Many answers, what is
relevant? Depends on context.
Authoring as dialogue.
Todo list.
Useable in the same ways as protégé.
Peter Winstanley
UN classification schemes.
Various vocabularies.
Allow development of cross mapping between government administrations.
Mostly internal currently. Moves to bring externalizing data into the 21st
century.
Peter Murray-Rust
Fight for your Ontologies.
Ontologies in physical sciences. Chemists don't want ontologies. They'll sue
you.
Crystallography uses 'dictionary'. Written in CIF. 20 years to build CIF.
Compare physical sciences to government.
Every program author writes dictionaries that work for them. When different
parties agree, promote to communal dictionary. Provide conventions to help
disagreements.
Show a company can do it as opposed to a rabbiting academic ..
Jeff Pan
Tractable ontological stream reasoning.
Need to be more efficient, scaleable, as things change. Inputs from web.
Dealing with complexities: approximate owl2.
Dealing with frequent updates: to-add stream and to-do delete stream. Truth
maintenance. Evaluation criteria.
Trowl.EU can use with protégé, also supports jena.
Edoardo Pignotti
Semantic web tech to support Interdisciplinary research.
ourSpaces VRE
Provenance crucial.
OPM prov ontology.
Deployed since 2009, 180 users. Comprehensive ontologies but people unwilling
to provide metadata.
paper! Edwards et al. ourSpaces.
Tom Grahame (BBC) @tfgrahame
Content arrangement on BBC sport by tagging, automatic to free up editors to
write.
LD API so systems don't need to know about each other.
Growing from simple rdfxml to more complex ontology.
Can ask much more general and much more detailed questions about sport.
Mapping incoming data is outsourced.
Lots of errors, sometimes system alerts, sometimes manual.
Working on opening the data. Maybe a dump, but licensing issues.
Ewan Klein
Mining old texts for commodities, adding place and time and putting in
structured database.
Transcriptions of customs import records.
Skos for synonyms.
Dbp concepts.
Why? Want to query.
Visualisations.
Tools? Python script.
Janice Watson
Harnessing clinical terminologies and classifications for healthcare
improvements.
Bob Barr
Geographical addressing.
Addressing and address geocoding is important and broad. Not always postal,
but this not addressed (punlol) in ontologies.
Different contexts change meaning of address (for delivering, you only care
about postbox; property sale whole building).
Loads of things to address. Loads of reasons why.
Work held up as national address file is owned by royal mail and might be
sold!
Fiona McNeill
Run time extraction of data. Failure driven. Looking at extraction of specific
information.
Emergency response. Lots of data, timely sharing of data required.
From domestic level to humanitarian disasters.
How can it be automated?
Multilayered incompatibility.
Format
Terminology
Structure
...
Richard Gunn
Towards an intelligent information industry.
Elena Simperl (Soton, sociam)
Crowdsourcing ontology engineering.
CSrc: Brabham 2008.
Distribute task into smaller atomic units.
Humans validating results that are automatically detected as not accurate.
What are the costs? What resources?
Games with a purpose. Like quizzes.
Micropayments or vouchers.
MTurk. CrowdFlower.
Paper about useage of microtask crowdsourcing. ISWC 2012.
Claudia Paglieri
Ontologies in ehealth.
Enrico Motta \\\\- Rexplore
Klink algorithm mines relations between research topics.
Use this! Nope, it's not public. Uees MS Academic research.
Peter Murray-Rust
Content mining expands regular text mining.
Focus on academic stuff.
Chemical Tagger. Takes chemistry jargon and annotated it, knows actions,
conditions, molecules etc.. NLP. Uses ontologies and contributes to
ontologies.
In chemistry, no need to put everything in rdf because there are already lots
of formalisms.
Proper cool PDF to sensible format conversion. Amy the kangaroo. Looking for
collaborators.
Yuan Ren
Ontology authoring in whatif project.
Reasoning with protégé and trowl .
Tractable reasoning. Trowl v fast.
Notes from conversations / breakout discussions:
BBC use owlm triplestore .
Store all their datasets in svn. But they have reads and writes to the live
triplestore all the time.
Lots of people saying minimise owl use because of unpredictable output.
Versioning ontologies (available in owl2) in case third parties change stuff
you use. You're dependent on their software engineering practices. Only good
if they're ahead of the game.
IRIs, Arabic characters in ontologies!
Semantic heavy, maybe make a decision to abstract away to ids and make heavier
use of labels.
Difference between importing and using someone else's.
There's no (practically useful) software that lets you reason over stuff you
haven't imported? (over HTTP?)
Build ontology from reality (data), don't start with no data.
Lode.
Problems with dbpedia URIs changing or disappearing.
Hard to visualize massive graphs. Relational, tabular much easier to
understand.
Amy, wonderful blog! I was just discussing how well you've done with Duncan
Rowland whom I bumped into last week in Lincoln.
Last modified:
My First Bread
Could have been better.
I based it on this bread recipe, but amended according to my mother's
advice.
Thus, I made a batter with 1/3 of the flour (strong white bread flour; 6oz),
all of the yeast (one 7g packet) and all of the water (415ml) and the oil
(30ml), and left that covered in the airing cupboard for an hour where it
bubbled away merrily.
Next I mixed in the rest of the flour (12oz) and some salt and a handful of
sunflour seeds, poppyseeds and pumpkin seeds. Contrary to what my mother
promised, it was very sticky. I stirred vigorously and kept adding flour
until I could cope with it, and kneading began. I kneaded for fifteen
minutes, having discovered that sometimes punching instead of regular
kneading is much more enjoyable.
I turned it into six balls and one long bit, and arranged them on a tray and
in an ad-hoc loaf tin, and they went back in the airing cupboard for about two
and a half hours (I went out). I covered them, as instructed, and that was
where things started to go wrong..
The bits that didn't get stuck to the tinfoil rose great. But the bits that
stuck pretty much didn't get to rise. They rose outwards loads, but upwards
was constrained. Rubbish.
I baked them anyway (30 minutes, 220 degress celcius).
Tastes great, textures is okay, appearance is disappointing. Next time I
won't cover them quite so securely while they're rising. Far more holey than
I was anticipating too; I'm not sure if under- or over-kneading causes this.
For breakfast I had Tesco's own brand bran flakes, with soya milk.
HAHA that's a lie. I had a cupcake. Orange sponge with a square of dark
chocolate in the middle.
Homemade, so it doesn't count.
Lunch was Linda McCartney sausages in my Mum's 'miracle bread' rolls that I
brought back from home a couple of days ago. They have soya flour, and
linseeds, and all sorts of stuff. Also in there was 'fresh' (actually week-
old) spinach, ketchup, tabasco sauce, hummus and a pinch of magic BBQ spice
mix. And bancha hojicha (a Japanese green tea).
Over the course of the afternoon I drank a cold bottled concoction calling
itself jasmine tea from the Chinese supermarket (because whenever I go into a
Chinese supermarket I have to try a new weird drink).
Dinner, then, was a sort of chilli bean stew, because I thought that would go
well with the weird bread...
[Soak and boil dried kidney beans; skip this part for canned ones, and chuck
them in whenever].
Boileth a quarter of a swede and large chunks of new potatoes with the kidney
beans, until everything is soft.
Meanwhile:
Half an onion, two cloves of garlic, a courgette, a red pepper, broad beans
(frozen), a handful of tomatos and a handful of mushrooms into the wok to fry
until they looked mostly cooked. Plus pinches of: Vegeta (which is a
particularly good vegetable stock you'll find in an Eastern European
supermarket); cajun spice mix; chilli powder; piri piri spice mix; garam
masala and mixed herbs, and let sizzle for a while longer. Then Quorn mince
and a tin of chopped tomatoes, a good mix, and simmer for an indeterminate
amount of time (or until the other pan of boiling things is done).
Combine two pans of boiling/simmering things (ideally into largest pan).
I'm now officially a volunteer moderator for the Edinburgh Freegle group.
I love Freegle (/Freecycle) and have used it for years. I've got rid of stuff
I didn't need and given a loving home to many unwanted things. Things I love
about Freegle are:
Saving items from landfill. The environment says thanks!
Getting stuff for free. My bank balance says thanks!
Giving stuff away. Other peoples' bank balances say thanks!
Knowing there's a network of like-minded, environmentally conscious people out there, and every now and again getting to briefly meet them.
Seeing more of the city! Unless it's outrageously far or outrageously raining, I always walk to pick something up. I've seen a dozen parts of Edinburgh I would never have explored otherwise, and discovered a multitude of cute little independent shops that I would have remained disappointingly ignorant of.
Possessions I have gained thanks to Freelge include:
An under counter freezer.
A futon mattress.
Clothes.
An electric citrus juicer.
Ten empty tins for transporting cakes and stuff.
A yoga mat.
Duvet covers and throws.
A kneeling chair.
When I met the other moderators a couple of weeks ago to become officially
trained in the process, I had no idea how much goes on behind the scenes.
Well, I had some idea, but actually there's loads more. Edward
Hibbert is
behind it all.
At it's core, Freegle still runs on Yahoo Groups. This means it's possible to
interact with it through email alone. Thus, moderation takes place through
the Yahoo Groups interface, but hugely enhanced thanks to Edward's moderator
Firefox plugin. (It means I have to moderate using Firefox or suffer the
consequences, but it's not so bad. One day maybe we/I'll port it to a Chrome
extension).
However, there's also a super high tech front-end at
Freegle.in/Edinburgh. Check it out. The web
interface helps people get their subject lines correct and keeps track of
various user stats. It detects where people are posting from from the subject
line and plots it on a handy map; it categorises posts by picking out keywords
from the subject line too, and usually gets it right.
There's definitely more to write about here, but I'm going to save that for
another time.
Meanwhile, if you're not part of your local Freegle or Freecycle group, change
that! Check out ilovefreegle.com to find yours.
It's an easy and amazingly beneficial way to help the environment and interact
with your local community.
I think over rising on your second rise was to blame - 2 and half hours is a
bit much!
I find it best to cover loosely with oiled cling film, but some people just
use a damp tea towel.
Last modified:
Week in review: upcatching
8th - 14th April Wednesday: returned from brief sojourn at my mum's; TechMeetup.
Thursday: 2nd UK Ontology Networks Workshop (link to notes to follow).
Friday: missed Social Computing seminar as (I found out later) they changed
rooms after I got there (8 minutes late, still not the latest).
The rest: rewrote my literature review outline (still not happy, but need to
stop agonising over it as it still doesn't contain anything substantial).
Read a largely irrelevant paper about cultural computing (notes may or may not
follow). Discovered a book I
want and have asked
the University Library to get. Only the first page of the chapter I
want is available to preview (I've made
enquiries to the author of the chapter, too). Started reading Socially
Mediated Publicness: An Introduction by danah
boyd; notes
will follow.
Just notes from a three-hour workshop about how to write an Informatics
thesis, on the 16th of April.
State contributions (to knowledge) explicitly. Intro, conclusions; each
chapter should have some (probably not all) contributions discussed. Be
obvious; use headings.
Clear openings for future work. Be clear where they are.
Make it reproduceable.
Short / concise. Examiners like short theses.
Introduce what's interesting and important.
When outline thesis, look at structure of main argument, not of document.
Background material must have point. Only include as much detail as you need
to make point.
Points, eg:
Explain method you use.
Novelty of your approach. Similarities with existing work.
Justify choices (evaluate other work).
Don't tear down others' work. 'Build on'.
Cite examiners, they've probably published something relevant.. (but not for the sake of it).
Then we had five minutes to write down what our PhDs are about and what we
have already found out. I wrote:
How do the futures of the Semantic Web and amateur digital content creation
fit together? Can Semantic Web tools and technologies be used to enhance collaborative
creative partnerships and encourage fruitful outputs?
_ There are knowledge sharing systems and collaborative tools for scientific
fields and in education, but nothing for creative artsy things.
Attitudes towards data sharing and privacy amongst content creators are in
flux. There are lots of projects and energy around open data and
decentralised social networks that allow data to become portable and not tied
to one platform. One of TBL's visions for the Semantic Web is the dissolution
of data silos and 'walled' applications that disadvantage the user, and as
such the promotion of the 'ownership' of a user's data by the user themselves,
rather than the software or organisation that uses the data.
__There are lots of reasons people make content. There are lots of reasons
people don't make content (who could / would like to)._
[Notes resume]
Use backreferences; don't repeat yourself.
Info / advice
...homepages.../sgwater/resources.html
..homepages.../imurray2/teaching/writing
Style: Toward Clarity & Grace (book)
The Craft of Research (book)
When to start writing thesis?
Do you already have papers? Slot them into a thesis template asap.
Maybe a year beforehand. Slower pace is better.
Don't assume appendices will be read. More for extra info if needed by people
trying to reproduce your work (not your examiners).
Too many direct quotes look like you don't understand and are avoiding
explaining yourself.
Keep copies of web resources and cite access dates in case they change /
disappear.
Figures might be copyright if you just copy them from papers, even if you cite
them. Remake them, and put 'adapted from' as citation.
Examiners?
Depends on your supervisor. Discuss. Student might be able to suggest someone to examine.
Maybe a balance between internal and external knowledge.
Won't be someone junior, even if they're considered an expert in the field.
Helpful if supervisor knows how that person will behave in viva. Might be a good reason to avoid someone you think would be perfect from their background.
Conflict of interest regulations. You can know them personally though. External can't have been affiliated with UoE in the last three years, or substantially involved in your research (like co-authoring a paper). No ex-supervisors, from any university.
No grading system (ie no different levels of passed PhD). Might be external
prizes if you want extra recognition.
Tuesday: meeting with Ewan, thesis workshop.
Friday: Social Informatics Cluster (non-adoption of technology solutions in
healthcare); Ontologies with a View (nothing of import discussed, delicious
soup in bread bowls from Union of Genius soup cafe).
Workshop by Dr Mimo Caenepeel on Monday 22nd April.
'Critical' does not mean you have to pass judgement, or say why it's good or
bad.
Not taking things at face value.
Started with freewriting about what has particularly influenced / inspired our
own research. Five minutes, not allowed to stop or edit, don't worry about
quality of writing, not for anyone else to read. A good way to get ideas out
of your head and start to organise your thoughts without censoring or
constraining yourself.
How many pages will a review usually take up in a thesis? My policy is to
write what needs to be written and stop when you're done. But apparently 20
to 30, sometimes more, is normal in sciences.
There's no consistent / right answer to 'how many publications to review'.
For some people it's in the tens, for some the hundreds.
Think about how to integrate literature review into the thesis. You're
unlikely to have a chapter that is just 'literature review' and no mention of
the background reading elsewhere.
Good qualities for a lit review?
\\\\- Coherence (avoid fragmentation)
\\\\- Structure, clarity.
\\\\- Proof of novelty - purposeful.
A review can often be considered as an indicator of the quality of the rest of
the research - demonstrating scholarship.
A good place to start:
1\\\\. Write your research question, formulated as a question.
2\\\\. Write up to five research areas that are relevant to your research
question.
3\\\\. Note some related issues/areas that will not be considered in your review.
Think about balance of content.
1\\\\. Three studies influential in your field (I couldn't answer this, I clearly
need to read more).
2\\\\. Two significan older contributions.
3\\\\. Five recent sources.
4\\\\. Two sources that have strongly influenced your thinking.
You don't need to consider all papers in the same level of detail. Decide
which papers are more important / useful than others.
For some papers (important ones) you should work through these questions in
the same way every time you read something (this is 'SQ3R'):
1\\\\. Survey: What is the gist of the article? Skim the title, abstract,
introduction, conclusion and section headings. What stands out?
2\\\\. Question: Which aspects of the research are particularly relevant for your
review? Articulate some relevant questions the article might address.
3\\\\. Read: Read through the text more slowly and in more detail and highlight
key points / key words. Identify connections with other material you have
read.
4\\\\. Recall: Divide the text into manageable chunks and summarise each chunk in
a sentence.
5\\\\. Review: To what extent has the text answered the questions you formulated
earlier?
Critical reading (these seem like really useful questions to work through
whilst reading papers):
1\\\\. What is the author's central argument or main point, ie. what does the
author want you, the reader, to accept?
2\\\\. What conclusions does the author reach?
3\\\\. What evidence does the author put foward in support of his or her
conclusions?
4\\\\. Do you think the evidence is strong enough to support the arguments and
conclusions, ie. is the evidence relevant and far-reaching enough?
5\\\\. Does the author make any unstated assumptions about shared beliefs with
readers?
6\\\\. Can these assumptions be challenged?
7\\\\. Could the text's scientific, cultural or historical context have an effect
on the author's assumptions, the content and the way it has been presented?
See Ridley, D. The Literature Review: A step-by-step guide for students.
Sage Study Skills Series. Sage Publications, 2011 (2008).
I had an amazing evening at the Starting up in IT panel discussion, followed
by Innis & Gunn beer tasting on Thursday evening. It was held in the shiny
MMS Quartermile One offices. (When I'm rich, I want a flat on
Quartermile. A turret-y one, not a glass one. Or
maybe both).
I felt chronically under-dressed when I arrived - a majority were suited - but
everyone was really friendly and forthcoming with advice.
Anyway, speaking of being rich. There were lots of interesting business-wise
people to talk to at this event, including CEO of Skyscanner Gareth Williams,
and Craig Anderson of Pentech Ventures. Plus lawyers specialising in things
like IP, employment, company formation, from MMS. The panel discussion was
enlightening; I'll go through some highlights raw notes...
Funding
Skyscanner - 2 mil from Scottish Equity Partners 2007.
Getting funding isn't a goal or validation.
Best way to get funding is not to need it.
Scottish Enterprise: match funding.
Give as much as you get. Confide in investor.
Getting wise
Don't pitch too early. Build traction first.
Prove potential marketshare one way or another.
Preparing business plan is productive. Converting to a vision to a plan when you get funding.
Subscribe to investment bloggers.
Networkiiiing. Find someone to champion you to an investor.
Gareth: As many people are delusional as have a key insight. How to know which you are yourself?
Employees
Do you need employees or contractors? Casual employees in between.
Consultant / contractors own IP for work they do. Unless contract says otherwise. Employees don't, employer owns it.
I heard about some really interesting ventures, too, like Identity
Artworks which looks like they're making
a _huge _difference to young people, and have really inspiring stories to
tell. Plus ShareIn, soon launching an equity
crowdfunding platform. Veeerrry interesting...
The panel was followed by beer tasting hosted by Innis & Gunn. I don't drink,
but I would have sipped along to be sociable. However, it turned out the beer
wasn't vegetarian (filtered through
isinglass). This, at least, meant
more for everyone else on my table. MMS had come up with a written seating
plan, by the way, that separated people who had arrived together. Forced
networking! Excellent.
This served as great chance for Steve and I to independently practice our
GeoLit elevator pitching, and I think we'd got it down to perfection by the
end of the evening. Extremely encouragingly, we were consistently met with
enthusiasm and responses like "that's an amazing idea!". We left pretty
buzzing.
Got hold of Amateur Media book (2012) in the NLS, and started reading it. Got
in trouble for having a pen instead of a pencil to take notes. Oops. N00b
mistake.
Did more GeoLit related stuff than PhD stuff. Must try harder. (Got a
funding application in and went to a tech start-up panel / networking
event that was great).
Fiddled with my lit review outline a bit, but didn't really accomplish much.
Officially behind schedule for finishing by the end of the month (was I ever
on schedule?). MUST TRY HARDER.
Discovering lots of things to write about semantically annotating multimedia
content. I decided there are three main ways to do this:
Technical / objective / statistical data: eg. media type; shutterspeed; framerate; duration; resolution; date created; number of times viewed at a particular source; number of time shared ...
Bibliographic: creators and contributors and their roles; methods/location of publication; methods/locations of creation ...
Content*: fictional characters; locations; camera movements; scene transitions; colours ...
These categories overlap somewhat really, and when I get round to it I'll type
my Venn diagram up.
Technical is easy, and a lot of that is automatically captured by hardware
or software used to produce and edit works. It's also relatively easy to
extract automatically. Standards like MPEG-7 and MPEG-21 take care of
formalising it, and Jane Hunter turned these standards into semantic
ontologies in 2002.
Bibliographic can largely - but not entirely - be covered by vocabularies
that have been around forever like Dublin Core, FOAF and various library-
originated things. Things that might be missing (or I just haven't found them
yet) are associating roles with tasks involved in digital media production,
since pieces are often a collaborative effort. has some idea of
participants and roles, but the purpose of is digital rights management
stuff, so it's more concerned with the distribution change, I think, than
granular production of content. I haven't read much about it yet.
Content is more interesting, and potentially more useful for ordinary human
beings. Imagine querying IMDB for "that film where John Goodman arrests an
animated talking moose on a US highway" instead of scouring John Goodman's filmography or googling for pictures
of animated meese until you see the right one. Annotating characters, objects
and events, and stringing them onto a timeline is possible with OntoMedia. It's very focussed around narratives, which is great, but doesn't
link back to technical so much. So if you did find the answer to that
query, it wouldn't be able to serve up the timestamp of that particular scene.
On top of what I've looked at already, I still have this list to
(re)investigate:
A thing I want to do is annotate some amateur content with OntoMedia and with
ABC to see how they compare. Maybe I'll do
asdfmovie, because it has
associated comics, and multiple people participating in production. Then I'll
do something live action as well, because I can't base all my research on non-
sequitur lolrandom stick figure cartoons.
Now, back to work..
I want a better name for this, since I'm referring to everything as 'content' anyway. So some better way of saying 'content of content'.
Thursday 16th of May was the 6th Open Knowledge Foundation meetup in
Edinburgh. We had a great room in Techcube, more speakers than usual and
loads of attendees. Here's an account.
Bill Roberts Bill, founder of Swirrl ("the linked data company") talked about tools and user interfaces they are developing to make handling data easier for communities; particularly for the less technical. They've encountered a spectrum of users with different levels of technical abilities and needs, so they have to account for this in the tools they build. Technical complexity for accessing data ranges from SPARQL endpoints, JSON APIs, downloadable spreadsheets to visualisations, maps and charts.
They're focussing on providing the data in accessible ways rather than
building visualisations though. They'd struggle to meet or even understand
everyone's needs; instead, it's important to concentrate in empowering the
communities to use the data themselves.
Kim Taylor An undergraduate Informatics student and participant of the Smart Data Hack, Kim showed us placED, the project her team had worked on, and are continuing to develop.
This is a place finder for people who are looking to maintain or improve their
personal wellbeing. They used datasets from the City of Edinburgh Council
(and presumably ALISS?) to create an Android app. They stored their data in a
Google AppEngine datastore, but I'm not sure if it has a web frontend as well.
Some of the problems they encountered include copyright issues with ordnance
survey place data, and royal mail postcode data, which made up part of the
Council's data but wasn't available for anyone to use due to licensing
restrictions. They worked around this by recomputing location data from the
parts of addresses the did have access to with Google's geocoding API.
When they open their database up for user input, which they inevitably will if
they want their app to stay current and useful, they'll have to think about
how to maintain the content.
Gavin Crosby Gavin works for the Council, with a title I've forgotten, but it's to do with youth work. Youth work has a very specific definition to do with people aged between 11 and 25, meeting in organised groups with a volunteering adult present. There's loads of this going on in Edinburgh, organised by Scouts/Guides, schools, churches or maybe even self-organised. There's no central database about what is going on where, which is one of the Council's biggest issues in this area. Word of mouth is usually how this kind of information is spread amongst young people, and Gavin suggested that a lot of youths may be unwilling to attend something they'd heard about without a direct invitation from someone they know.
In an attempt to reign some of this information in, they've created the Youth
Work Map.
It's not an ideal system, as they have to update it manually when a youth
group or activity organiser decide to inform the Council that they exist. Not
everybody opts in, so there is data missing. Manual updating also means the
map is not 'live'; things might go out of date and not be removed straight
away.
Gavin said it is the constraints of the Council's web system that has caused a
lot of the problems, and points out that they haven't considered accessibility
issues (for example, access for people with vision problems), and it's not
interactive. He'd love to see the ability for kids to chat to each other
through the map, or leave reviews for particular events. There are issues
with child protection here, of course.
He would also like to see better tagging and organisation of the content on
the map, links to other data repositories (there are parallel similar
projects), and the ability to connect events to areas or routes rather than
single points.
Gavin pointed out that a lot of the audience for this map is likely to be
adults looking for youth projects, rather than young people themselves.
Leah Lockhart Leah made a quick announcement about the new Local Government Open Data Working Group. They're organising open data surgeries (similar to her social media surgeries that you've definitely heard of by now if you're floating around the OD scene in Edinburgh). They're also hoping to fill in the OKFN Open Data Census for Scotland, and meet regularly in the pub.
Fiona works in Informatics at the University of Edinburgh, and she told us
about Open Data in climate change science, or the lack thereof. A team she
has put together have got some funding to carry out a small investigation
about Open Data use in climate change science, and to try to build a network
around this. They'll be looking at trends and patterns of the past decade to
see if research has been any more successful when existing datasets were used,
or if papers are more well-cited when they make their data open at the end of
it (for example).
She thinks the lack of Open Data in this area could be due to the expensive
nature of making data good enough quality to share, and of course the fact
that when people have worked hard to gather data they feel that they own it;
why should they share?
They're hoping that their report might go some way to persuading funding
bodies to have sharing of data as a criteria for applications.
Contact Fiona if
you're interested in this kind of thing.
John Kellas
John said, brilliantly, that talking about information visualisation usually
means graphs. But normal people "don't think in graphs".
He works in community education, and a couple of years ago he started working
in "volumetric and comparative" visualisations, which can be much more useful
and empowering to people. He showed us a visualisation of one trillion
dollars (which I can't find a link to, so let me know if anyone has one).
He's not had much support with creating tools and visualisations, because he's
not interested in making money from it, so it's hard to attract funding. What
he's doing looks really useful though, so hopefully we'll see more!
Ben Jeffery Ben is another undergraduate Informatics student who took part in the Smart Data Hack and whose team is still working on the project they started at the hack. They're re-imagining the University's student information portal by pulling in lots of different data sources, and presenting the information more sensibly. They've been doing a fantastic job, but of course are all busy with exams and general learning, so the haven't been able to spend as much time on this as they'd like.
They're also struggling to get raw data out of the University, and point to
(my alma mater) the University of Lincoln's open data
portal as an example of what could and should be
done about this. So they're turning their project into a pilot to demonstrate
what they could do if they had the data they wanted. They're also conscious
of similar-but-different projects, like projects.ed.ac.uk, and don't want to
duplicate effort.
Ben said they've found that a lot of the University of Edinburgh's data is
held by middleware vendors, so it's particularly hard to access. But this is
information that is funded by students, so it should be available to them! He
said the "University should be a breeding ground for knowledge" so data
shouldn't be silo'd up.
He also said that there are a lot of politics in the way with this sort of
thing. They, as any level-headed software developer, just want to build
stuff. They're still in various talks though, so this is a space to watch...
Susan Pettie and Marc Horne These guys are from So Say Scotland and aim to change culture to make Scotland better. Open Data is important for democratic movements, so they told us about some of their events. They're building a network of activists and campaigners, and hold large scale assemblies themed around 'thinking together', which is a kind of en masse guided brainstorming. They're trying to spark a movement, and are aiming for 25,000 people. They're investigating ways to make their assemblies more efficient, as currently collating all of the ideas that are generated is a manual process. This would be nigh on impossible when they reach their participation goal.
There will be a report about their progress on the 27th of May.
Devon Walshe Devon was our Techcube host, and he told us about Sync Geeks, Geeks in Residence. This is a program funded by Creative Scotland that puts the technologically minded into arts organisations. Previous efforts by arts organisations to employ 'geeks' to solve a technical problem or produce a digital solution for something have been problematic due to the 'black box' approach. The developers produce an outcome, get paid and leave, often aiming to do the minimum amount of work. Geeks in Residence promotes developers and the organisations working together more closely, to allow for sustainable solutions.
Part of the project is to analyse the relationships of people who know about
technology, and those who don't, with each other.
In my notes I've scribbled "convert fear into technology", and I can't
remember what that originated from, but it sounds awesome.
Devon did some work with Stills photography centre.
Nobody knew what they needed, so after some collaboration they developed an
interactive floor plan (because the Stills building is way confusing) and some
kind of interactive timeline because Stills has an interesting history.
He also plugged the Culture Hack Scotland in Glasgow in July (12th-
14th), which I'm terribly disappointed I won't be in the country for.
Next OKFN meetup Will be on the 22nd of August, in Informatics. Here's a link to the Meetup so you can RSVP. I'll sure be there, if I'm not somewhere else.. (depends if/when/where my Mum books an obligatory family holiday).
20th - 26th May Continued to work on literature review. Nothing much to report.
Went to MCM Expo in London and managed to find time (around non-stop merch
selling for TomSka and Eddsworld) to ask between 30 and 40 content creators -
a wide variety of ages, experience, types of content - about their process and
collaborative practices. The thing they all had in common (I randomly picked
people as they were waiting in the two hour long queue to get autographs from
Tom) was that they all do what they do because the love it, want to
entertain people, and if the could earn a living from it too that would be
amazing; but that's not why they do it. For many it's the dream, but not one
they expect realistically to achieve.
That is why this is important to me. Because everybody should be able to
make a living from doing what they love*, and the technology exists to allow
it. How exciting.
Unless they're really bad at it. There's only so much technology can do. But they should definitely have the chance to get good before caving in to a ninetofive that they're not totally passionate about.
I'm lucky enough to have some quests to complete, or achievements to unlock,
during my month in Australia.
1\\\\. Find the resting place of my Uncle David. My Mum and her family moved to Australia for a few years when she was nine. They moved around a lot, and eventually came back to the UK. Her older brother returned to Melbourne when he was 17, and died there (very suddenly I think) at the age of 28, of a brain tumour. He only visited the UK once while he lived in Oz, and though he stayed in touch with his family, information about his life is sketchy. Until recently I thought I was going to have to trawl through various records to find out where he might be buried, and I'm not even sure what year he died in, let alone where he lived. I talked to my Grandma the other day though, and she knew more than we thought - that he was cremated, and scattered in the "Herb Garden, South Yarra" and that he has a headstone there. I'm not sure how accurate this is; the Web isn't being very forthcoming with useful information right now, but it's more to go on than I had before.
2\\\\. Forge ties between Edinburgh and Melbourne Open Knowledge Foundations. Since there's an OKFN Hack days after I arrive in Melbourne, and not even far from where I'm staying, I've registered to take part. Should be fun.
And some less questy, more general things...
Check out some of the places my Mum and her family lived. I have a pretty long list of addresses across Victoria and South Australia, to see which still exist and what they're like.
Maybe attempt to bump into relatives or family friends. This will be tricky though, as my UK family don't seem to know where anyone on the upside down side of the world is any more, how to contact them, or whether they'd be interested in meeting. The ones we do know are in Perth, which I'm not going to be able to get to. There are quite a lot of them around in Australia and New Zealand though, so hopefully more data will emerge.
See Ayres Rock. Duh.
See the main cities on the east coast, like Brisbane, Sydney and Canberra. Expensive and time-consuming, but doable. I also want to go to Tasmania.
And things that need doing that aren't to do with being in Australia:
Two freelance projects, due on 5th June and 1st July.
I have been on no fewer than thirteen planes in the last four weeks or so.
I've already made my apologies to the environment. If it's any consolation,
it makes my ears really hurt every time. It's only four days, after that,
until I get on a flight from Edinburgh to Madrid. It's becoming as normal as
going for a walk. (A walk with earache and slight deafness).
Presumably, if using electronic equipment during takeoff and landing, and
mobile phones at any time, was actually dangerous there would be some sort of
machine to detect if any passengers did have things turned on, and electric
shock them (or something). And there would be a degree of consistency in how
these things were handled between airports. In quite a few airports in
Australia, we were asked to turn our phones off altogether before stepping
onto the tarmac to approach the plane. I saw a couple of people who were
walking and tapping their screens get pulled aside and refused boarding until
their devices were safely off and pocketed. No planes were crashing in the
background. And nobody checked that out-of-sight gadgets were switched off.
So you only pose a risk if you get caught? And in Zurich, the crew were happy
with people making phonecalls right up until the plane started trundling.
Security is inconsistent too. And a massive hassle, so in an endeavour to
minimise unpacking and re-packing time at the gates, I gradually reduced the
number of electronics and liquids I removed from my hand luggage.
A timid and inexperienced flyer before this year, I used to think it was
standard procedure to remove everything with a glimmer of metal or a drop of
liquid for an x-ray in its own separate tray. So out came the chargers,
headphones, coins, cards and soaps, along with tablets, e-readers, laptop,
phone and clear bags of 100ml containers of liquids.
Before the start of this trip I'd figured out that was excessive, and silently
pitied those who unloaded all of these things still.
Time to see what else the security guys don't give two shits about.
Kindle. That was fine. Nobody questioned a Kindle left in my bag.
Tablet was the next to stay. My Nexus 7 caused zero concern.
At this point I'm just unloading a laptop and liquids.
So I left the toothpaste in. That was fine. Then I found out one of my
travel buddies had been inadvertently passing through security gates with a
250ml bottle of suncream in the side pocket of his rucksack. Apparently not
an issue.
At some point on the trip I acquired small scissors; so weak they could barely
cut thread. They made it through security in Sydney the first time just fine,
but in Alice Springs my bag got pulled back. I thought it was the toothpaste,
but no. Once they had confirmed that these scissors were such that they'd
buckle before piercing skin, they let me have them back. Same again in
Cairns, and Gold Coast. By Sydney round three, the scissors had broken into
two parts. A very concerned looking lady asked if she minded if she threw
them away rather than let them on the plane. I didn't.
By this point I was sending 50ml nasal spray and deodorant through, because
what the heck? They were also cool with our massive jar of pasta sauce, and
whatever we happened to be drinking at the time.
So within Australia, I was just removing my laptop at security (and phone from
my pocket).
In Singapore, signs actually advised that we keep phones, tablets and
e-readers packed, so only the laptop came out there too (my liquids were in my
hold baggage by this point).
Zurich, however, are nuts.
No liquids still, and I got out my Nexus and Kindle straight away because
someone ahead of me was getting yelled at about an iPad.
Apparently that wasn't enough, and eager staff sent my bag back through three
times, unloading more on each round. Out came chargers, headphones, bags
within bags, chocolate. I was almost ashamed of how much crap I was carting
around. For the final round they dug deep to extract a a sealed bar of soap I
bought someone as a souvenir from Cairns. That seemed to do the trick.
So then I had to hold up the line re-packing.
How does x-raying soap on its own make it less of a threat? Or easier to
detect the threat? Serious question. Anyone know?
I'm just pleased I left my Vegemite and teas in my hold luggage.
So I got behind with the blogging. As in, I didn't do any. I will rectify
that over upcoming days! But for now, I will reflect on my Pre-Oz quests.
1\\\\. Find the resting place of my Uncle David. Achieved, debatabley. I found the Herb Garden, South Yarra, which was part of Melbourne's amazing Botanic Gardens (and conveniently pretty much across the road from where Jamie lived). However, it has never been legal to scatter ashes there, and there definitely aren't any headstones. It doesn't mean he wasn't scattered there though. So I took lots of pictures. That's the best I could do. It is beautiful.
2\\\\. Forge ties between Edinburgh and Melbourne Open Knowledge Foundations. Well, I went to #govHack and had an amazing time. Pete and I discovered we'd already been to one of Flanders' hacks in the UK, and another hack that had derived from it (Dev8D and DevXS). So we were well received. Also because we'd travelled 11,000 miles to be there. And I got to know some things about Australian data. Stay posted for the official blog.
Check out some of the places my Mum and her family lived. I found a caravan park in Palm Cove where she lived 38 years ago, and the creek where she learned to swim! The change in the town from how she described it was immense, but walking the same beach she walked as a child was a really surreal experience. I also dangled my feet in Mossman Gorge in the Daintree rainforest before I even knew that she used to swim there.
I saw a sign to Toowoomba near Brisbane, where she lived too.
Maybe attempt to bump into relatives or family friends. Wasn't expecting to achieve this. But I saw my cousin in Perth! That was awesome. And it was very good of him to pick us up from the airport, give us a super fast tour of Perth at night and let us sleep on his floor for a few hours, before returning us to the airport for 5am. Thanks Luke!
It almost made up for the ..ahem.. additional expenses incurred.
See Ayres Rock. Check! It wasn't all that.
But the road tripping was amazing.
See the main cities on the east coast, like Brisbane, Sydney and Canberra. Check! But only just. Brisbane and Canberra need more visits. I wrung quite a lot out of Sydney, made new friends (one of whom was feathered), ate some really tasty food, and learned to love the Domestic Airport.
Two freelance projects, due on 5th June and 1st July. Succeeded the first (with some hiccups) and failed at the second (pending..). Finish my PhD literature review... Haha. That was ambitious. It didn't get a look-in. I feel bad. There was no wifi though! Anywhere! Make a poster for the Semantic Web Summer School. I didn't do that whilst travelling, but it is done now.
Meanwhile, whilst you wait for my full blog posts, here are our tweets, nicely
collated:
Returned to Edinburgh on Wednesday. Suffering from post-travel depression.
That's a thing, I looked it up. It's a thing nobody has any sympathy for.
Life is hard.
I've never been to Spain, and for some reason the Spanish language baffles me.
It's the least guessable European language, in my humble opinion, and I can't
get my head (/tongue) around the pronunciation. I've never studied Spanish,
but I can get by just fine with French and German, and when I've been in
Switzerland, Holland and even Italy I could make a go of communicating to a
reasonable degree, so I thought I'd pick it things up.
But apparently my brain is resistant to Spanish.
That aside, locals are very friendly and even when they don't speak English
they don't seem to give you disparaging looks.
We arrived on Saturday evening (having successfully got our poster tube past
various levels of EasyJet staff who would have been within their rights, if
unreasonable, to tell us it was too big for hand luggage). We were met at the
airport by new friends, who we later accompanied into the centre of the city
to watch the Gay Pride Parade.
The streets were packed, the heat was stifling, and the costumes varied and
outrageous. The party atmosphere filled the air with a tangible excitement.
The parade itself was slow to start, but eventually lasted for several hours.
We explored a little, taking in this version of the city as the empty streets
of the Sunday morning to follow would have a very different feel.
But mostly we sat on the grass, chilling with our hosts and their friends, who
were mixing cheap wine and lemonade and bobbing to the waves of techno, trance
or cheesy pop that came by with every float.
The night was hot, but not unbearable; about 37 degrees, yet there was a
slight breeze and no humidity which made all the difference.
On Sunday we braved the sun to do the tourist circuit recommended to us to
take in as much of the city as possible in the time we had.
We deviated somewhat to explore various gardens (beautiful, although I was too
distracted by the heat to really appreciate them) and pop into some impressive
looking cathedrals (a nice break from the heat, but I find Christian art,
architecture and interior decoration disturbingly morbid).
Never having thought about visiting Madrid before, and thus never having
planned out what I might like to see when I got here, none of the monuments,
buildings or squares stood out to me. Architecture is pale and old-looking,
and mostly very ornate. Similarly pale sculptures, statues and fountains are
in abundance.
Distinguishing vegetarian options in restaurants and cafes seemed more
challenging than I had the energy for; not to mention, most didn't start
serving food until at least one-thirty which wouldn't have allowed us time to
catch our train, so I'm ashamed to say we grabbed snacks from a supermarket
instead.
We (eventually) caught a train from Atocha station in the centre to Cercedilla
(EUR 5.30). And thus began the Summer School adventure... Continued in
another post.
Attendee, Summer School for Ontology Engineering and the Semantic Web
Presented poster "Active Digital Content Creators and the Semantic Web". Learnt lots, slept little. Worked on a mini group project about measuring serendipity. Won the best video prize, with "Summer School for Semantic Wizards", featuring a wizard who turns water into wine using SPARQL.
Semantic Web & Web of data = a more manageable mission.
Metaweb movie - got bought by Google and incorporated into Knowledge Graph.
SW Principles:
1\\\\. Give everything a name (entities).
2\\\\. Relations form graph between things.
3\\\\. Names are addresses on the Web (so we inherit properties of Web like AAA).
This becomes Giant Global Graph. (Maybe SW should be called Giant Global
Graph?)
4\\\\. Add semantics.
Types of things, relationships.
Hierarchy, constraints:
Inferences. Bounding shared beliefs by sharing ontological information. Space for confusion gets smaller and we begin to agree on interpretation of information.
Semantics = predictable inference.
Google: from just links to results, to information boxes (last May). Can't
directly address Google Knowledge Graph.
NXP (microprocessors): 26,000 products. Integrated all databases into
triplestore. Exposing subset of triplestore to customers.
BBC: 125 million triples. Many data sources. APIs to website. Own
ontologies.
All have the same triple-layer architecture:
Raw data
SW layer
Output / API / UI etc
DataGov: eg. air quality in cities, campaign money, if policies work.
Companies don't care about SW, but are using these technologies for their own
IRL purposes.
These are all different types of use cases of SW technologies:
search;
data integration;
content re-use;
SEO;
data publishing.
It's important that the SW graph is so big.
More questions to ask.
Good that we no longer know how big, or how fast it is growing... Tens of billions of facts.
How many are really permanent?
Some are stable, some will disappear - just like the 'regular' Web.
"...it being a mess is the only reason why it scales."
We need to get used to the idea of SW being a mess - aka "a system so large
you can no longer enforce central control" (complex system).
The LD cloud is still poorly interconnected, but good graph properties.
SameAs.org
Heterogeneity is unavoidable.
Socio-economic, first to market - why certain systems/ontologies get used, eg.
schema.org, dbpedia.
Self-organisation.
LD cloud grew, nobody designed it.
Knowledge follows power curve. This has an impact on mapping and reasoning,
storage and indexing.
Distribution.
Web not geared for distributed SPARQL queries. Everyone pulls in all data and
queries local copy. Not very 'webby', disadvantageous. So subgraphs? Query
planning? Caching? Payload priority?
Provenance.
Representation, (re)construction. Metametadata (knowledge about knowledge;
uncertainty; problems with vocabs for this).
How to get from provenance to trust.
Dynamics (change).
Cool Web in 60 seconds graphic.
SW not changing this fast, but soon..
Errors and noise.
Sometimes we disagree.
Deal with by: avoid, repair or contain. Or just deal with it \\\\- allow
argumentation.
Fuzzy, rough semantics - almost, maybe.
Lots of research questions. But not ones we could ask 10 years ago.
Information universe - "algorithms exist without us looking at them".
We should ask if things work in theory.
Scientists vs. engineers.
Discovering vs. building.
Is this incidental or universal?
OWL is our microscope.
We can see structure well in some domains, but not so well in others. Maybe
it's our tool that distorts, rather than a property of the domain.
Says we should change our mindset from building stuff to hypothesising and
falsifying.
The only problem we had during the workshop was disagreement about how to read
the 'broader' and 'narrower' relations between courses. It instinctively
ready contrary to what (my) common sense suggested (eg. that 'arts and
humanities' is broader than 'history', which some people disagreed with). A
quick reference to the ontology documentation resolved that.
Streams: Any time dependant data / changes over time.
Has done a paper about P2P stuff.
Data silo - "natural enemy of SW scientists"
Massive exponential growth of global data.
Still have to integrate dynamic data with static data.
Multiway joins are domintion operator. Need to be efficient.
Everything/body is a sensor.
Various research challenges:
Query framework.
Efficient evaluation algorithm.
Optimise queries.
Organisation of data.
CoAP ~= http for sensors.
Stuff about sensor networks and context - useful for Michael.
Common abstraction levels for understanding.
SSN-XG ontology
Application: SPITFIRE
You can buy a sensor off the shelf that runs a binary RDF store and can be queried. So possible to use SW tech with resource constrained devices.
RESTful sensor interfaces stuff being standardised - CoRE, CoAP.
Linked Stream Model
CQELS-QL (extension to SPARQL 1.1; already legacy)
Rewrite query to spit out static and dynamic - lots of overhead.
But need to optimise between these.
Neither existing stream processing systems nor existing databases could be
efficient enough.
So the built own LD stream processing system. (Optimised and adopted existing
database stuff).
HyperWave - didn't succeed. Didn't listen to customers and wasn't open source
(license fees).
But better than hypertext was back in the day. Performance important for success/uptake.
Just putting it on cloud infrastructure doesn't mean it scales.
Need to parallelize algorithm.
Took it to a point where adding more hardware did help.
Problems! Inconsistent results, engines don't support all query patterns.. very early, don't fully understand yet.
Long way to go. How to prove what is a correct result?
Needs to be easy to use - dumb it down.
Linked Stream Middleware (available):
Flights, live trains - SPARQL endpoint!, traffic cams.
SuperStreamCollider.org
Current Tomcat problem with twitter streams.
To do?
Scaleability
Stream reasoning (only processing, pattern matching, so far. Want to infer conclusions).
World is:
... uncertain, fuzzy, contradictory.
So combine statistics and logics.
Hard to scale logical reasoning, so use statistics to shoot in the right
direction.
Privacy?
Build systems! Can't do thought experiments about the Web.
Arriving by train into Cercedilla, north of Madrid, we immediately encountered
other confused looking folk with poster tubes. So we shared taxis (EUR 10)
from Cercedilla station to the summer school residence further north, in the
forest.
After getting keys for our pleasant, single, en-suite rooms, arrivals
congregated in the shade by the building to introduce ourselves.. Again, and
again, and again, as new people continuously arrived over the space of a few
hours.
A really broad mix of people are here in terms of nationalities and places
and levels of study, but I still haven't quite got used to the fact that
answering 'Semantic Web stuff' is not specific enough in this crowd, when
someone asks you what your research is about. Nobody needs convincing that
these technologies are useful!
Later we received schedules, maps, ill-fitting t-shirts* and very helpful name
badges, and headed for dinner at the bar down the road.
As is traditional when I write about my experiences in new places, I will
describe the food every day. It has become apparent, at this residence at
least, that variety of ingredients is not ordinary, so in this respect meals
are simple. Dinner that first night started with a salad (lettuce, olives,
tomato, onion, shredded beetroot and a single slice of hard boiled egg; no
dressing), followed by - for the majority - slices of meat (beef? Pork? I
dunno..) and fries. Mine was a plate of mushy green vegetables with a little
seasoning, that was pretty tasty. Dessert was a single pear, delivered with
ceremony, but otherwise unadorned. Healthy, at least.
Yet we were all (those I sat with at least) were left feeling a little
unsatisfied.
I shared a table with a French, Spanish, Italian and Irish guy. Conforming
appropriately to stereotypes, and setting up reputations for the rest of the
week, the French and the Italian shared the bottle of wine on the table; the
rest of us went without.
I returned to bed after a couple of hours of socialising and enjoying the cool
air in and around the bar.
For next year, they could ask for t-shirt sizes when they ask for dietary preferences?
Monday
The day started early, and with no hot water or wifi for anyone. Breakfast
was combinations of sweet pastries, coffee, tea, juice and bread.
Punctuated variously by coffee breaks, the learning began in earnest.
During the introduction by Mathieu D'Aquin, I found out that I am one of 53
students selected out of 96 applicants to attend this year's Summer School of
the Semantic Web! I had no idea it was that selective, or that there had been
that much competition.
The first keynote was by Frank van Harmelen, about all the Semantic Web
questions we couldn't ask ten years ago.
Frank started by saying that the early Semantic Web vision has morphed into
the more manageable vision of a Web of Data, or a Giant Global Graph, and
outlined the principles of the Semantic Web as they appear to stand at
present:
1\\\\. Give everything a name (entities).
2\\\\. Relations form graph between things.
3\\\\. Names are addresses on the Web (so we inherit properties of Web like AAA).
4\\\\. Add semantics.
Frank pointed out the advantages of the fact the Linked Data crowd, grown
naturally and not designed, is now so big we don't know how many triples it
contains, nor how fast it is growing. Companies and organisations (like
Google, NXP, BBC, DataGov) are using Semantic Web technologies to achieve
their own ends, for a variety of different use cases, without caring much
about the Semantic Web, and this is contributing to the growth.
This growth has given rise to a number of research areas that were impossible
to realisitically ask questions about ten years ago, including self-
organisation, distribution of data, provenance, dynamics and change, errors
and noise (how to deal with disagreements).
Frank asserted that rules and structures, algorithms and patterns in data,
exist whether we are looking at them or not. He used the analogy that OWL is
our microscope, and it may be the tool that distorts our vision of the
information universe rather than properties of what we are looking at (for
example, structures in data presenting themselves well in some domains but not
others).
He went on to promote the roll of the Informatician to be to test theories,
hypothesis and falsify, as scientists rather than engineers. To discover,
rather than build.
I struggle with this view of the world, and feel instinctively that theory and
practice are intrinsically linked; one can't exist without the other, not just
in the grand scheme of things, but in day to day work and research. This is
one of the main points of contention with my own PhD, and I've no doubt there
will be many more blog posts about this issue in the near future as I
reconcile my need to create something immediately useful with the necessity of
producing a contribution to knowledge at large.
We had an Introduction to Linked Data by Mathieu D'Aquin (raw notes
here), followed by a workshop. We wrote SPARQL queries to populate a pre-
written web page with information about Open University courses, sub-courses
and locations thereof.
Lunch, similar to the previous night's dinner, was a starter salad, an entire
half chicken (or something) plus fries for the carnivores and the most
unappealing risotto of my life for (not that I'm ungrateful, but I have never
been unable to finish a meal due to boredom before). I went for a walk with
some others to grab some fresh air before the afternoon's work, and missed out
on watermelon.
Manfred Hauswirth presented some really exciting stuff about annotating and
using streams of data. Particularly challenging is how to integrate this
with static data and make inferences over the lot. Streams include sensor
data, as well as ever-flowing social media streams for example; anything that
changes over time.
They've built some systems to process this kind of data, and one of them is
available as middleware.
In the afternoon we had a poster session, where all participants pinned up
posters about their work, and discussed at length with anyone who was
interested. Here's evidence that I participated.
And here's Paolo's:
I wrote a few notes about things from other peoples' posters that I need to look
up.
The main feedback I received was about making sure I focus, narrow down my
topic, and concentrate on some evaluatable deliverables that are PhD-worthy.
Questions like (paraphrasing) "why should we care about digital creatives?"
threw me, because I thought the obvious answer - that they are people too,
Web users, technology users, contributors to culture and an ecosystem of
digital content and data - was apparently not enough from an academic
standpoint.
I was simultaneously told to focus more, and to explain why the problem I'm
trying to solve is applicable to all domains, not just digital creatives. But
some of the problems I'm looking at have been (or are being) solved in other
domains (like e-health, biological research, education) and the reason what
I'm doing is interesting is because none of these solutions quite work for
digital creatives, and I want to find solutions that do, and try to figure out
why.
I'm still stuck in some sort of struggle between theory and practice; thinking
and doing. And the long-standing problem of how to decide which doing
actually worked.
I've started scribbling notes about the narrowing down problem. I'll need to
have this figured out before my first year review in August anyway, so stay
tuned for another post all about it.
Then I sneaked off for a nap.
Dinner at the bar again; the usual salad, plus some eggy fish thing for most.
I got a plate of artichoke. Artichoke is great, I love it, and I'm all for
simple meals. But I remain unconvinced that a plate of only artichoke
constitutes an acceptable level of effort on the part of caterers. And the
sheer quantity made it start to taste a bit funny after a while. But not to
worry; we rounded off with a solitary peach apiece.
Further socialising, and appreciation of the night sky, before returning to
bed write blog posts.
I'm super excited and inspired by the talks, work I've heard about so far, and
the atomsphere of the place. I'm excited to learn a helluva lot, and remind
myself that I'm not facing impossible problems, and am not facing many
problems alone. I remember that I am instinctively passionate about the Web
and the possibilities it holds (and indeed has already realised) for the
empowerment of individuals. I remember how lucky I am to be able to sustain
myself through studying something I love so much, and to have the potential to
make a change, and through my work maybe even facilitate others to be able to
make a living doing what they love, as well.
Filmic: continuity like camera movements, framing, direction of speaker, lighting, sound - rules that film directors know.
Statement encoding (eg. summary what the interviewee said):
subject - modifier - object statements.
Thesauri for terms.
Can make a statement graph, finding which statements contradict and which agree.
(He encoded this stuff by hand - automated techniques aren't good enough).
Argumentation model - claims, concessions, contradictions, support.
Automatically generated coherant story.
Are we more forgiving watching video? (Than reading these statements as text). Peoples' own interpretations strongly affect understanding of the message.
Vox Populi has (not for human consumption) GUI for querying annotated video
content.
User can determine subject and bias of presentation.
Documentary maker can just add in new videos and new annotations to easily
generate new sequence options.
User informatio needs - Ana Carina Palumbo
Linked TV. Enhancing experience of watching TV. What users need to make
decisions / inform opinions.
Expert interviews (governance, broadcast).
User interviews - what people thought they need (215 ppts).
User experiments - what people actually need.
Experiment - oil worth the risk?
eg. people wanted factual information from independent sources; what the benefits are; community scale information.
Published at EuroITV.
Conclusions
We can give useful annotations to media access, useful at different stages of interactive access (not just search).
Clarify intended message. Explicity with annotations.
Manual or automatic.
Media content and annotations can be passed among systems.
No community agreement in how to do this. <\\\\--- li="">
How to store?
Questions
Hand annotations are error prone - how to validate?
Media stuff - there can be uncertainty, people don't always care.
Motivating researchers to annotate...
Make a game.
Store whole video or segements?
W3C fragment identification standards - timestamps via URLs.
#SSSW2013: Collaborative ontology engineering and team formation
We were introduced to the various mini-projects on Tuesday morning, and
encouraged to form teams with people who weren't from the same university. I
quickly shortlisted the five that sounded most interesting to me, but was
disappointed that there weren't any about multimedia. Because how to evaluate
a very subjective system is a potential problem for me, the project proposed
by Valentina Presutti was my first choice:
"Serendipity can be defined as the combination of relevance and
unexpectedness: an information is considered serendipitous if it is at the
same time very relevant and unexpected for a given user and in the context of
a given task. In other words, a user would learn new relevant knowledge. To
evaluate the performance of a tool (e.g., an exploratory search tool, a
recommending system) in terms of its ability to provide users with
serendipitous knowledge is a hard task because both relevance and
unexpectedness are highly subjective. This miniproject focuses on two main
research questions: what is the correct way of designing a user-study for
evaluating an exploratory search tool performance in terms of serendipity? Is
it possible to build a reusable set of resources (a benchmark) for evaluating
ability to produce serendipity, allowing easier evaluation experiments and
comparison among different tools?"
Nobody else seemed to be interested though, so I resigned myself to not being
able to do it... until I explained the project and why it was interesting, to
the best of my ability, to Andy, Oscar and Josef, and they were sold enough to
mark it as our first choice. Thus Team Anaconda Disappointed (a name of
significant and mysterious origins) was born, and Project Cusack (because of
the movie Serendipity, which nobody got) was underway.
Our first lecture today was from Lynda Hardman, about telling stories with
multimedia objects. It was super relevant to what I'm doing, to the point
where I'm surprised I hadn't come across her work already. My notes are here.
Lynda has done, for example, work with annotation of personal media objects
like holiday photos in order to combine them into a media presentation. She
has considered similar things to me, in particular noting that there are many
many aspects of data about multimedia - I had assembled my take on this into a
Venn diagram for my poster..
One I hadn't considered is annotating an explict message of a piece of
media, intended by the creator. This isn't always relevant - sometimes the
consumer's interpretation of the media is more important - and this in itself
might be an interesting annotation problem. Competing perspectives -
something an ontology should be able to represent.
I need to check out COMM - Core Ontology for Multimedia.
She has an overview of the canonical processes they have consolidated the
process of producing digital content into, and how annotation can be formed
around these.
Lynda also told us about Vox Populi and and LinkedTV; practical applications
of annotating multimedia.
Natasha Hoy gave us some insights from the biomedical world with regards to
ontology development, particularly in relation to the International
Classification of Diseases which, when last revised in the 80s, consisted of a
lot of paper and a whoever-shouts-the-loudest algorithm for inclusion of
terms. But the next version, currently under creation, is being developed
with a version of Web Protege, customised to be friendly for those who don't
know or care about ontologies, and is a truly collaborative process (for those
allowed to take part) with accountability for all changes. It's open too
though, so even those without modification rights can view and comment on the
developments. My notes are here.
Lunch was for the first time outside, under the shadows of the forest, and for
me was a tray of tomatoey vegetables that were delicious but few. A striking
contrast to Monday's lunch. Everyone else had some meat-potato combination,
preceded by a salad with tuna, and followed by a peach.
The hands-on session followed on from Natasha's talk. We teamed up
(temporarily Anaconda Hopeful) and played with Web Protégé. There were two
magazines and two newspapers, each with four departments. Anaconda Hopeful
were randomly designated the Advertising Department of Iberia Travel (a food
and travel magazine). We got stuck in, on paper first to identify some
classes and relations that were relevant to us, and then with Web Protégé,
along with the other departments of Iberia Travel. We didn't come into any
conflicts, but ended up creating a few classes that we needed, but should
really have been the remit of another department (I guess we just got there
first).
Then it was announced that Iberia Travel had bought the other magazine (and
one of the newspapers had bought the other), and we had to work together to
merge ontologies with the other department. It became apparent that the other
magazine had never had an Advertising Department (no wonder they went under!)
so we had no-one to attempt to merge ontologies with. We attempted to sell
our expertise to the Advertising Departments of the newspapers, but there were
already too many people involved in the heated debate that came out of the
ontology merging there, so we couldn't really get involved.
Later we got cracking with our mini-projects. Valentina showed us
aemoo, and the experiments her team had come up with to
try to evaluate it. We sat down by ourselves to brainstorm, describing a lot
of concepts for ourselves, breaking down the notion of serendipity, figuring
out what might be wrong with existing experiments to 'measure' serendipity,
and collating literature in the area. (Turns out there is a lot, and it's a
very interdisciplinary issue; lots to read about from social sciences,
anthropology etc, as well as philosophy of science. In computing, it seems to
be primarily discussed within the realms of recommender systems and
exploratory search).
Serendipity seems to be mainly described as a combination of unexpectedness
and relevance. Problems include the sheer subjectivity of it. Some people
are going to get excited by all facts they find out, whether they're useful or
not. Some people are going to have hidden, inexplicit or subconscious goals
that affect how 'relevant' something is to them. People describe their
different areas of expertise in different ways; some are more humble than
others and would not call themselves an expert in a topic, for example. So
whether or not an event can be considered a serendipitous one is a complex
question, which must take into account the person's background, goals and
existing knowledge, the task they are trying to achieve (or lack thereof, as
serendipity is particularly important - in my opinion - in undirected,
loosely-motivated activities), the way they are able or encouraged to interact
with a system, what they are doing before and after... all these things make
up a context for someone's activities, and none of them seem to be
particularly measurable.
Dinner was a vegetable and potato (yay!) starter, followed by spaghetti in
tomato sauce (fish for everyone else, although Andy got a custom omelette,
lah-de-dah). Also an apple. We learnt the hard way not to sit at a table
directly underneath a light, as the bugs just raiiiin down.
After dinner we crowded around Enrico who had offered to provide advice about
PhD-ing. From this session, I have a signed diagram of the life of a PhD,
because he borrowed my notebook to make it. I tuned in and out of the
discussion, and noticed some irregularities between my PhD and what seemed to
be 'normal'. For instance, most people didn't seem to have as much control
over their topic, or what they were doing at any given moment in their first
year. I am really, really enjoying my freedom, but in order to justify that I
deserve it I need to sort out my lack of direction and focus. I need to
believe in what I'm doing - not be told by someone else - which is one of the
main reasons I am doing this particular PhD. Perhaps I need to ask for more
guidance to more quickly reach the necessary conclusions for myself. (And, of
course, perhaps I also need to stop taking big chunks of time out periodically
fordifferentreasons; that might
speed up the process as well).
Later, the overriding sentiment was that the job of a PhD student was to
answer a question, to produce a theory. Not to create a system or solve a
large problem; certainly not to worry about practical, real-world
applications of theories. Well, I've already explained that this is
something I can't accept, and I still am not convinced that that is going to
impact on my ability to do a PhD. Theories develop during practice. Coding
and designing, like writing, are part of my thought processes, and I reach
realisations or find new questions to ask through hacking and playing and
making. And why would I be hacking and playing and making, if not to try to
produce something of real-world value? If my motivation in making a system is
explicitly to come up with new theories, then my approach and outcomes and
realisations will be entirely different. In trying to make something that
works for real people, not researchers in a restricted domain or specific
context; a clean and sterile laboratory, I figure out different things, that
matter.
There was another discussion that I came into a bit late, but it sounded like
a very harsh discussion about problems with research in industry (rather than
academia) that seemed to be very overstated compared to what I have read and
experienced myself.
By the end of the day, it felt like I'd been at Summer School for weeks, and
had known everyone forever.
Social Media Analytics with a Pinch of Semantics Using semantics to solve problems (not solving problems of semantics).
SM for businesses:
Analytics.
How to measure success?
SM silos impeding progress.
In-house social platforms increasing, so even more so.
SIOC to integrate online community information.
SIOC + FOAF + SKOS.
FB Graph.
People are likeaholics. Their 'likes' become meaningless, so you need to take
this into account when making recommendations.
Browse your data and understand user actions.
Behaviour analysis.
Bottom-up analysis.
Can handle unexpected or emerging behaviours.
Community members classified into roles.
Identify unknown roles.
Cope with role changes over time.
Clustering to identify emerging roles.
eg. focussed novice; mixed novice; distributed expert; ...
Spectrum across users you can or can't do without.
Extending an ontology built on SIOC.
Encoding rules in ontologies with SPIN.
Three categories of features:
Social features (people you follow, people follow you, ...)
Content features (what you're posting, keywords, ...)
Topical/semantic features
Which behaviour categories you need to cater for more than others? How roles
impact activity in online community.
Consistently see that you need some sort of stable mixture of behaviours for
activities in forums to increase.
==> Don't know what's causing which.
What is a healthy community?
Use behaviour analysis to guess what's going to happen to community. Eg.
Churn rate.
User count.
Seeds/non-seeds prop (how many / if people reply to you).
Clustering.
Unexpected: the fewer focused experts in the community, the more posts
received a reply.
(But quality of answers?)
Community types (Little work in this space)
Muller, M. (CHI 2012) community types in IBM Connections:
Communities of Practice
Teams
Technical support
..
.. (see slides..)
Need an ontology and inference engine of community types.
Wants an automated process to tell you what type of community it is - it might
be something it wasn't set up for.
Then you could determine what sort of patterns you would expect to find.
Noone has done this yet.
Measurements of value and satisfaction
Answers different across communities. They ran it on IBM Connections -
corporate community.
Most of this work is for managers of communities - see what's happening and
help to predict what might be coming next.
Can classify users based on Maslow's Hierarchy of Needs?
Mapping the hierarchy to social media communities.
~90% users happily staying at the lower levels of the 'needs hierarchy'.
Behaviour evolution patterns
What paths they follow over time.
eg. people who become moderators eventually.
Engagement analysis
What's the best way to write a tweet so that people care about it?
Which posts are likely to generate more attention?
Getting bored of people finding patterns in individual datasets. What can be
generalised to other communities?
So experimented with 7 datasets and looked at how results differed across:
community types.
randomness (vs. topicality) of datasets.
related experiments.
And people use different features.
Semantic sentiment analysis in social media
Too much research going on, especially on twitter.
Extract semantic concepts from tweets; likely sentiment for a concept.
Tweetnator.
Semantics increases accuracy by 6.5% for negative sentiment; 4.8% for positive
sentiment.
OUSocial.
Students don't use in-house networks because they already use facebook groups
etc. Want to analyse what's happening on them.
Upcoming
Reel Lives (inc. Ed.)
Fragmented digital selves.
Want to automate compilations of media (photos, messages) posted online.
Changing energy consumption behaviour.
Providing information is not enough.
Harith Alani talked about using semantics to solve problems around evaluating
the success of social media use in business. The SIOC ontology is widely used
to describe online community information. It's not as simple as measuring
someone's engagement with a brand's online presence - people are 'likeaholics'
on Facebook, so you have to look at someone's whole behaviour profile to judge
whether their like means anything or not. It's no good just aggregating your
data and spewing out numbers - you have to browse the data and try to
understand where it came from.
He mentioned how little work has been done in classifying community types.
Most of the work that has been done seems to be with social networks internal
to an organisation. A bottom-up approach to community analysis can handle
emergent behaviours and cope with role changes over time. Looking at
behaviour categories and roles can help an organisation to decide who to
concentrate on supporting and how in order to sustain the community. The
results they have seen so far suggest that a stable mix of the different types
of behaviours are needed to increase activities in forums - but they don't
know what causes what. They're reaching a point where they can use their
behaviour analysis to guess what's going to happen to a community: how long it
will last, how fast it will grow, how many replies a certain type of post is
likely to get, etc.
Next they want to be able to classify community types, and be able to look at
activities within a community over a period of time and automatically discover
what kind of community it is; it might be something different than what it was
set up for.
They created an alternative Maslow's Hierarchy of Needs to correspond with
activities seen on forums, and found that most people are happy to stay at the
lower levels of the hierarchy. For example, join a community, lurk for a bit,
ask one question and leave. Not everyone wants or needs to be a power user.
Papers are being written that find patterns in individual datasets for a
particular community in a particular context. Harith and his team are getting
tired of this; they want to generalise across communities. So they took seven
datasets and looked at how the analysis features differed as well as comparing
the results across community types, randomness (vs. topicality) of datasets,
and compared similar experiments.
Upcoming work includes the Reel Lives project, in which UoE is involved.
They're taking media fragments - photos, videos, audio clips, text recorded as
audio - and creating automated compilations to tell a story.
From Tommaso Di Noia's talk, I learnt that recommender systems have a lot of
maths behind them, especially for evaluating things, and reinforced something
I already knew: I don't maths good enough to be taken seriously by most of the
Informatics world. I think I understand the principles behind the maths, but
when something is descried in just maths, I have no idea what it relates to.
I'll work on this.
Real world recommender systems use a variety of approaches, including
collaborative (based on similar users' profiles); knowledge-based (domain
knowledge, no user history); item-based (similarities between items); content-
based (combination of item descriptions and profile of user interests).
Linked Open Data is used to mitigate a lack of information about entities, and
helps with recommending across multiple domains. You do have to filter the LD
you use before feeding it to your recommender system though, to avoid noise.
Notes here.
Tommaso's talk was followed up by a hands-on
session, where we got to poke about with some of
the tools he mentioned, including FRED
(transforms natural language to RDF/OWL); Tipalo (gets entity types from natural language text); and
using DBpedia to feed a recommender system.
Then we worked on our mini-projects for the afternoon. We made some progress
towards breaking down the concept of serendipity and working out what
properties we might need to represent as linked data, and how we could
observer a user and work out if/when/how they were having serendipitous
experiences without intruding too much.
In the evening we took a coach to 'nearby' historical town Segovia.
Apparently an extremely motion-sickness-inducing two and a half hour coach
journey around twisty mountain paths is 'nearby'. Fortunately I was
distracted from this horrible journey by a conversation with Lynda Hardman,
which I wish I had recorded. Lynda challenged various aspects of my PhD until
I could explain/justify them reasonably, including:
Why digital creatives? (I'm used to that one now).
What is the outcome?
Why Semantic Web for this?
She also recommended a number of resources, including theses of her recent
former students to help me with a structure for my own, and advice on
maintaining a healthy balance between thinking and doing.
Plus she used to live in Edinburgh, more or less across the road from where I
live now. Cool. Thanks Lynda! You haven't heard the last of me :)
#travel
Once we got to Segovia, we had a guided tour of the ancient Roman
architecture, interesting building façades and local legends. It was a very
good tour, but too hot to really focus. Then they took us to a restaurant for
a local speciality. I was all set to write a whole individual blog post
surveying the barbaric nature of human beings, but I didn't do it straight
away and now the passion has faded slightly, so I'll leave it at a paragraph.
Some people watched the local 'ceremony' out of morbid curiosity I imagine,
but it was the fact that so many people took so much pleasure in the idea of
violently hacking up bodies of three-week-old piglets that really bothered me.
Fortunately the surging standing crowd allowed me (and only one other) to
inconspicuously sit it out. The veggie option was tasty, but it was difficult
to really enjoy the rest of the evening whilst wondering vaguely about the
states of minds of most of the people I was sharing a table with.
River belongs to citizens, not authorities.
Physical sensors (hard layer) are expensive and brittle.
So use people instead (soft layer, social).
Give people small sensors. Phones.
Then you just need software for information management.
capture.
integrate and correlate data.
share.
Can't rely on phones.
Old people in Doncaster.
Give them easy sensors instead.
camera
humidity
position GPS
water depth, velocity
rainfall via accelerometer
could coverage via luminocity
Costs about EUR 80.
Open Source & hackable.
Not expected to substitute professional sensors, but a way to crowdsource
information you would never get.
In Delft
Give people flood preparation advice and record who ticks things off, to build
a picture of who/how/when preparations take place.
The Floow Ltd
"Commercialises data solution for telematic insurance."
World divided 10x10m squares, sense things everywhere.
Traffic risks.
Sensors tell you people are going somewhere, not why.
That's what social media can tell you.
Monitoring development of a house fire via Twitter.
Seeing events through the eyes of the community.
Social streams:
High volume
Duplicated, incomplete, imprecise, incorrect
Time sensitive / short term
Informal
Only 140 characters
Spam
Large music festival. Monitor geolocated messages, trends, topics and
relations.
Most 'critical' events were management issues.
Developing system to warn you automatically about things to pay attention to.
Look/listen for event within 72 hours. 10 minutes to find out what it was.
\\\\- Simulation of station bombing.
Minute by minute description of event.
1.5 billion messages.
Linguistic issues
Alternative language
Negatives
Conditional statements
Hope/prayer statements
Irony/sarcasm
Ambiguity
Unreliable capitalisation
Data sparsity
Four things when monitoring:
What
Identify, classify, cluster
Events and sub-events
Involved entities
Who
Human or not?
Bots can be beneign, but many are a serious risk.
Bots that pretend to be humans.
When
Where
Big problem - people tweet crap!
People don't realise when people nearby are in danger.
Deception on social media
False crowdsourcing political support on social networks.
Smear campaigns using bots.
Bots to foster / prevent social unrest.
Identifying bots
23 behavioural features.
Feature set is open.
Recognise 90% of bots - more than humans can do.
Very small amount of tweets are geolocated, it's useless.
Have to use the text.
Timestamp is not necessarily correct.
Issues in events
No infrastructure (eg. at music festivals).
Phone signal issues, phone charging issues.
Most tweets from outside event.
Conclusions
Need to convince citizens that authorities are not spying on them.
Need to convince authorities that citizens are not all criminals.
Privacy and legality issues.
Creating a company on this research would be unethical.
Need to pass the right message. Full disclosure. Non-intrusive use of tweet
content.
What happens when authorities demand this technology for privacy-invading
stuff.
Have to be careful with what you publish.
Always assume the bad guys have thought of what you thought of.
Always be in a situation where you can destroy your data at short notice.
Bit legal barrage behind them. Know what they are/aren't allowed, know what
they do/don't have to do.
Start leading a blameless life.
We started work on the serendipity project before breakfast today, although I
didn't make it down as early as some of my teammates.
To start the day, Fabio Ciravenga talked about some really exciting practical
applications of monitoring and analysing social media streams. It's
particularly interesting during emergencies, or large events where problems
might occur. The people on the ground make the perfect sensors if you can
work out the differences between people who are saying something useful and
who aren't; people who are really there, and people who are speculating or
asking about the situation. A main problem has been that people tweet crap.
They were trying to monitor a house fire, but so many people were tweeting
lyrics from Adele's various singles at the time, which all apparently contain
references to fire, it was almost impossible.
They also put (or tapped into existing) sensors in peoples' cars to monitor
driving patterns with the aim of more fairly charging for car insurance. I
told my Mum about this the other day, and she was pretty alarmed by the idea.
Which made me wonder how they'll get mass adoption, if it's going to go
anywhere.
Fabio did have some interesting things to say about using all this data
ethically though, and never working for someone who is going to take that away
from you. But in case the 'bad guys' do find out about all this data you have
about people, keep a magnet handy.
This was followed by a hands-on session where we got to mess with a mini
version of the twitter topic monitoring system that Fabio's team use at large
events, to try to answer questions about the Tour de France only by
manipulating the incoming social media streams and following only links which
came through that.
Spanish omelette sandwiches were an amazing outdoor leisurely lunch. We
headed to the pool down the road and chilled out there for a couple of hours.
Us tough British folk found the water pleasantly tepid, whilst all those wimpy
Europeans and Latin Americans shivered on the grass. They'd made such a fuss
in advance about how cold the pool was going to be.
We regrouped that afternoon to work on Project Cusack, creating a slide deck
of pictures from Serendipity. I don't like slides with too much to read on,
so I enforced this. The imagery from the movie will be lost on most people,
but we have at least managed to choose pictures of John Cusack with
appropriate expressions for each part of the presentation. We worked outside
in the forest, because Oscar's 3G was faster than the residence wifi.
We also brainstormed for the required short film, which we only just
discovered doesn't have to be about our project.
We returned to the residence to find everyone eating ham and cheese, and
attempted to get some shots for our film, but other people were unwilling to
participate.
That evening we ate tasty vegetable soup, weird (in a bad way) pasta in a
creamy onion sauce, and chocolatey ice cream cake. The tutors spontaneously
organised a game where students had to arrange the tutors by age, which was
funny. Someone suggested the tutors ought to play it with the students.
Obviously there were too many students, but they elected to find the youngest
student, and that turned out to be me.
Week(s) in review: #SSSW2013, figuring stuff out and annotating YouTube
8th - 14th July Semantic Web Summer School, much heat, much fun, much learning... Here's an index of my posts.
15th - 21st July Friends visited. Progress included writing notes to myself to figure out just what my PhD outcomes really are, and why. Came up with:
1\\\\. Recommending how to usefully describe diverse amateur creative digital
content (ACDC) using an ontology.
a) What are the parts of ACDC that need to be represented? Identify and categorise properties. How do these differentiate it from other similar content?
b) What existing ontologies can be used to do this, and how do they need to be extended?
2\\\\. Building an initial set of linked data about ACDC, and providing means for
its growth and use.
a) Manual annotation of ACDC, and refinement (to test ontology).
b) Tools for automatic annotation of the parts of ACDC that it is possible to automatically annotate.
c) Tools for manual annotation by the community of content creators and consumers for the parts of ACDC that cannot be automatically annotated.
d) Tools to expose the linked data for use by third-party applications.
3\\\\. Create and test an example service which uses the linked data to benefit
content creators and/or consumers.
eg. Unobtrusive recommendations for collaborative partners (most likely); content recommendation; content consumption analysis (like tracking viral content); community building / knowledge sharing in this domain; ... .
22nd - 28th July Brainstormed with Ewan about stage 3 (above), and came up with the idea of an interface that allows content creators to allocate varying degrees of credit for roles played by different people when collaborating on a project. This would serve to both gather collaborative bibliographic data, learn things about how different segments of the community allocate credit, and provide a potentially useful tool for content creators. With the future value that, if we can learn enough to estimate role inputs from different people, it could be used for things like automatic revenue sharing.
Then spent the rest of the week in London, frolicking amongst the YouTubers
(including attending a meeting at Google about secret YouTube-y stuff), and
annotated some ACDC. Write-up coming soon.
Vague thoughts about content creators and the Semantic Web
I had two meetings with Dave Robertson, my second supervisor, about what on
earth I'm doing, and here is a vague summary of my thoughts afterwards.
I came to the realisation between meetings that I need to scrap the term
Amateur Creative Digital Content, because amateur doesn't really apply by its
true definition and creative is too subjective anyway.
Focus on content creators, not content (so previous point doesn't matter so
much anyway; maybe just need to look at existing ways people are describing
types of users to make it clear who I'm concentrating on).
In terms of emphasis of the thesis, I need to make a choice between taking a
cognitive science/sociology perspective and a tecchie/engineering perspective
(I choose tech because that's where I'm most comfortable, but the sociology
side of things is still important).
(Therefore) I need to think concretely now about technology architecture.
Not to get too hung up on the Semantic Web; the technologies are a vehicle for
testing theories, rather than an end in itself (though I still think
facilitating a big linked data set of this sort of data is useful in the long
run for research and practical applications, I didn't labour that point).
Social machines, and how Dave's process modelling language fits in, which I
think I get in theory but not practice (I'd probably have to look at a working
application and code to understand really). Some of the principles may be
useful further down the line, but probably not the language itself or
anything.
Technology-wise, I'm not thinking about anything novel or new, but more new
ways for how various Web and SW technologies are combined and applied to this
domain. (?)
So maybe the novelty is in marking up various things about content creators
and using this to infer information about the processes they're involved in
(or want to be involved in) in order to then facilitate these processes,
without (necessarily) ever explicitly representing these processes (because
from the content creators' perspective, they're certainly not thinking in
terms of formal representations of processes, and in many cases won't know
what they're trying to make until it's done, for example).
How to represent the inferences made might be novel and exciting, but I don't
know.
Hmm, I still don't think I've figured out how to evaluate .. anything. Beyond
comparing activities of users with magical-new-system vs without magical-new-
system. And maybe, going back to the this-big-dataset-is-useful idea, by
finding questions we can now ask about these kinds of communities that we
couldn't before because they were so fragmented.
Co-organiser and mentor, Edinburgh Centre, Young Rewired State
Helped to organise the local centre, recruited participants in the run up, and helped with mentoring during the week. Also helped to supervise a coachload of kids for the trip to the Festival of Code.
Rhia Ro was the name I chose (when I was around 9 or 10 years old I think) and I decided any budding author worth their salt needed a penname (I always wanted to be a writer of fiction; I still do). I scribbled Rhia Ro at the bottom of all my short stories for years, and when I started to converge on a consistent online identity, that's what I went with (before then, you might've known me as gerbilsbhs, theboynextdoor, TheRingleader, mysticalmoonflower or kirilya).
Buying rhiaro.co.uk at the end of 2008 was a real formative moment for Rhia and myself.
It often gets autocorrected to rhino.
Most people instinctively pronounce it rih-haro ree-haro or ree-aro, and that's okay, but really it's reah-ro.
Young Rewired State is a week-long hack event for under 19s. There are
centres all over the UK, and the week finishes with a giant sleepover in the
Custard Factory in Birmingham, presentations and prizes.
I was helping out with running the Edinburgh centre this year, between the 5th
and 11th of August. We had 15 young people taking part, and a few parents
popping in and out as well. Not to mention several fantastic mentors.
Every day we gathered in one of the University of Edinburgh Informatics
computer labs. On the first day we did some brainstorming, introduced the
young people to Open Data, and they sorted themselves into teams.
We had a diverse range of projects by the end of the week.
The Weatherproof app was written in Scala with a Web frontend, and as well
as telling you the weather forecast, gives you practical advice on what to
wear and what to take with you.
_Stuff Index _was a Python Web app that lets people photograph and upload
stuff they've left out on the street that they want to get rid of, so anyone
browsing the site can opt to take it away if they fancy it. Helping to keep
stuff out of landfill, and without the dreaded social interactions that come
with Freegle.
Tag is a game by a one-man team, with a Python game server and a JavaScript
front end that lets you chase your friends around the real world, and
automatically tags them when you're in range.
_PokeGame _is a real-world Pokemon simulator that lets you roam IRL and
capture virtual Pokemon.
Great stuff!
On Friday we crammed into a coach along with the participants from Aberdeen,
Dundee and Glasgow, and set off on a seven hour road trip to Birmingham for
the finale.
The Edinburgh teams didn't win anything, but the presentations were fantastic
and everyone had an amazing time. The young people made new friends, learnt
tons of new stuff, and hopefully remain enthused about coding.
Next year we're going to do more to walk through the creation process of some
example apps to get them started off, and maybe do a better job of introducing
Open Data and the possibilities it holds.
We're also thinking about starting a regular under 19s code club in Edinburgh
weekly or bi-weekly - so stay tuned for more info about that. (And if you
want to help or participate, get in touch!)
Freegle is a national network for promoting reuse and reducing waste by enabling people to give away stuff they don't need anymore, and other people to pick second-hand things for free, in their local area. I'm a sub-par mailing list moderator, and I feel bad about being so inattentive.
I have a thing where I can't not finish reading something. There's a very
short list of books I never finished, and they all date from when I was about
8 to 13, and are because I was to young to follow them or too young to bear
them, and just haven't got around to picking them up again. They haunt my
subconscious.
These days I feel I have to get at least half way to have given it enough of a
chance, and once I'm past half way I feel I might as well finish it.
By the time I had wrenched my way through the first half of Angel of Death,
I had started to come round to it. By the end I guess I'd enjoyed it in some
ways.
What I struggled with through the first half was the erratic jumping between
persons and, worst of all, tenses. You experience one character's perspective
in first person present tense (something I dislike anyway), plus second person
directed at another character. And sometimes past tense. Other perspectives
were usually third person past tense. I guess I got used to it, but if
someone had told me before I started that it was written in this way, I
probably wouldn't have picked it up.
I'm starting to figure out that I prefer character-driven narratives. This is
the pattern with things I've enjoyed lately, anyway. The premise of Angel of
Death was kinda interesting. But the characters were utterly flat and often
behaved unbelievably or in a very contrived manner, given how they'd been set
up. It was all tell and no show. THIS IS A BAD EVENT IN HER TROUBLED PAST,
OH BOY, NOW YOU KNOW HER MOTIVATIONS. Totally wasn't enough.
The twists and turns throughout are, I suppose, well done. The reader is
convinced of the state of the world and, just as you're absolutely certain
that that's the way it is, you're being convinced of the opposite. This
happens not quite enough to feel like an indecisive cop-out, but isn't far
off.
There's gory horror, but it feels appropriate and not over-done. Kudos.
If there were deeper levels of meaning or metaphor intended, which I suspect
they might have been, I missed them.
Conclusion: meh, don't bother. But if you've got nothing else to read, you
could do worse. If you want my copy you can have it, get in touch.
The SHRUB is an awesome cooperative dedicated to promoting reuse, recycling, repairing and upcycling. I helped with early sorting-out of the new shop premises, made some shelves.. spent a few hours per week working in the shop in exchange for swap tokens which which I have procured all manner of exciting things, and helped with the big university halls cleanup last year. I regret that I've only managed to make it in a couple of times in 2015.
I haven't blogged about yoga yet, but now seems like as good a time as any to
start.
I started yoga-ing towards the end of 2012, with on-and-off classes at the
Commonwealth Pool, then joined two beginner classes (with very different
teaching styles) in January that ran for a semester as part of Edinburgh
Council's Adult Education Programme. I was
hooked, and since the start of this semester I've been doing four classes a
week:
Monday is Vinyasa 'power' yoga, one of the classes held by the University's
Yoga Society. It's fast,
sweaty and intense, and I'd never have been able to handle it - or enjoy it -
as a complete beginner.
Tuesday is a really relaxed beginner class running this term through the
University Chaplaincy, mainly for relaxation. Great for the final hour of the
24 hour recovery from Monday's class.
Wednesday and Thursday are post-beginner Adult Education Programme classes, in
Cameron House Centre and Nelson Hall respectively, with one of the teachers
from my first semester of regular classes before the summer.
But what I really wanted to say, is that today I got myself into a full
backbend unassisted (the last two weeks I've had help) and got substantially
closer to reaching my toes with a straight back than I ever have before.
Go me.
(I definitely have shorter hamstrings than is normal, and my main aim with
yoga is improving on that).
I'm using Michel Fortin's PHP Markdown to grab each md file in the posts directory and convert the contents to HTML.
I'll add some custom stuff to parse labels and categories that I plan to use to organise my content.
I also need to think about templating, and when the to-triples part is going to happen.
Updates (16:48)
PHP Markdown is already able to work with Smarty, but the Smarty docs are awful and the site hurts my brain.
The site and docs for Twig seem much nicer, and it claims to be more modern and faster than Smarty, so that's leading the templating engine race at the moment.
Do I need to use a templating engine at all? Isn't this kind of what I'm building..?
It's easy to see why not many people do. I say not many, because I'm sure there are people using the various RDF Drupal or Wordpress plugin. But I don't really know if they count, because that's really just people annotating their content for SEO purposes, and maybe a bit of interlinking with DBPedia to let visitors find out a bit more about topics of interest.
But what I'm really talking about is the whole blog running from a triplestore, with a publicly accessible SPARQL endpoint.
All the publicly accessible SPARQL endpoints I can find are big 'useful' datasets that people might want to mash up with other things. Archives of data. Not live or frequently updated stuff. Certainly not the contents and metadata of a personal blog.
So what's the point?
There's not a lot else 'on' the Semantic Web to tap into. There's no hidden Semantic Blogosphere that would yield great worth if I could only tap into it. There are no smart agents traversing the Semantic Web and aggregating interesting blog content for intelligent readers (are there?).
And there never will be, if nobody publishes their content this way.
What I want to create is a solution to allow the Every(wo)man to publish their blog (/whatever) 'on' the Semantic Web.
The Every(wo)man uses a cheap, perhaps archaically run, shared hosting environment, and has little to no control over what is installed there. The most accessible server-side language is PHP. The most likely default databases are MySQL. Files are uploaded using FTP (typically through a client with a GUI) or a web interface like cPanel. Maybe they'd like to change to a new host who would allow them to, say, experiment with Python and CouchDB, but they just renewed their two-year contract, and besides the customer service with their current account is really good, and they have so much content knocking about on that server space now, and they definitely can't afford to be paying for duplicate at the moment. And they also can't be bothered with the faff of remembering how to work with Heroku or something everytime they sit down to tinker, not to mention the worry that their limited understanding of such a system will suddenly start incurring costs.
Disclaimer: Previous paragraph may be based on actual events.
The solution for running a triplestore from shared hosting is the PHP ARC2 library, which can, wonderfully, drag a MySQL database kicking and screaming into the SPARQL world.
But I didn't really want to just use MySQL. It is a terribly practical, but inpure solution. I want a graph database, damnit, with a SPARQL endpoint, and I want somewhere for it to live that I don't have to pay extra for.
So what else is there?
I want to experiment with CouchDB anyway, so I checked that out. There are hosted instances - like Iris Couch that look pretty easy to deal with. Nobody seems to have optimised it as a triplestore though, so setting up a SPARQL endpoint and proper inferencing and stuff may be a bit beyond me right now.
There are lots of hosted triplestores I guess, but they all seem to be about 'Big Data'. I came across Dydra ages ago and signed up for their Beta and forgot about it. Then they called me at my office to find out what I wanted to use the Beta for. I bumbled that conversation because I'd forgotten what it even was (I think they introduced themselves with a different name, too), and I was generally dealing with the fact they'd actively Googled me (I asked) to get my office phone number (the University helpfully posts these on their website) which I had not given to them. And if I had figured out what was going on in time, I'm positive they would not have given me a Beta account for "maybe tinkering a bit to see what it does when I have time at some indeterminable point in the future".
UPDATE: The day after writing this I got an email with an invite code for the Dydra beta, and no personal contact from them. Crazy coincidence. I'll give it a shot.
Anyway, I'm sure I've come across hosted 4store services that didn't try to step on your toes about what you used it for, but I can't find any now.
Talis looks mega promising, aside from having an outdated site with broken links, but they're still allocating stores on a request basis. Grumble.
In the interests of moving forwards, I'm going to go with ARC2 and MySQL. But I'll make sure it's all modularised and stuff so it's easy to switch out the store for something else in the future.
Or perhaps more generically, Status to Linked Data. But Twitter is the only thing I update with statuses, so I'll start with that.
I want to be able to:
Parse the whole backlog of my tweets and turn them into Linked Data along the same schema as my blog posts will be.
Tag all tweets with the same tagset that my blog posts use, automatically from hashtags or keywords, followed by a manual second pass. The mapping of hashtags and keywords to tags for the 'automatic' tagging will be done manually, I should think. This, then, needs a nice interface.
Shove all that into the same triplestore my blog uses.
Integrate status updates into my kanban-style website, alongside blogposts.
Right-ho, it parses whatever arbitrary metadata I put in the comments of the markdown files providing they start with "label:" and end with a newline ("
").
And lumps it into displayable html along with the post it belongs to, mostly for easier testing.
The next step is to engage some ontologies and turn the metadata and posts into a graph. It's ARC2 time!
Update: Actually the metadata it accepts is not all that arbitrary. Will remedy.
I touched my toes in yoga today. It happened in the heat of power yoga, and I
didn't think I'd be able to do it cold. But I can! That's my goal for the
end of the year met then..
(This may sound trivial, but I have short hamstrings, and anything that
involves bending in the middle and straightening my legs at the same time I
find extremely difficult. This the main thing I'm aiming to overcome with
yoga).
Another first from today was binding without help in a spinal twist. I
managed to do this again at home an hour later, too.
And I've noticed that going into Chatarunga between sets of postures has
become a reprieve, a chance to catch my breath and rest for a second.
Chatarunga is essentially holding yourself in the middle of a pressup, and
when I started this class that was not something I could do, or ever though
it might be a good idea to do; with the fast pace of the class, it was easy to
collapse down onto my chest and skip over it. But over the past few weeks as
I've got used to how the class progresses I've slipped into doing the
Chatarunga properly - or as properly as I can without having time to stop and
think about it - and not having trouble at all. I just tested that theory at
home, and held myself in it for a good ten seconds.
I haven't seen such fast progress in any of the other yoga I've done. This
class is exciting me, and filling me with hope.
It's taking a toll on my wrists though. By the end of the semester, they'll
be strong :)
Several weeks of debating and planning following Young Rewired
State finally came to fruition on the 16th of
October, with our first Prewired event.
Thirty eight kidsyoung people arrived between 9:30 and 10 on that
Wednesday morning (it was half term week in Scotland, so we weren't pulling
them out of school), grabbed some kindly donated Google swag, made name badges
with stickers and felt-tipped pens, and sat down for two and a half hours of
lightly guided learning.
They were between the ages of three and eighteen, although the three to six
year olds were more there to be tagging along with older siblings or
University staff. It's obviously impossible to divide attendees up by age and
decide what to work with them on, as older definitely does not mean more
experienced. We had decided on no lower bound for the age limit, and no lower
bound for experience either, figuring that the only real requirement is
enthusiasm about programming. There was a huge mix of interests and abilities,
and we let them decide for themselves which topics would be worth listening
to.
We also had about fifteen students, University staff or industry professionals
along as mentors.
After a few minutes of welcomes, where most of the room were willing to
introduce themselves and tell us what they wanted to learn ("Python",
"Scratch" and "more about programming in general" were popular ones) we kicked
off with three five minute introductions: to HTML and CSS (beginner), to HTML5
Geolocation (intermediate) and to Python's Natural Language Toolkit
(advanced). They then had the chance to spend 40 minutes in a hands-on session
for whichever of these they chose. The groups were very evenly spread, and
despite a few hiccups with Python installations on Windows and Chrome not
playing nice with geolocation (worked through thanks largely to the mentors)
most people got some code up and running and appropriately hacked about with
by the end.
We took a break for juice, crisps, chocolate and fruit, plus a bit of hardware
tinkering. We'd borrowed a Nodecopter, but hadn't managed to get it charged in
time so it wasn't in the air, but there were still plenty of people interested
in looking at the code to control it. We also had a demo of a robot arm, which
could be controlled by an Android app connected to a Python server, which had
been written over the summer by one of our mentors.
Next up were three more lightning talks: introduction to Scratch (beginner),
doing cool things with Redstone Circuits in Minecraft (intermediate) and
introduction to PyGame (intermediate-advanced). The following hands-on session
for Scratch was under-attended, possibly ousted by the allure of Minecraft,
but the PyGame session had over a third of the group and made some great
progress, which was awesome.
We finished a little late, but still managed to have time for a quick demo of
a football playing robot from the nearby robotics lab, and a few attendees who
took their time dragging themselves away from their screens.
I'm told that overall it was a success. I was concerned because I was
generally called upon when something was going wrong, so my perspectively was
weighted towards the negative. But it wasn't too chaotic, none of the kidsattendees played up, and as far as we could tell they were
doing something in some way productive at all times.
A lot of them had had little to no programming experience before that morning,
and I really hope they were able to take away something positive and, most
importantly, feel encouraged to try things out by themselves at home. Plenty,
too, had enough experience that they were calling out to correct the speakers,
and helping their peers to get things working. It's a huge challenge to find
enough activities to engage so many different levels of experience and
interest, and I don't think we did a bad job.
Our next Prewired event will be on the 30th of October, and we're running them
bi-weekly on Wednesday evenings from now on. They will be henceforth less
structured. Our primary aim is to help young people to realise that with
programming (and related areas) they can create anything, express
themselves, and change the world. We don't wish to enforce a curriculum, but
encourage them to explore areas they are interested in, learn how to teach
themselves and figure out how to make what they want, and most of all to
persuade them not to be afraid to experiment - to hack - and to just keep
trying if it doesn't work first time. To get them excited before they become
jaded and before this society's stereotypes have a chance to impact on them.
You can find out more about Prewired at prewired.org,
and join the mailing list there too.
Photos and feedback
Here are some of the photos from the day:
If you took some that you'd like us to add, then please send them to
hello@prewired.org!
Similarly, send any feedback you have about the event to us that way, as well.
Resources
I'll update this post (as well as the website) with resources from the
speakers and mentors as I get hold of them.
I magicked Autobiography to my Kindle, but I haven't started reading it yet because I'm worried I'll cry from beginning to end due to the sheer beauty of the prose.
I'm looking forward to a spectacularly written yarn, designed explicitly to incite scandal and speculation.
I'll read it as fiction and fantasy, inspired by, but not based upon, real events.
I might relate too much, or regress back to my teenage years. I think I'd enjoy/despise it best shut away from the world until it's over; always the case with the first listening of a new album. I don't know when I'll have time or understanding of peers for that. I'll try to schedule a few days. Maybe I'll run away to Kerrara for a weekend on my own.
Thinking about the options for hosting the Linked Data for Slog'd implementations, and also thinking about hosting personal and shared Linked Data in general, and Linked Data authored by an Agent who isn't the subject, because this is likely to be appropriate in a lot if cases.
For Slog'd it's a bit easier, because people setting up a Slog'd implementation would have a bit of knowledge / interest in the Semantic Web and data ownership, in theory. Such people will probably have already a unique URI (like a WebID) or the ability to set one up. The data is also most likely to be authored by the subject.
I'm thinking about it in broader terms because (PhD-related) I want content creators (who don't know or care about the Semantic Web) to publish Linked Data about their involvement in the creation of different digital media and online creative works.
Where does this data live, to give them absolute and definitive control over it? If they can author data about anyone and anything (which of course, they can) how do we verify what they say when there are no conflicts, or deal with conflicts that arise? Is content attribution data stored in a giant graph that anyone can update?
How do we handle IDs? Is there a central service for generating URIs for people who can't use their own domains (like the MyProfile demo)? How do we link - or not - multiple identities of the same person, according to what they want other people to know?
MyProfile: unified user profile service which you sign into with WebID and stores data about you with FOAF. There's a hosted demo, but it's intended for people to host their own instance I think. Code on Github
RWW.IO: a personal Linked Data store; looks like RemoteStorage but for LD. Code on Github
Didn't have much success talking to the Dydra SPARQL endpoint yesterday. I was briefly worried as there are no docs describing how to write back to the SPARQL endpoint, so I thought that was write-off at once, but then I found a blog post from 2011 about how that has been introduced. Just not documented yet apparently.
But to start with, I imported some test triples using the Web interface, into dydra.com/rhiaro/about-me and tried to read them back.
But all I got back was an empty array. I tried with with the DBPedia endpoint, which fell over a couple of times, but I got results... except... they were different from the results I got when I queried the endpoint directly through their interface. They seemed sort of metadata-y, rather than actual triples from the store. But it's hard to tell.
So I had a go with Python's RDFLib to try to figure out who had the problem.
import rdflib
rdflib.plugin.register('sparql', rdflib.query.Processor, 'rdfextras.sparql.processor', 'Processor')
rdflib.plugin.register('sparql', rdflib.query.Result, 'rdfextras.sparql.query', 'SPARQLQueryResult')
g = rdflib.Graph()
query = """
SELECT *
FROM
WHERE {
?s ?p ?o .
}Limit 10
"""
for row in g.query(query):
print row
And with that I got some triples... but not from the triplestore. It parsed, I presume, whatever semantic markup it could find in the page itself, the page you see when you visit dydra.com/rhiaro/about-me/sparql. Eg.
Do I have to send an accept header? Surely RDFLib is supposed to take care of that for me... Whatever.
If that's how you're going to play it, I'll just make the request with CURL directly. (I used Python's Requests because the Web says it's nicer than urllib2):
import requests
import rdflib
q = "select * where {?s ?p ?o}"
url = "http://dydra.com/rhiaro/about-me/sparql"
p = {'query': q}
h = {'Accept': 'application/json'}
r = requests.get(url, params=p, headers=h)
print r.text
Boom! Triples! Better yet... the ones in the triplestore! By default (with no Accept header set) they come through as RDF/XML, and it won't give me Turtle, so JSON seems to be the nicest looking option. That doesn't really matter though, as nobody really needs to look at it.
I guess I'll try CURL with PHP for Slog'd, and just parse it with ARC2. It seems a shame that ARC2's remote endpoint querying didn't Just Work with Dydra, but I don't have the time or energy to try to figure out why right now.
Then I need to figure out if I can write to it or not. If I can't... In the name of progressing, I'll have to ditch it and use ARC2's built in MySQL-based triplestore.
Update: Parsing the results with RDFLib
Because I want to understand exactly what Dyrda is giving back to me, I wanted to quickly parse the results and use them like I should be able to use a graph.
The XML that Dydra is returning is not straightforward RDF/XML that RDFLib can just understand. It's a 'SPARQL Result. It looks like this:
So later I either have to work out how to make RDFLib understand this, or make RDFLib understand the JSON alternative. I really don't want to have to write a custom parser to deal with it.
Update: Solved
Turns out it's as simple as using CONSTRUCT instead of SELECT in the query. Rookie mistake? I don't know. I feel like RDFLib ought to be able to handle the SPARQL results format somehow though.
I think I'll use Chris Gutteridge's Graphite layer on top of ARC2. I've been putting off thinking about yet more libraries, but I think I'll end up implementing bits of Graphite in an effort to make ARC2 more friendly anyway, and I'm sure he's done a better job than I ever could.
£50 isn't a large amount of money to a massive organisation like Sky. But for
me, it's two to three weeks food, or a train ticket to visit my mother, or
about half of my Christmas shopping (yeah alright, I'm not very extravagant
with that sort of thing).
Following is an account of how a relatively small mistake on their part, which
I thought was resolved well and quickly, then led to an agonising four month
back-and-forth of:
a) Lies
b) Broken promises
c) Incompetencies
d) All of the above.
May: Attempted cancellation
I messaged through Help & Support to cancel my broadband and phone line,
because I was going away a lot over the summer and it wasn't worth keeping it
on. I was told it was sorted. I went to Australia for a month.
June: Apparently failed cancellation
I got back to Edinburgh and discovered I'd been billed. I turned on my router
to see if I'd managed to mess up the cancellation myself, and saw no sign of
broadband. So the services were cancelled, just not the billing.
18 July: Good customer service
I finally got round to contacting Sky about my bill; by this point I'd been
billed for July as well.
I spoke to a couple of people on the phone who made me feel like it was my
fault, then finally to the lovely Rachael who dug deeper, discovered a human
error had occurred as a result of (we think) the two ways of writing the first
line my address: 2F2 35 or 35/5. As a result, I had two account numbers... one
had the services I guess, and one had the bills. It was very confusing, but
once we found the two numbers Rachael properly cancelled my mysterious second
account, and very kindly applied £50 credit to make up for the overbilling. I
was told to contact again in a few weeks to initiate the refund process.
16-26 Sept: Being blown off
I tried several times to contact customer services via help tickets to ask for
the refund but got told (by a human, not an automated error message) there
'system error' with my bill, or something totally unhelpful (telling me
something I didn't ask about) in response.
29 Sept: Actually, no
I'm told the £50 credit can only be spent on Sky services, not refunded
directly to a bank account.
29 Sept: I damn well disagree!
I protested the unfairness of this, since I have moved somewhere with an
existing broadband provider and actually probably wouldn't be going back to
Sky after all this even if I had a choice...
I was told that actually they can refund it after all, and I should expect it
in 48 hours. I pointed out that I'd heard this before. I was told not to
worry, it'll be fine! I relaxed.
48 hours later... no refund.
Social media
Throughout September and October I started tweeting about the problems. I
liked to do things like compare Sky's customer service to that of Virgin
Mobile (who have done wonderful things for me from time to time). I hoped
'public shaming' might speed up a resolution. I got encouraging responses from
the social media team at Sky, who actually seemed to care (which is their job,
I suppose) but really I only ended up opening more tickets and talking to
other advisors in the end.
21 Oct: A change of tune
Following another help ticket chase up, I'm told credit was incorrectly
applied in the first place, and has been removed. I'm not exaggerating (just
paraphrasing) when I say this was done "for reasons".
21 Oct: You what?!
Oh, hey guys, how about... no? I didn't just suffer through months of torment
for you try to tell me the only competent member of staff I have spoken to was
actually incompetent after all!
So I had a very long live chat to William (I think; I've named him, because
he didn't leave a follow-up note like he was supposed to) who "carefully
reviewed my account" and agreed with the verdict that the credit was
incorrectly applied. I protested. He "carefully reviewed my account" some
more, and then said he could see that an explicit reason was left for the
application of the credit to my account in the first place... so it should
have been there after all! Yay.
He said it would be refunded to my bank account in 78 hours. I told him I'd
heard this before. He said not to worry, it'll definitely be fine this time.
Guess what.
31 October: It wasn't.
So far, no refund, no re-application of credit to my bill, and not even an
update to my help ticket about the conversation. It was like it never
happened. I stupidly didn't think to copy the chat as evidence, although I
assume they have a transcript of it somewhere.
1 November: Hope
I DMed the social media team a bit, and scheduled a time to live chat with
one of them, directly. While I waited, I typed out a timeline of
everything that happened so far so I could paste it straight to them.
Three quarters of an hour later, I have hope once more following a chat with
the most human member of Sky customer services I have spoken to so far.
He found actual reasons for things that had been done for "reasons". For
example, my initial refund was never successfully issued because goodwill
credit must be issued as a cheque, not a bank account transfer, so Finance
just rejected it. A silly rule, but that's the way of it. (In my case the
response was just to not issue it though, so Finance can't even follow their
own silly rules).
I'm still not entirely sure why the credit was totally removed though, or why
on my bill its removal shows up as a charge for Sky TV.
He treated me like a person by not making vague promises, or holding back
particularities of how the organisation works. He told me he believed I
should be receiving a refund based on what he knew so far, but he'd have to
talk to his manager. He talked to his manager, who agreed. But he didn't then
just tell me everything would be fine. He told me it still might get rejected
by Finance (for "reasons", I presume).
What he is doing is speaking directly to someone in Finance to get a cheque
sent out. He's manually changing my address to make sure the cheque goes to
the right place, and he's going to get in touch with me again on Wednesday
night to let me know what the progress has been.
I asked him what the next step is if Finance reject the refund request, and he
implied threats of violence. (NOTE to Sky managers who might read this: I
don't believe he meant he would really commit violence on the Finance team. He
was doing his job well and using humour to relieve me whilst promising he
would make an effor to follow up).
He also gave me permission to verbally abuse him if he doesn't get in touch on
Wednesday night. I appreciate this sentiment, though it's unlikely I'll get
into capslock territory with this guy any time soon. I would tweet a gentle
a reminder of course.
6 November: The Wednesday follow-up
I got a Twitter DM with a link to a live chat... After just under an hour of
waiting in a 'queue' for the live chat I had to leave, and DMed @SkyHelpTeam
back to ask if they'd let me know when there was someone there, so I wasn't
waisting my time refreshing a page. They responded and sent the details to my
MySky help tickets instead. And the result?
A cheque is in the post!
Please allow 28 days for it to arrive.
I sure hope that's true. And that it's coming to the right address. I might
send a letter to my old flat, just in case. So I guess I'll update in 28 days
whether or not it actually arrived. I'm hopeful, but they've promised me that
money is on its way in x amount of time before...
Conclusions
If your problems aren't fixed immediately, pester the social media team
(@SkyHelpTeam). Get everything in writing,
record every conversation, keep track of dates, names of customer service
people, and what you were promised. Don't give up. For every semi-competant
and sympathetic customer service person, there are four or five
lazy/useless/uncaring or possibly even malicious ones. Just keep trying, and
you'll get through eventually...
Recommendations for Sky
Following my unfortunately extensive experience with Sky help ticketing, I'd
like to make a few suggestions for its improvement.
Tickets should be marked as resolved by the customer. I have so many tickets that I don't consider to be resolved, sitting in my 'resolved' tickets column.
I should be able to reply to tickets. I post a request, I get a response that is marked as resolved that I don't agree with. I then have to follow up by opening a new ticket, which inevitably goes to a different person, and I end up going around in circles.
If I'm taking the time to type out messages to you, it probably means I don't want to talk to you on the phone. It doesn't matter why. Take the time to write messages back. (Related: telling me it's free to call customer service on a Sky line is really unhelpful when I'm trying to contact you about a recently cancelled Sky line. The fact that I never physically had a landline phone, line or no, is irrelevant here).
I've committed to Nanowrimo for the seventh year. I almost didn't. It's
distressing and frustrating and sucks at my self-confidence like nothing else.
It makes me feel like a failure in a way that nothing else can.
But it makes me write.
If I don't commit to it, I don't write a lot of fiction. Maybe a burst every
six months.
But every November for the last few years, I've written literally thousands of
words. I've brought vague, lingering ideas to life; I've fleshed out
characters; I've explored worlds.
Every November for the last few years, I've bashed out incoherent paragraphs
figuring I'll fit them in properly later. I've exhausted ideas that I now
never want to hear of again. I've killed my love for characters, and tired of
worlds.
I've doubted my writing abilities, my imagination, my creativity, my
storytelling. I've convinced myself that I'm incapable of finishing anything.
The one year I hit 50,000 words? (50,299 to be precise). I was maybe a
chapter away from finishing the actual story. The third quarter needed
totally replacing and didn't really fit with the main story. Four or five
years later, I still haven't written that final chapter, even though I know
what the outcomes are to be. I haven't even typed it all up, let alone re-
written part three. I didn't fall out of love with the characters or the
world, and I think about it a lot, and it breaks my heart.
Every November I've made a few new friends, and reconnected with old ones.
Bonding with someone over Nanowrimo is an experience that stands alone. I've
had one more conversation-starter than usual. I've discovered some new cafes
and new writing software. My productivity has increased as a result of using
The Work I'm Supposed To Be Doing as a distraction from writing.
Every year I tell myself I'm doing it to make myself write. The 50,000 is
irrelevant. I just need to write some words. More than none. Then I'm a
winner. But not hitting 1,667 per day still feels like a gut-wrenching
failure. Finding out someone else is further on than me brings me down a
notch. Even with my inner-editor firmly silenced (she crawls into the
cupboard of her own accord on the 1st of November these days) the inability to
just sit down and churn out words right off the bat is crushing.
But it does make me write.
Writing fiction is my first love. What I wanted to be when I grew up was
"author". It was a complicated word I knew when I was quite little. Along
with "aspidistra", but that's another story.
Imagine if I'd gone on to study it? If I was writing because someone told me
to write? If I had to write to move forward in life? I'd probably have burnt
out well before now.
I guess it hurts so much because it means so much.
And that's why I have to get over myself and just get on with it. If...
when?... if I succeed, where success is writing a story I'm happy with,
regardless of length, the boost will be indescribable. I'll get a new lease
on life. I'll be sure I can do anything.
I'm going to the Edinburgh NanoBeans launch party tomorrow. It's at 2pm in
Forest Cafe. I'm going to add loads of new people (People Who Understand) on
various social networks to increase the chances of being asked how it's going.
Mostly I'll just have to shrug and say slowly, and feel guilty about that
movie I watched or that extra batch of brownies I baked. But maybe... just
maybe there will be a time this year when I can say "it's going great! I'm
ahead of target."
And just for some encouragement, here's are some pictures from 2008:
I'm trying to automatically find connections between accounts on different networks - social networks, content hosting sites, other? - that are held by the same person Agent. I'm starting with YouTube, because that's a good source of content creators.
Who?
I haven't figured out a way to reliable pick channels at random (and have since decided that wouldn't be a good way of doing it anyway due to the long tail of people who don't upload anything at all, let alone are 'active' content creators), so I'm starting with the 'standard feeds'. These used to, more sensibly, be called Charts. They're RSS (or Atom or JSON) feeds of statistics about content or channels. They no longer appear on the frontend of the site, but are available if you know where to look. They are mentioned in some of the API documentation, which is referred to in the YouTube Help about generating your own RSS feeds from YouTube content. The standard feeds are:
Most of these are useful in finding popular videos, which means there's a good chance the uploader has a wide network of connections within YouTube (which I can follow to get more information). Many, though, will be one-hit wonders. I've picked Top favorites as a list that intuition says will be more likely populated by videos from channels to which viewers have some kind of loyalty. These days everything you do on YouTube shows up in your friends feeds, so people may favourite a video as part of building their own identity on the service, as well as to support the content creators they love. It demonstrates an active, positive, reaction to the video. It's the content creators who produce content that is received in this way that I'm ultimately aiming to support. Most viewed, discussed, linked and responded could simply be controversial. Recently featured is some YouTube-inner-circle conspiracy, no doubt. This is all my opinion; if anyone has any better insights on these charts, please do let me know.
I'm also limiting the charts to 'this week', to get a fairly - but not too - rapid turnover of data. 'Today' might give me too many less-established one-hit wonder, viral of the moment types; longer term establishes some sort of consistent enjoyment of the video by the masses. 'All time' is a fairly unchanging list, and would mean all my research is based around Charlie Bit Me. (Although this in itself might be an interesting study of content creator evolution; the original video was aimed at close family and friends, went viral by chance, and since the parents and children involved have built a many-$ content creation empire, with sequels and merchandise and all sorts. They've easily made enough from ad revenue to put both kids through college. But that's another discussion).
Why though?
I'd like to know which other networks are most commonly linked to by active content creators. This might indicate what kinds of interactions are meaningful to them. Social networks for interacting with fans? Other content host sites for different versions of their content, or different media types? Independently run websites and portfolios? Online merchandise stores? Other peoples' content they want to share with their viewers (friends and collaborators)?
It might also be interesting to try to find out how often people reuse the same username across sites. And do people link to profiles on other sites that aren't their own? Either profiles they share with collaborators or friends, or just other peoples' profiles entirely? How can I reliably differentiate?
YouTube's provisions for external account linking
YouTube allows people to put links on their channel. They can choose up to four 'social' links to display icons for over their channel banner, plus one 'custom' link. They can also input as many custom links as they like which show up in a list in the About section of their channel.
The predefined list of 'social' links from YouTube is:
Google+
Facebook
Twitter
Myspace
Tumblr
Blogger
deviantArt
WordPress
SoundCloud
Orkut
Flickr
Google Play
iTunes
Pinterest
Instagram
Zazzle
CafePress
Spreadshirt
LinkedIn
There are crucial things missing from this list, I'm sure - Bandcamp, Newgrounds, off the top of my head - but if this is what YouTube thinks its users want to connect to, then it seems like as good a place to start as any. And of course, if a chosen profile doesn't appear on this list, they can add it (labelled however they want) in the custom links section. The custom links section is also often used for listing secondary (or tertiary or group) YouTube channels, which are fairly commonly found amongst active YouTubers.
Getting these links programmatically
The YouTube API (v3) is balls when it comes to giving me information that is useful in this regard.
Currently all of these links, regardless of banner, social, or custom, conveniently reside in <li>s with a class of custom-links-item. I BeautifulSouped them out. (Why I can't get this information through the API, I don't know).
Linked Data-ing things
So I'll use FOAF's OnlineAccount to hook all the accounts together as Linked Data, which in theory is a perfect fit. SIOC's UserAccount is also an option, but I'll keep it simple for now.
In related news, YouTube is phasing out usernames. New YouTube channels are now created directly through Google+, with a Google+ ID as the unique identifier. It's trying (to the outrage of YouTubers with any kind of branding or well-known identity) to encourage people to hook up their channels to their G+ profiles, and lose their old username. Once done, this cannot be undone. I'd still expect to be able to find out someone's username if they have one though, given the unique channel ID. The API doesn't return this. You get a channel 'title', which is just a display name. For some people (those with established branding) this will be their ye olde username, but for many - most, I suspect - it's their G+ (supposedly real) name.
It just means that for YouTube channels I have to use the gibberish long unique ID instead of a nice human readable username for the foaf:accountName. This goes against what I feel accountName means, but is compliant with the spec, so I guess I'll leave it there.
Everything else at that point is straightforward:
Once the links are got, broken down into their constituent parts with urlparse, I can use rdflib to turn them into, eg:
And store them somewhere ... to be continued.
OnlinePersonae
I'll probably subclass Agent with OnlinePersona (inspired by K. Faith Lawrence's FanOnlinePersona) and have the accounts belonging to that. Eventually OnlinePersona will have more properties which it won't necessarily share with all Agents.
Note: SIOC doesn't have a notion of this type. SIOC has UserAccount which subclasses foaf:OnlineAccount, and thus defers back to a foaf:Agent as the account holder.
Sooo... what do I use as URIs for my OnlinePersonas?
This merits a tangent in the discussion, so I'll make another post about URI issues.
URI locations
Months ago (probably) I thought it would be a good idea to make a PURL for all of my content creation ontology related stuff. I couldn't find any existing sensibly named domains that are public at purl.org... things like '/ontology' are selfishly private. So I created '/content-creation' as a (public!) top-level domain. It's still 'pending approval'. Which means I can't do anything with it. Is purl.org even looked after any more? Grumble.
(Andrei Sambra suggested I use prefix.cc to give my ontology a pretty name. Which looked briefly promising, before I realised it doesn't redirect automatically to an ontology... it's good for humans searching for vocab prefixes, but not for machines by any stretch. Mo validated my feeling that ontology URIs ought to resovle to machine- and human-readable descriptions).
I had been going to use data.inf.ed.ac.uk as the base, but the server that pointed to melted down last month. I dunno when it'll be back. So I'll stick to something I, personally, control. At some point I might buy a more suitable domain specificially for it, but I should discuss the options with some people who know what I'm doing before making a decision by myself. Available candidates right now though include: creativecontent.info, webcontentdb.com/info, internetcontentdb.com/info.
Oh, I just found out that purl.org isn't unfailingly reliable. In that case, forget it.
So for now I'll use:
rhiaro.co.uk/vocab/oocc# for the ontology spec for any terms of my own (when I write it)
rhiaro.co.uk/cc/onlinepersona/ for OnlinePersonas
rhiaro.co.uk/cc/content/ for content, when I get that far.
Next
Follow the links to find more connections and/or verify ones I've already found. For common social and content sites, I can manually scrape useful information or use their APIs. For independent websites or things I haven't come across before, I shall devise some means to not ignore them altogether...
Grab other stuff from the YouTube profile and handle it in the same way. Featured channels may link to other channels the content creator is involved with. Subscriptions and mutual friends may be a good place to go for building up the network.
Put more into the graph than just the FOAF OnlineAccounts. Start on content..
In a centralised system, I could generate my own unique IDs by whatever means, assign them, and be done with it.
I thought briefly about trying to generate human-readable unique IDs, but this article made me decide that that will all end in tears.
Maybe for now I assume that people won't need to remember their OnlinePersona URI... Dangerous? Maybe. Maybe not. Maybe it's more likely that someone will be searched for by all the properties of their OnlinePersona, but the OnlinePersona itself doesn't matter directly. We shall see.
So on that note, Python's UUID will do. They're long and horrible. But I'll get over it.
Power to the people
How do I persist creator and content URIs in a non-centralised, user-owned network?
People would need the option to change their URI to whatever they felt represented themselves, like their personal 'about me' page. Trying to enforce content negotiation and a Document != Person mentality here might be difficult.
Ultimately it doesn't really matter what their URI is as long as it resolves and persists, right? And if it doesn't resolve, or even disappears entirely, it's kind of rubbish, but not Web-breaking. Kind of the reason the Web still holds up, and the reason the Semantic Web is an extension of that.
Assuming a distributed, Diaspora*-esque 'Pod' structure for this network, if a user moves to another Pod and as a result must change their URI, the protocols involved essentially need to require leaving a 'forwarding address' to their new URI. Maybe, in this scenario, URIs are handled differently altogether. Separately from the Pods. You can get a URI from the Pod you just joined, or you can use your own or generate one from a provider.
How do you authenticate changing of a URI? Someone could essentially steal someone else's identity by switching out their URI... so... that can't happen.
Maybe I'm thinking too much. I might need to talk to someone smarter about this.
Our third Prewired event went smoothly, with 20 young people (about 4 new) and
7 or so parents attending, plus 8 mentors. So lower signups than usual (a few
cancellations due to school commitments), but we decided not to do a big
publicity push and see how it ran with a smaller group. I didn't notice much
difference, since they organise themselves into smaller groups anyway to work
on different things. I think next time we'll try to reach our capacity of 40.
We had a big group working on a variety of Python projects (games, basics,
algorithms, I'm not sure what else..), a small group doing front-end web, and
quite a few doing amazing things with Scratch.
Every week I discover new things these super-talented young people are doing
with their time, and it won't be long before many of them are spending a lot
of time mentoring their peers as well as working on their own projects.
Nantas came by to talk about what he does with the
University's Robotics lab, including the challenges of making humanoid robots
play football, and the state of the art, two-million-pounds, full sized
humanoid robot that is moving to Edinburgh in the near future. Definitely
stuff to get young people excited about learning to code.
We've been trying to encourage them to code between Prewired sessions, too,
and about half of them said they had. I hope by the next time all of them
have, and I'm really excited to see what they're capable of making in a few
months time!
But... Some of the young people attending are disadvantaged by not being able to bring their own laptop, or having only really old laptops which can't support modern browsers and therefore have trouble even executing the JavaScript their writing (true story).
We'd love to be able to pay for a set of simple but up-to-date laptops that we
could lend to the attendees who don't have their own during sessions. This at
least will put them on a level playing field with the others during the
sessions, and I suspect that many of them have adequate desktop machines or
family laptops at home.
Prewired runs on a budget of volunteer blood, sweat and tears, and zero
pounds. We're lucky enough to be able to use space in the University
Informatics building for free, and there are no shortage of keen mentors and
helpers willing to chip in their time (and in some cases cash for snacks).
So if you work for a company who might be able to support the purchase of
resources for our young coders, or know someone who does, then please get in
touch!
I ventured to Glasgow for the second Open Knowledge Foundation meetup on
Monday 18th. It was well attended, and there were six short talks:
Lorna Campbell from Cetis talked about Open Scotland. I understood this to be
a collaboration between Cetis, the SQA, JISC and the ALT Scotland, to do with
the opening up of education, and influencing policy and practicein this area.
Here's a blog.
Grianne Hamilton from JISC talked about Mozilla's Open Badges. You can use
them to reward learning, skills and achievements in all sorts of areas, and
any organisaiton can create and issue badge packs to people who have earned
them. Recievers can then show them off anywhere they can put HTML.
Graeme Arnott talked about a collaboration between Glasgow Womens' Library and
Wikimedia, which resulted in the Scottish Women on Wikipedia event. This was a
group of Scottish women getting together to edit Wikipedia articles about
Scottish Women, and there was very positive feedback. They have more events
planned. Graeme also reminded us about Wikimania, which is taking place next
August in London.
Jennifer Jones told us about the Digital Common Wealth project. She pointed
out that with media-saturated global events like the Olympics, the official
story is already decided before the event even starts. An alternative to
relying on what is broadcast by the mainstream media is to turn the camera on
the crowd, and get the 'real' version of what is going on. The Digital Common
Wealth project will encourage citizen journalists to work together to craft
the story of the Glasgow Commonwealth Games from their perspective. Jennifer
also raised the point that although free tools like YouTube, AudioBoo and
Twitter are great for spreading stories, the data is still held by third
parties - what happens if they disappear? How should initiatives like this
safely archive their stories, and keep them in context?
Pippa Gardner talked about Glasgow's Future Cities project, for which they
have £24 million to develop. It's about "people and data", but she was here
to talk about data. There's the Data Innovation Engagement (which apparently
needs a better acronym) and Glasgow's data
portal which has already launched. Not all of
the data on their is 'properly' open, but it's more open than it was before.
There's a maps portal coming soon. Follow
@openglasgow to keep up to date. Someone
asked how they can avoid inadvertantly widening the digital divide by making
all this data available - as it will only improve things for people who
already have understanding and access. Pippa said there's a dedicate group in
the Council working on widening digital participation, so they're involved.
Duncan Bain, and MPhil student at the University of Edinburgh, talked about
Open Architecture. He says it's hard to define 'knowledge' and 'data' in
architecture; architects create drawings/representations, not buildings. There
are efforts towards opening certain aspects of this, like wikihouse.cc and the
Open Architecture Network, but the culture of the architecture world, and
where the money is, seems to be preventing things from going in the same
direction as software development any time soon.