+ http://www.jwz.org/blog/2015/04/youtube-has-finally-destroyed-their-rss-feeds/#comment-161869
Amy added http://www.jwz.org/blog/2015/04/youtube-has-finally-destroyed-their-rss-feeds/#comment-161869 to https://rhiaro.co.uk/bookmarks/
+ http://www.jwz.org/blog/2015/04/youtube-has-finally-destroyed-their-rss-feeds/#comment-161869
Amy added http://www.jwz.org/blog/2015/04/youtube-has-finally-destroyed-their-rss-feeds/#comment-161869 to https://rhiaro.co.uk/bookmarks/
I'm trying to automatically find connections between accounts on different networks - social networks, content hosting sites, other? - that are held by the same person Agent. I'm starting with YouTube, because that's a good source of content creators.
I haven't figured out a way to reliable pick channels at random (and have since decided that wouldn't be a good way of doing it anyway due to the long tail of people who don't upload anything at all, let alone are 'active' content creators), so I'm starting with the 'standard feeds'. These used to, more sensibly, be called Charts. They're RSS (or Atom or JSON) feeds of statistics about content or channels. They no longer appear on the frontend of the site, but are available if you know where to look. They are mentioned in some of the API documentation, which is referred to in the YouTube Help about generating your own RSS feeds from YouTube content. The standard feeds are:
Most of these are useful in finding popular videos, which means there's a good chance the uploader has a wide network of connections within YouTube (which I can follow to get more information). Many, though, will be one-hit wonders. I've picked Top favorites as a list that intuition says will be more likely populated by videos from channels to which viewers have some kind of loyalty. These days everything you do on YouTube shows up in your friends feeds, so people may favourite a video as part of building their own identity on the service, as well as to support the content creators they love. It demonstrates an active, positive, reaction to the video. It's the content creators who produce content that is received in this way that I'm ultimately aiming to support. Most viewed, discussed, linked and responded could simply be controversial. Recently featured is some YouTube-inner-circle conspiracy, no doubt. This is all my opinion; if anyone has any better insights on these charts, please do let me know.
I'm also limiting the charts to 'this week', to get a fairly - but not too - rapid turnover of data. 'Today' might give me too many less-established one-hit wonder, viral of the moment types; longer term establishes some sort of consistent enjoyment of the video by the masses. 'All time' is a fairly unchanging list, and would mean all my research is based around Charlie Bit Me. (Although this in itself might be an interesting study of content creator evolution; the original video was aimed at close family and friends, went viral by chance, and since the parents and children involved have built a many-$ content creation empire, with sequels and merchandise and all sorts. They've easily made enough from ad revenue to put both kids through college. But that's another discussion).
I'd like to know which other networks are most commonly linked to by active content creators. This might indicate what kinds of interactions are meaningful to them. Social networks for interacting with fans? Other content host sites for different versions of their content, or different media types? Independently run websites and portfolios? Online merchandise stores? Other peoples' content they want to share with their viewers (friends and collaborators)?
It might also be interesting to try to find out how often people reuse the same username across sites. And do people link to profiles on other sites that aren't their own? Either profiles they share with collaborators or friends, or just other peoples' profiles entirely? How can I reliably differentiate?
YouTube allows people to put links on their channel. They can choose up to four 'social' links to display icons for over their channel banner, plus one 'custom' link. They can also input as many custom links as they like which show up in a list in the About section of their channel.
The predefined list of 'social' links from YouTube is:
There are crucial things missing from this list, I'm sure - Bandcamp, Newgrounds, off the top of my head - but if this is what YouTube thinks its users want to connect to, then it seems like as good a place to start as any. And of course, if a chosen profile doesn't appear on this list, they can add it (labelled however they want) in the custom links section. The custom links section is also often used for listing secondary (or tertiary or group) YouTube channels, which are fairly commonly found amongst active YouTubers.
The YouTube API (v3) is balls when it comes to giving me information that is useful in this regard.
Scraping time!
Currently all of these links, regardless of banner, social, or custom, conveniently reside in <li>
s with a class
of custom-links-item
. I BeautifulSouped them out. (Why I can't get this information through the API, I don't know).
So I'll use FOAF's OnlineAccount to hook all the accounts together as Linked Data, which in theory is a perfect fit. SIOC's UserAccount is also an option, but I'll keep it simple for now.
In related news, YouTube is phasing out usernames. New YouTube channels are now created directly through Google+, with a Google+ ID as the unique identifier. It's trying (to the outrage of YouTubers with any kind of branding or well-known identity) to encourage people to hook up their channels to their G+ profiles, and lose their old username. Once done, this cannot be undone. I'd still expect to be able to find out someone's username if they have one though, given the unique channel ID. The API doesn't return this. You get a channel 'title', which is just a display name. For some people (those with established branding) this will be their ye olde username, but for many - most, I suspect - it's their G+ (supposedly real) name.
It just means that for YouTube channels I have to use the gibberish long unique ID instead of a nice human readable username for the foaf:accountName
. This goes against what I feel accountName
means, but is compliant with the spec, so I guess I'll leave it there.
Everything else at that point is straightforward:
Once the links are got, broken down into their constituent parts with urlparse
, I can use rdflib
to turn them into, eg:
And store them somewhere ... to be continued.
I'll probably subclass Agent
with OnlinePersona
(inspired by K. Faith Lawrence's FanOnlinePersona
) and have the accounts belonging to that. Eventually OnlinePersona
will have more properties which it won't necessarily share with all Agent
s.
Note: SIOC doesn't have a notion of this type. SIOC has UserAccount
which subclasses foaf:OnlineAccount
, and thus defers back to a foaf:Agent
as the account holder.
Sooo... what do I use as URIs for my OnlinePersona
s?
This merits a tangent in the discussion, so I'll make another post about URI issues.
Months ago (probably) I thought it would be a good idea to make a PURL for all of my content creation ontology related stuff. I couldn't find any existing sensibly named domains that are public at purl.org... things like '/ontology' are selfishly private. So I created '/content-creation' as a (public!) top-level domain. It's still 'pending approval'. Which means I can't do anything with it. Is purl.org even looked after any more? Grumble.
(Andrei Sambra suggested I use prefix.cc to give my ontology a pretty name. Which looked briefly promising, before I realised it doesn't redirect automatically to an ontology... it's good for humans searching for vocab prefixes, but not for machines by any stretch. Mo validated my feeling that ontology URIs ought to resovle to machine- and human-readable descriptions).
I had been going to use data.inf.ed.ac.uk as the base, but the server that pointed to melted down last month. I dunno when it'll be back. So I'll stick to something I, personally, control. At some point I might buy a more suitable domain specificially for it, but I should discuss the options with some people who know what I'm doing before making a decision by myself. Available candidates right now though include: creativecontent.info, webcontentdb.com/info, internetcontentdb.com/info.
Oh, I just found out that purl.org isn't unfailingly reliable. In that case, forget it.
So for now I'll use:
OnlinePersona
sFollow the links to find more connections and/or verify ones I've already found. For common social and content sites, I can manually scrape useful information or use their APIs. For independent websites or things I haven't come across before, I shall devise some means to not ignore them altogether...
Grab other stuff from the YouTube profile and handle it in the same way. Featured channels may link to other channels the content creator is involved with. Subscriptions and mutual friends may be a good place to go for building up the network.
Last modified:
8th - 14th July
Semantic Web Summer School, much heat, much fun, much learning... Here's an index of my posts.
15th - 21st July
Friends visited. Progress included writing notes to myself to figure out just what my PhD outcomes really are, and why. Came up with:
1\\\\. Recommending how to usefully describe diverse amateur creative digital
content (ACDC) using an ontology.
a) What are the parts of ACDC that need to be represented? Identify and categorise properties. How do these differentiate it from other similar content?
b) What existing ontologies can be used to do this, and how do they need to be extended?
2\\\\. Building an initial set of linked data about ACDC, and providing means for
its growth and use.
a) Manual annotation of ACDC, and refinement (to test ontology).
b) Tools for automatic annotation of the parts of ACDC that it is possible to automatically annotate.
c) Tools for manual annotation by the community of content creators and consumers for the parts of ACDC that cannot be automatically annotated.
d) Tools to expose the linked data for use by third-party applications.
3\\\\. Create and test an example service which uses the linked data to benefit
content creators and/or consumers.
eg. Unobtrusive recommendations for collaborative partners (most likely); content recommendation; content consumption analysis (like tracking viral content); community building / knowledge sharing in this domain; ... .
22nd - 28th July
Brainstormed with Ewan about stage 3 (above), and came up with the idea of an interface that allows content creators to allocate varying degrees of credit for roles played by different people when collaborating on a project. This would serve to both gather collaborative bibliographic data, learn things about how different segments of the community allocate credit, and provide a potentially useful tool for content creators. With the future value that, if we can learn enough to estimate role inputs from different people, it could be used for things like automatic revenue sharing.
Then spent the rest of the week in London, frolicking amongst the YouTubers (including attending a meeting at Google about secret YouTube-y stuff), and annotated some ACDC. Write-up coming soon.
Last modified:
GemuCon, a first-time gaming convention, wasn't normally the sort of event I'd go to. Especially not with the £35 ticket price tag.
But it was being organised by one of my friends from my undergraduate, so I agreed to do the website (violating my no-more-freelance-work policy), and having botched together a custom registration system (scope creep) I was drafted in as 'Registrations Officer' on the committee, too. Since I was in Nottingham on the 4th for the Lovelace Colloquium anyway, I had no excuse not to go.
It was a good job I did, as the checking-people-off-who-arrive system was web based, and the hotel wifi was not playing ball from the outset. We'd thought of that of course, and brought backup wifi dongles. Neither of which could get signal. So half an hour before registration opened I was writing a script to export the database into a nicely formatted spreadsheet (sounds simple; wasn't; ask if you're curious) so that we had more than one machine (I had the database locally on my laptop) we could register over 700 people with. Then it was literally non-stop.
The other reason I was there was to morally support my good friend TomSka, who was attending as a guest because he is Internet Famous.
So my time was split between hanging out in the Operations Room (mostly) to help confused con-goers with things like registration, lost property, picking up merchandise, finding the stairs, getting free cupcakes; making myself useful by running up and down ten flights of stairs on errands (until I discovered the service elevator; two 6-man lifts between 800 people hadn't been so accessible); and hanging out with Tom and Matt.
On Saturday I helped him on his merch stall (we sold everything but all of the wristbands and all of the keyrings and earrings).
During the quiet times when there were other big events on, and thus no customers, I had to make my own fun.
On Sunday I live-tweeted Tom and Matt's panel "How to YouTube".
[View the story "TomSka's YouTube panel at GemuCon" on Storify]
This generated a small amount of controversy, as people who have never had to live off advertising revenue often hate people who live off advertising revenue even if it means they have found a way to survive by doing something they love, and can provide what they create to the world for free.
Frankly I'm just excited that we do live in a world where young creatives can be their own boss, make a living from doing what they love, and where the only hoops they have to jump through to do so are getting better at their craft. Whilst the advertising-centric revenue model may be outdated and may be despised by a good number of people, it's working for YouTubers at the moment and I haven't seen a better alternative present itself. Not everyone, particularly consumers of amateur media, can afford to pay for content they consume; accessing content for free empowers consumers too because their entertainment choices are not controlled by the same person who controls their finances (and thus probably most other aspects of their lives). I'm also fairly convinced that if the advertising revenue model falls flat in the future, amateur content creators will be much faster to recover and adapt than traditional media industries would.
The other great thing about this business model - for YouTubers at least - is that many/most don't start out with financial motivations. After a while they realise thier hobby is giving pleasure to increasing numbers of people, so they carry on, and suddenly a side-effect is that they're making money as well, at no cost to their audience. (This may change as YouTubing is acknowledged as a career choice).
Maybe naive or overly idealistic, but I don't believe anyone should be stuck doing a job they hate. It's a very, very long term goal for society, but the ultimate utopia is a world in which everybody is motivated and empowered to develop skills they enjoy or knowledge they're passionate about, and to put their abilities to some use that can sustain an acceptable standard of living for themselves and their family. Technology plays a crucial role in this (for a start, all the jobs nobody wants to do will be automated).
Anyway, Tom and Matt's panel was full of sound advice for digital creatives just starting out, though the current landscape is a very different one from when _they _began their YouTube journeys (for example: no YouTube).
Though I didn't experience much of it myself, GemuCon had all sorts going on. There were a few rooms packed full of video game consoles (for people to entertain themselves at leisure, as well as scheduled tournaments with cash prizes), a room of tabletop games, merch dealers and artists galore, various panels with the various guests, a talent show, a cosplay masquerade, and parties all night every night. Now, I don't like parties, but even I couldn't resist hanging around at a rave for a bit when the music was Pokemon theme remixes, or the Zelda soundtrack. Et cetera.
The most impressive thing about this wee convention (and 700-odd people is wee, compared with similar more established cons) is the air of friendliness and solidarity that seemed to be ever-present. Granted not everyone could have been happy at every moment, and there was definitely douchebaggery from time to time, but in general there was a unification of nerds; an unspoken understanding between the stereotypically socially awkward that allowed people to come out of their shells and enjoy themselves in a way that they might normally suffer abuse for, thanks the the common background provided by video game and Internet culture. This is somewhat tongue in cheek, but... hopefully you know what I mean.
I made a few new friends, too.
If you're thusly inclined, check out 'official' photos (here and here) and videos (here) from Team Neko.
And if you visit the new GemuCon holding page, don't forget to Konami Code.
Last modified: