<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Hackdiary &#187; metadata</title>
	<atom:link href="http://www.hackdiary.com/category/metadata/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.hackdiary.com</link>
	<description></description>
	<lastBuildDate>Wed, 10 Feb 2010 13:34:58 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>My first week with Thinglink</title>
		<link>http://www.hackdiary.com/2006/07/23/my-first-week-with-thinglink/</link>
		<comments>http://www.hackdiary.com/2006/07/23/my-first-week-with-thinglink/#comments</comments>
		<pubDate>Sun, 23 Jul 2006 23:32:13 +0000</pubDate>
		<dc:creator>Matt Biddulph</dc:creator>
				<category><![CDATA[metadata]]></category>

		<guid isPermaLink="false">http://www.hackdiary.com/?p=92</guid>
		<description><![CDATA[
]]></description>
			<content:encoded><![CDATA[<p>The first <a href="http://www.thinglink.org/">Thinglink</a> technical workshop took place in Amsterdam this month. <a href="http://www.hobbyprincess.com/">Ulla-Maaria Mutanen</a> and I spent an incredibly productive week at the <a href="http://www.mediamatic.net/">Mediamatic</a> offices thrashing out ideas that will form a Thinglink technical whitepaper over the next few months.</p>
<p>I&#8217;d like to highlight two things in particular that came out of this activity.</p>
<p><span id="more-92"></span><br />
Firstly, I wrote up some ideas on the Thinglink blog entitled <a href="http://ullamaaria.typepad.com/thinglink/2006/07/design_patterns.html">Design patterns for building with web APIs</a>. This came out of a discussion on the different kinds of integration model that occur in a Web 2.0 world. Although the patterns are described using stories from a Thinglink perspective, I think they&#8217;re generally applicable. I&#8217;d be interested in hearing anyone&#8217;s thoughts on how they can be refined, and whether they seem to fit the kinds of systems you&#8217;re building.</p>
<p>Secondly, go take a look at the <a href="http://thinglink.hackdiary.com/thingtagging/">Thingtagging</a> site. It&#8217;s an aggregator for photos on flickr that have been <a href="http://ullamaaria.typepad.com/thinglink/2006/07/introducing_thi.html">thingtagged</a>. We think it&#8217;s a lot of fun. We also think that you should start thinglinking and thingtagging your unique and interesting objects right now.</p>
<p>I&#8217;d like to thank <a href="http://www.plasticbag.org/">Tom Coates</a> not only for the gorgeous design of the site, but also for pointing out that &#8216;mashup&#8217; is really quite a tired word these days. We think that an exquisite combination of two great services deserves better, and so we name this a Flickr/Thinglink <i>intimacy</i>.</p>
<p><img src="http://www.hackdiary.com/images/thingtagging_screenie.png" /></p>
]]></content:encoded>
			<wfw:commentRss>http://www.hackdiary.com/2006/07/23/my-first-week-with-thinglink/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Last.fm isn&#8217;t just for humans</title>
		<link>http://www.hackdiary.com/2006/04/26/lastfm-isnt-just-for-humans/</link>
		<comments>http://www.hackdiary.com/2006/04/26/lastfm-isnt-just-for-humans/#comments</comments>
		<pubDate>Wed, 26 Apr 2006 16:53:55 +0000</pubDate>
		<dc:creator>Matt Biddulph</dc:creator>
				<category><![CDATA[metadata]]></category>

		<guid isPermaLink="false">http://www.hackdiary.com/?p=83</guid>
		<description><![CDATA[
]]></description>
			<content:encoded><![CDATA[<p>While I&#8217;m talking BBC, here&#8217;s a story from a little while ago. I&#8217;m a big fan of <a href="http://last.fm">last.fm</a>, and I&#8217;ve been using it for a few years. Because I used to play my MP3s on a headless linux box in my flat, I wrote a <a href="http://www.hackdiary.com/archives/000052.html">commandline python mp3 player that could ping last.fm</a>. My <a href="http://last.fm/user/biddulph">profile</a> is a pretty good picture of my listening habits.</p>
<p>At BBC Radio, the radio stations are moving steadily from traditional analogue studios to fully digital systems that play nearly all their music from hard disk. As a member of the Architecture Team there, I had access to experimental data feeds from these systems. One day at work I asked myself a question: what happens when you plug behavioural data generated by an automatic process into social software designed for humans?</p>
<p><span id="more-83"></span><br />
Half an hour later, I&#8217;d rigged my last.fm plugin into the feed system and switched it on. Over a year later, when I left the BBC, <a href="http://www.last.fm/user/sekrit">sekrit</a> had accumulated a record of more than 50,000 tracks played on <a href="http://www.bbc.co.uk/6music/">BBC 6Music</a>.</p>
<p>Bear in mind when looking at this data that only the most mainstream and automatable parts of this admirably diverse radio station are visible in the feed. Every dusty Ska 7&#8243; played by Phill Jupitus on the Breakfast Show is invisible here. Even with this proviso, I think the dataset is fascinating.</p>
<p>Ever wondered what a radio station&#8217;s best friends would look like? <a href="http://www.last.fm/user/sekrit/neighbours/">Here&#8217;s your answer</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.hackdiary.com/2006/04/26/lastfm-isnt-just-for-humans/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Using Wikipedia and the Yahoo API to give structure to flat lists</title>
		<link>http://www.hackdiary.com/2005/09/02/using-wikipedia-and-the-yahoo-api-to-give-structure-to-flat-lists/</link>
		<comments>http://www.hackdiary.com/2005/09/02/using-wikipedia-and-the-yahoo-api-to-give-structure-to-flat-lists/#comments</comments>
		<pubDate>Fri, 02 Sep 2005 22:49:10 +0000</pubDate>
		<dc:creator>Matt Biddulph</dc:creator>
				<category><![CDATA[metadata]]></category>

		<guid isPermaLink="false">http://www.hackdiary.com/?p=71</guid>
		<description><![CDATA[
]]></description>
			<content:encoded><![CDATA[<p>Some of my recent (and <a href="http://www.hackdiary.com/archives/000068.html">final</a>) work at the BBC has involved breathing life into old rolodex-style flat databases of content. With my colleague <a href="http://www.plasticbag.org">Tom Coates</a>, I&#8217;ve been puzzling over how to take a list of text strings like this:</p>
<p><code>"AGNEW, Spiro", "ATTLEE, Clement", "BARBER, Anthony", "BEVAN, Aneurin", "BLAIR, Tony", "CALLAGHAN, James", "CHAMBERLAIN, Neville", "CHURCHILL, Winston", "COULTHARD, David", "DYALL, Valentine", "EDEN, Anthony", "FOOT, Michael", "GAITSKELL, Hugh", "HAGUE, William", "HEATH, Edward", "HESELTINE, Michael", "JENKINS, Roy", "KINNOCK, Neil", "MACLEOD, Iain", "MACMILLAN, Harold", "MARSHALL, David", "MILLIGAN, Spike", "NIXON, Richard", "REDWOOD, John", "THATCHER, Margaret", "WILSON, Harold"</code></p>
<p>and turn it into a network of directed links <a href="http://www.hackdiary.com/misc/people.png">like this</a>. Hopefully anyone who has a passing knowledge of the history of the British government will agree that it&#8217;s a convincing little map, easily usable as a basis for navigation around the concepts attached to the text strings</p>
<p>We found a pretty neat automated solution, entirely based on public internet resources, that requires no input at our end apart from the text strings above.</p>
<p><span id="more-71"></span><br />
The first step is to turn our flat text strings into references to internet resources that we can use to gather judgements about them. For the politics domain (and for a vast variety of others), <a href="http://www.wikipedia.org">Wikipedia</a> is an obvious choice. How do we turn &#8220;BLAIR, Tony&#8221; into <a href="http://en.wikipedia.org/wiki/Tony_Blair">http://en.wikipedia.org/wiki/Tony_Blair</a>? The simplest approach, transforming the former string into the latter by regular expression, doesn&#8217;t work for all cases, even in this small data set. Note the URL for <a href="http://en.wikipedia.org/wiki/Lord_Barber">Anthony Barber</a>, for example. Thankfully, the solution is still simple: we use a search engine. Yahoo&#8217;s REST API is far preferable to Google&#8217;s SOAP, so we use it to formulate <a href="http://api.search.yahoo.com/WebSearchService/V1/webSearch?appid=YahooDemo&#038;query=%22margaret%20thatcher%22&#038;site=wikipedia.org">a search for &#8220;margaret thatcher&#8221; restricted to wikipedia.org</a> and take the top result.</p>
<p>Straight away we&#8217;ve added value to our database. With a plausibily-canonical URL for Margaret Thatcher (certainly an <a href="http://www.w3.org/TR/owl-ref/#InverseFunctionalProperty-def">inverse functional property</a> in ontology terms), we can use a service like Bloglines Citation Search to gather <a href="http://www.bloglines.com/citations?url=http://en.wikipedia.org/wiki/Margaret_Thatcher">web zeitgeist around that particular politician</a>.</p>
<p>Wikipedia gives us much more than just an identifier, however. There&#8217;s great human-readable information in that resource. Luckily, the Yahoo API has another service that can help machine-agents make sense of human prose. Its <a href="http://developer.yahoo.net/search/content/V1/termExtraction.html">Term Extraction service</a> will take a chunk of content and return a short list of &#8216;significant words or phrases&#8217; from it. Incidentally, you can play with some great visualisations based on this service at <a href="http://www.tagcloud.com/">tagcloud.com</a>.</p>
<p>If we run the HTML from Thatcher&#8217;s wikipedia page through an html-to-text process (perhaps <code>lynx -dump</code>) and then hand the text to the Yahoo service, we get the following:</p>
<ul>
<li>margaret thatcher</li>
<li>baroness thatcher</li>
<li>woman</li>
<li>political philosophy</li>
<li>james callaghan</li>
<li>figurehead</li>
<li>government spending</li>
<li>margaret hilda thatcher</li>
<li>conservative party</li>
<li>tony blair</li>
<li>wikipedia</li>
<li>wikimedia</li>
<li>free encyclopedia</li>
<li>thatcherism</li>
<li>order of the garter</li>
</ul>
<p>What&#8217;s particularly interesting about this list? It contains some text strings we can relate directly back to members of our original list. So we judge that Margaret Thatcher links to James Callaghan and Tony Blair. Repeat this extraction and correlation process for each member of the list and you get <a href="http://www.hackdiary.com/misc/people.png">the map we are looking for</a>. While a political journalist might complain about its completeness, it&#8217;s an impressive result that comes at zero cost.</p>
<p>What are the implications of this method? Firstly, it shows yet again the value of using <a href="http://www.hackdiary.com/slides/xtech2005/">existing well-designed URLs</a> as globally-unique and resolvable identifiers for concepts. Simply using a consensus URL as a proxy for a concept increases the chances of correlating your information with that of others on the web. As the method is applicable to any text strings, it could be used with a list of tags attached to a URL or photo on <a href="http://www.flickr.com/">flickr</a> or <a href="http://del.icio.us/">del.icio.us</a>. While it certainly wouldn&#8217;t give results on the every single tag, any result at all is better than none.</p>
<p>Secondly, it&#8217;s a great method for adding value to your own data by using external information. It fits well with other emerging thinking on ad-hoc inferencing such as Tom&#8217;s <a href="http://www.plasticbag.org/archives/2005/09/how_to_build_on_bubbleup_folksonomies.shtml">How To Build On Bubbleup Folksonomies</a>. Now that tagging and URL-linking are firmly established as viable and accessible tools for distributed correlation and consensus-forming, second-generation techniques such as these can start to take advantage of the resulting network effect. We can aggregate opinion and categorisation up the chain, and across the network, along any axis that makes sense for our problem domain. By using external resources, we can start somewhere in our data, hop up onto the web, take a few steps, and come back down in a different, yet relevant, part of our own database.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.hackdiary.com/2005/09/02/using-wikipedia-and-the-yahoo-api-to-give-structure-to-flat-lists/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>
