<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Hackdiary &#187; web</title>
	<atom:link href="http://www.hackdiary.com/category/web/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.hackdiary.com</link>
	<description></description>
	<lastBuildDate>Mon, 05 Dec 2011 17:15:30 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3</generator>
		<item>
		<title>Algorithmic recruitment with GitHub</title>
		<link>http://www.hackdiary.com/2010/02/10/algorithmic-recruitment-with-github/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=algorithmic-recruitment-with-github</link>
		<comments>http://www.hackdiary.com/2010/02/10/algorithmic-recruitment-with-github/#comments</comments>
		<pubDate>Wed, 10 Feb 2010 13:34:58 +0000</pubDate>
		<dc:creator>Matt Biddulph</dc:creator>
				<category><![CDATA[web]]></category>

		<guid isPermaLink="false">http://www.hackdiary.com/?p=209</guid>
		<description><![CDATA[In my new job in Berlin I&#8217;ve been asked to hire some people to help prototype new, secret projects. Berlin has a superb tech scene but as I&#8217;m new in town it&#8217;s taking me a little time to get to know everyone. While that&#8217;s going on, I wrote some code to help me explore Berlin&#8217;s [...]]]></description>
			<content:encoded><![CDATA[<p>In my <a href="http://www.nokia.com/press/press-releases/showpressrelease?newsid=1344044">new job in Berlin</a> I&#8217;ve been asked to hire some people to help prototype new, secret projects. Berlin has a superb tech scene but as I&#8217;m new in town it&#8217;s taking me a little time to get to know everyone. While that&#8217;s going on, I wrote some code to help me explore Berlin&#8217;s developer community.</p>
<p>When I&#8217;m hiring, one of the things I always want to see is evidence of personal projects. Over the last two years, <a href="http://github.com">GitHub</a> has become an amazing treasure trove of code, with the best social infrastructure I&#8217;ve ever seen on a developer site. GitHub profiles let the user set their location, so I started with a few <a href="http://www.google.com/search?hl=en&#038;q=site:github.com+location+berlin+"profile+-+github"">web searches</a> for Berlin developers. This finds hundreds of interesting people, but how do I prioritise them?</p>
<p>Another thing that I look for when building a good team is someone&#8217;s personal network. I&#8217;ve always believed strongly in spending lots of time at conferences meeting passionate people who are smarter than me. A good developer can make themselves even more productive by knowing who to email, IM or DM to answer a question when they&#8217;re stuck.</p>
<p><a href="http://en.wikipedia.org/wiki/Centrality"><img style="float:right; margin-left: 20px;" src="http://upload.wikimedia.org/wikipedia/commons/thumb/6/60/Graph_betweenness.svg/200px-Graph_betweenness.svg.png" width="200" height="200" /></a> A recent <a href="http://www.stoweboyd.com/message/its-betweenness-that-matters-not-your-eigenvalue-the-dark-ma.html">article by Stowe Boyd on centrality and influence in social networks</a> reminded me of some of the network analysis we use behind the scenes calculating recommendations for the <a href="http://www.dopplr.com/socialatlas">Dopplr Social Atlas</a>. So I wrote some code to query the <a href="http://develop.github.com/">GitHub API</a> and analyse the social graph of the Berlin subset of their users.</p>
<p>The JRuby code uses Yahoo BOSS to do the web search. After querying the GitHub API for <a href="http://develop.github.com/p/users.html">each user&#8217;s followers</a> it builds an in-memory graph using the <a href="http://jung.sourceforge.net/"> Java Universal Network/Graph Framework</a>. Then it ranks each user node in the graph using the <a href="http://jung.sourceforge.net/doc/api/edu/uci/ics/jung/algorithms/importance/BetweennessCentrality.html">Betweenness Centrality algorithm</a>. You can see the simple <a href="http://github.com/mattb/flotsam/tree/master/github-recruitment/">source code on my github</a>.</p>
<p>To sanity-check the results I ran it for a couple of cities I already know well: London and San Francisco. Here are the top 5 for each, which seem quite plausible to me:</p>
<h3><a href="http://github.com/mattb/flotsam/blob/master/github-recruitment/sf.txt">San Francisco</a></h3>
<ol>
<li><a href="http://github.com/defunkt">Chris Wanstrath, GitHub</a></li>
<li><a href="http://github.com/miyagawa"> Tatsuhiko Miyagawa, Six Apart</a></li>
<li><a href="http://github.com/leah">Leah Culver, Six Apart</a></li>
<li><a href="http://github.com/square">Square Inc</a></li>
<li><a href="http://github.com/tmm1">Aman Gupta, ruby eventmachine maintainer</a></li>
</ol>
<h3><a href="http://github.com/mattb/flotsam/blob/master/github-recruitment/london.txt">London</a></h3>
<ol>
<li><a href="http://github.com/james">James Darling</a></li>
<li><a href="http://github.com/lrug">London Ruby User Group</a></li>
<li><a href="http://github.com/norm">Mark Norman Francis</a></li>
<li><a href="http://github.com/danwrong">Dan Webb (recently moved to Twitter in SF)</a></li>
<li><a href="http://github.com/cv"> Carlos Villela, Thoughtworks</a></li>
</ol>
<p>My choice of metric biases these lists towards connectedness and influence &#8212; it can&#8217;t measure ability. It&#8217;s only measuring GitHub users, and they are biased towards <a href="http://github.com/languages">Ruby, Perl and Javascript</a>. But seeing names there that I trust gives me confidence that it&#8217;ll help me find interesting people in Berlin.</p>
<p>Hopefully some of those people are reading this blog post right now. Others outside Berlin might be interested to know that Nokia does a superb job of relocating people, with everything taken care of by shipping companies and local agents. If you love the web, Javascript, mobile, user experience, social networks, location, enormous datasets and currywurst, you should <a href="mailto:mb@hackdiary.com">get in touch</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.hackdiary.com/2010/02/10/algorithmic-recruitment-with-github/feed/</wfw:commentRss>
		<slash:comments>20</slash:comments>
		</item>
		<item>
		<title>Alas, Second Life! Web 2.0 in a virtual world</title>
		<link>http://www.hackdiary.com/2006/05/29/alas-second-life-web-20-in-a-virtual-world/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=alas-second-life-web-20-in-a-virtual-world</link>
		<comments>http://www.hackdiary.com/2006/05/29/alas-second-life-web-20-in-a-virtual-world/#comments</comments>
		<pubDate>Mon, 29 May 2006 23:37:00 +0000</pubDate>
		<dc:creator>Matt Biddulph</dc:creator>
				<category><![CDATA[web]]></category>

		<guid isPermaLink="false">http://www.hackdiary.com/?p=86</guid>
		<description><![CDATA[
]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.secondlife.com/">Second Life</a> has been my new hacking obsession ever since I bought a laptop fast enough to run it. I don&#8217;t spend a lot of time socialising in the gameworld, but I am fascinated by the possibilities for makers of new user interfaces, useful virtual objects and playful toys. With every object being scriptable, aware and active, it&#8217;s a proving ground for <a href="http://www.alistapart.com/articles/everyware">Everyware</a>.</p>
<p>Version 1.10 was released last week, and hidden among the exciting new visual modeling possibilities of shiny rendering and flexible objects was Second Life&#8217;s own XMLHTTPRequest: <a href="http://secondlife.com/badgeo/wakka.php?wakka=llHTTPRequest">llHTTPRequest</a>. Using asynchronous callbacks, it gives the platform an important new capability: communication with the web on demand. A lot of what we are learning about AJAX makes sense here, in this world of Asynchronous Lindenscriptinglanguage And Some-sort-of-data (ALAS!)</p>
<p>I&#8217;ve spent a few hours hacking on some toy objects with this new capability, starting with the mashup de rigeur: <a href="http://flickr.com/">Flickr</a> integration. My home in SL now sports a simple picture frame. Touch it and it looks up your avatar name to see what your favourite Flickr tag is, picks a random picture with that tag from Flickr and displays it on its surface. If it hasn&#8217;t met you before, it asks you to tell it what tag to use.</p>
<p>Because Second Life is so wonderfully visual, here&#8217;s a little demo movie that I recorded with <a href="http://plasticbag.org/">Tom Coates</a>:</p>
<p><object type="application/x-shockwave-flash" width="640" height="480" wmode="transparent" data="http://www.hackdiary.com/misc/sl_flickrscreen/flvplayer.swf?file=http://www.hackdiary.com/misc/sl_flickrscreen/movie.flv"><param name="movie" value="http://www.hackdiary.com/misc/sl_flickrscreen/flvplayer.swf?file=http://www.hackdiary.com/misc/sl_flickrscreen/movie.flv" /><param name="wmode" value="transparent" /></object></p>
<p><span id="more-86"></span></p>
<h2>how it works</h2>
<p><i>UPDATE: <a href="http://www.hackdiary.com/archives/000087.html">source code is available</a></i></p>
<p>LSL is a rather limited C-like scripting language with a couple of interesting features (like its native support for state-machines to respond to events). As far as I&#8217;ve been able to work out, it lacks the ability to parse XML or JSON, and has no native associative array type. Objects can store some state, but it is a little fragile in the face of resets. I decided not to try accessing the Flickr API directly, or using local storage. Instead, I created a stateful web companion for my new object with a bit of serverside Rails.</p>
<p>Strings and lists are the most useful datatypes available, and there&#8217;s a function similar to perl&#8217;s &#8216;split&#8217; for turning strings into lists. This makes pipe-delimited a reasonable format for simple data. I created the following web resources:</p>
<p><code>/seen?user=SomeAvatar</code>: Records that the object has sensed the presence of SomeAvatar</p>
<p><code>/touched?user=SomeAvatar</code>: In response to a &#8216;touch&#8217; event, consults the database for the user&#8217;s tag and asks the Flickr API for a random photo with that tag. Returns a string such as <code>http://flickr.com/some/photo.jpg|SomeAvatar|</code>, or <code>UNKNOWN</code>.</p>
<p><code>/set_tag?user=SomeAvatar&#038;tag=sausages</code>: Records SomeAvatar&#8217;s favourite tag</p>
<p>The HTTP system is nicely responsive: using the web as my object&#8217;s outboard brain added only a tiny bit of latency to the mix. The asynchronous model allowed other processing to continue while waiting for a response. With these URLs ready to respond, I wired up the appropriate Second Life events using <a href="http://secondlife.com/badgeo/wakka.php?wakka=llSensorRepeat">llSensorRepeat</a> and <a href="http://secondlife.com/badgeo/wakka.php?wakka=sensor">sensor</a> for presence, <a href="http://secondlife.com/badgeo/wakka.php?wakka=llListen">llListen</a>  and <a href="http://secondlife.com/badgeo/wakka.php?wakka=listen">listen</a> to respond to spoken commands, and <a href="http://secondlife.com/badgeo/wakka.php?wakka=touch_start">touch_start</a> for the physical interface. The <a href="http://secondlife.com/badgeo/wakka.php?wakka=llParcelMediaCommandList">llParcelMediaCommandList</a> features are confusing (and only work on land you own, with movie streaming enabled in the client), but I found the <a href="http://www.simteach.com/wiki/index.php?title=SL_FreeView">source code for Freeview</a> to be a useful reference.</p>
<p>If you&#8217;re interested in this code, or would like to see a demo, you can find me from time to time in SL as Matt Basiat.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.hackdiary.com/2006/05/29/alas-second-life-web-20-in-a-virtual-world/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Stemming tags, and one website to the tune of another</title>
		<link>http://www.hackdiary.com/2005/01/30/stemming-tags-and-one-website-to-the-tune-of-another/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=stemming-tags-and-one-website-to-the-tune-of-another</link>
		<comments>http://www.hackdiary.com/2005/01/30/stemming-tags-and-one-website-to-the-tune-of-another/#comments</comments>
		<pubDate>Sun, 30 Jan 2005 23:42:14 +0000</pubDate>
		<dc:creator>Matt Biddulph</dc:creator>
				<category><![CDATA[web]]></category>

		<guid isPermaLink="false">http://www.hackdiary.com/?p=68</guid>
		<description><![CDATA[
]]></description>
			<content:encoded><![CDATA[<p>del.icio.us is still giving me food for thought. Here are two toys I&#8217;ve made recently: a <a href="http://www.hackdiary.com/stemtags/">tag stemming tool</a> that helps you tidy up your tagging using the Porter algorithm, and a (Flash) screen-recorded demo of <a href="http://www.hackdiary.com/misc/firefox-delicious-demo.html">del.icio.us seamlessly embedded in the BBC Radio 3 website</a>.</p>
<p>(Maximize your browser window! Apologies for the slow playback speed of the movie; although you&#8217;re welcome to <a href='http://www.hackdiary.com/src/pips.user.js'>browse the javascript</a>, it&#8217;s something of a pain to get it running on your own browser. I&#8217;m looking at how I can turn it into a reusable and configurable Firefox extension, but for now it&#8217;s just a demo built with <a href='http://greasemonkey.mozdev.org/'>Greasemonkey</a>.)</p>
<p><i>UPDATE: I had to demo this to a mixed audience at the BBC this afternoon, so I put together <a href="http://www.hackdiary.com/slides/pips-deli/">some quick slides</a> to help me explain the step-by-step process that goes on behind the scenes. Perhaps someone else will find them useful too.</i></p>
<p><span id="more-68"></span><br />
So, what are you seeing in this movie? It&#8217;s nothing more than a bit of DHTML trickery that imports a subset of del.icio.us functionality into an existing website. I chose BBC Radio 3 because it has a wealth of content with plenty of potential for horizontal navigation, and because it has a clearly-defined <a href='http://www.plasticbag.org/archives/2004/06/developing_a_url_structure_for_broadcast_radio_sites.shtml'>canonical URL per programme</a> and thereby gains the maximum benefit from being tagged. By creating a symbiotic relationship between the two sites in your browser, you gain an overlaid cross-site navigation that doesn&#8217;t exist in the site as it currently stands, and del.icio.us users see <a href='http://del.icio.us/sekrit/'>your tagging of Radio 3 pages</a> in the <a href='http://del.icio.us/tag/bach'>wider context</a>.</p>
<p>There are several things that I enjoy in this demo. In no particular order:</p>
<ul>
<li>I like the immediate feedback that you can get from adding a tag to a programme. Decide that &#8216;cello&#8217; is relevant, and within seconds you see a bunch of other cello programmes. It&#8217;s common for content management systems to demand &#8216;metadata&#8217; or &#8216;keywords&#8217; of you when you file content, but rare that there&#8217;s an easy way to get a feel for what value you&#8217;ve added by doing so.</li>
<li>This was my first real attempt to wrangle the XMLHTTPRequest system, and it was a satisfying one. I did learn one or two things, including some problems with <a href="http://jpspan.sourceforge.net/wiki/doku.php?id=tutorials:asynchronouscalls">asynchronous and synchronous modes of operation</a>.</li>
<li>Looking beyond the specific application (tagging) used here, notice the two-way benefit that came from the mashup of one site&#8217;s service with another&#8217;s content. I like the idea that domain-specific use on Radio 3 leads to general usefulness on del.icio.us.</li>
</ul>
<p>There are many more possibilities to explore. The demo uses a single user on del.icio.us for all tagging. Imagine instead being able to select between different tag sets to overlay &#8211; one to guide newcomers to classical music, another designed for experts and old hands, a third to explore the history of a particular instrument or musical movement.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.hackdiary.com/2005/01/30/stemming-tags-and-one-website-to-the-tune-of-another/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>del.icio.us experiments redux</title>
		<link>http://www.hackdiary.com/2004/10/06/delicious-experiments-redux/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=delicious-experiments-redux</link>
		<comments>http://www.hackdiary.com/2004/10/06/delicious-experiments-redux/#comments</comments>
		<pubDate>Wed, 06 Oct 2004 22:26:08 +0000</pubDate>
		<dc:creator>Matt Biddulph</dc:creator>
				<category><![CDATA[web]]></category>

		<guid isPermaLink="false">http://www.hackdiary.com/?p=63</guid>
		<description><![CDATA[
]]></description>
			<content:encoded><![CDATA[<p>About a month ago, I posted about some <a href="http://www.hackdiary.com/archives/000060.html">del.icio.us experiments</a> I&#8217;d been doing, and published the python wrapper I&#8217;d been using. Of course, the post itself was another del.icio.us experiment.</p>
<p><span id="more-63"></span><br />
Guessing that the post would be perfect linkfodder, I was curious to see how the del.icio.us community would take the bait. It went down pretty well. To date, it&#8217;s been posted by 153 people.</p>
<p>I knocked up a bit of code to list the tags that the item got, something like this:</p>
<pre class="codeblock">
tags = {}
for post in d.Href("http://www.hackdiary.com/archives/000060.html"):
for tag in post.tags:
tags[tag.name] = tags.get(tag.name,0) + 1
</pre>
<p>And what was the emergent categorisation of my piece? It was a pretty broad list, demonstrating several different styles of tagging strategy:</p>
<ul>
<li><i>97 times</i> python</li>
<li><i>64 times</i> del.icio.us</li>
<li><i>37 times</i> delicious</li>
<li><i>22 times</i> programming</li>
<li><i>10 times</i> hacks</li>
<li><i>7 times</i> tools</li>
<li><i>6 times</i> web</li>
<li><i>4 times</i> software</li>
<li><i>4 times</i> dev</li>
<li><i>3 times</i> rdf</li>
<li><i>3 times</i> bookmarks</li>
<li><i>2 times</i> unread</li>
<li><i>2 times</i> social</li>
<li><i>2 times</i> semweb</li>
<li><i>2 times</i> library</li>
<li><i>2 times</i> hacking</li>
<li><i>2 times</i> hack</li>
<li><i>2 times</i> development</li>
<li><i>2 times</i> del</li>
<li><i>2 times</i> api</li>
<li>***</li>
<li>*checkout</li>
<li>*touch</li>
<li>blog</li>
<li>blogentry</li>
<li>blognew</li>
<li>bookmark</li>
<li>booty</li>
<li>checkout</li>
<li>clipz</li>
<li>code</li>
<li>collaboration</li>
<li>collaborativefiltering</li>
<li>computer</li>
<li>cool</li>
<li>del.icio.us social</li>
<li>delic</li>
<li>delicios</li>
<li>devel</li>
<li>extensions</li>
<li>geek_stuff</li>
<li>generalreference</li>
<li>greatminds</li>
<li>icio.us</li>
<li>images</li>
<li>metabookmark</li>
<li>metadelicious</li>
<li>namespaces</li>
<li>newnet</li>
<li>news/articles</li>
<li>read</li>
<li>readlater</li>
<li>rss</li>
<li>ruby</li>
<li>semanticweb</li>
<li>socialsoftware</li>
<li>todo</li>
<li>viapopular</li>
<li>webdev</li>
<li>webservices/delicious</li>
<li>webtech</li>
<li>xml</li>
<li>yummy</li>
<li>z:active</li>
<li>z:del.icio.us_experiments</li>
</ul>
<p>I was also curious about what the pattern of linking would be over time. Mostly I expected the standard power law curve, reflecting an initial flurry of attention followed by a swift drop-off. I put together a quick chart in openoffice:</p>
<p><center><img src="http://www.hackdiary.com/misc/delposts.jpg" /></center></p>
<p>My hypothesis appears to be largely borne out by the results, but there are some ripples in the tail of the curve where the post regains momentary popularity then tails off again. I&#8217;m guessing this happens when the link is discovered by a new del.icio.us sub-community (python programmers, etc) who read each other&#8217;s del.icio.us streams via rss or inbox.</p>
<p>A few more numbers: of 153 posts, only 19 used a title other than the page title I used in the post. Only 44 posts gave an extended description. Most users (64) assigned just two tags, with the rest distributed like this:</p>
<p><center><img src="http://www.hackdiary.com/misc/deltags.jpg" /></center></p>
]]></content:encoded>
			<wfw:commentRss>http://www.hackdiary.com/2004/10/06/delicious-experiments-redux/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>del.icio.us experiments</title>
		<link>http://www.hackdiary.com/2004/09/09/delicious-experiments/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=delicious-experiments</link>
		<comments>http://www.hackdiary.com/2004/09/09/delicious-experiments/#comments</comments>
		<pubDate>Thu, 09 Sep 2004 22:45:22 +0000</pubDate>
		<dc:creator>Matt Biddulph</dc:creator>
				<category><![CDATA[web]]></category>

		<guid isPermaLink="false">http://www.hackdiary.com/?p=62</guid>
		<description><![CDATA[
]]></description>
			<content:encoded><![CDATA[<p>Maybe you&#8217;re a python programmer. Maybe you think <a href="http://del.icio.us">del.icio.us</a> is kinda cool. Maybe you&#8217;d like to be able to do this:</p>
<pre class="codeblock">&gt;&gt;&gt; print len(delicious.Href("http://www.vim.org/").posts())
8</pre>
<p>or:</p>
<pre class="codeblock">&gt;&gt;&gt; for post in delicious.users.mattb(delicious.tags.hackney):
...   print post.description,post.href
&nbsp;
Phil Gyford selling his flat http://www.gyford.com/strandbuilding/
POGO http://www.pogocafe.co.uk/
hackneycentral http://www.hackneycentral.org/
Armadillo http://www.armadillorestaurant.co.uk/
</pre>
<p>or perhaps do something much, much cooler. <a href="http://www.hackdiary.com/src/delicious.py">delicious.py</a> is for you.</p>
<p><i>UPDATE:</i> Since I published this code, Joshua Schachter has made the <a href="http://lists.burri.to/pipermail/delicious-dev/2004-September/000058.html">rules around use of del.icio.us APIs</a> clearer. So that you can stay within these limits, you should be aware that no call to a method in delicious.py will cause more than one HTTP request to del.icio.us. This means that it&#8217;s left up to you to time your requests appropriately and politely, but at least you know that the code won&#8217;t spam del.icio.us of its own accord.</p>
<p><span id="more-62"></span><br />
As with <a href="http://www.hackdiary.com/archives/000052.html">audioscrobbler a few months back</a>, <a href="http://del.icio.us">del.icio.us</a> has been around for a while but last week really started to show up on <a href="http://bloglines.com/public/mbiddulph">the radar</a>. I&#8217;ve been poking it with bits of code in the hope of achieving enlightenment.</p>
<p>I started by posting a few things: doing some <a href="http://del.icio.us/mattb">linklogging</a>, scripting an extraction of my items from <a href="http://pants.heddley.com">The Daily Chump</a> into <a href="http://del.icio.us/mattb/dailychump">a tag of their own</a>, and extracting and posting pictures from <a href="http://www.picdiary.com">my photo site</a> under <a href="http://del.icio.us/picdiary">a username of their own</a> with simple tags derived from the foaf and wordnet RDF annotations in that site&#8217;s <a href="http://www.picdiary.com/rss/www2004.rss">RSS</a>. I <a href="http://rdfig.xmlhack.com/2004/09/08/2004-09-08.html#1094679301.837777">pointed</a> to the picdiary experiment on the Semantic Web Interest Group IRC channel, and got some <a href="http://www.ilrt.bris.ac.uk/discovery/chatlogs/rdfig/2004-09-08.html#T21-35-01">interesting discussion</a> out of it.</p>
<p>The python wrapper that I used for most of this is fairly simple in structure. It uses a combination of <a href="http://www.redland.opensource.ac.uk">Redland</a>&#8216;s RDF parsing (for the pleasantly fulsome <a href="http://del.icio.us/rss/mattb">RSS</a> bits) and <a href="http://www.xmlsoft.org">libxml2</a>&#8216;s HTML parser for the bits I needed to screenscrape. Href, User and Tag objects are used throughout, and can be used in combination. You can ask a User for all the posts from the user, or restrict it by passing one or more Tags into its post() method, and the converse for the Tags. It&#8217;s got some nice python automagic, creating Tag and User objects when you use syntax like &#8220;<code>delicious.tags.foo</code>&#8221; or &#8220;<code>delicious.users.bar</code>&#8221; and providing <code>__iter__</code> methods that allow the &#8220;<code>for post in delicious.tags.foo</code>&#8221; usage.</p>
<p>There&#8217;s a bit of an example in the module&#8217;s <code>__main__</code> code, which will take a username and find all the users that also posted two or more URLs posted by that user, and tell you what tags others used for the URLs posted by that user.</p>
<p>This code has evolved out of my own needs, rather than having a goal in particular. If you find it useful, I&#8217;d love to <a href="mailto:mb@hackdiary.com">hear from you</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.hackdiary.com/2004/09/09/delicious-experiments/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Adventures in XHTML and CSS</title>
		<link>http://www.hackdiary.com/2004/08/01/adventures-in-xhtml-and-css/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=adventures-in-xhtml-and-css</link>
		<comments>http://www.hackdiary.com/2004/08/01/adventures-in-xhtml-and-css/#comments</comments>
		<pubDate>Sun, 01 Aug 2004 19:34:08 +0000</pubDate>
		<dc:creator>Matt Biddulph</dc:creator>
				<category><![CDATA[web]]></category>

		<guid isPermaLink="false">http://www.hackdiary.com/?p=59</guid>
		<description><![CDATA[
]]></description>
			<content:encoded><![CDATA[<p>For my dad&#8217;s 60th birthday, my family and I produced a print book of memories and photographs from his life. I typeset it using OpenOffice 1.1 and sent it to the printers as PDF, which worked just fine. Today I&#8217;ve been creating an <a href="http://www.mikebiddulph.org/selectiveperception/">online version</a> from the original document.</p>
<p><span id="more-59"></span><br />
Although the layout was quite simple (two columns of text and images), the automatic HTML export was pretty bad &#8211; full of ugly FONT tags and with the images in all the wrong places. I ended up starting again from scratch, hand-producing XHTML and CSS with nothing but a text editor and pasting the text in from the original.</p>
<p>The resulting HTML is <a href="http://validator.w3.org/check?uri=http%3A%2F%2Fwww.mikebiddulph.org%2Fselectiveperception%2Fjobs.html">valid</a> XHTML 1.1 strict, and degrades nicely in lynx. I&#8217;ve not yet been able to check it in many other browsers, but it seems to look OK in IE6 running under <a href="http://www.netraverse.com">win4lin</a> (I don&#8217;t have any machines natively running Windows).</p>
<p>I used to shy away from frontend web work, finding it awfully fiddly and dull. Over the last year or two I&#8217;ve discovered that making simple semantic HTML and styling it with CSS can be quite a satisfying activity.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.hackdiary.com/2004/08/01/adventures-in-xhtml-and-css/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Crawling the Semantic Web</title>
		<link>http://www.hackdiary.com/2004/02/12/crawling-the-semantic-web/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=crawling-the-semantic-web</link>
		<comments>http://www.hackdiary.com/2004/02/12/crawling-the-semantic-web/#comments</comments>
		<pubDate>Thu, 12 Feb 2004 19:52:47 +0000</pubDate>
		<dc:creator>Matt Biddulph</dc:creator>
				<category><![CDATA[rdf]]></category>
		<category><![CDATA[talks]]></category>
		<category><![CDATA[web]]></category>

		<guid isPermaLink="false">http://www.hackdiary.com/?p=48</guid>
		<description><![CDATA[
]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve had a proposal for a paper accepted for <a href="http://www.xmleurope.com/2004/">XML Europe 2004</a>. Yay! Looking forward to meeting lots of old friends and making new ones in Amsterdam in April. Let me know if you&#8217;re going to be there. Here&#8217;s what I submitted:</p>
<p><span id="more-48"></span><br />
This presentation examines the problem of semantic web crawling &#8211; following links from document to document and gathering the results for searching. Unlike centralised web search facilities, semantic web agents will be distributed, personalised and often highly domain-specific. How can we hold the entire world inside our laptops?</p>
<p>The W3C vision for the Semantic Web is &#8220;an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation&#8221;. It is &#8220;the representation of data on the World Wide Web&#8221;, expressed using the Resource Description Framework (RDF). Just as the web grew in usefulness as it was traversed, indexed and searched by systems such as Lycos, Altavista and Google, the semantic web requires technologies that can crawl, aggregate and query the RDF data.</p>
<p>This talk presents a modular semantic web crawler designed to explore the provision of services to applications. It highlights differences from and similarities to existing web search systems that gather their source data from the public web.</p>
<p>Rather than have web crawling and aggregation built into every semantic web application, agents will be able to call on aggregation services via webservices, be notified of new resources by publish-and-subscribe mechanisms, or simply receive a stream of RDF statements as they are found. A number of different RDF storage mechanisms are tested, including traditional relational databases and RDF toolkits such as Redland.</p>
<p>Applications will be discussed in the areas of social networks (using the Friend Of A Friend vocabulary) and personal publishing. Models for providing centralised services to lighter-weight agents (such as mobile applications) are explored, and important issues such as trust and attribution of information are covered.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.hackdiary.com/2004/02/12/crawling-the-semantic-web/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Hackdiary redesign in progress</title>
		<link>http://www.hackdiary.com/2004/02/01/hackdiary-redesign-in-progress/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=hackdiary-redesign-in-progress</link>
		<comments>http://www.hackdiary.com/2004/02/01/hackdiary-redesign-in-progress/#comments</comments>
		<pubDate>Sun, 01 Feb 2004 18:21:29 +0000</pubDate>
		<dc:creator>Matt Biddulph</dc:creator>
				<category><![CDATA[web]]></category>

		<guid isPermaLink="false">http://www.hackdiary.com/?p=47</guid>
		<description><![CDATA[
]]></description>
			<content:encoded><![CDATA[<p>It&#8217;s time to rework the hackdiary site and get rid of the nasty design that reeks of a lazy movabletype user. The HTML could do with a tidy-up too. It&#8217;ll probably look a bit broken for now.</p>
<p>It&#8217;s now about 80% of the way there, with a colour scheme inspired by a <a href="http://www.picdiary.com/new/shortestday/3">photo of a warning sign</a> that I took somewhere in East London last year.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.hackdiary.com/2004/02/01/hackdiary-redesign-in-progress/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

