Algorithmic recruitment with GitHub

February 10th, 2010  |  Published in web

In my new job in Berlin I’ve been asked to hire some people to help prototype new, secret projects. Berlin has a superb tech scene but as I’m new in town it’s taking me a little time to get to know everyone. While that’s going on, I wrote some code to help me explore Berlin’s developer community.

When I’m hiring, one of the things I always want to see is evidence of personal projects. Over the last two years, GitHub has become an amazing treasure trove of code, with the best social infrastructure I’ve ever seen on a developer site. GitHub profiles let the user set their location, so I started with a few web searches for Berlin developers. This finds hundreds of interesting people, but how do I prioritise them?

Another thing that I look for when building a good team is someone’s personal network. I’ve always believed strongly in spending lots of time at conferences meeting passionate people who are smarter than me. A good developer can make themselves even more productive by knowing who to email, IM or DM to answer a question when they’re stuck.

A recent article by Stowe Boyd on centrality and influence in social networks reminded me of some of the network analysis we use behind the scenes calculating recommendations for the Dopplr Social Atlas. So I wrote some code to query the GitHub API and analyse the social graph of the Berlin subset of their users.

The JRuby code uses Yahoo BOSS to do the web search. After querying the GitHub API for each user’s followers it builds an in-memory graph using the Java Universal Network/Graph Framework. Then it ranks each user node in the graph using the Betweenness Centrality algorithm. You can see the simple source code on my github.

To sanity-check the results I ran it for a couple of cities I already know well: London and San Francisco. Here are the top 5 for each, which seem quite plausible to me:

San Francisco

  1. Chris Wanstrath, GitHub
  2. Tatsuhiko Miyagawa, Six Apart
  3. Leah Culver, Six Apart
  4. Square Inc
  5. Aman Gupta, ruby eventmachine maintainer


  1. James Darling
  2. London Ruby User Group
  3. Mark Norman Francis
  4. Dan Webb (recently moved to Twitter in SF)
  5. Carlos Villela, Thoughtworks

My choice of metric biases these lists towards connectedness and influence — it can’t measure ability. It’s only measuring GitHub users, and they are biased towards Ruby, Perl and Javascript. But seeing names there that I trust gives me confidence that it’ll help me find interesting people in Berlin.

Hopefully some of those people are reading this blog post right now. Others outside Berlin might be interested to know that Nokia does a superb job of relocating people, with everything taken care of by shipping companies and local agents. If you love the web, Javascript, mobile, user experience, social networks, location, enormous datasets and currywurst, you should get in touch.

Alas, Second Life! Web 2.0 in a virtual world

May 29th, 2006  |  Published in web

Second Life has been my new hacking obsession ever since I bought a laptop fast enough to run it. I don’t spend a lot of time socialising in the gameworld, but I am fascinated by the possibilities for makers of new user interfaces, useful virtual objects and playful toys. With every object being scriptable, aware and active, it’s a proving ground for Everyware.

Version 1.10 was released last week, and hidden among the exciting new visual modeling possibilities of shiny rendering and flexible objects was Second Life’s own XMLHTTPRequest: llHTTPRequest. Using asynchronous callbacks, it gives the platform an important new capability: communication with the web on demand. A lot of what we are learning about AJAX makes sense here, in this world of Asynchronous Lindenscriptinglanguage And Some-sort-of-data (ALAS!)

I’ve spent a few hours hacking on some toy objects with this new capability, starting with the mashup de rigeur: Flickr integration. My home in SL now sports a simple picture frame. Touch it and it looks up your avatar name to see what your favourite Flickr tag is, picks a random picture with that tag from Flickr and displays it on its surface. If it hasn’t met you before, it asks you to tell it what tag to use.

Because Second Life is so wonderfully visual, here’s a little demo movie that I recorded with Tom Coates:

Read the rest of this entry »

Stemming tags, and one website to the tune of another

January 30th, 2005  |  Published in web

del.icio.us is still giving me food for thought. Here are two toys I’ve made recently: a tag stemming tool that helps you tidy up your tagging using the Porter algorithm, and a (Flash) screen-recorded demo of del.icio.us seamlessly embedded in the BBC Radio 3 website.

(Maximize your browser window! Apologies for the slow playback speed of the movie; although you’re welcome to browse the javascript, it’s something of a pain to get it running on your own browser. I’m looking at how I can turn it into a reusable and configurable Firefox extension, but for now it’s just a demo built with Greasemonkey.)

UPDATE: I had to demo this to a mixed audience at the BBC this afternoon, so I put together some quick slides to help me explain the step-by-step process that goes on behind the scenes. Perhaps someone else will find them useful too.

Read the rest of this entry »

del.icio.us experiments redux

October 6th, 2004  |  Published in web

About a month ago, I posted about some del.icio.us experiments I’d been doing, and published the python wrapper I’d been using. Of course, the post itself was another del.icio.us experiment.

Read the rest of this entry »

del.icio.us experiments

September 9th, 2004  |  Published in web

Maybe you’re a python programmer. Maybe you think del.icio.us is kinda cool. Maybe you’d like to be able to do this:

>>> print len(delicious.Href("https://www.vim.org/").posts())


>>> for post in delicious.users.mattb(delicious.tags.hackney):
...   print post.description,post.href
Phil Gyford selling his flat https://www.gyford.com/strandbuilding/
POGO https://www.pogocafe.co.uk/
hackneycentral https://www.hackneycentral.org/
Armadillo https://www.armadillorestaurant.co.uk/

or perhaps do something much, much cooler. delicious.py is for you.

UPDATE: Since I published this code, Joshua Schachter has made the rules around use of del.icio.us APIs clearer. So that you can stay within these limits, you should be aware that no call to a method in delicious.py will cause more than one HTTP request to del.icio.us. This means that it’s left up to you to time your requests appropriately and politely, but at least you know that the code won’t spam del.icio.us of its own accord.

Read the rest of this entry »

Adventures in XHTML and CSS

August 1st, 2004  |  Published in web

For my dad’s 60th birthday, my family and I produced a print book of memories and photographs from his life. I typeset it using OpenOffice 1.1 and sent it to the printers as PDF, which worked just fine. Today I’ve been creating an online version from the original document.

Read the rest of this entry »

Crawling the Semantic Web

February 12th, 2004  |  Published in rdf, talks, web

I’ve had a proposal for a paper accepted for XML Europe 2004. Yay! Looking forward to meeting lots of old friends and making new ones in Amsterdam in April. Let me know if you’re going to be there. Here’s what I submitted:

Read the rest of this entry »

Hackdiary redesign in progress

February 1st, 2004  |  Published in web

It’s time to rework the hackdiary site and get rid of the nasty design that reeks of a lazy movabletype user. The HTML could do with a tidy-up too. It’ll probably look a bit broken for now.

It’s now about 80% of the way there, with a colour scheme inspired by a photo of a warning sign that I took somewhere in East London last year.