Moyles-proof code

April 11th, 2004  |  Published in perl, python  |  12 Comments

While the rest of the UK was enjoying a Good Friday lie-in, I dragged myself into Yalding House (home of BBC Radio 1) at 7.30am. I was there to see our new text message system get its first live broadcast use on the Chris Moyles show in a preview of the Ten Hour Takeover.


This is a long piece… skip to the end if you want to see how it went on the day.

The project

I work for the beeb at Radio and Music Interactive, in the software architecture team. We’re responsible for the behind-the-scenes plumbing, running the systems behind things like LiveText, and providing toolkits for the guys on the apps team who build public services. When Radio 1 came up with the takeover day (ten hours of radio with music selected entirely by the listeners via text message and phone), they asked us to build them something to deal with the expected deluge of messages.

As you probably know, text messaging is huge in the UK and it’s been taken up enthusiastically by the BBC’s radio networks. Radio 1’s incoming SMS provider have a web console that’s used by the broadcast assistants and DJs in the studio. It works rather like an email inbox, and like most email clients the user interface treats the incoming messages as a chronological stream of text rather than a database to be mined for information. This works for running a regular radio show, but the takeover day needs special treatment.

Since the web console had been enough for the networks so far, this was our department’s first project based on machine-processing of SMS. Knowing how important text is to the radio networks, we wanted to build a foundation of components that would support future applications. As every software engineer knows, it’s hard to “build one to throw away” after the users have got their hands on it.

The build

In our team the language of choice is Python, whereas the apps team prefer Perl. Contrary to what some might expect, this mixed economy hasn’t yet given us any interop problems. We’re very keen on using abstractions such as asynchronous messaging, RESTful web services and relational databases to act as language-agnostic intermediaries for our data flows.

My immediate thoughts about the architecture were that it should be built from a series of loosely-coupled layers. Even solving the simple problem of how to pull in a high-volume external XML feed of text messages and redistribute it to applications would show benefits later if it was simple to hook new code into. Further layers could process the data, with a web user interface layered on top of that. Working from a specification of the XML format used by the provider, Paul Clifford quickly built a gateway that collected messages and rebroadcast them over a message bus. Using asynchronous messaging as a transport gives us a good level of resilience and clusterability for free, while keeping the logic very simple. Building on this, he then started work on code to push messages sent to Radio 1 into a database and analyse their contents.

We wanted to provide simple, useful features to the users such as freetext search, but we thought we could do more given that the incoming texts would have a certain amount of implicit structure. The programme was going to ask people to text in “Artist – Track – Name – Dedication”, but we assumed that there would be a lot of variation in the accuracy of the texts and a lack of consistent punctuation to act as delimiters. We planned a system of ‘fuzzy matching’ to group together texts using stopwords, sounds-alike phonetic matching and statistical analysis of text prefixes. If you see enough text strings beginning with words that sound like “Bob Dylan” then you can start to guess that Bob Dylan might be an artist, rather than a track called Dylan by a band named Bob. Paul did some great work on refining these techniques, and an on-air test a few weeks before the real broadcast showed that we could pretty accurately infer that “Weezer”, “Wheezer”, “Weazer” and “Wheeser” were all the creators of tracks called “Buddy Holly”, “Budy Holly” and “Buddy Holy”.

While Paul worked on the analysis, I created a Perl wrapper around the database so that the apps team could get to work on the user interface. At the same time, I passed on a few lines of sample client code to Matt Webb to see what he could do with it. When working on a toolkit or API, I always like to have more than one client using it, to make sure that the requirements of the main project haven’t kept the interface from being generic and useful in other contexts. He built a nice rolling graph of texts received per minute, tapping in at the message bus layer. As with any messaging system, adding a new message recipient affected neither the code providing the messages nor any other clients of the bus. Neil Slater and Conal Jones in the apps team did great work building a web interface with guidance from the Radio 1 team, and within a few weeks we were ready to go live.

On air

Chris Moyles likes to break things. He once managed to get listeners to send 14,000 texts at once. When I arrived, Aled the BA asked me how many messages the new system could take, and told me that Chris was going to do his best to overload it. At 8.45am, Chris started trailing the feature, and the text messages started trickling in. Then flooding. I was impressed: on a UK Bank Holiday, thousands of people were not just listening but involved enough to get on their phones and interact with the programme.

Chris and his team did a great job of understanding our system. They caught onto the freetext search right away and used it to find interesting tracks, and messages to read out from people who requested them. We put a ‘quick stats’ box on every screen of the app, which they used to goad on the listeners: “3000 text messages so far. That is simply not enough. I want that quadrupled!”. Once there were enough messages in the system for the pattern matching to kick in, they used it to navigate through the data and see which tracks by which artists were getting the most attention. As the listeners realised that the playlist had gone out the window and they could request anything, we got some great stuff coming in. Our system even got a little mention on air as Chris teased us for matching an Elvis Costello track to a bunch of requests for Elvis.

As the programme started I was sitting nervously next-door to the studio tailing logfiles and watching process tables, but as it went on and the code coped with everything they could throw at it, I relaxed and enjoyed the show. I don’t normally listen to the Radio 1 Breakfast Show but listening to this gave me new respect for the skills of a daytime DJ, sitting in a tiny room in a basement working a crowd of millions that they never see.

If you want to hear the show, there’s a Listen Again stream available from the BBC until April 16th. The request section is in the last hour. Tune into Radio 1 on Monday April 12th from 10am for the ten hour version.

Responses

  1. cityofsound says:

    April 12th, 2004 at 2:44 am (#)

    Takeover Radio

    If you’re reading this between 10am-8pm GMT on Easter Monday 2004, right now there’s a system we’ve built at BBC Radio & Music Interactive which is driving Radio 1. During this time, those of you in the UK can text

  2. Raw says:

    April 12th, 2004 at 10:57 am (#)

    In the BBC Trenches

    Matt Biddulph on producing hackdiary: Moyles-proof code – high volume SMS messaging to a national radio show (Chris Moyles is…

  3. plasticbag.org says:

    April 12th, 2004 at 11:21 pm (#)

    More fun from Radio and Music Interactive…

    Those of you in the UK today may have stumbled upon Radio 1’s Ten Hour Takeover – in which members of the listening public got to choose what got played – only using the power of text messaging. What you…

  4. Karinski says:

    April 13th, 2004 at 1:46 am (#)

    The Nation’s Favourite

    It’s no big secret that I’m a great fan of BBC Radio 1, and listen to and enjoy a great deal of their output. Today, however, saw Radio 1 take a break from the norm and run without a playlist…

  5. christopher-hill.com says:

    April 13th, 2004 at 1:45 pm (#)

    Ten Hour Takeover

  6. Listen to Musak says:

    April 13th, 2004 at 1:51 pm (#)

    It’s A Takeover

    As each day passes I have more and more respect for the talents of Chris Moyles.

  7. Tom Taylor says:

    April 13th, 2004 at 4:32 pm (#)

    Well done, it sounds like a pretty impressive and robust system. Presumably all of this information will be stored and used the next time the system runs?

  8. Martin's Linkdumps says:

    April 13th, 2004 at 10:27 pm (#)

    Moyles-proof code

    http://www.hackdiary.com/archives/000051.html

  9. Reprocessed says:

    May 15th, 2004 at 6:42 pm (#)

    Back again

    After a brief period of unpublicised writing activity a few months ago, followed by a period of frenetic back-end new-site coding that, as yet, hasn’t produced a usable new site…

  10. Reprocessed says:

    July 15th, 2004 at 2:07 pm (#)

    MMS plumbing at the BBC

    When Matt Biddulph (R&Mi T&D Architecture Team Gruppenfuhrer) wrote about Moyles-proof code he was talking about processing incoming SMS. The core plumbing of that system was a way of turning…

  11. dandr.org » Blog Archive » Wasting time? says:

    November 11th, 2008 at 12:08 am (#)

    […] randomly surfing from twitter to blogs, I enjoyed reading one of Matt B’s posts about designing a system to deal with lots of text messages at the BBC.  Linking different software components and sources together like he described is […]

  12. Alex Muller says:

    November 13th, 2008 at 2:38 pm (#)

    This is really interesting – I had no idea that you wrote the code to handle SMS messages that's used on the Chris Moyles show, even though it's used (and mentioned on air) at least a couple of times a week.

    Very cool :)