XMLEurope 2003 Talk Slides
May 7th, 2003 | Published in photos, rdf, rss, talks, wordnet | 2 Comments
This morning I did my talk (A Semantic Web Shoebox – Annotating Photos with RSS and RDF) at XMLEurope 2003. The slides are now available.
May 7th, 2003 | Published in photos, rdf, rss, talks, wordnet | 2 Comments
This morning I did my talk (A Semantic Web Shoebox – Annotating Photos with RSS and RDF) at XMLEurope 2003. The slides are now available.
April 21st, 2003 | Published in java, rdf | 8 Comments
I wrote an RDF crawler (aka scutter) using Java and the Jena RDF toolkit that spiders the web gathering up semantic web data and storing it in any of Jena’s backend stores (in-memory, Berkeley DB, mysql, etc). Download it here.
April 13th, 2003 | Published in java, xml | 10 Comments
Often I find I need to pull out a bit of information from a webpage to reuse inside some code. I’ve always done this from the commandline using a combination of wget, HTML TIDY and xsltproc. Recently I’ve been doing the same thing in program code using some very handy tools written in Java.
Note: the example code below has been updated.
April 9th, 2003 | Published in java | 6 Comments
If you’re going to download a resource over HTTP from a URL more than once, there are a couple of features of HTTP you should make sure you’re using. By giving the server some metadata about what you saw when you last downloaded the resource, it can give you a status code indicating that the resource hasn’t changed and you should continue to use the version you already have.
This issue has been highlighted recently by the bandwidth load caused by the growth in popularity of RSS readers, which repeatedly download RSS files looking for changes. There’s a good writeup of the details at The Fishbowl. I didn’t find any sample Java source when I went looking recently, so here’s some code.
April 1st, 2003 | Published in perl, rss | 3 Comments
Noticing the variety in popularity amongst the different topics that the pieces on this site cover, I added a “Most Popular Entries” sidebar to keep track of what people are reading. This is done with a simple application of Apache::ParseLog, XML::RSS, movabletype’s XML-RPC interface and a movabletype plugin.
March 14th, 2003 | Published in perl, rdf, talks | 4 Comments
Last night I gave a lightning talk at the london.pm techmeet that attempted to explain as simply as possible what RDF and the Semantic Web are, and how you can start playing with them with perl.
March 9th, 2003 | Published in hardware, linux | 5 Comments
UPDATE: more notes written recently
My new laptop arrived this week – a Dell Latitude x200. And it’s marvellous. Wonderfully lightweight, good battery life for such a small box, good keyboard and a really clear bright screen. After a quick look at Windows XP, which I’d never seen properly before, I set about installing Linux on it. The Linux on Laptops Dell page has links to some useful bootstrapping information, but there were a few things I found pretty hard to work out. Here are my notes on those things.
February 26th, 2003 | Published in misc
Just when I’d got into a regular rhythm of posting new stuff to this site at least once a week, my laptop died. For the last year or so I’ve stopped using desktop machines, partly prompted by the arrival of cheap wireless networking. A good laptop and a number of built-for-purpose servers (mp3 jukebox, network gateway and webserver, etc) have suited me very well.
I’ve blatted my savings and ordered a shiny new replacement, but until that arrives I won’t be able to properly finish any of the code I’ve been working on. I’m looking forward to posting a new RDF scutter based on an updated version of the foaftool code posted here a few weeks ago.
February 6th, 2003 | Published in java | 4 Comments
Because the Exchange mailserver at work is frustratingly slow and doesn’t have a flexible cross-folder search option, I wanted an indexing spider for IMAP. After a bit of struggling with the javamail API and almost no work at all plugging the messages into Lucene (which is impressively clean, flexible and powerful), I had some working code that will start at a folder and work down through its subfolders, indexing messages as it goes.
February 3rd, 2003 | Published in foaf, java, rdf | 1 Comment
To normalise and aggregate FOAF metadata related to photographs, I needed some new code to:
So I wrote foaftool, a Java class that uses Jena. The tarball also contains a couple of servlets that can be used to transform existing content on the web.