Taking automated webpage screenshots with embedded Mozilla

June 13th, 2004  |  Published in python  |  1 Comment

The other day I discovered Hotlinks, a rather nice link aggregator. It collects links from sites (including those of a couple of my respected colleagues) and combines them into a good-looking summary page. I particularly like the automatic webpage thumbnails it makes, which are created using khtml2png. I couldn’t get khtml2png to compile on my machine. After finding that there are now python wrappers for GtkMozEmbed, I made my own screenshotter-and-thumbnailer by embedding the Mozilla browser component using a little python script.

UPDATE: Ross Burton picked up the script and made a couple of enhancements. Miguel de Icaza posted a C# version.


To run, you’ll need PyGtkMoz, Gtk and the Python Imaging Library. Because it’s a GTK app, it needs an X server. To run headless, it’ll need an X server like Xnest or VNC. VNC’s working well for me.

The way it works is very simple:

Starting with the example.py shipped with PyGtkMoz, I created a stripped-down browser window app that loads a URL given on the commandline. By connecting the net_stop signal to a method, you can tell when network activity has finished. I couldn’t find a way to be notified when rendering has finished (images decompressed, etc) so I put in a 3 second pause here.

Following clues in a GTK mailing list post, I wrote these lines to save a PNG of my widget’s window:

window = self.widget.window
(x,y,width,height,depth) = window.get_geometry()
pixbuf = gtk.gdk.Pixbuf(gtk.gdk.COLORSPACE_RGB,False,8,width,height)
pixbuf.get_from_drawable(window,self.widget.get_colormap(),0,0,0,0,width,height)
pixbuf.save("screenshot.png","png")

After that, producing a thumbnail just takes a few more lines of PIL code to open, thumbnail and save the PNG under a new name:

Responses

  1. Software Documentation Weblog says:

    June 15th, 2004 at 9:44 am (#)

    Taking automated screenshots of webpages

    Matt Biddulph has released a small program that will take screenshots of webpages as they are displayed by the Mozilla webbrowser.