<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Screenscraping HTML with TagSoup and XPath</title>
	<atom:link href="http://www.hackdiary.com/2003/04/13/screenscraping-html-with-tagsoup-and-xpath/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.hackdiary.com/2003/04/13/screenscraping-html-with-tagsoup-and-xpath/</link>
	<description></description>
	<lastBuildDate>Fri, 19 Feb 2010 13:50:20 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
	<item>
		<title>By: darkerhorse</title>
		<link>http://www.hackdiary.com/2003/04/13/screenscraping-html-with-tagsoup-and-xpath/comment-page-1/#comment-208</link>
		<dc:creator>darkerhorse</dc:creator>
		<pubDate>Wed, 31 Dec 2008 03:54:47 +0000</pubDate>
		<guid isPermaLink="false">http://www.hackdiary.com/?p=32#comment-208</guid>
		<description>Doing the code in Java takes a little bit of less work I believe. I have used TagSoup, and that works great.</description>
		<content:encoded><![CDATA[<p>Doing the code in Java takes a little bit of less work I believe. I have used TagSoup, and that works great.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: ip address</title>
		<link>http://www.hackdiary.com/2003/04/13/screenscraping-html-with-tagsoup-and-xpath/comment-page-1/#comment-88</link>
		<dc:creator>ip address</dc:creator>
		<pubDate>Fri, 13 Feb 2004 12:03:58 +0000</pubDate>
		<guid isPermaLink="false">http://www.hackdiary.com/?p=32#comment-88</guid>
		<description>Nice summary. Thank you for posting it.
</description>
		<content:encoded><![CDATA[<p>Nice summary. Thank you for posting it.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: stuck-on-mobile-e-com ;-)</title>
		<link>http://www.hackdiary.com/2003/04/13/screenscraping-html-with-tagsoup-and-xpath/comment-page-1/#comment-91</link>
		<dc:creator>stuck-on-mobile-e-com ;-)</dc:creator>
		<pubDate>Sun, 04 Jan 2004 18:33:17 +0000</pubDate>
		<guid isPermaLink="false">http://www.hackdiary.com/?p=32#comment-91</guid>
		<description>&lt;strong&gt;html scraping&lt;/strong&gt;

by hackdiary: Screenscraping HTML with TagSoup and XPath: Screenscraping HTML with TagSoup and XPathlooking forward to experiment with those tools ......
</description>
		<content:encoded><![CDATA[<p><strong>html scraping</strong></p>
<p>by hackdiary: Screenscraping HTML with TagSoup and XPath: Screenscraping HTML with TagSoup and XPathlooking forward to experiment with those tools &#8230;&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Unicast</title>
		<link>http://www.hackdiary.com/2003/04/13/screenscraping-html-with-tagsoup-and-xpath/comment-page-1/#comment-90</link>
		<dc:creator>Unicast</dc:creator>
		<pubDate>Sat, 27 Dec 2003 11:11:41 +0000</pubDate>
		<guid isPermaLink="false">http://www.hackdiary.com/?p=32#comment-90</guid>
		<description>&lt;strong&gt;The Danish media RSS project&lt;/strong&gt;

I&#039;ve started recoding the RSS feeds for the major Danish newspapers that I once had. Those feeds were done with...
</description>
		<content:encoded><![CDATA[<p><strong>The Danish media RSS project</strong></p>
<p>I&#8217;ve started recoding the RSS feeds for the major Danish newspapers that I once had. Those feeds were done with&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Eric</title>
		<link>http://www.hackdiary.com/2003/04/13/screenscraping-html-with-tagsoup-and-xpath/comment-page-1/#comment-87</link>
		<dc:creator>Eric</dc:creator>
		<pubDate>Mon, 15 Dec 2003 20:04:11 +0000</pubDate>
		<guid isPermaLink="false">http://www.hackdiary.com/?p=32#comment-87</guid>
		<description>Hi, sorry if this is a stupid question, but i&#039;m trying to use Tagsoup with jdom and it seems like jdom sets an internal feature for namespace-prefixes which won&#039;t turn off even if it try to setFeature(...,false)..just wondering if tagsoup will support this feature soon or do i need to bug jdom about it :) Thanks! -Eric
</description>
		<content:encoded><![CDATA[<p>Hi, sorry if this is a stupid question, but i&#8217;m trying to use Tagsoup with jdom and it seems like jdom sets an internal feature for namespace-prefixes which won&#8217;t turn off even if it try to setFeature(&#8230;,false)..just wondering if tagsoup will support this feature soon or do i need to bug jdom about it :) Thanks! -Eric</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Max V</title>
		<link>http://www.hackdiary.com/2003/04/13/screenscraping-html-with-tagsoup-and-xpath/comment-page-1/#comment-86</link>
		<dc:creator>Max V</dc:creator>
		<pubDate>Wed, 10 Dec 2003 14:37:34 +0000</pubDate>
		<guid isPermaLink="false">http://www.hackdiary.com/?p=32#comment-86</guid>
		<description>The result of my diploma thesis made the Tagsoup approach simpler by giving the user a way to visually select elemtens in a browser. In the background an XSLT is created. Release sheduled for Xmas 2003.

&lt;a href=&quot;http://www.xam.de&quot; rel=&quot;nofollow&quot;&gt;http://www.xam.de&lt;/a&gt; &#124; &lt;a href=&quot;http://wal.sf.net&quot; rel=&quot;nofollow&quot;&gt;http://wal.sf.net&lt;/a&gt;
</description>
		<content:encoded><![CDATA[<p>The result of my diploma thesis made the Tagsoup approach simpler by giving the user a way to visually select elemtens in a browser. In the background an XSLT is created. Release sheduled for Xmas 2003.</p>
<p><a href="http://www.xam.de" rel="nofollow">http://www.xam.de</a> | <a href="http://wal.sf.net" rel="nofollow">http://wal.sf.net</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Leigh Dodds</title>
		<link>http://www.hackdiary.com/2003/04/13/screenscraping-html-with-tagsoup-and-xpath/comment-page-1/#comment-85</link>
		<dc:creator>Leigh Dodds</dc:creator>
		<pubDate>Wed, 30 Apr 2003 20:46:39 +0000</pubDate>
		<guid isPermaLink="false">http://www.hackdiary.com/?p=32#comment-85</guid>
		<description>Have you tried tweaking the TagSoup schema?

I&#039;ve not looked too closely myself, but looks like I&#039;m going to have to do that as, not surprisingly, when I ran TagSoup over an HTML page containing Trackback markup it discarded it.

There&#039;s probably another tutorial in there somewhere...
</description>
		<content:encoded><![CDATA[<p>Have you tried tweaking the TagSoup schema?</p>
<p>I&#8217;ve not looked too closely myself, but looks like I&#8217;m going to have to do that as, not surprisingly, when I ran TagSoup over an HTML page containing Trackback markup it discarded it.</p>
<p>There&#8217;s probably another tutorial in there somewhere&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Development Notebook</title>
		<link>http://www.hackdiary.com/2003/04/13/screenscraping-html-with-tagsoup-and-xpath/comment-page-1/#comment-89</link>
		<dc:creator>Development Notebook</dc:creator>
		<pubDate>Tue, 15 Apr 2003 09:59:53 +0000</pubDate>
		<guid isPermaLink="false">http://www.hackdiary.com/?p=32#comment-89</guid>
		<description>&lt;strong&gt;Screenscraping with TagSoup&lt;/strong&gt;

Hmm, I wonder : TagSoup is a HTML cleaner, SAX-style hackdiary: Screenscraping HTML with TagSoup and XPath Screenscraping HTML with
</description>
		<content:encoded><![CDATA[<p><strong>Screenscraping with TagSoup</strong></p>
<p>Hmm, I wonder : TagSoup is a HTML cleaner, SAX-style hackdiary: Screenscraping HTML with TagSoup and XPath Screenscraping HTML with</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Matt Biddulph</title>
		<link>http://www.hackdiary.com/2003/04/13/screenscraping-html-with-tagsoup-and-xpath/comment-page-1/#comment-84</link>
		<dc:creator>Matt Biddulph</dc:creator>
		<pubDate>Mon, 14 Apr 2003 13:56:09 +0000</pubDate>
		<guid isPermaLink="false">http://www.hackdiary.com/?p=32#comment-84</guid>
		<description>Zaid of &lt;a href=&quot;http://www.altmobile.com&quot; rel=&quot;nofollow&quot;&gt;http://www.altmobile.com&lt;/a&gt; suggests his Mobile Internet Studio product for point-and-click HTML scraping. It&#039;s commercial software so I&#039;ve removed his marketing-oriented comment from this page (no offense intended; just personal preference), but do check out his site if you&#039;re interested.
</description>
		<content:encoded><![CDATA[<p>Zaid of <a href="http://www.altmobile.com" rel="nofollow">http://www.altmobile.com</a> suggests his Mobile Internet Studio product for point-and-click HTML scraping. It&#8217;s commercial software so I&#8217;ve removed his marketing-oriented comment from this page (no offense intended; just personal preference), but do check out his site if you&#8217;re interested.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jason</title>
		<link>http://www.hackdiary.com/2003/04/13/screenscraping-html-with-tagsoup-and-xpath/comment-page-1/#comment-83</link>
		<dc:creator>Jason</dc:creator>
		<pubDate>Mon, 14 Apr 2003 02:06:56 +0000</pubDate>
		<guid isPermaLink="false">http://www.hackdiary.com/?p=32#comment-83</guid>
		<description>I&#039;ve used HttpUnit for &quot;web-scraping&quot;. It worked for what I needed it to do.
</description>
		<content:encoded><![CDATA[<p>I&#8217;ve used HttpUnit for &#8220;web-scraping&#8221;. It worked for what I needed it to do.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
