<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Creating a Search Engine</title>
	<atom:link href="http://www.darrenherman.com/2007/05/28/creating-a-search-engine/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.darrenherman.com/2007/05/28/creating-a-search-engine/</link>
	<description>Marketing, Media, and Technology Conversations</description>
	<lastBuildDate>Wed,  8 Feb 2012 19:55:30 -0500</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.6</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: jim cary</title>
		<link>http://www.darrenherman.com/2007/05/28/creating-a-search-engine/comment-page-1/#comment-121903</link>
		<dc:creator>jim cary</dc:creator>
		<pubDate>Fri, 27 May 2011 05:58:57 +0000</pubDate>
		<guid isPermaLink="false">http://www.darrenherman.com/2007/05/28/creating-a-search-engine/#comment-121903</guid>
		<description>its a good guideline for new peoples and i think its great</description>
		<content:encoded><![CDATA[<p>its a good guideline for new peoples and i think its great</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: edward smith</title>
		<link>http://www.darrenherman.com/2007/05/28/creating-a-search-engine/comment-page-1/#comment-121452</link>
		<dc:creator>edward smith</dc:creator>
		<pubDate>Mon, 23 May 2011 06:18:49 +0000</pubDate>
		<guid isPermaLink="false">http://www.darrenherman.com/2007/05/28/creating-a-search-engine/#comment-121452</guid>
		<description>Hey this is good guideline for me.Thanks a lot.</description>
		<content:encoded><![CDATA[<p>Hey this is good guideline for me.Thanks a lot.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Hot Water Systems</title>
		<link>http://www.darrenherman.com/2007/05/28/creating-a-search-engine/comment-page-1/#comment-116544</link>
		<dc:creator>Hot Water Systems</dc:creator>
		<pubDate>Fri, 08 Apr 2011 08:17:58 +0000</pubDate>
		<guid isPermaLink="false">http://www.darrenherman.com/2007/05/28/creating-a-search-engine/#comment-116544</guid>
		<description>find this article very informative this post really increase my knowledge &lt;br&gt;</description>
		<content:encoded><![CDATA[<p>find this article very informative this post really increase my knowledge </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: amit</title>
		<link>http://www.darrenherman.com/2007/05/28/creating-a-search-engine/comment-page-1/#comment-11303</link>
		<dc:creator>amit</dc:creator>
		<pubDate>Fri, 15 Jun 2007 17:17:53 +0000</pubDate>
		<guid isPermaLink="false">http://www.darrenherman.com/2007/05/28/creating-a-search-engine/#comment-11303</guid>
		<description>This is a wonderful idea. I have had experience in building vertical search engines... can we chatch up some time to discuss? Looking forward to hear from you.

Cheers</description>
		<content:encoded><![CDATA[<p>This is a wonderful idea. I have had experience in building vertical search engines&#8230; can we chatch up some time to discuss? Looking forward to hear from you.</p>
<p>Cheers</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: e.p.c.</title>
		<link>http://www.darrenherman.com/2007/05/28/creating-a-search-engine/comment-page-1/#comment-6996</link>
		<dc:creator>e.p.c.</dc:creator>
		<pubDate>Wed, 30 May 2007 15:47:16 +0000</pubDate>
		<guid isPermaLink="false">http://www.darrenherman.com/2007/05/28/creating-a-search-engine/#comment-6996</guid>
		<description>I have been using Yahoo!&#039;s web services (http://developer.yahoo.com/search/) to provide the raw data for a vertical search engine I was working on.  Assuming you are doing a basic crawl and index of much of the web there is little value in building your own search engine: you get stuck with a lot of capital costs and bandwidth costs for slurping all the data down.  Now, if you are focussing on a small set of sites (small being a relative term, let&#039;s say a specific industry vertical resulting in maybe 10,000 sites) then there might be value in truly building your own engine.  But if you&#039;re just screwing around, go with something like Yahoo&#039;s service or the Amazon a9 search service (which is fed by Microsoft&#039;s MSN/Live index).  Google had a search service service as well (using SOAP) but have killed it in favour of an AJAX-y javascript based service.  Technically easier to implement but you lose control over the UI and all interaction is with Google.</description>
		<content:encoded><![CDATA[<p>I have been using Yahoo!&#8217;s web services (<a href="http://developer.yahoo.com/search/" rel="nofollow">http://developer.yahoo.com/search/</a>) to provide the raw data for a vertical search engine I was working on.  Assuming you are doing a basic crawl and index of much of the web there is little value in building your own search engine: you get stuck with a lot of capital costs and bandwidth costs for slurping all the data down.  Now, if you are focussing on a small set of sites (small being a relative term, let&#8217;s say a specific industry vertical resulting in maybe 10,000 sites) then there might be value in truly building your own engine.  But if you&#8217;re just screwing around, go with something like Yahoo&#8217;s service or the Amazon a9 search service (which is fed by Microsoft&#8217;s MSN/Live index).  Google had a search service service as well (using SOAP) but have killed it in favour of an AJAX-y javascript based service.  Technically easier to implement but you lose control over the UI and all interaction is with Google.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Adam Phillips</title>
		<link>http://www.darrenherman.com/2007/05/28/creating-a-search-engine/comment-page-1/#comment-6864</link>
		<dc:creator>Adam Phillips</dc:creator>
		<pubDate>Wed, 30 May 2007 09:11:02 +0000</pubDate>
		<guid isPermaLink="false">http://www.darrenherman.com/2007/05/28/creating-a-search-engine/#comment-6864</guid>
		<description>Darren,

I wrote my own search engine a couple of years ago, being entirely unimpressed with any offering which claimed to be a search engine but also runs on a windows platform - in short, there were none. You can see it in action at &lt;a href=&quot;http://www.thewebstoobig.com/pr&quot; rel=&quot;nofollow&quot;&gt;The Web&#039;s Too Big : PR&lt;/a&gt;. I wouldn&#039;t recommend attempting to write your own. I mean, it&#039;s entirely possible, as I have proved, but don&#039;t try it if you want to sleep for the next year or so.

The engine itself was written specifically to solve a problem for a vertical search engine. In its current usage it crawls and indexes the content of all PR companies in the UK, although this could easily be changed for whatever sites a user wishes.

I also tried Nutch, and at the time it wasn&#039;t very good, although to be fair that was two years ago. I have a colleague who is using it now and apparently the documentation has got much better. Nutch is Java based and sits on Unix/Linux.

Anyway, drop me an email if you are really thinking about writing your own search engine. I have some handy tips I could share with you to spare you some pain.

Adam</description>
		<content:encoded><![CDATA[<p>Darren,</p>
<p>I wrote my own search engine a couple of years ago, being entirely unimpressed with any offering which claimed to be a search engine but also runs on a windows platform &#8211; in short, there were none. You can see it in action at <a href="http://www.thewebstoobig.com/pr" rel="nofollow">The Web&#8217;s Too Big : PR</a>. I wouldn&#8217;t recommend attempting to write your own. I mean, it&#8217;s entirely possible, as I have proved, but don&#8217;t try it if you want to sleep for the next year or so.</p>
<p>The engine itself was written specifically to solve a problem for a vertical search engine. In its current usage it crawls and indexes the content of all PR companies in the UK, although this could easily be changed for whatever sites a user wishes.</p>
<p>I also tried Nutch, and at the time it wasn&#8217;t very good, although to be fair that was two years ago. I have a colleague who is using it now and apparently the documentation has got much better. Nutch is Java based and sits on Unix/Linux.</p>
<p>Anyway, drop me an email if you are really thinking about writing your own search engine. I have some handy tips I could share with you to spare you some pain.</p>
<p>Adam</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Andrew</title>
		<link>http://www.darrenherman.com/2007/05/28/creating-a-search-engine/comment-page-1/#comment-6808</link>
		<dc:creator>Andrew</dc:creator>
		<pubDate>Tue, 29 May 2007 14:49:21 +0000</pubDate>
		<guid isPermaLink="false">http://www.darrenherman.com/2007/05/28/creating-a-search-engine/#comment-6808</guid>
		<description>Toss Lucene or Heritrix on a Rackspace server ($250/mo a pop) and you&#039;re good to go.  Almost every vertical search engine I&#039;ve met has bootstrapped off Lucene or Heritrix.  Some roll their own eventually (and a very small minority roll their own from the start), but both open source projects are robust, fast, and free, so I&#039;d hire a consultant that has experience with either of their of these two crawlers.</description>
		<content:encoded><![CDATA[<p>Toss Lucene or Heritrix on a Rackspace server ($250/mo a pop) and you&#8217;re good to go.  Almost every vertical search engine I&#8217;ve met has bootstrapped off Lucene or Heritrix.  Some roll their own eventually (and a very small minority roll their own from the start), but both open source projects are robust, fast, and free, so I&#8217;d hire a consultant that has experience with either of their of these two crawlers.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

