|User-Agent Name||DotBot 1.1|
|Category||Robot, Spider, Crawler|
|Last Visit||Aug 11, 2011 02:44 PDT|
|#||User-Agent String||Visit Frequency||Last Visit||View|
|1||Mozilla/5.0 (compatible; DotBot/1.1; http://www.dotnetdotcom.org/, email@example.com)||126542||Aug 11, 2011 02:44 PDT||View Detail|
Dotbot is a crawler of dotnetdotcom.org.
User-agent: dotbot Disallow: /
Our purpose is rather simple. We want to make the internet as open as possible. Currently only a select few corporations have a complete and useful index of the web. Our goal is to change that fact by crawling the web and releasing as much information about its structure and content as possible. We plan on doing this in a manner that will cover our costs (selling our index) and releasing it for free for the benefit of all webmasters. Obviously, this goal has many potential legal, financial, ethical and technical problems. So while we can't promise specific results, we can promise to work hard, share our results, and help make the internet a better and more open space.
Our crawling system is written in a mixture of C and Python. We elected to store our index using custom flat files on disk as opposed to a traditional database management system. We would like to give thanks to everyone who was involved in the many open source tools we used. These include gcc, gdb, ubuntu linux, valgrind, python and libcurl. Additionally, we want to thank the many webmasters who have taken the time to give us feedback and support our cause.