Happy Fun Bot

Your web site has been visited by Happy Fun Bot?

Happy Fun Bot is a document analyzing tool used to index and classify the World Wide Web. This kind of program is also known as a web crawler. The goal of Happy Fun Bot is to collect as many documents as possible for the happyfunsearch.com search engine. On this page you will find the most Frequently Asked Questions about Happy Fun Bot.

Questions

  1. At what speed does Happy Fun Bot retrieve my sites pages?
  2. How to ask Happy Fun Bot not to retrieve certain pages of my site?
  3. Why does Happy Fun Bot keep on asking for a robots.txt file on my site?
  4. Why does Happy Fun Bot try to retrieve nonexisting pages on my site? Or on a nonexisting domain?
  5. Why is Happy Fun Bot retrieving pages on our private site?
  6. Why does Happy Fun Bot do not obey my robots.txt rules?
  7. Why can I see different host names with the same Happy Fun Bot signature on my site?
  8. What kind of links does Happy Fun Bot follow when visiting a site?
  9. I can't find an answer to my question on this page, what can I do?

Answers

At what speed does Happy Fun Bot retrieve my sites pages?

In order not to disturb the availability of the visited sites, Happy Fun Bot has been configured to visit each page of a same site with a delay varying between 1 and 50 seconds. Nevertheless, given the nature of the Internet, the unavailibity of a part of the network may slow down the frequency of the visits. If you consider Happy Fun Bot is affecting your web site in a significant way you can tell us to slow down the crawling speed by using the form at the end of this page. However you should know that we already take into account the time given to answer our requests as an indicator of your server load.

top

How to ask Happy Fun Bot not to retrieve certain pages of my site?

The robots.txt file is a standard document that specifies if Happy Fun Bot can fully visit your web site or not. The robots.txt syntax is defined by the Robot Exclusion Standard. If you wish to treat Happy Fun Bot differently from the other robots you can define the rules with a User-Agent: starting with "HappyFunBot". If this rule is not defined, Happy Fun Bot will obey the User-agent: * directives.

robots.txt example: in the following example all the robots are concerned by the exclusion of the /stats/, /cgi-bin/ and /img/ directories.

User-agent:*
Disallow:/stats/
Disallow:/cgi-bin/
Disallow:/img/

Other example, this time defining the rules for Happy Fun Bot on the /stats/, /cgi-bin/, /img/ and /tmp/ directories.

User-agent:happyfunbot
Disallow:/stats/
Disallow:/cgi-bin/
Disallow:/img/
Disallow:/tmp/

User-agent:*
Disallow:/stats/
Disallow:/cgi-bin/
Disallow:/img/

top

Why does Happy Fun Bot keep on asking for a robots.txt file on my site?

robots.txt is a standard document allowing or disallowing robots to retrieve pages of a web site. If you want to learn how to write your own robots.txt file please check The Robot Exclusion Standard. If you just want to avoid seeing errors in your log files regarding this file you can put an empty file named robots.txt at the root of your site.

top

Why does Happy Fun Bot try to retrieve nonexisting pages on my site? Or on a nonexisting domain?

The World Wide Web is made of many "broken" links or sites that do not exist anymore. When a site contains an incorrect link to your site, visitors will not be able to access the said document. In the same way Happy Fun Bot will try to access this document from an old or incorrect link. This explains why you may see an error in your logs when Happy Fun Bot tries to access the link. These access failures are usually reported by a 404 error in your server logs.

top

Why is Happy Fun Bot retrieving pages on our private site?

It's sometimes impossible to keep a site "secret" even if you do not publish a link to it. There are many reasons for this:
- as soon as a visitor from this "secret" site follows a link to another site, the "secret" site will appear in the referer of the visited site logs (transmitted by your own web browser). These logs are sometimes made public by statistic pages.
- some domain lists are public such as newly registered domains (depending on the registrar). In a similar way, sometimes companies hosting web sites maintain a public list of the sites they host.
- you may not know of all links pointing to your site. You can use the link: syntax in happyfunsearch.com to find these links to your site, for example link:www.free.fr or link:www.mysite.com/page.html (but not necessarily all pages are indexed even if the links are followed).
Apache users should consider using htaccess to protect their data from being accessed by unauthorized users/robots, see http://apache-server.com/tutorials/ATusing-htaccess.html

top

Why does Happy Fun Bot do not obey my robots.txt rules?

Each time Happy Fun Bot is visiting your site, it starts with the robots.txt file first in order to obey your directives. This means that changes to the robots.txt file will not be taken into account until the next visit of your web site by Happy Fun Bot. Please check the correct syntax of your robots.txt file on http://www.robotstxt.org/wc/exclusion.html#robotstxt. Most of the problems come from a misplacement of this file on the site. It must be placed at the root of your web site, it will have no effect in any other subdirectory. If you have followed all these rules and keep on experiencing problems you can contact us by using the form at the end of this page.

top

Why can I see different host names with the same Happy Fun Bot signature on my site?

Happy Fun Bot is used to collect millions of pages and requires many computers. These computers have been assigned different IP (Internet addresses). This means than more than one robot can crawl your web site at a time.

top

What kind of links does Happy Fun Bot follow when visiting a site?

Our crawler follows all the links found in HREF tags of a document, no matter the extension (.html, .htm, .php, .asp and so on).

top

I can't find an answer to my question on this page, what can I do?

You can ask your questions regarding Happy Fun Bot by filling the following form:

Your email:

top