Skip to main content

Crawl Errors: Crawler Blocked by Server, Network, or robots.txt settings

Modified on: Fri, 24 Apr, 2026 at 9:45 PM

Issue / Question

You are receiving the following errors: 

  • 403 Forbidden
  • Proxy error
  • Connection error
  • Max tries reached
  • Index URL blocked by robots.txt

Environment (If Applicable)

Specify the conditions where the issue occurs:

  • Product version: Siteimprove (all supported versions)
  • Platform / OS / Browser: Platform‑independent (issue occurs at the site configuration and crawl level, not browser‑specific)
  • User role / permissions:  Account Owners
  • Preconditions: 

Solutions

If you are seeing a 403 Forbidden error

  • Ask your IT department or hosting provider to allow the Siteimprove crawler. This error must be resolved outside the platform.

If you are seeing a Proxy error

  • Adjust proxy settings or contact IT. If the issue persists, contact Customer Support.

If you are seeing a Connection error

If you are seeing Max tries reached

  • Verify that your site is online. If the error continues, contact Customer Support.

If you are seeing Index URL blocked by robots.txt

  • This error must be fixed outside the platform by updating your robots.txt file. Contact your IT department, web agency, or hosting provider if needed. 
  • Example: Your robots.txt may restrict all bots from crawling the site:

    User-agent: *

    Disallow: /

    To allow Siteimprove but block other bots, you could use:

    User-agent: *

    Disallow: /

    User-agent: SiteimproveBot

    Disallow:


Cause

The root cause of these issues is blocking outside Siteimprove.

Additional Information

  • Related errors or messages: N/A
  • Known limitations: N/A
  • Configuration notes: N/A

Did you find it helpful? Yes No

Send feedback
Sorry we couldn't be helpful. Help us improve this article with your feedback.