Skip to main content

Typical Reasons for Crawl Problems

Modified on: Thu, 19 Dec, 2024 at 4:49 PM

We crawl thousands of websites every day to present our latest findings on your website in our platform. Once in a while, we run into a challenge gathering the latest data from your website.

The reason for this is typically that something changed on your website. It can be close to impossible for us to pinpoint exactly what happened, however, our system will always do its best to mitigate the issue and continue crawling your website.

In extreme cases, we still will not be able to crawl your website. When that is the case, all known automated solutions have been unsuccessful.

That typically leaves the following cases where we need some help from you.

We’re finding zero pages:

Zero pages is typically the result of one of these incidents.

  • The website no longer exists. If your website is no longer online, we suggest you delete the site that crawled it from your account. This will reduce your stored pages and free up space in your subscription for other websites.
  • Temporary server maintenance or issues. If your website is down for maintenance or you are experiencing intermittent server or hosting issues, this may cause our crawler to not find any pages when requesting your website. Your website will be crawled again in 5 days and then you should see a full set of updated pages.
  • You have excluded the main URL we are trying to crawl. Please check your Site Content Settings. You should remove rules in "exclude content" or "remove links" that match the site's index-URL if this is the case. Once removed, we will be able to crawl your website.
  • Firewall, Robots.txt or IP restrictions. If none of the above items match your situation, it could be that the hosting service you use for your website does not allow our crawlers access to your website. If that is the case, you can read more about the IP address and other identifiers of our system here, and contact your hosting provider to have these blocks lifted. 

We’re only finding the front page of your website:

  1. Login. If your website content is protected behind a login, it could cause us to only see the first page of your website. To enable crawls behind login please contact us to find out more!
  2. Our current crawler capabilities do not match your website. We continuously work to improve our systems to deal with most challenges. 

Did you find it helpful? Yes No

Send feedback
Sorry we couldn't be helpful. Help us improve this article with your feedback.