The Siteimprove scan process
A scan runs from the moment it is triggered until results are updated and can be seen in the Siteimprove platform. Information on your site's status and scan history can be found at Settings > Crawler Management.
Scan status conditions
The scan process includes the following status conditions:
- Queuing - A signal is sent that a site should be crawled. This signal is triggered either when the “recrawl website” button is clicked in the Siteimprove products, or as part of a recurring crawl schedule. The site cannot always start crawling right away though. Depending on a few factors, the crawl job will have to wait in a queue until there is a free slot available for it to start crawling. Read more about how the queue works in the article, “ Crawler queuing ”.
- Crawling - Once a free slot is available, we start crawling your site. A spider (bot) identifies all website links. The internal links are followed and data from your pages stored so the Siteimprove checks can be processed.
- Processing - The data found and stored during the crawling stage is processed. Siteimprove product checks are carried out.
- Completed - The scan is completed once the data has been processed and results regarding your site have been updated. Your site will remain in this state until it is scheduled to be crawled again and the process repeats.
- No status available - This status will appear for a site when we see no timestamp on crawls on your site. Causes for this status are usually:
- The site is set to not crawl (e.g. Crawling may have been disabled by support at your request)
- The site was recently added to your account and has been added to the queue yet.
- A scan has previously completed on the site using the old crawler. It has since been migrated to the new crawler and did not complete a scan with the new crawler yet. A scan status will appear when the next crawl starts queuing -> see "next crawl date" in the site overview table.
Factors affecting website scan time
The amount of time it takes to scan your website depends on:
- The amount of time your site is waiting in a queue to be crawled (see "Crawler queuing")
- The speed at which we crawl your website (see "How do requests affect my website crawl speed?")
- The number of URLs/links we find on your website.
For further insights on how long it takes for your sites to scan, check out the scan history, Settings > Crawler Management > Scan History.
If you have any questions regarding your scan, feel free to contact Siteimprove Technical Support.
Did you find it helpful?Send feedback