Search Results for:

*

Crawling Password Protected Websites

Crawl websites that require a login, using web forms authentication using our inbuilt Chrome browser.

How Accurate Are Website Traffic Estimators?

If you’ve worked at an agency for any significant amount of time, and particularly if you’ve been involved in forecasting, proposals or client pitches, you’ve likely been asked at least one of (or a combination or amalgamation of) the following...

Continue Reading Screaming Frog Blog

Exclude

Configuration > Exclude The exclude configuration allows you to exclude URLs from a crawl by using partial regex matching. A URL that matches an exclude is not crawled at all (it’s not just ‘hidden’ in the interface). This will mean...

Continue Reading SEO Spider Guide

URL rewriting

Configuration > URL Rewriting The URL rewriting feature allows you to rewrite URLs on the fly. For the majority of cases, the ‘remove parameters’ and common options (under ‘options’) will suffice. However, we do also offer an advanced regex replace...

Continue Reading SEO Spider Guide

Robots.txt

The Screaming Frog SEO Spider is robots.txt compliant. It obeys robots.txt in the same way as Google. It will check the robots.txt of the subdomain(s) and follow (allow/disallow) directives specifically for the Screaming Frog SEO Spider user-agent, if not Googlebot...

Continue Reading SEO Spider Guide

Crawling

The Screaming Frog SEO Spider is free to download and use for crawling up to 500 URLs at a time. For £199 a year you can buy a licence, which removes the 500 URL crawl limit. A licence also provides...

Continue Reading SEO Spider Guide

How do I extract multiple matches of a regex?

If you want all the H1s from the following HTML: <html> <head> <title>2 h1s</title> </head> <body> <h1>h1-1</h1> <h1>h1-2</h1> </body> </html> Then we can use: <h1>(.*?)</h1>

Continue Reading Seo Spider FAQ

Why is my regex extracting more than expected?

If you are using a regex like .* that contains a greedy quantifier you may end up matching more than you want. The solution to this is to use a regex like .*?. For example if you are trying to...

Continue Reading Seo Spider FAQ

How does the Spider treat robots.txt?

The SEO Spider is robots.txt compliant. It checks robots.txt in the same way as Google. It will check robots.txt of the (sub) domain and follow directives specifically any for Googlebot, or for all user-agents. You are able to adjust the...

Continue Reading Seo Spider FAQ

Why isn’t my Include/Exclude function working?

The Include and Exclude are case sensitive, so any functions need to match the URL exactly as it appears. Please read both guides for more information. Functions will be applied to URLs that have not yet been discovered by the...

Continue Reading Seo Spider FAQ