SEO Spider
How To Perform A Parity Audit
How To Perform A Parity Audit
This tutorial explains how to utilise the Screaming Frog SEO Spider to perform parity audits to help uncover differences and potential SEO issues.
First, let’s quickly summarise what we mean by a parity audit.
What Is A Parity Audit?
The term parity audit is used in SEO to describe whether two things are the same, or whether there are differences that could impact organic visibility.
Websites can be set up in contrasting ways, and pages themselves can react differently to different types of requests.
In SEO, the two most common types of parity considerations and audits are –
- Mobile Vs Desktop
- JavaScript Vs Raw HTML
But this type of audit could also include ensuring parity between live and staging environments for example.
It can be useful to know if there are differences to the website or pages, and to key SEO elements, that might impact crawling, indexing and ranking. These can include –
- Page & Asset Counts
- HTTP Responses
- Internal Links
- Page Titles
- Headings
- Copy
- Directives
- Canonicals
- Hreflang
- Structured Data
And more! If there are differences between some of these elements, it doesn’t immediately mean it will be problematic in ranking.
However, you need to know if there are differences, and what they are – so you can act if required.
Let’s consider both JavaScript Vs Raw HTML and Mobile Vs Desktop parity in more detail.
How To Perform A JavaScript Parity Audit
While Google effectively render every web page today, they still crawl the raw HTML, as well as the rendered HTML after JavaScript execution.
It’s useful to know what JavaScript dependencies there are on a website, as there can be both user and SEO complications relying on client-side rendering.
To perform a JavaScript parity audit, it’s as simple as switching to JavaScript rendering mode and viewing the JavaScript tab for differences reported.
1) Configure JavaScript Rendering
Open up the SEO Spider, click ‘Config > Spider > Rendering’ and switch ‘Rendering’ to ‘JavaScript’.
In JavaScript rendering mode, the SEO Spider will crawl both the raw and rendered HTML and identify differences.
2) Adjust User-Agent & Window Size
The default viewport for rendering is set to Googlebot Smartphone, as Google primarily crawls and indexes pages with their smartphone agent for mobile-first indexing.
This will mean you’ll see a mobile sized screenshot in the lower ‘rendered page’ tab.
These shouldn’t need adjusting, but both user-agent and window size can be configured to your preferences.
3) Enable Resources & External Links
Ensure resources such as images, CSS and JS are selected under ‘Configuration > Spider’.
If resources are on a different subdomain, or a separate root domain, then ‘check external links‘ will need to be enabled, otherwise they won’t be used in rendering.
This is the default configuration in the SEO Spider, so you can simply click ‘File > Default Config > Clear Default Configuration’ to revert to this set-up.
4) Crawl The Site
Now crawl the website, by inputting the homepage into the ‘enter url to spider’ box and clicking ‘Start’.
The SEO Spider will then start crawling the website, and rendering the pages using headless Chrome.
5) Analyse JavaScript Tab & Filters
The JavaScript tab contains 12 filters that help make you aware of JavaScript dependencies, differences to the raw HTML, and potential issues.
These help you uncover pages which have JavaScript content, links, or differences in meta tags, canonicals or on-page elements such as page titles, meta descriptions and headings.
You can quickly identify pages which have JavaScript content and the % which is only in the rendered HTML.
If the ‘JavaScript Content’ filter has triggered for a page, you can store HTML, click on the lower ‘View Source’ tab and switch the ‘HTML’ filter to ‘Visible Content’ to highlight page text which is only in the rendered HTML.
You can also find pages with links that are only in the rendered HTML after JavaScript has run.
To see which links are only in the rendered HTML, click the lower ‘Outlinks’ tab and select the ‘Rendered HTML’ link origin filter.
This can be helpful when identifying which links are loaded using JavaScript, such as products on category pages. You can bulk export all links that rely on JavaScript via ‘Bulk Export > JavaScript > Contains JavaScript Links’.
You can also discover pages which use JavaScript to update page titles, meta descriptions or headings.
The following filters in particular are helpful when performing a JavaScript parity audit –
- Contains JavaScript Links – Pages that contain hyperlinks that are only discovered in the rendered HTML after JavaScript execution. These hyperlinks are not in the raw HTML. While Google is able to render pages and see client-side only links, consider including important links server side in the raw HTML.
- Contains JavaScript Content – Pages that contain body text that’s only discovered in the rendered HTML after JavaScript execution. While Google is able to render pages and see client-side only content, consider including important content server side in the raw HTML.
- Noindex Only in Original HTML – Pages that contain a noindex in the raw HTML, and not in the rendered HTML. When Googlebot encounters a noindex tag, it skips rendering and JavaScript execution. Because Googlebot skips JavaScript execution, using JavaScript to remove the ‘noindex’ in the rendered HTML won’t work. Carefully review pages with noindex in the raw HTML are expected to not be indexed. Remove the ‘noindex’ if the pages should be indexed.
- Nofollow Only in Original HTML – Pages that contain a nofollow in the raw HTML, and not in the rendered HTML. This means any hyperlinks in the raw HTML pre to JavaScript execution will not be followed. Carefully review pages with nofollow in the raw HTML are expected not to be followed. Remove the ‘nofollow’ if links should be followed, crawled and indexed.
- Canonical Only in Rendered HTML – Pages that contain a canonical only in the rendered HTML after JavaScript execution. Google has said they only process canonicals in the raw HTML, although industry tests have suggested Google can process them in the rendered HTML. Include a canonical link in the raw HTML (or HTTP header) to ensure Google can see it and avoid relying only on the canonical in the rendered HTML only.
- Canonical Mismatch – Pages that contain a different canonical link in the raw HTML to the rendered HTML after JavaScript execution. Google say they only process canonicals in the raw HTML, although industry tests have suggested they do process them in the rendered HTML. However, this can cause conflicting signals and may lead to unwanted behaviour. Ensure the correct canonical is in the raw HTML and rendered HTML to avoid conflicting signals to search engines.
- Page Title Only in Rendered HTML – Pages that contain a page title only in the rendered HTML after JavaScript execution. This means a search engine must render the page to see it. While Google is able to render pages and see client-side only content, consider including important content server side in the raw HTML.
- Page Title Updated by JavaScript – Pages that have page titles that are modified by JavaScript. This means the page title in the raw HTML is different to the page title in the rendered HTML. While Google is able to render pages and see client-side only content, consider including important content server side in the raw HTML.
- Meta Description Only in Rendered HTML – Pages that contain a meta description only in the rendered HTML after JavaScript execution. This means a search engine must render the page to see it. While Google is able to render pages and see client-side only content, consider including important content server side in the raw HTML.
- Meta Description Updated by JavaScript – Pages that have meta descriptions that are modified by JavaScript. This means the meta description in the raw HTML is different to the meta description in the rendered HTML. While Google is able to render pages and see client-side only content, consider including important content server side in the raw HTML.
- H1 Only in Rendered HTML – Pages that contain an h1 only in the rendered HTML after JavaScript execution. This means a search engine must render the page to see it. While Google is able to render pages and see client-side only content, consider including important content server side in the raw HTML.
- H1 Updated by JavaScript – Pages that have h1s that are modified by JavaScript. This means the h1 in the raw HTML is different to the h1 in the rendered HTML. While Google is able to render pages and see client-side only content, consider including important content server side in the raw HTML.
Check out our How To Crawl JavaScript Websites tutorial for more.
How To Perform A Mobile Vs Desktop Parity Audit
Google first started experimenting with a mobile first index in 2016, before enabling mobile first-indexing for all websites in September 2020.
Mobile-first indexing means Google predominantly uses the mobile version of the content for indexing and ranking. Historically, their index used the desktop version when evaluating the relevance of a page to a user’s query.
This means if you have a website which shows different content to Google’s mobile (smartphone) user-agent than desktop, you may have noticed differences in ranking.
Most websites today are responsive, where the HTML is the same for all devices and the appearance changes based upon screen size – meaning no impact from mobile-first indexing. However, adaptive and mobile specific websites can serve different HTML and content, which is where parity audits are helpful.
You don’t have to perform a parity audit between mobile and desktop, you can just focus on the mobile version. But often a parity audit can help uncover gaps that can be easily missed, or not fully considered.
To perform a Mobile Vs Desktop parity audit, you’ll need to ensure you’re in database storage mode, perform two separate crawls with mobile and desktop user-agents, and then use crawl comparison and change detection.
1) Crawl The Mobile Site
First, crawl the mobile site using a mobile user-agent. Click ‘Config > User-agent’ to switch to Googlebot Smartphone.
If you have any JavaScript dependencies use JavaScript rendering mode, select any elements you wish to crawl and compare, such as structured data – and crawl the mobile site.
Many hosts and CDNs block spoofing of Googlebot user-agent strings (and serve a 403 response), so you can alternatively choose Chrome for Android, or another mobile UA.
While we recommend crawling the whole site for completeness, if the website is extremely large and the structure is much the same, you can just check on the parity of specific page templates from across the website. Pick one page for each template, and upload them in list mode (Mode > List).
2) Crawl The Desktop Site
Next, crawl with a desktop user-agent. Go to ‘Config > User-agent’ and choose Googlebot Desktop, or another desktop user-agent.
Either crawl the whole website, or the chosen page templates in a similar way to the mobile version.
3) Select Crawls To Compare
Go to ‘File > Crawls’, select the mobile and desktop and then ‘Select to Compare’.
This will switch you to ‘Compare’ mode.
4) Configure Change Detection
Now click the cog icon at the top of the screen or ‘Config > Compare’. The Change Detection configuration will then appear, which allows you to identify whether specific elements are different, such as page titles, descriptions, headings, word count, internal linking, structured data and more.
Select all the items by clicking ‘Select All’ at the top.
5) Configure URL Mapping
If the mobile version uses separate URLs, rather than dynamically serving different content on the same URL by user-agent, you’ll need to set up URL Mapping so you can compare mobile and desktop equivalents.
In the same configuration (‘Config > Compare’), select ‘URL Mapping’. This will enable you to compare mobile URLs to desktop equivalent by mapping the previous crawl (the mobile version if you performed that first), against the current crawl (the desktop version).
Now click ‘OK’ and the ‘Compare’ button, where the crawl comparison and change detection analysis will run.
6) Analyse Crawl Overview & Change Detection Tabs
When the analysis is complete, the right hand overview window will appear with data, highlighting changes for tabs and filters between the mobile and desktop crawl.
Ideally the mobile website would contain the same content as the desktop site, with the same on-page targeting and alignment with descriptive page titles, descriptions, and headings.
There are four columns (and filters on the master window view) that help segment URLs that have changed in tabs and filters.
- Added – URLs in previous crawl that moved to filter of current crawl.
- New – New URLs not in the previous crawl, that are in current crawl and filter.
- Removed – URLs in filter for previous crawl, but not in filter for current crawl.
- Missing – URLs not found in the current crawl, that previous were in filter.
Explore this data to identify differences and gaps in the mobile Vs desktop crawls and whether they might be problematic. This might include missing pages or images, to differences in site structure, internal linking and indexability.
Scroll down in the right hand Overview tab to analyse the ‘Change Detection’ tab for matched URLs between the mobile and desktop crawls. This tab will alert you to differences in the actual content of elements – which is not covered by the usual tabs and filters.
You can click on the filters to see where like for like URLs have differences, such as page titles. The current crawl in the below is mobile, while previous is desktop.
Or word count.
Or crawl depth.
These types of differences in the mobile version of the website could be extremely problematic in ranking.
Google provide a useful list of common errors that can stop sites from being enabled for mobile-first indexing, or could cause a drop in ranking after a site is enabled for mobile-first indexing.
Check out our How To Compare Crawls tutorial for more.
How To Perform A Live Vs Staging Parity Audit
It’s a time saver to be able to compare live vs staging websites to uncover differences and problems before changes are published to the live site.
This process is similar to the mobile vs desktop parity audit, where you can crawl both websites separately, before using URL mapping which enables two different URL structures to be compared to their equivalents.
Check out our How To Use The SEO Spider In A Site Migration tutorial and the Compare Live Vs Staging With URL Mapping section for more on this process.
Summary
This tutorial should help you better understand how to use the SEO Spider and work more efficiently when performing parity audits.
Please read our Screaming Frog SEO Spider FAQs and full user guide for more information on the tool.
If you have any queries or feedback on how to improve the SEO Spider then get in touch with our team via support.