If I ask you why you built a page on your website, what would your response be?
- For generating revenue…
- for educating the customers…
- for building a brand…
- or for all of them maybe, right?
But what if I tell you that there may be some pages on your website that are doing neither of those things?
In fact, they may not even be getting more than a handful of visitors, only consuming your server’s bandwidth.
Such pages are known as Orphan pages, and in this article we’re pulling back the curtain to reveal how to find orphan pages using Google Analytics.
Frequently Asked Questions
Can Google Find Orphan Pages?
If you include a link to them in your sitemap, then obviously.
Otherwise Googlebot — the crawler of Google search — may struggle to find orphan pages on your site.
Eventually, however, Google can discover almost all pages that are published on a website.
Do Orphan Pages Hurt SEO Rankings?
This depends upon which page’s rankings you’re talking about.
While orphan pages do not affect the ranking of any of your existing web pages, they themselves rank much lower than they deserve. Â
What is the Best Tool for Finding Orphan Pages on a Website?
There are many tools available in the market that claim to find orphan pages, but none of them is fully reliable because they are just like an average web crawler.
If Google’s crawler can’t find a page, other crawlers will also have limited success finding it.
The only real solution that can be used to discover such pages is Google Analytics.
What You’ll Need to Find Orphan Pages
You’ll need 3 things to find the orphan pages on your website:
- Google Analytics installed and working properly on your website
- Screaming Frog SEO Spider
- Excel/Google Sheets
How to Find Orphan Pages: Step-by-Step Instructions
1. Identify Your Crawlable Pages
Firstly you’ll have to prepare a list of all your pages that can be accessed by Google’s crawler.
To do this, you’ll need your own crawler.
You can use Screaming Frog for this purpose.
Download it from its official website and install it on your computer.
Once installed, launch it and enter your domain name to start crawling.
But make sure that the crawler’s configuration allows it to crawl indexable pages only.
It should not crawl Noindexed pages or pages that are hidden from search engines using a robots.txt file.
After crawling is complete, you’ll have a list of all easily accessible and indexable pages on your website.
2. Get a List of URLs from Google Analytics
Next, it’s time to get a list of all URLs on your website.
This can be done best by Google Analytics, as it tracks the traffic and pageviews of almost every URL on your website.
It doesn’t care whether the page has been visited once or thousands of times — if it has been visited ever, its record remains in Google Analytics.
In the left sidebar of your Google Analytics dashboard, navigate to Behavior >> Site content >> All pages.
This should bring you to a page that provides a list of all URLs on your website.
Since orphan pages are difficult to find, the number of their pageviews will be quite less in comparison to the indexable pages on your website.
It’s quite possible that some of them might have been visited only once, so in order to find them we need to play with a little with dates.
Change the date range in top right corner and set the starting date closer to the date when Google Analytics was first installed on the site.
Now sort your pages by the number of page views they have received, in ascending order, you’ll have pretty much a ready list of orphan URLs on your website.
3. Export All URLs
Now we shall compare this list of all URLs extracted by Google Analytics to our list of crawlable and indexable URLs that we got in the first step using Screaming Frog.
To do so, first of all change the number of rows being shown in Google Analytics by clicking ‘Show rows’ dropdown menu and selecting 5,000.
This will load data of all URLs in up to 5,000 rows.
If you have more links than that, you may have to repeat the process.
Wait for a while until Google Analytics provides you with the data.
Once it’s done, click on the Export option in the top right corner and export the URLs to a Google sheet or Excel file.
4. Prepare the URLs for Comparison
Now paste the URLs extracted from Google Analytics in the same sheet in which you pasted indexable URLs.
Columns of both URLs should be adjacent to each other.
As you can see, the URLs exported from Google Analytics don’t come in the URL format.
We need to fix this before they can be of any use.
Insert a new column between both columns and paste the address of your homepage in that column.
Then hover your pointer on the cell in which you pasted the home address, and drag down the crosshair to paste it in as many rows as you have for your Google Analytics exported URL.
Once you have the home address in as many rows as exported URLs, apply the concat() formula in a new column to merge the content of both columns in a row.
See the screenshot below for a better understanding:
Once it has been done for the first row, you can drag the crosshair again to do it for all the rows.
Now delete the two columns between your indexable URLs and these final URLs that you just created by merging the content of the two columns.
5. Compare and identify Orphan URLs
Finally, apply the following formula in a new column:
=match(D2,$A$2:$A$11,0)
The formula will compare the values of both cells in the row and provide a result like the one shown below:
Then you need to select the cell in which you applied the formula and drag down to apply it to all the rows.
The output for all web pages that are not in the list of crawlable URLs will be N/A.
These are your orphan URLs and pages.
Similar Tutorials to Check Out
- How to Rank on Google: This tutorial explains how can you get your web pages to rank in Google’s SERPs for your desired keywords. After identifying your orphan pages, you can follow the steps outlined in this tutorial to make them rank.
- How to Find Cached Pages: This tutorial explains how to find out web pages on your site that have been cached by Google and saved on its servers for easier distribution to searchers.
- How to Check Which Pages Are Indexed by Google: This tutorial can help you identify the pages that are indexed and the ones that are not indexed by Google. It can help you identify issues that may be blocking your indexable pages from being indexed.
Wrapping Up
Orphan pages do no good either to your visitors or to search engine crawlers.
They also don’t do any good to your server bandwidth.
So it’s better you find them out and either eliminate them (if they are unnecessary) or turn them into normal website pages linked internally.
We hope this tutorial did a fine job of explaining how can you find such pages on your website.
Now if you liked it, do share it on your social media profiles.
Also, if you have any questions or thoughts on your mind, share them in the comments.