How to Use Google Search Console for Enterprise & Ecommerce SEO
Google Search Console (GSC) is a powerful tool for supporting technical, on-page and off-page SEO.. This reference guide focuses exclusively on using Google Search Console for optimization of established enterprise-level and ecommerce websites. If you do not already have an active Google Search Console account, visit Google Search Central: Getting started for beginners for a tutorial on setting up a Google Search Console account and verifying your domain name.
Below we’ll review and explain all relevant Google Search Console reports and identify what to look for when analyzing site performance.
Overview & URL Inspection
Security & Manual Actions
Legacy tools and reports
Links and settings
The overview page provides a summary of high-level search performance in Google. If your website employs AMP pages or has other search enhancements, performance information relating to these features will appear here.
URL Inspection > Products
Search Results > Search Type
Search Results > Date Range – Compare
Search Results > Date Range – Filter
Search Results > Queries
Search Results > Pages
Search Results > Countries
Search Results > Devices
Search Results > Dates
Search Results > New
Coverage – Error
Coverage – Server error (5xx)
Coverage – Server error (5xx) URL Inspection
The Coverage “Valid” report shows the number of web pages Google has successfully indexed. As your site grows, you should see the number of valid indexed pages reported gradually increase. Sudden drops or spikes in indexation should be investigated.
There are two types of indexed URLs that fall within the “Valid” status: “Submitted and indexed” and “Indexed, not submitted in sitemap”.
1. Submitted and indexed URLs have been submitted via an XML sitemap and indexed by Google.
2. Indexed, not submitted in sitemap URLs have been found and indexed by Google but were not discovered in XML sitemap files.
Action required: If “Indexed, not submitted in sitemap” URLs should be indexed, then add them to your sitemap files. If these URLs should not be indexed, implement a meta robots “noindex” directive or exclude them from crawling by disallowing them in your robots.txt if crawl budget is a concern. (Note: For the meta noindex directive to work, the URL should not be blocked from crawling via the robots.txt file. For a page to be noindexed, it must first be crawled by Google.)
If all “Valid” URLs in the coverage report are listed as Indexed, not submitted in sitemap then (1) you do not have a sitemap, (2) your sitemap cannot be found or crawled by Google, or (3) important URLs are missing from the sitemap. Make sure you have a valid XML sitemap free from errors and submit it directly to Google via the GSC sitemap submission tool. Alternatively, make sure your XML sitemap is correctly referenced in your robots.txt file and Google should find it the next time it crawls your site.
Coverage “Valid with warning”
The “Valid with warning” resport shows URLs that Google has indexed but have problems. Left unchecked, these problems may lead to indexing issues.
There are two types of “Valid with warnings” status within GSC index coverage reports: “Indexed, though blocked by robots.txt” and
“Indexed without content”.
1. Indexed, though blocked by robots.txt are URLs that were have been indexed by Google but blocked from crawling by the robots.txt file (e.g. yoursite.com/robots.txt). Typically, Google would not index URLs that are blocked from crawling. When it does, it’s usually because it found internal or external links to these URLs.
Action required: If these URLs are not intended to be indexed, apply the meta robots noindex directive () to the page header of each URL to be noindexed.
2. Indexed without content are URLs that have been indexed, but Google is unable to find any content on these pages. Possible reasons for this status include cloaking, Google was unable to fully render the page, page content in format Google doesn’t index, or the page is actually empty.
Action required: First make sure that the page actually has content. If the page is empty, and is left empty, Google will likely apply a “soft 404” to the URL and the page will eventually be deindexed. If the page has content, use the Google Search Console URL Inspection Tool to determine what Google is seeing when it crawls the URLs. If the URL inspection tool indicates that the URL is in good standing and indexable, simply click the “Request Indexing” button on the details page.
When a URL is excluded it means the affected page will not be indexed by Google. Consequently, the page will not appear in Google search results. The Coverage “Excluded” status report provides key data for performing technical SEO audits and identifying configuration issues.
Some of the most common SEO uses of the Coverage “Excluded” report include:
- Identify URLs that have crawling and indexing issues not identified by SEO crawling software
- Prioritize technical optimization efforts by identifying which issues are affecting the most pages
- Identify pages being attributed a Soft 404 by Google due to content quality issues
- Validate the content duplication scenaries addressed by canonical tags are being overlooked by Google
The “Excluded” status within Google Search Console contains the following types:
- Alternate page with proper canonical tag
- Blocked by page removal tool
- Blocked by robots.txt
- Blocked due to access forbidden (403)
- Blocked due to other 4xx issue
- Blocked due to unauthorized request (401)
- Crawled – currently not indexed
- Discovered – currently not indexed
- Duplicate without user-selected canonical
- Duplicate, Google chose different canonical than user
- Duplicate, submitted URL not selected as canonical
- Excluded by ‘noindex’ tag
- Not found (404)
- Page removed because of legal complaint
- Page with redirect
- Soft 404
Alternate page with proper canonical tag
These are pages that Google considers duplicates of other pages, and are correctly canonicalized to the canonical (preferred) version of the page.
Action required: If pages that are canonicalized should not be canonicalized, change the canonical tag on each page to make it self-referencing. For example, the page https://www.yoursite.com would have a canonical tag.
Checked the Coverage “Exclude” status report regularly. If the report is showing a large growth in URLs with the “Alternative page with proper canonical tag” status while your website hasn’t increased in indexable pages, this could be a sign that Google is not correctly canonicalizing pages due to poor internal link structure.
If the canonicalization logic for pages in this report appear correct, review URL patterns. You may find problematic AMP pages, page variants and URLs with UTM tags. Typically, these page types and code elements are fine, but occassionally they can cause issues.
A large site, such as an eCommerce site, that generates URLs for a variety of product attributes (color, size, etc.) can create massive numbers of URLs per product leading to page variant overload. You don’t want to overload Google with too many page variants and may consider making some of these page inaccessible for indexing in Google.
Sometimes when UTM tags are added to internal links, things can go sideways. UTM tags can dilute the transfer of page authority and skew Google Analytics data. Have UTM tags been implemented on your site? If so, you may want to audit your UTM implementation to ensure it isn’t problematic for SEO.
If you have a website with a few thousand indexable pages, but you’re seeing hundreds of thousands of pages with the “Alternate page with proper canonical tag” status, then you should consider implementing disallow rules using your robots.txt file to prevent Google from crawling these URLs. For larger website, in excess of 10,000 pages, blocking Google from unecessarily crawling URLs that do not provides value can save crawl budget.
Blocked by page removal tool
Status indicates these URLs are not appearing in Google search result because a URL removal request has been submitted through Google Search Console. These URLs will be hidden for up to 90 days. After 90 days, these URLs may reappear in searh results.
Action required: if removed pages should not reappear in Google’s index, they should be blocked permanently by adding the meta robots “noindex” tag to each page. You’ll need to make sure these pages are recrawled before the 90 day removal period expires to ensure they do not reappear in Google’s index.
Blocked by robots.txt
URLs that are disallowed within the robots.txt file are not crawled by Google and consequently may not be indexed. If these URLs have been indexed, they would be listed under “Indexed, though blocked by robots.txt” within the Coverage report.
Action required: make sure there aren’t any URLs being blocked by the robots.txt file that should be indexed. Note: remember that disallowing crawling of a URL using the robots.txt file does not guarantee a URL will not be indexed. Utilize the meta robots noindex tag to ensure noindexing of URLs that you don’t want indexed with Google.
Blocked due to access forbidden (403)
Google bots were denied access to these URLs and a received a 403 HTTP response code.
Action required: make sure that Google has access to all web pages on your site that you want crawled and indexed. If you’re unsure why a 403 access denied response code is being generate when Google attempts to access certain web pages, you may need to white list all Google IP addresses on your server.
Blocked due to other 4xx issue
This status indicates that Google was unable to access these URLs due to a 4xx response code other than the 401, 403 and 404.
Action required: attempt to debug the page by using the URL inspection tool to see if you can indentify what is causing the behavior. Once the issue has been resolved, if you want these URLs indexed make sure they are included in your XML sitemap.
Blocked due to unauthorized request (401)
These URLs generated a 401 HTTP (not authorized) response when Google attempted to crawl them.
Action required: If the URLs in question are legitimate consumer facing pages that should be indexed, either remove any authorization requirements for accessing the pages, or allow Google bot to access these pages by verifying its identity. Note: The 401 HTTP response code can be verified by visiting the page in incognito mode.
Crawled – currently not indexed
These URLs have been discovered and crawled by Google, but have not yet been indexed. The URLs may have been recently crawled, and are due to be indexed by Google. Alternatively, Google is aware of the URLs but has decided they do not merit being indexed — often due to a lack of quality. For instance, the URLs may contain thin content, duplicate content or have few internal links pointing to them.
Action required: if any of these URLs are important and should be indexed, take steps to determine how to get Google to pay attention to them. This may include improving internally linking, removing duplicate content or adding high quality content to these pages.
Discovered – currently not indexed
Google has found these URLs, but they have not yet been crawled or indexed. This typically means the URLs are queued to be crawled at a later date for indexing consideration.
Action required: you’ll want to monitor the number of URLs appearing as “Discovered – currently not indexed”. If the number continues to grow, you may have a crawl budget issue where your site is requiring more crawling than Google is willing to provide. If your site has low domain authority, is slow, or at times unavailable, this can also lead to crawl issues.
Note: sometime this report lags actual results. Verify URLs are not already indexed using the URL inspection tool. Also, Google may get bogged down crawling low-value pages (“crawl traps”) at the expense of crawling higher quality, more important pages. Make sure that your site is structurally sound and Google isn’t getting stuck crawling a large number of irrelevant URLs (e.g. auto generated URLs, filters, or other applications that can cause an infinite number of URLs.)
Duplicate without user-selected canonical
These are duplicate pages that are not canonicalized to a preferred version. Neither is considered the preferred version by Google and they have consequently been removed from Google’s index.
Action required: If in fact these URLs should not be indexed, then simply added the meta robots “noindex” tag to the header of each page. Alternatively, duplicate URLs should be consolidated by canonicalizing to a preferred version of the page by adding the rel=”canonical” link tag. (Note: use the URL inspection tool in Google Search Console to see if Google shows the canonical version for the duplicate URL.)
If there are a large number of URLs with the “Duplicate without user-selected canonical” status, this may be indicative of a sitewide issue, such a broken canonical tag logic, a brokern header, or canonical tags being inadvertently modified.
Duplicate, Google chose different canonical than user
Google is selecting a different canonical than that indicated in the canonical tag. In short, Google believes your selection for the canonical version is incorrect and has chosen to ignore it. This is a somewhat common occurance on multi-language sites with similar content and eCommerce sites where featured product variants that are essentially identical yet have a self-referencing canonical tag.
Action required: If Google isn’t respecting your indexing preferences, you need to understand why and solve the problem. Use the URL inspection tool in Google Search Console to inspect these pages and learn which URL Google has selected as the preferred version and see if it makes sense. If Google’s selection does make sense, canonical logic needs to be modified inline with Google’s selection or page content and internal linking needs to be improved to convince Google otherwise.
Duplicate, submitted URL not selected as canonical
Indicates URLs submited in XML sitemap files do not have a canonical URL set. In essence, Google views these URLs as duplicates of other URLs and is canonicalizing them to the Google preferred version. This status is similar to the “Duplicate, Google chose different canonical than user” status discussed above. However, these URLs have (1) been submited via XML sitemaps to Google for indexing and (2) do not have defined canonical URLs.
Action required: add the correct canonical tag from the “Duplicate, submitted URL not selected as canonical” URLs to the canonical chosen by Google as reported using the URL inspection tool. If you believe Google is mistaken and that these URLs are the preferred version to index, add a self-referencing canonical to each URL, improve page content and point more internal links to the page.
Excluded by ‘noindex’ tag
The URLs are being blocked from being indexed by Google because of the noindex directive included as a meta tag or header in the HTTP response.
Action required: if URLs should be indexed by Google, then remove the noindex directive. Then use the URL inspection tool in Google Search Console to request URL indexing. (Note: the best way to make pages inaccessible to Google is to implement HTTP authentication.)
Not found (404)
The “Not found (404)” status indicates the URL wasn’t included in an XML sitemap, but Google somehow found the page and it returned an HTTP status code 404 indicating that the URL was not found on the server. This typically occurs when URLs that existed in the past are still being linked to from another websites.
Action required: make sure all 404’d URLs in this report in fact should not be in Google’s index. If you find important URLs in this list, restore the page or 301 redirect the URL to the most relevant page on your website. (Note: 301 redirecting important URLs will pass income signals and “link juice” to relevant website pages. When an important URL is simply 404’d, any external links to your website through the 404’d page are lost–as is the SEO benefit of these links.)
Page removed because of legal complaint
Due to a valid legal complaint, or motion, the URLs was removed from Google’s index.
Action required: make sure that the status is correct for each URL in this report.
Page with redirect
These URLs are being redirected to another page, and are therefore not being indexed by Google.
The Soft 404 status indicates that the page is not returning a HTTP status code 404 but Google nonetheless considers the page to be a 404 page. The Soft 404 status is often assessed to pages that (1) show a “Page can’t be found” message or (2) are considered very low quality. For example, an empty category page on an eCommerce website may be assessed a soft 404 by Google.
Action required: if there are important pages in the Soft 404 report, restore they’re contents, improve overall page quality or 301 redirect the page to the more relevant alternative URL
Sitemaps > /sitemap.xml
Removals New Request
Core Web Vitals