Faceted Navigation Explained & Handling Facets
What is faceted navigation
Here I'll explain what faceted navigation is, the issues they cause and how to handle facets to ensure SEO efforts aren't impeded.
Faceted navigation, sometimes referred to as filtered navigation, is most often found on ECommerce websites which showcase a large range of products across multiple category and sub-category pages.
The faceted navigation will display to the visitor various options, or attributes, that are available for the product range on the web page. They are usually housed to the left on a category page, or sub-category or on any page for that matter.
Common attributes that make up faceted navigation usually consist of price, colour and brand along with any other attribute relevant to the products. Along with ECom websites, Job boards and flight booking websites usually provide faceted navigation options too. Where there are varied options for a product or service, faceted navigation will aid the visitor in drilling down to the product or service they wish to use, or purchase, fast.
Handling faceted navigation
Faceted navigation allows us to filter listings on a website’s category or sub-category page by the listed products most common attributes.
The action of the web page when a filter is clicked varies but common handling consists of the following actions:
- When a new filter is clicked a new page loads up in the browser
The above four faceted navigation examples are the most common, but how you choose to handle faceted navigation may be dictated by the platform used to build the website.
If you’ve the option to choose how facets are applied then it’s worth noting that if the product range offers a lot of choice, then several filters maybe clicked at once. If this is the case then it’s worth looking at the third option, options 1 and 2 follow a similar pattern when executed though option 4 generates a whole new URL.
URL handling is key to successful SEO in this sphere of ECom optimisation, when the above facets are applied, the URL handling may vary but will usually consist of the following occurring:
- Nothing happens and the listings update with no change to the URL
- A hash appends to the URL - #color=red
- Parameters are appended to the end of URL - ?color=red&brand=iphone
- A new URL is created, if you go from /jumpers/ and hit blue in the faceted navigation - /jumpers/blue
Facets & the SEO problem
The common problem with faceted navigation usually stems from the sheer number of URLs they generate, and if these generated URLs are not handled properly, they can lead to duplicated content issues, indexation bloat or crawling issues.
Let’s look at these issues in more detail below:
Duplicate content will arise if boiler plated content blocks (similar or identical content areas) are left on pages which are displayed after an attribute has been selected in the faceted navigation. If, for example, you are looking for a printer on a main category page, then you select a particular brand. If a content block exists on the main category page, you want this not to dupe on to a page that is created when an attribute is selected, as we want the content to reflect the selection made.
Issues arise when multiple selections can be made from the facets which result in the creation of unique URLs. The various selections can create multiple URL versions and if the same content blocks are outputted into each new URL, you will have a lot of pages with duplicated content.
The Canonical approach, detailed further below, is the ideal way of handling this issue but to ensure its fully affective the pages must be near perfect dupes. If not, Google will ignore the canonical leading to cannibalisation issues and the pages will struggle to perform in the serps.
Index bloat relates to low value pages being allowed to be indexed by search engines. Low-quality pages indexed by Google aren’t going to give search engines a good view regarding the overall quality of the website. This is obviously bad for overall website performance, so you see here the need to handle low quality pages which are often outputted from faceted URLs.
Resolving this is best done by blocking the generated URLs from being crawled, this is best achieved by applying the Robots txt file method, detailed further in the next section.
Crawling issues are nothing to worry about unless your website has well over a million or so unique pages. Though it shouldn’t be ignored if you have a website of around 20,000+ pages that often require updated content or changing content.
Issues occur from faceted navigation, in relation to crawling, when the generated URLs are crawlable. The multitude of potential crawlable URLs that can be generated by multiple attributes being selected will need consideration and most definitely handling. If not, the budget allocation for crawling your site will grow and may lead to future indexing issues.
To ensure a page is not to be indexed then apply a noindex tag to the page, again this is detailed further below.
Dealing with Faceted Navigation Issues
The method in which we deal with issues that arise from facets depends on the issue they are causing.
The Canonical Approach
If your website has less than a thousand pages but there are indexation issues caused by multiple URLs being generated with identical content, then a canonical tag will aid in resolving the issue. I see this a lot on Shopify websites that create multiple versions of the same product/category page, a canonical tag pointing to the preferred, often shortened version URL, works perfectly well. The canonical works perfectly well with URLs generated when facet options are clicked to. Best shown with an example:
Here's your category page URL:
The facets applied create a parameterised URL, when the attributes on the facet menu are clicked, and generate a URL like the below:
The generated facet URL would simply have a canonical tag pointing back to the category page URL – link rel="canonical" href=”www.yoursite.com/dresses/
The canonical tag consolidates link signals into the URL specified and so search engines will understand the page that is to be given priority in the serps.
To ensure the canonical isn’t ignored by Google check that internal links point to the canonicalized version of the page, too many internal links and external links to the faceted URL can lead Google to ignore the canonical directive. Also, the canonical maybe ignored if the canonical points to a page that Google deems not to be a duplicate, so obviously make sure you use the canonical appropriately.
The Robots.txt File
If crawl budgets are off the scale due to faceted selections generating thousands of URLs, then a robots directive to block a faceted URL is the answer.
Implementing a disallow directive that blocks a particular attribute selection will often lead to search engines not indexing the generated page. However, Google makes the final decision on this, it can be unreliable but often works fine. If the URLs, you are looking to prevent from indexation, have links from external and internal sources, the URL may be adjudged as valuable, and the Google may go ahead and index it anyways.
You can find out more about the Robots.txt file directive by clicking here, where you’ll find examples of URLs being blocked from indexation/crawls by using the Robots txt file.
Placing the noindex tag on a web page is the ultimate way of ensuring the web page will not be indexed into Google’s index.
When you noindex a web page Google will not, wait for it, index the page. However, be aware that when doing so all links from the noindexed page will be defunct, so ensure the page is not passing any good equity to existing pages via internal links.
The Nofollow approach
This approach is often utilised with the above Robots file process and solely involves simply placing a NoFollow directive in the faceted link that generates the URL when the facet is clicked.
Google does treat the Nofollow as a hint as opposed to a Robots directive, so be sure you absolutely do not want the URL to be indexed. Faceted URLs can have uses, but it depends entirely on the website’s nature, as in product range and overall structure.
This doesn’t mean you should rule this approach out, as Google will fully understand the hint and work accordingly as confirmed by Mr Mu himself:
“We will continue to use these internal nofollow links as a sign that you’re telling us:
- These pages are not as interesting.
- Google doesn’t need to crawl them.
- They don’t need to be used for ranking, for indexing.”
This approach doesn’t correct the dilution of PageRank. PageRank is still distributed between all links on the page, even those with the Nofollow attribute. If you want to fix that, you’ll need to implement the canonical tag.
Signs of Faceted Navigation Issues
You can make the judgement by analysing page source on pages with facets, or you can head to GSC and check out the Page Indexing area. In GSC check out the validated URLs and ensure the number is similar to what you’d expect.
Another strong indicator of faceted navigation issues can be detected by looking for the ‘Indexed, not submitted in sitemap’ type. If faceted navigation has been implemented recently, and you are seeing a high number of ‘Indexed, not submitted in sitemap’ URLs, then double check that these aren’t being created by URLs being generated by the faceted options.
With the recent updates to GSC, you will find excluded URLs in the Pages section with useful labels that give a strong indication of the how’s and whys of any unindexed URLs. The ‘crawled – currently not indexed’ type will also give you a good indication of what Google considers low value pages; faceted URLs will usually fall within this category too.
Mind your PageRank
PageRank and the distribution of it is something SEOs consider (and are usually extremely mindful of), and it’s a major consideration when it comes to exploring faceted navigation. Why? PageRank is a credibility factor in the eyes of search engines, the higher your pages PageRank score the more likely it is to compete in the serps (Search Engine Result Pages) for strong and high-volume keyword search terms (within competitive niches).
With the above being the case, and the fact that PageRank can be distributed throughout your website with the use of internal linking, it’s essential we target our champion, category, and other important pages with strong PageRank. The issue with facets occurs when PageRank is distributed across all the URLs generated when a faceted attribute is clicked. Not handling PageRank to all of these faceted URLs generated will impact PageRank from the page hosting the facets. PageRank is impacted by the number of links of a page, we need to ensure we don’t leak out the PageRank to the possibility of thousands of URLs generated by the facets.
Faceted navigation will need to be handled to ensure your website operates to its maximum potential. It’s clearly a technical SEO process and one where you may need the help of a dev to implement, depending on the site’s ownership or set-up.
I’ve worked across a variety of eCommerce platforms which sell a wide range of products and have come across countless issues generated by faceted navigation. If you are running an eCommerce website and need any help with the above, then please do not hesitate to get in touch.