- Faceted search is an in-page ecommerce navigation system that allows users to filter products to fit their user preferences.
- Google encourages faceted navigation for usability, but warns against the SEO issues it can cause.
- Faceted navigation creates a new URL for every filtered search, which can create a massive volume of duplicate content and hurt the search engine visibility of your priority pages.
- Avoid faceted search SEO issues by setting canonical tags, configuring URLs Google can easily understand and ensuring Google crawls only priority pages.
If you manage ecommerce SEO for a large enterprise site, you probably work with some sort of faceted search system. These nifty sort and filter features are a huge benefit from a user standpoint – but they’re notorious in the SEO world for creating a quagmire of issues. A faceted navigation system left to its own devices will generate potentially millions of pages of duplicate content, like a magical photocopier that won’t turn off. Here’s how to keep your website’s faceted navigation in check without sacrificing any of its benefits.
WHAT IS FACETED SEARCH?
Let’s start with a simple faceted search definition:
Commonly called faceted navigation, faceted search is an in-page navigation system used for ecommerce sites, listing sites and other websites that deal with a large list of results. Faceting simplifies site search by presenting shoppers with a smart, logical search interface.
Facets vs filters
What’s the difference between facets and filters? People often confuse facets and filters because they both help searchers narrow down a large list of items. But here’s the difference:
Filters are applied globally, and the set of filtering options remains the same regardless of previous selections.
Faceted navigation is different. Each selection returns a subcategory of new choices (facets) that may change depending upon the previous selection. More importantly, facets are not applied globally. So if you search for “red shirts,” you could apply a “mens clothing” filter, or a “long sleeve” facet.
When it’s used for ecommerce, faceted navigation gives the user many filtering options to sort through various product attributes and drill their shopping down to the exact type of product they need. This is exceptionally helpful for sites with large product catalogs, apparel stores and any instance when searchers may sort by product attributes that don’t warrant their own category.
For example, a user shopping for shoes could filter by many possible combinations, including color, material, size, style and price range. So if they’re looking for green leather sandals in a size 7 for under $200, faceted navigation will let them arrive at a product list that fits that criteria. “Green sandals” or “size 7” sandals don’t warrant their own category or subcategory page, let alone “size 7 green sandals.”
So faceted navigation is an elegant way to address a complex set of user preferences. It’s far more elegant than, say, building a nearly-infinite number of landing pages and creating a way for the user to navigate through them.
Faceted navigation examples
Faceted search examples can be found on almost any major ecommerce site (think Amazon). Let’s look at Target’s user interface. If you’re shopping for men’s t-shirts and you land on Target’s home page, you can take the following path without ever leaving the main navigation system:
Home > Men > Men’s Clothing > Shirts > T-shirts > Basic Tees.
The primary navigational journey ends there, rather than overwhelming the user with choices or forcing them to narrow the field down further before they are ready.
On the Basic Tees page, a clean faceted navigation design helps the user filter the list of basic tees by the following qualities:
That’s nine additional attributes – ten including out-of-stock products. All the different combinations of these attributes result in thousands and thousands of different versions of this single page.
To put that another way, if you filter by your list of preferred attributes and get here:
…That’s one of the thousands of possible results. All within a single sub-category of a sub-category.
WHAT DOES GOOGLE THINK?
That staggering number is the reason faceted navigation is the most user-friendly option for customers. It’s also the most practical one for ecommerce teams. It simply doesn’t make sense to build page after page, forcing the user through click after click on an arduous journey with no end in sight.
And here’s where we hit an anomaly in the SEO world, one of the rare instances of good user experience potentially conflicting with good SEO. Google still falls on the side of user experience and faceted navigation. But the possible issues created by faceted navigation are formidable enough that Google issued a detailed warning about them.
COMMON FACETED NAVIGATION SEO PROBLEMS
When we used the faceted navigation on Target’s Basic Tees page and landed on a final list of filtered results, our end URL looked like this:
Faceted navigation systems operate by creating a new URL for every filtered search. They’ll either dynamically generate the URL, creating something like the one used in our example. Or they’ll append parameters that specify how the category URL is behaving (more on this later). That means even if you’re not creating new landing pages for every possible permutation of attributes, your faceted navigation is creating them as they happen.
Left unchecked, faceted search can leave your site running amok with duplicate content. It can rob high-priority pages of crawl budget and link equity. This is easier to conceptualize if you can think of “crawl” as a resource rather than an action.
The biggest faceted navigation SEO mistakes happen when you fail to anticipate and fix dynamically generated unique urls. Broadly, they can be broken into two major issues:
How your URLs look and behave
A language is encoded into URLs that tells crawlers more about where the page is in your website architecture. It also tells spiders like Googlebot how to interpret what’s happening with the page. URLs that set the wrong file paths or aren’t encoded with the right parameters can confuse the search engines. As a result, you could waste crawl budget. Or worse, the duplicate page can be indexed. We’ll dig into this more when we look at the best practices for URLs.
How you handle duplicate content
When faceted navigation creates duplicate pages, how do you ensure search engines don’t crawl and index them? How can you make sure the process happens automatically with the creation of each duplicate?
Fortunately, there are several reliable ways to keep your faceted navigation system from causing problems with SEO.
FACETED NAVIGATION BEST PRACTICES FOR SEO
These fixes are so reliable, in fact, that if you’re using an out-of-the-box faceted search integration or ecommerce platform, the best practices are probably built into the system.
But what if you deploy custom development to adjust your current system or build a new one from scratch? We’ll walk you through the best way to do faceted navigation to avoid the dreaded duplicate content monster.
Read more: Is thin content is hurting your SEO?
1. Run a crawl
Sometimes when you create duplicate content, search engines crawl it, identify it as duplicate content and refuse to index it. That creates bloat across your site and pulls crawl without helping your organic presence in any way, ultimately diminishing the authority for the pages that should be crawled. The first step to fixing this issue in ecommerce sites is finding it in the first place.
A tell-tale sign that you might have duplicate content issues is a wonky indexing ratio in Google Search Console. If the number of pages indexed significantly outweighs the number of pages crawled/submitted for your site, there’s a problem somewhere and duplicate content is the likely culprit. If that’s the case, then faceted search should be one of your first suspects.
The only way to understand the full scope of the issue, though, is to run a site crawl using Screaming Frog, DeepCrawl or your preferred crawling tool. A full crawl will return a list of every URL on your site and identify duplicates. It will also find canonical errors and let you set URL parameters that can help you zero in on your faceted search category pages specifically.
2. See if the pages are indexable
We just pointed out why it’s a problem when Google crawls but doesn’t index your content. It’s a whole other problem when Google crawls and then indexes the duplicate pages. This creates a poor search experience for the user and ultimately impacts your site authority.
Check for this by running a site:search for any of your category pages. If the search results return a long list of indexed pages, there’s a problem.
You can also just grab a handful of the URLs generated by your faceted search and Google them. If they show up in the search results, they’re indexed. You should instead see something like this:
3. Set canonical tags
One you’ve spotted the problem, move through a series of best practices to build your fix. First and foremost, check the website’s canonical tags. Every URL created by faceted search in an ecommerce website should canonicalize to the preferred version of the page. In this case, the preferred version will be the ecommerce category page where that search started.
So in our Target example, our filtered search brought us to this URL:
That URL is not causing issues for Target because a quick peek at their source code reveals that they’re canonicalizing to the Basic Tees page:
You don’t need to take any further action.
But what about pagination? It’s important to note that Google no longer supports rel next and rel prev. However, they recommend using pagination if it would improve user experience.
4. Configure URLs
A filtered search will usually create a dynamically generated URL. But using the sort feature will generate a URL that tells a story. URLs like this often include file paths to directories, indicating the page’s position in the site architecture. They can be encoded with a language that helps search engines interpret what’s happening on the new page. Here’s how to follow best practices:
Stick to standard encoding
Mark all key=value pairs with an = sign, not a comma. Append multiple parameters with an ampersand. Don’t use brackets or other non-standard characters. When we head back to Target’s Basic Tees page and sort by price, our URL looks like this:
That URL communicates that we’ve moved into the Basic Tees category and we’re now sorting by price from low to high.
Don’t put variables like session id in the file path
A URL’s file path or directory is a bit like a breadcrumb for search engines. And for people too, actually. If a user has moved far into a subcategory and they want to get back to the higher category, they’ll often manually adjust the URL by deleting the URL parameters. For example, I can get back to the higher category for by deleting everything past the question mark in the example below:
So if you’ve got a user creating a new session id and you’re changing the file path to include that session id, you’re going to end up with a huge problem on your hands that can result in infinitely more URLs. Target is following best practices here so they didn’t do that. But if they did, it might look something like this:
In this example, the session id is s1489. And it’s up there in the file path instead of appended as a parameter.
Session ids aren’t the only variables to consider, either. Anything that does not change page content – like tracking ids, referrer ids and timestamps – doesn’t belong in the file path.
5. Disallow via robots meta tag or X-robots-tag
If faceted search creates certain URL parameters that you don’t want search engines or web crawlers to index, you can block them with a robots meta tag. Just add the following noindex tag to the <head> section of your page:
<meta name=”robots” content=”noindex”>
This will prevent search engines from indexing those pages. You can even customize the tag to only allow certain crawlers, like Googlebot. However, it won’t do anything to free up crawl budget or preserve link equity. If you want to do that, you’ll need to also add a nofollow, like this:
<meta name=”robots” content=”noindex, nofollow”>
While that’s a great solution for single URLs, it’s not very scalable. If you have hundreds (or thousands) of ecommerce product pages, you’ll need to use an X-robots-tag.
Let’s say, for example, your faceted search results always appear after the directive /filter/ or /sort/ in your URL. All you’d need to do in that instance is disallow /sort/ or /filter/ with the x-robots meta tag. Since this directive supports regular expressions (regex), you can disallow multiple parameters or folders from crawling and indexation.
Previously, webmasters used a robots.txt file in this way. However, it’s important to note that as of September 1, 2019, Google no longer supports robots.txt files with the noindex directive.
BE WARNED: your URLs must be pristine and consistent for this to work. Otherwise, you might unintentionally block important pages. Or you might fail to catch all instances of duplication. And it doesn’t necessarily guarantee that your page won’t be indexed.
6. NOFOLLOW internal links
Let’s say you’re Target and you decide to write a blog post about how men can show their support during Breast Cancer Awareness Month. Instead of highlighting individual products, you link to a complete list of pink t-shirts generated by your handy dandy filter:
Whenever you use a filter in this way for any internal link on the site, nofollow that link. This will prevent crawlers from discovering unnecessary URLs, pulling that precious crawl bandwidth and link equity away from the rest of your site.
Google also recently introduced two new link attributes – rel=”sponsored” and rel=”ugc”. These can help Google differentiate between your priority pages and other content like comments and sponsored posts. Using one or more of these tags can help guide Google to the pages that really matter.
7. Consider the users
Last but certainly not least: always, always, always think like a user. What is the easiest solution for them? In your frenzy to follow strict best practice have you accidentally excluded pages people would actually search for? This won’t be the case with attributes like size. But for filters like brand or style, it very well could be.
Learn how to do keyword research like a boss so you’re aware of search volumes for queries related to different product attributes.
For example, Target no doubt discovered a fairly high search volume for “mens graphic t shirts.” So instead of including it as a style attribute in one of their other T-shirt categories, they pulled it out and turned graphic tees into a different category. That allows them to capture the search traffic for graphic tees and create a better experience for the user.
When you encounter a situation where one of your faceted attributes has a high search volume or it can be filtered further (graphic tees can be filtered by trend), consider whether it belongs on a static landing page. Make your adjustments accordingly.
Also, set up your faceted navigation to make the user’s life easier. Add breadcrumbs to each page so the user can quickly return to where they were before. Double check for speed and functionality. Check for mobile functionality and appearance. And don’t give users the option of choosing filters that won’t actually return any products.
Continuously check your faceted navigation for issues and prioritize fixes. You’ll satisfy users and search engines without sacrificing one for the other. Build these processes into your system and you’ll be anticipating and correcting issues faster than you can hunt down a new T-shirt.