With Python, you can automate the Creation of E-commerce Category Pages

May 21, 2022

Finding chances to establish new e-commerce categories by clustering product inventory and automatically aligning SKUs to search demand is a wonderful approach to start.

Ecommerce sites can use niche category pages to line with organic search demand while also supporting consumers in making purchases.

If a website sells a variety of products and there is a high demand for those products, building a dedicated landing page is a simple method to meet that demand.

But where may SEO experts go for this opportunity?

Sure, you may guess, but you’ll likely miss out on a lot of opportunities.

This dilemma prompted me to write a Python script, which I’m releasing today in the form of a basic Streamlit application. (There’s no need to know how to code!)

Using only two crawl exports, the software linked above generated the following output automatically!

Have you noticed how the suggested categories are all automatically linked to the parent category?

The app even displays the number of products available to fill the category.

Benefits and Uses

Create new landing pages to improve relevancy to high-demand, competitive queries.

Increase the likelihood that relevant site connections will appear beneath the parent category.

Increased relevancy reduces CPCs on the landing page.

It has the potential to help with merchandising decisions. (If there is a strong search demand vs. a low product count, the range can be expanded.0

With relatively little effort, creating the proposed subcategories for the parent sofa category would connect the site to an extra 3,500 monthly queries.

Features

Automatically generate subcategory recommendations.

Subcategories should be linked back to the parent category (this eliminates a lot of guesswork).

Before recommending a category, match it to at least X products.

Before recommending a new category, make sure it’s similar to an existing one (fuzzy match X percent).

Set a minimum search volume/cost-per-click (CPC) cut-off for category recommendations.

Data on search volume and cost-per-click (CPC) from a variety of countries are supported.

Getting Started/Prepping the Files

You’ll need two things to utilize this app.

The Streamlit application.

Screaming Frog is a copy of the book.

A method of determining the volume of keyword searches. The Keywords Everywhere API key is supported natively by the Streamlit app, however, you can manually check search volume after the script has finished.

The purpose is to crawl the target website using two custom extractions at a high level.

The internal html.csv report, as well as an inlinks.csv export, are both exported.

The opportunities are then processed using these exports, which are subsequently uploaded to the Streamlit app.

Crawl and Extraction Setup

When crawling the site, you’ll need to use Screaming Frog to create two extractions: one to identify product pages and another to identify category pages.

When offering recommendations for new pages, the Streamlit app recognizes the difference between the two sorts of pages.

The key is to identify a distinct element for each page type.

(This is normally the price or the returns policy for a product page, and it’s usually a filter sort element for a category page.)

Extracting the Unique Page Elements

When crawling a web page, Screaming Frog enables bespoke extractions of text or code.

If you’re not experienced with custom extractions, this section may seem intimidating, but it’s critical for getting the right data into the Streamlit app

The goal is to create something similar to the image below.

(With no overlap, a unique extraction for product and category pages.)

Manually extracting the price element for a product page is demonstrated in the steps below.

After that, repeat for a category page.

The official documentation is worth your time if you’re lost or want to learn more about the web scraper function in Screaming Frog.

Manually Extracting Page Elements

Let’s begin by extracting a one-of-a-kind element that can only be found on a product page (usually the price).

With the mouse, choose the pricing element on the page, right-click, and select Inspect.

This will bring up the elements pane, which will already have the right HTML line selected.

Copy > Copy selection by right-clicking the pre-selected line. That concludes our discussion.

Copy the selector and put it into the custom extraction area of Screaming Frog. (Extraction > Custom > Configuration)

Choose Extract Text from the CSSPath drop down and name the extractor “product.”

To extract a unique piece from a category page, repeat the process. Once completed, both the product and category pages should look like this.

Finally, begin crawling.

When examining the Custom Extraction tab, the crawl should appear like this.

Notice how each page type’s extractions are different? Perfect.

The extractor is used by the script to determine the page type.

The extractor will be converted to tags by the app internally.

Exporting the Files

After the crawl is finished, the final step is to export two different types of CSV files.

Links to product pages in internal html.csv

In Screaming Frog, go to the Custom Extraction tab and highlight any URLs that have a product extraction.

(To group the column, you’ll need to sort it.)

Finally, right-click the product URLs and choose Export, then Inlinks from the drop-down menu.

You should now have a file named inlinks.csv on your computer.

Finally, the internal html.csv file must be exported.

Select HTML from the dropdown menu below and click the adjacent Export button after clicking the Internal tab.

Finally, choose to save the file as an a.csv file.

Congratulations! The Streamlit app is now ready for usage!

Using the Streamlit App

The Streamlit app is straightforward to use.

The cut-offs for the various choices are set to appropriate defaults, but you are free to change them to better suit your needs.

I would strongly advise utilizing a Keywords Everywhere API key (while it is not technically necessary; this may be done manually later with an existing tool if wanted).

(By checking for search traffic, the script pre-qualifies the opportunity.) The final result will contain more irrelevant terms if the key is absent.)

This is the portion on the left to pay attention to if you want to use a key.

Upload the inlinks.csv crawl after you’ve entered the API key and changed the cut-offs for your links.

When it’s finished, a new popup will show next to it, requesting that you upload the internal html.csv crawl file.

Finally, a new window will open, requesting you to pick the product and column names from the uploaded crawl file for them to be properly mapped.

The script will run after you submit the form. After you’ve finished, you’ll see the screen below, where you can download a handy.csv export.