Using Machine Learning for SEO Competitor Research

Learn how to use machine learning for more precise, statistically relevant, and scalable SEO competitor research (with tools, code & more).

Using Machine Learning for SEO Competitor Research 2

In this particular case, the most important factor was “title_keyword_dist” which measures the string distance between the title tag and the target keyword. Think of this as the title tag’s relevance to the keyword.

No surprise there for the SEO practitioner, however, the value here is providing empirical evidence to the non-expert business audience that doesn’t understand the need to optimize title tags.

Other factors of note in this industry are:

  • no_cookies: The number of cookies.
  • dom_ready_time_ms: A measure of page speed.
  • no_template_words: Counts the number of words outside the main body content section.
  • link_root_domains_links: Count of links to root domains.
  • no_scaled_images: Count of images scaled that need scaling by the browser to render.

Every market or industry is different, so the above is not a general result for the whole of SEO!

How Much Rank a Ranking Factor Is Worth

In another market case, we can also see how much rank will be delivered.

Using Machine Learning for SEO Competitor Research 3

In the chart above, we have a list of factors and the rank change for every positive unit change in that factor.

For example, for every unit increase in meta description length by 1 character, there is a corresponding decrease in Google rank of 0.1.

Taken out of context, this sounds ridiculous. However, given that most meta descriptions are populated it would mean that a unit change away from the average meta description length would then lead to a decrease in Google Search ranking.

The Winning Benchmark for a Ranking Factor

Below is a graph plotting the average title tag length for a different industry to the one above, which also includes a line of best fit:

Using Machine Learning for SEO Competitor Research 4

Despite the best practice SEO recommendation of using up to 70 characters for title tag length, the data plotted above shows the actual optimum length in this industry to be 60 characters.

Thanks to machine learning, we’re not only able to surface the most important factors but when taking a deep dive can also see the winning benchmark.

Automating Your SEO Competitor Analysis with Machine Learning

The above application of machine learning is great for getting some ideas to split AB test and improve the SEO program with evidence-driven change requests.

It’s also important to recognize that this analysis is made all the more powerful when it is ongoing.

Why?

Because the ML analysis is just a snapshot of the SERPs for a single point in time.

Having a continuous stream of data collection and analysis means you get a truer picture of what is really happening with the SERPs for your industry.

This is where SEO purpose-built data warehouse and dashboard systems come in handy, and these products are available today.

What these systems do is:

  • Ingest your data from your favorite SEO tools daily.
  • Combine the data.
  • Use ML to surface insights like to above in a front end of your choice like Google Data Studio.

To build your own automated system, you would deploy into a cloud infrastructure like Amazon Web Services (AWS) or Google Cloud Platform (GCP) what is called ETL i.e., extract, transform and load.

To explain:

  • Extract – Daily calling of your SEO tool APIs.
  • Transform – The cleaning and analysis of your data using ML as described above.
  • Load – Depositing the finished result in your data warehouse.

Thus your data collection, analysis, and visualization are automated in one place.

TL;DR?

Competitor research and analysis in SEO is difficult because there are so many ranking factors to control for.

Spreadsheet tools are not up to it, due to the amounts of data involved (let alone the statistical capabilities that data science languages like Python offer).

When conducting SEO competitor analysis using machine learning, it’s important to understand that this is a regression problem, the target variable is Google rank, and that the hypotheses are the ranking factors.

Using ML on your competitors can tell you what the key drivers are, identify winning benchmarks among them, and inform just how much lift in rank your optimizations can potentially deliver.

The analysis is a snapshot only, so to stay on top of the competitors, automate this process using Extract, Transform, Load (ETL).

This article has been published from the source link without modifications to the text. Only the headline has been changed.

Source link