Majestic

  • Site Explorer
    • Majestic
    • Summary
    • Ref Domains
    • Backlinks
    • * New
    • * Lost
    • Context
    • Anchor Text
    • Pages
    • Topics
    • Link Graph
    • Related Sites
    • Advanced Tools
    • Author ExplorerBeta
    • Summary
    • Similar Profiles
    • Profile Backlinks
    • Attributions
  • Compare
    • Summary
    • Backlink History
    • Flow Metric History
    • Topics
    • Clique Hunter
  • Link Tools
    • My Majestic
    • Recent Activity
    • Reports
    • Campaigns
    • Verified Domains
    • OpenApps
    • API Keys
    • Keywords
    • Keyword Generator
    • Keyword Checker
    • Search Explorer
    • Link Tools
    • Bulk Backlinks
    • Neighbourhood Checker
    • Submit URLs
    • Experimental
    • Index Merger
    • Link Profile Fight
    • Mutual Links
    • Solo Links
    • PDF Report
    • Typo Domain
  • Free SEO Tools
    • Get started
    • Backlink Checker
    • Majestic Million
    • Browser Plugins
    • Google Sheets
  • Support
    • Blog External Link
    • Support
    • Get started
    • Tools
    • Subscriptions & Billing
    • FAQs
    • Glossary
    • How To Videos
    • API Reference Guide External Link
    • Contact Us
    • About Backlinks and SEO
    • SEO in 2024
    • Link Building Guides
    • Webinars
  • Sign Up for FREE
  • Plans & Pricing
  • Login
  • Language flag icon
    • English
    • Deutsch
    • Español
    • Français
    • Italiano
    • 日本語
    • Nederlands
    • Polski
    • Português
    • 中文
  • Get started
  • Login
  • Plans & Pricing
  • Sign Up for FREE
    • Summary
    • Ref Domains
    • Map
    • Backlinks
    • New
    • Lost
    • Context
    • Anchor Text
    • Pages
    • Topics
    • Link Graph
    • Related Sites
    • Advanced Tools
    • Summary
      Pro
    • Backlink History
      Pro
    • Flow Metric History
      Pro
    • Topics
      Pro
    • Clique Hunter
      Pro
  • Bulk Backlinks
    • Keyword Generator
    • Keyword Checker
    • Search Explorer
      Pro
  • Neighbourhood Checker
    Pro
    • Index Merger
      Pro
    • Link Profile Fight
      Pro
    • Mutual Links
      Pro
    • Solo Links
      Pro
    • PDF Report
      Pro
    • Typo Domain
      Pro
  • Submit URLs
    • Summary
      Pro
    • Similar Profiles
      Pro
    • Profile Backlinks
      Pro
    • Attributions
      Pro
  • Custom Reports
    Pro
    • Get started
    • Backlink Checker
    • Majestic Million
    • Browser Plugins
    • Google Sheets
    • Get started
    • Tools
    • Subscriptions & Billing
    • FAQs
    • Glossary
    • How To Videos
    • API Reference Guide External Link
    • Contact Us
    • The Company
    • Style Guide
    • Terms & Conditions
    • Privacy Policy
    • GDPR
    • Contact Us
    • What is Trust Flow?
    • SEO in 2024
    • Link Building Guides
    • Webinars
  • Blog External Link
    • English
    • Deutsch
    • Español
    • Français
    • Italiano
    • 日本語
    • Nederlands
    • Polski
    • Português
    • 中文

Find opportunities for embedding machine learning into your processes

Lazarina Stoy

Lazarina believes that too many SEOs are manually doing tasks that could be automated, and that machine learning could be the answer.

@lazarinastoy  
Lazarina Stoy 2022 podcast cover with logo
« Back to SEO in 2022
More SEO in 2022 YouTube Podcast Playlist Link Spotify Podcast Playlist Link Audible Podcast Playlist Link Apple Podcast Playlist Link

Find opportunities for embedding machine learning into your processes

Lazarina says: "There are three main components of embedding machine learning into your processes. Firstly, the 'What' - which is to increase your efficiency exponentially. Secondly, the 'Why' - which is to focus on building systems that skyrocket performance. And thirdly, the 'How' - which is finding opportunities to utilise machine learning.

Increasing your efficiency is something that every SEO needs to do, especially as the industry becomes more competitive and search engines continuously improve their offering and algorithms. There's a lot of hunger for good SEOs because businesses and individuals are becoming a lot more aware of the importance of developing their online brands and digital presence. They need allies to help them do this efficiently, and it makes sense to think about how to increase your efficiency as an SEO because it can save you a lot of time - which is a very valuable commodity. You need to build systems internally that can free up your time to focus on strategy, and develop scalable systems for your clients.

The most exciting way to become more efficient is to seek opportunities for embedding machine learning into your processes, and there are a couple of different steps you can take to do this. First of all, you need to become familiar with the different types of models, scripts, and tools available out there. Most of them don't even require any coding experience, you just have to get started and get your hands dirty. Then, when you're encountering a new task or project, you just have to think about how you can break this down into things that are more manageable. Finally, you just have to assess the characteristics of the task, and identify which machine learning models, libraries, or scripts can become your allies when completing these tasks. This will give you more time to focus on more scalable initiatives."

What specific SEO tasks are currently not particularly efficient, and can be aided with machine learning?

"Every task that requires you to pull exports from different systems and tools. Most of the time, we get these large chunks of data that need to be audited. The process of data science and analysis is the number one area where you can get a machine learning model or script to become your ally in identifying opportunities. Furthermore, you can easily measure the time you save after implementing a particular script model.

For things like internal linking or technical audits, you can create scripts that actually identify the top opportunities based on machine learning libraries, or even clustering content based on the similarity of this content. Obviously, it will be a lot easier for a Natural Language Processing (NLP) library, or model, to go through the content of your website and cluster it as opposed to you reading the articles and trying to make sense of them. These are great opportunities to rapidly scale your auditing processes."

Is this something that existing SEO auditing tools can offer or does this have to be set up manually without a pre-existing platform?

"It's a mixture of both, and it depends on the access to the tools you have. For instance, if you are working in-house, you might have access to very advanced, expensive tooling, which allows you to get more insights than you'd normally get from doing simple analysis yourself with basic tooling. There are some great tools out there that provide very insightful comments. If you want to do a very in-depth analysis - especially on larger websites - it's always good for you to know how the tools created the insights, so you can replicate it in your processes as well."

Can you recommend any processes and machine learning tools?

"It's quite easy to find all of the Python stuff from the SEO community on Twitter - so I'd definitely recommend following them to get a lot of amazing resources.

In terms of the processes, you can very quickly look into automating things like keywords clustering, extraction of the main keywords for a particular topic, labelling search intent based on the content of the article or based on the title, and looking into how to cluster different content using topic modelling and algorithms. If you get a massive website audit and have a huge export from a tool like Screaming Frog, exploring this data in Python is a great starting point, before incorporating different models based on what the analysis shows you.

A couple of very quick libraries to get started doing this are pandas and NumPy. For visualisation, you can incorporate things like Matplotlib and for natural language processing, things like NLTK's fuzzy matching techniques. Also, there are different clustering algorithms, but k-nearest neighbors (KNN) is the one that works well for clustering different texts.

When you have a particular task, the main thing is to break down what data you are trying to analyse. Is it numerical or is it text? Then, label the task you're trying to do. For instance, if you're analysing text data, are you trying to generate new text, cluster it, or maybe label or classify it? Once you have these two things, you can start searching for algorithms, libraries, or scripts that can help you achieve this task."

Does this mean you could use machine learning to assist you with identifying content opportunities, and determining what you should be writing about next?

"Yes - but this is the second step of topic clustering. The first step is analysing your website and content using machine learning libraries to provide embeddings for all the words in the text. They work by considering the inter-exchangeability of words and topics. For instance, if you have a lot of keywords in your content, you might imagine this is a topic that you are trying to target as well. This is the same assumption most of the tools like Semrush make when they provide you with the list of parent, seed, and 'broad match' keywords.

This analysis during the first step will show you the definitive clusters of content, and where the similarities between these clusters lie. This can give you a lot of opportunities to find out which clusters you can link together, and which are the main keywords for each cluster - so you can guide your users to discover new content.

After this, you can seek out where your topical authority is - based on the content you have. For instance, which topic is the most represented on your website? Does this align with your business proposition? If it doesn't, then you know where to expand, and invest in more content development."

Should SEOs be concerned that quality may deteriorate when so many different tasks are automated?

"This is definitely something I consider when trying to implement any sort of machine learning algorithm. People should consider these scripts, tools, and libraries as allies rather than replacements to a particular process. Imagine this is like someone on their first day in SEO, because these algorithms are not typically designed for our work. There would definitely be a quality checking step after any implementation.

Normally, you will be implementing pre-trained models. It will take additional time to fine-tune it based on the data you have if you're doing that. If you're just looking for a simple output, such as automating meta descriptions and generating them in bulk, then you'd need to quality check the output after using a machine learning tool. Similarly, you'd need to sense check things like automated image alt text generation and captions."

Can you think of anything that shouldn't be automated?

"At this point, I really don't think we should be automating content generation. The creators of GPT-3 and other models have trained them on historical data that is not updated in real-time. Of course, this will probably change in the future. The other issue is that it has a lot of biases and makes a lot of assumptions. Furthermore, when you are generating text, there is no authoritativeness. There is no trust in the text, because it's not fact-checked or providing references.

As SEOs, we know that search engines and users want authoritativeness, trust, and expertise, and we cannot safely say that our automated models can provide this yet. Until they do, I'm not sure how they can be used as the main driver of a content strategy. Things like content and user experience really need the human touch."

How far away are we from a significant percentage of the content on a website being generated automatically?

"I've seen a lot of SEOs currently running experiments with their sites, and they never put the name of the site there. You never know, it might already be the case! I really don't know - but it's an interesting future to think about. I'm sure that Google still states in their guidelines that automatically generated content is not something that is aligned with best practice. Automated content requires editing, fact-checking, and sense-checking. Hopefully, we are far away from this - but you never know."

What's one thing SEOs should stop doing to spend more time investigating the potential opportunities of machine learning?

"If you imagine an on-page optimisation project, you'll have many different mini-workstreams - such as optimising the meta descriptions, optimising the titles, and writing image captions. Break it down into chunks and try to test out all of the different scripts and models that already exist for things like generating titles, meta descriptions, H1 headings, and alt text. See how much easier it is to work with this output. Then, just edit and sense check it, as opposed to trying to generate it yourself. You can now use the time you saved from doing this to create better strategies for scaling sites."

You can find Lazarina Stoy over at LazarinaStoy.com.

@lazarinastoy  

Choose Your Own Learning Style

Webinar iconVideo

If you like to get up-close with your favourite SEO experts, these one-to-one interviews might just be for you.

Watch all of our episodes, FREE, on our dedicated SEO in 2022 playlist.

youtube Playlist Icon

Podcast iconPodcast

Maybe you are more of a listener than a watcher, or prefer to learn while you commute.

SEO in 2022 is available now via all the usual podcast platforms

Spotify Apple Podcasts Audible

Book iconBook

This is our favourite. Sometimes it's better to sit and relax with a nice book.

The best of our range of interviews is available right now as a physical copy and eBook.

Amazon US Amazon UK

Don't miss out

Opt-in to receive email updates.

It's the fastest way to find out more about SEO in 2025.


Could we improve this page for you? Please tell us

Fresh Index

Unique URLs crawled 331,200,592,566
Unique URLs found 791,501,903,333
Date range 23 Jul 2024 to 20 Nov 2024
Last updated minute ago

Historic Index

Unique URLs crawled 4,502,566,935,407
Unique URLs found 21,743,308,221,308
Date range 06 Jun 2006 to 26 Mar 2024
Last updated 03 May 2024

SOCIAL

  • LinkedIn
  • YouTube
  • Facebook
  • Bluesky
  • Twitter / X
  • Blog External Link

COMPANY

  • Flow Metric Scores
  • About
  • Terms and Conditions
  • Privacy Policy
  • GDPR
  • Contact Us

TOOLS

  • Plans & Pricing
  • Site Explorer
  • Compare Domains
  • Bulk Backlinks
  • Search Explorer
  • Developer API External Link

MAJESTIC FOR

  • Link Context
  • Backlink Checker
  • SEO Professionals
  • Media Analysts
  • Influencer Discovery
  • Enterprise External Link
top ^