If you are an SEO beginner and getting yourself familiar with different SEO terms like anchor text, meta description and keywords, chances are you must have heard about short-tail keywords and long-tail keywords.
Keywords are divided into these two broad categories for a better understanding of SEO beginners.
As you deepen your knowledge about keywords, you’ll know that a keyword can be categorized into several other categories as well, based on certain criteria.
But, for starters, there is a third category of keywords there you must be aware of besides the long tail and the short tail.
These are known as LSI keywords and this chapter is dedicated to throwing some light on the topic.
What is LSI?
Latent Semantic Indexing is a natural language processing technology developed in the 1980s.
LSI technology is meant to solve the problem of understanding the context of a search query.
LSI helps search engines understand polysemic words and phrases easily. They are words or phrases with multiple meanings.
For example, “to get” can mean to acquire and can be used in the following way: I’m going to get the book.
Another meaning of “to get” is to understand and can be used in this way: She is beginning to get it right.
You must be thinking, what does context have to do with keywords anyway? We’ll try understanding that in the latter part of the blog.
Before that, let’s understand what LSI keywords are and how they function.
What are LSI Keywords?
LSI keywords are phrases that are semantically related to a web page’s target keyword.
For example, if you have a page on “jacket styles for winter” and your target keyword is jackets, then the LSI keywords could be leather jackets, bomber jackets, men’s jacket styles, denim jackets, etc.
As you can see above, the LSI keywords are very closely related to the head term/target keyword.
How Does LSI Work?
LSI helps search engines to understand the relationship between different words and phrases in a document by using complex mathematical formulas.
For example, if you search for the word “fall”, it has two different meanings. One refers to the autumn season while the other is an act of falling down.
So, how do search engines determine which of these two search meanings you intend to use?
When you apply LSI technology to a set of words or phrases about seasons, the search engine can then quickly figure out that you are conducting a search for the word “fall” that denotes a season.
Fall is semantically related to words like winter, summer, season, etc.
What Makes LSI Keywords Important?
In earlier days, Google would figure out a page’s topic based on keywords used in it.
So, if you repeat a keyword multiple times on your page, Google will believe that your page is about it.
This is why keyword density was of such importance back in the day.
However, Google has become smarter over the years.
Google now tries to understand a page’s overall topic by relying on LSI keywords used throughout the content.
This helps the search engine to display the most accurate results on SERP for any search term.
Else, imagine the plight of millions of users who might be looking up on the search engine for cranes (a type of machine to lift objects) only to be fed with endless information about crane birds.
How does Google do it?
Google will take into consideration the title tag, URL, headlines, content, image alt text, primary and secondary keywords along with the LSI keywords to properly determine the exact topic of your page.
LSI keywords should not be confused with synonyms since they are just words with alternate meanings.
For example, the synonym of the keyphrase “cars for sale” can be “automobile for sale”, but the LSI keywords would be used cars for sale, luxury cars for sale, second-hand cars for sale, etc.
While some LSI keywords can be synonyms of the target keyword, not all synonyms are LSI keywords.
Tools to Find LSI Keywords
Now that you are familiar with LSI keywords and their importance, the next step is to know how to find LSI keywords and implement the same in your content.
We’ll list below ten different ways to find LSI keywords any web page.
Google autocomplete is one of the easiest and fastest ways to find LSI keyword opportunities for your content.
To find LSI keywords, just type the primary keyword in Google search and Google will try autocompleting it with a list of suggestions.
These are your LSI keywords because these are the terms that people usually look for when they search for something related to your main keyword.
You can also use SEO tools like UberSuggest and Keyword Tool to get LSI keyword suggestions.
People Also Ask
Google’s “People Also Ask” (PAA) section can also give you a fair idea of the LSI keywords to target in your content.
The PAA box can appear at different positions on SERP.
The PAA questions are closely related to your main query and can be utilized to get LSI keyword ideas
LSIGraphs and LSIKeywords.com are two dedicated tools that can be used to find LSI keywords for any given search term.
They work in a similar manner. You need to visit these tool links and type in a keyword you want to rank for.
You will get a list of LSI keywords to include in your content.
Searches Related to
This feature works in a similar manner to Google Autocomplete.
Instead of suggesting LSI keywords in the search bar, it gives you a list of related keyword suggestions at the bottom of the SERP.
Choose the terms that best suit your content and use them.
Bold Terms in Google Snippet Descriptions
Another way to find LSI keywords on Google is to do this.
You might have noticed that Google bolds terms in result snippets that match your keyword.
Besides, you will also notice that Google not only matches exactly what you searched for but also bold phrases that are similar.
These similar words and phrases are LSI keywords.
Google Keyword Planner
Google keyword planner is one of the most effective keyword research tools.
When you put a search term/phrase in the Planner, you’ll get a list of keyword ideas.
Some of them will be variations of your main keyword, which can be used as LSI keywords.
Google Image Tags
Add your target keyword to Google Images and Google will return you a bunch of related terms above the image results.
You can utilize some of these as LSI keywords.
To find LSI keywords in SERPStat, you have to enter your target keyword and then click on the “Related keyword” section on the tool.
This will give you a list of LSI keywords to target.
Answer the Public
Answer the Public is a free tool that allows you to find LSI keyword ideas in the “Related” section of the tool.
It analyzes data from autocomplete searches from the web and shows you search results related to the primary keyword.
All you need to do is enter the target keyword in the search bar and the tool will return a list of questions based on the keyword entered.
Also Asked is another useful tool that uses data from the “People Also Ask” section and shows more longtail results for a given keyword.
The tool will take your keyword and show you what other questions people also ask related to it.
What is TF-IDF?
TF-IDF stands for (term frequency-inverse document frequency).
It is a statistical measurement tool for evaluating the relevancy of a word to a document in a set of documents. This measurement is concluded by multiplying two different metrics.
- How many times a word appears in a particular document, and
- The inverse document frequency of the word across a set of documents
TF-IDF is very critical in scoring words for NLP (natural language processing) in machine learning.
This tool was mainly developed to search for documents and retrieve their information and text mining.
TF-IDF increases proportionally to the number of times a word appears in a document and reduces with the number of documents that contain the word.
Therefore, certain words that appear repeatedly in most documents like “what”, “if”, “this”, and so on, rank low since they don’t add much value to any document in particular.
However, when a word appears many times in a document without appearing multiple times on other documents in the set, it usually means that the word is very relevant.
How is TF-IDF Calculated?
The term frequency (TF) can be calculated in several ways. The simplest method is to raw count the instances where a word appears in a particular document.
The frequency can be adjusted by the length of a document or by finding the raw frequency of the most commonly used word in the document.
The inverse document frequency measures how common or rare a word is across a set of documents. If the measure is closer to 0, the frequency of the word is high across documents, else the measure will be close to 1.
To put TF-IDF in a mathematical equation, the TF-IDF score for the word t in document d from the document set D is calculated as follows:
Example: Consider a document containing 100 words wherein the word seo appears 3 times. The term frequency (i.e., TF) for seo is then (3 / 100) = 0.03.
Now, assume we have 10 million documents, and the word seo appears in one thousand of these.
Then, the inverse document frequency (i.e., idf) is calculated as log(10,000,000 / 1,000) = 4.
Thus, the Tf-IDF weight is the product of these quantities: 0.03 * 4 = 0.12.
TF-IDF & LSI Keywords
Search engines use TF-IDF to determine how often an LSI keyword will appear in a specific document.
It gives us a way to associate each word in a document with a number that measures the relevance of each word to that document.
Tools like Website Auditor by SEO PowerSuite can help you with TF-IDF analysis for top sites.
By doing this, you can easily analyze which LSI keywords are being targeted by your competitors.
If you find a new LSI keyword in the list for which your TF-IDF score is 0, it means you need to update your existing content with new keyword opportunities.
TF-IDF is important in extracting keywords from the text.
The highest scoring words in a document are the most relevant to that document and are therefore considered to be the most ideal choice of keywords for that particular document.
It is recommended that you do LSI keyword research beforehand for better optimization, instead of trying to add them later in your content.
In the image above, the Website Auditor tool has scraped the top ten results for the term- LSI Keywords.
It has generated a huge list of multi-word and single-word keywords.
The frequency at which these terms appear in the document is known as TF-IDF and it also displays the TF-IDF of your page that you can semantically optimize.