Googlebot User Agents and Strings – Complete 2024 List
By: Dileep Thekkethil | Updated On: May 3, 2024
Table of Contents
You may be wondering how Google can find trillions of web pages. The truth is Google heavily depends on different spiders or bots that it has built to scan the whole web to identify pages.
How you may ask, and the answer to that is the crawlers or Googlebot user agents (yet another name for the spiders) look into the trails of links that connect one page to another.
The most common Google crawler that visits your website is Googlebot, however, there are multiple other user agents that Google uses for different products and features.
This blog post will explore the typical Googlebot user agents you may encounter in your referrer logs. Knowing these bots will help you to specify them in robots.txt, the robots meta tags, and the X-Robots-Tag HTTP rules.
As I said, Google utilizes multiple user agents to help them understand the type of page so that they can surface in other products, features and services that the search engine giant offers.
By identifying the user agent token, you can specify rules inside the User-agent line in robots.txt of your website. For example, some specific Google crawlers may use more than one token; however, you only need to match one crawler token for a rule to apply in your robots.txt file.
Here is an exhaustive list of Google crawlers and their corresponding full user agent strings you might come across on your site log files.
Caution: The user agent string can be spoofed. Learn how to verify if a visitor is a Google crawler.
List of Googlebot User Agent and associated Full user agent strings
Common Googlebot User Agents
These common Googlebot User Agents play a crucial role in indexing web content and ensuring that the search engine provides relevant, up-to-date results. These user agents are designed to respect the guidelines laid out in the robots.txt file, which tells search engines which parts of a website should not be crawled or indexed.
1. Googlebot Image
This crawler is specifically designed to index images on the web.
Full user agent string: Googlebot-Image/1.0
2. Googlebot News
This crawler indexes news content for Google News.
Full user agent string: The Googlebot-News user agent uses the various Googlebot user agent strings.
3. Googlebot Video
This crawler is specifically designed to index video content on the web.
Full user agent string: Googlebot-Video/1.0
4. Googlebot Desktop
This crawler indexes desktop web pages for Google Search.
Full user agent strings:
- Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
- Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Googlebot/2.1; +http://www.google.com/bot.html) Chrome/W.X.Y.Z Safari/537.36
- Googlebot/2.1 (+http://www.google.com/bot.html)
5. Googlebot Smartphone
This crawler indexes smartphone web pages for Google Search.
Full user agent string: Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/W.X.Y.Z Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
6. Google Favicon
This crawler fetches favicons (website icons) associated with web pages.
Full user agent string:
- Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.75 Safari/537.36 Google Favicon
7. Google StoreBot
This crawler is responsible for indexing and analyzing web pages related to Google’s online store.
Full user agent strings:
- Desktop agent: Mozilla/5.0 (X11; Linux x86_64; Storebot-Google/1.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.88 Safari/537.36
- Mobile agent: Mozilla/5.0 (Linux; Android 8.0; Pixel 2 Build/OPD3.170816.012; Storebot-Google/1.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Mobile Safari/537.36
8. GoogleOther
A generic crawler used by various Google product teams for fetching publicly accessible content from sites for internal research and development.
Full user agent string: GoogleOther
9. Google-InspectionTool
Google-InspectionTool is the newest addition to Google’s list of user agents. Google has made it official by adding this information to the Google crawler help document.
Google says, “Google-InspectionTool is the crawler used by Search testing tools such as the Rich Result Test and URL inspection in Search Console. Apart from the user agent and user agent token, it mimics Googlebot.
The user agent token for the crawl activity of this latest crawler is either Googlebot or Google-InspectionTool.
It also comes with two full user agent strings, one for mobile and another one for desktop.
- Mobile
Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/W.X.Y.Z Mobile Safari/537.36 (compatible; Google-InspectionTool/1.0)
- Desktop
Mozilla/5.0 (compatible; Google-InspectionTool/1.0)
John Mueller, Senior Search Analyst / Search Relations team lead at Google, confirmed,“The update is now complete.”
If you check the bot activity and crawling activity in your log files, you may spot Google-InspectionTool, particularly if you use the Rich Result Test and URL inspection in Google Search Console.
In case you find issues with these tools, chances are, you may be blocking Google-InspectionTool user agent’s access to your website. Make sure you allow access to it.
10. Google-Extended
Google-Extended is a crawler or bot that allows you to control the access of your content to Bard AI and Vertex AI, Google’s machine learning platform for developing generative AI products.
Google says Google-Extended is a “standalone product token that web publishers can use to manage whether their sites help improve Bard and Vertex AI generative APIs, including future generations of models that power those products.”
With Google-Extended, you can tell Google not to use your website content or parts of it for Google’s AI projects. So, how do you do that?
To restrict Bard and Vertex AI from accessing your content, specify the same in your robots.txt with the user agent Google-Extended.
This way, you can infer Google to crawl, index and rank your website while disallowing Bard, Vertex AI and other future AI products from Google to use your content.
Special-case Googlebot User Agents
Special-case crawlers, as the name suggests, are designed for specific purposes and may not adhere to the general robots.txt rules like common crawlers do. These crawlers are typically used when there is an agreement or understanding between the website owner and the product that utilizes the crawler.
1. APIs-Google
This user agent is used for webmasters and developers to interact with Google’s various APIs.
Full user agent string: APIs-Google (+https://developers.google.com/webmasters/APIs-Google.html)
2. AdsBot Mobile Web Android
This crawler checks the ad quality on Android mobile web pages.
Full user agent string: Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/W.X.Y.Z Mobile Safari/537.36 (compatible; AdsBot-Google-Mobile; +http://www.google.com/mobile/adsbot.html)
3. AdsBot Mobile Web
This crawler checks the ad quality on iPhone mobile web pages.
Full user agent string: Mozilla/5.0 (iPhone; CPU iPhone OS 14_7_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.2 Mobile/15E148 Safari/604.1 (compatible; AdsBot-Google-Mobile; +http://www.google.com/mobile/adsbot.html)
4. AdsBot
This crawler checks the ad quality on desktop web pages.
Full user agent string: AdsBot-Google (+http://www.google.com/adsbot.html)
5. AdSense
This crawler is responsible for analyzing web pages to determine the relevance of content for displaying targeted ads.
Full user agent string: Mediapartners-Google
6. Mobile AdSense
This crawler is responsible for analyzing mobile web pages to determine the relevance of content for displaying targeted ads on mobile devices.
Full user agent string: (Various mobile device types) (compatible; Mediapartners-Google/2.1; +http://www.google.com/bot.html)
User-triggered Fetchers
User-triggered fetchers are designed to perform specific functions upon user request, and are not a part of the regular crawling process. Since these fetchers are initiated by users, they generally do not follow the robots.txt rules, as the user’s intent is to access specific information or perform a certain action.
1. Google Site Verifier
This fetcher is used to verify website ownership within Google Search Console.
Full user agent string: Mozilla/5.0 (compatible; Google-Site-Verification/1.0)
2. Feedfetcher
This fetches RSS and Atom feeds for Google services like Google Reader and Google Alerts.
Full user agent string: FeedFetcher-Google; (+http://www.google.com/feedfetcher.html)
3. Google Read Aloud
This fetches web pages to generate audio versions of the content, which can be played back using text-to-speech technology.
Full user agent strings:
- Desktop agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.118 Safari/537.36 (compatible; Google-Read-Aloud; +https://developers.google.com/search/docs/crawling-indexing/overview-google-crawlers)
- Mobile agent: Mozilla/5.0 (Linux; Android 7.0; SM-G930V Build/NRD90M) AppleWebKit/537.36 (KHTML)
4. Google Publisher Center
This fetches and processes feeds supplied by publishers through the Google Publisher Center for Google News landing pages.
Full user agent string: GoogleProducer; (+http://goo.gl/7y4SX)
Where and How to Use User Agents on Your Website
User Agents in Robots.txt
You can allow or block specific Google crawlers from accessing your content, by adding them as a User-Agent inside the robots.txt file.
User-agent: Googlebot
Disallow: /private/
User-agent: Googlebot-Image
Allow: /public/images/
Disallow: /private/images/
User-agent: *
Disallow: /private/
Directives inside robots.txt that Google respects
- User-agent: The User-agent directive specifies the web crawler or user agent that the following rules apply to. It can target a specific crawler (e.g., User-agent: Googlebot) or use the wildcard * to target all crawlers (e.g., User-agent: *).
- Disallow: The Disallow directive tells the web crawler not to access or crawl the specified URLs or URL patterns. For example, Disallow: /private/ would prevent the crawler from accessing any content within the /private/ directory.
- Allow: The Allow directive is used to grant permission for a web crawler to access specific URLs or URL patterns, even if they are within a disallowed directory. This directive is not part of the original robots.txt standard, but it is widely supported by major search engines like Google. For example, Allow: /private/public-content/ would permit a crawler to access the /private/public-content/ directory, even if the /private/ directory is disallowed.
- Sitemap: The Sitemap directive is used to specify the location of your XML sitemap, which helps search engines discover your site’s content more efficiently. For example, Sitemap: https://www.example.com/sitemap.xml.
User Agents in Robots Meta Tags
Some pages use multiple robots meta tags to specify rules for different crawlers. Google will use the sum of the negative rules, and Googlebot will follow both the noindex and nofollow rules. For Example:
<!– Example: Allowing Googlebot to index, but not follow links –>
<meta name=”googlebot” content=”index, nofollow”>
<!– Example: Disallowing all crawlers from indexing and following links –>
<meta name=”robots” content=”noindex, nofollow”>
Directives Inside Robots Meta Tags
- Meta name: The meta name attribute is set to “robots” to specify that this meta tag provides instructions for web crawlers. For example: <meta name=”robots” content=”…”>.
- Content: The content attribute contains the instructions for the web crawlers in the form of directives. These directives are separated by commas when there are multiple instructions. Some common directives include:
- index or noindex: Tells the crawler whether the page should be indexed or not. For example: <meta name=”robots” content=”noindex”> prevents the page from being indexed.
- follow or nofollow: Instructs the crawler whether or not to follow the links on the page. For example: <meta name=”robots” content=”nofollow”> tells the crawler not to follow any links on the page. However, Google doesn’t consider it as a directive but just as a hit and based on circumstances, it may decide to take it into account or not.
- archive or noarchive: Determines if the search engine should display a cached version of the page. For example: <meta name=”robots” content=”noarchive”> prevents the search engine from displaying a cached version.
- Specific user agent: Instead of targeting all crawlers using the “robots” meta name, you can target specific user agents by replacing “robots” with the name of the user agent. For example: <meta name=”googlebot” content=”…”> targets only the Googlebot crawler.
You May Also Like
Google Algorithm Update 2019
Older Google Algorithm Updates: 2023, 2022, 2021, 2020, 2019, Other Updates BERT Update Live for 70 Languages – December 9, 2019 Google has officially announced the rollout of BERT (Bidirectional Encoder Representations from Transformers) in Google Search across 70 languages. Earlier in October 2019, Google rolled out BERT, touting it as the latest and most reliable language processing algorithm. … Google Algorithm Update 2019
Google Algorithm Upate 2020
Older Google Algorithm Updates: 2023, 2022, 2021, 2019, Other Updates Google Algorithm Update – December 2020 Core Update On December 17th, Google announced that the rollout of the December 2020 Broad Core update is finally over. The December 2020 Core Update rollout is complete. — Google SearchLiaison (@searchliaison) December 16, 2020 The announcement about the update came two weeks back … Google Algorithm Upate 2020
Google Algorithm Update 2021
Older Google Algorithm Updates: 2023, 2022, 2020, 2019, Other Updates Product Review Update December 2021 Google has launched the second product review update for 2021, and it’s intended to help websites that offer in-depth reviews that help users make an informed buying decision. Earlier, in April, Google had launched a similar update. Sites with shallow reviews that add no value … Google Algorithm Update 2021
Comments