Table of Contents

Want to Boost Rankings?
Get a proposal along with expert advice and insights on the right SEO strategy to grow your business!
Get StartedGooglebot Link Crawler: Gary Illyes Explains How It Works
- Aug 9, 2024
Imagine you’ve been operating under a belief for years: Googlebot, the sophisticated crawler at the heart of the world’s most powerful search engine, navigates websites just like a visitor would—clicking through links and moving from one page to another in real-time.
This belief has shaped my SEO strategies, influencing how I design a website’s architecture, manage internal linking, and prioritize crawl budgets.
But what if this belief was only partially true?
In a recent episode of Google’s Search Off The Record podcast, Analyst Gary Illyes offered me a revelation that could change how we SEOs think about Googlebot’s behavior.
The insight shared was simple yet profound: Googlebot doesn’t follow links in real-time, as we might have imagined.
Instead, it operates in a way that’s more methodical, more strategic, and perhaps more complex than we’ve given it credit for.
The Reality Check: Googlebot Gathers, Then Crawls
Let’s rethink the idea of Googlebot “following” links.
According to Illyes, the reality is that Googlebot collects links during its initial crawling of your site.
Gary Illyes says, “we keep saying Googlebot is following links, but no, it’s not following links. It’s collecting links, and then it goes back to those links.”
Picture it as a meticulous librarian gathering a list of all the books (or, in this case, URLs) it wants to check out later. Only after compiling this list does it return to those links to perform the actual crawling. (If you look at Google’s patents closely, most of them are based on document retrieval.)
This two-step process—collect first, crawl later—differs from the traditional view that Googlebot continuously navigates your site in real time. It’s not hopping from link to link instantly but planning its route, ensuring it captures every possible path before starting its journey.
What This Means for Your SEO Strategy
Now, you might be wondering: what does this mean for how I optimize my site?
Should I be doing things differently?
The short answer is yes, at least for me. This new perspective on Googlebot’s behavior invites a reevaluation of how you manage your site’s structure, internal linking, and crawl budget.
Crawl Budget: Beyond the Basics
Think about how you’ve been managing your crawl budget. If Googlebot collects links first and then decides when and what to crawl, the initial phase may be less resource-intensive than the actual crawling. This could give you more flexibility in prioritizing certain pages, ensuring that Googlebot spends its crawl budget where it matters most.
Site Architecture: Revisiting the Depth
You’ve probably heard countless times that a shallow site structure is best for SEO, making it easier for Googlebot to find your important pages.
But with this new understanding, it’s clear that Googlebot isn’t getting “lost” in deep pages—it’s collecting all the links first.
This doesn’t mean you should abandon a logical site structure, but it does suggest that the fear of hiding valuable content too deep might be less of an issue than previously thought.
After all, crawl budget worries shouldn’t affect a few hundred-page websites. Let’s say your website has 10,000+ pages, you may want to reconsider the idea of crawl budget optimization.
Crawl Frequency: Decoding the Patterns
Have you ever noticed that some pages on your site get crawled more often than others, even if they aren’t at the top of your hierarchy?
This could be because, after collecting the links, Googlebot prioritizes certain URLs based on their perceived importance—something that goes beyond just their position in your site’s architecture.
A New Way of Thinking About Crawling
This shift in perspective doesn’t render your existing strategies obsolete, but it suggests that we might have oversimplified Googlebot’s behavior.
Knowing that Googlebot is more like a collector than a follower changes how we should approach SEO.
From now on, when you think about optimizing your site, consider how Googlebot is gathering its data. Ensure that your important pages are well-linked, not just to make them easier to find, but to ensure they’re prioritized when Googlebot goes back to crawl.
The Takeaway
In the world of SEO, staying ahead of the curve often means challenging what we think we know. This revelation about Googlebot’s true crawling process reminds us that even the most fundamental aspects of search engine behavior can be more complex than they appear. As we continue to learn more about how Google operates, we must adapt and refine our strategies accordingly.
So, next time you plan your SEO strategy, remember: Googlebot isn’t just following links—it’s gathering them, strategizing its crawl, and ensuring it captures the full scope of your site before it digs deeper. And that small shift in perspective might just make all the difference.
Get Your Free SEO Audit Now!
Enter your email below, and we'll send you a comprehensive SEO report detailing how you can improve your site's visibility and ranking.
Share this article
