On July 22, 2025, Google’s John Mueller made an interesting recommendation: you might want to “noindex” your [llms.txt file](https://www.stanventures.com/news/google-dismisses-llms-txt-as-ineffective-and-unused-by-ai-bots-2479/). 

The reason? It can show up in search results when linked from other sites, confusing users who stumble upon it.

Mueller wrote on Bluesky, “That said, using noindex for it could make sense, as sites might link to it and it could otherwise become indexed, which would be weird for users.”

So what exactly is going on here?

### What Exactly Did Google Say?

On July 22, 2025, John Mueller posted on [Bluesky](https://bsky.app/profile/johnmu.com/post/3luhnpb4wdk2f), saying:

“Using noindex for it could make sense, as sites might link to it and it could otherwise become indexed, which would be weird for users.”

![John Mueller post on Bluesky](https://www.stanventures.com/news/wp-content/uploads/2025/07/John-Mueller-post-on-Bluesky.png)

This came in response to a direct question from a user who asked whether Google might view LLMs.txt as duplicate content, and whether it should be noindexed to avoid confusion.

To that, John clarified:

“It would only be duplicate content if the content were the same as an HTML page, which wouldn’t make sense (assuming the file itself were useful). 

That said, using noindex for it could make sense…” So, we are looking at two things here:

- It is not duplicate content, but…
- It might still end up indexed in Google and that is not ideal for users.

## What Is LLMs.txt?

If you have been hearing about [robots.txt](https://www.stanventures.com/blog/robots-txt-guide/) for two decades, then llms.txt is customized not for search engine crawlers but for large language models (LLMs) like OpenAI’s GPT, Anthropic’s Claude, or Google’s own Gemini.

![What Is LLMs.txt](https://www.stanventures.com/news/wp-content/uploads/2025/07/What-Is-LLMs.txt.png)

Introduced by some sites in late 2024 and early 2025, LLMs.txt is a proposed voluntary protocol that tells AI models which URLs or directories to avoid crawling and using for training data.

Think of it like saying:

“Hey AI, you can index these pages for summarization or training but please leave these others alone.”

No major AI company has formally endorsed or universally adopted this format yet but publishers are proactively adding it anyway.

## Why Is Noindex Even a Thing for a Non-HTML File?

Great question.

In theory, .txt files are not particularly meant to be indexed. But Google’s crawler does not discriminate strictly if someone links to your llms.txt file (which is very likely when bloggers or tech analysts review your AI-blocking policies), that file can get crawled, ranked, and shown in search results.

That is where things get messy.

Imagine a user stumbles upon your llms.txt page via a Google search result and expects rich content… only to see a dull plain text file with URL restrictions. 

That’s confusing, irrelevant and frankly a bad user experience.

Mueller is essentially saying:

 “Look, even if it is not duplicate content, why let it appear in search results at all? 

Just tell Google to ignore it from the index — it still exists, still gets respected by LLMs but does not clutter your SERPs.”

## Will This Impact How AI Models Use Your Content?

Not directly. Let us be very clear that no AI system has officially adopted llms.txt yet. John himself confirmed:

“As far as I know no AI system currently uses llms.txt.”

So, whether your llms.txt is indexed or noindexed doesn’t change how OpenAI or Anthropic scrapes your site. 

They are still deciding what respect to show to this new file format — if any.

However, that has not stopped publishers from deploying llms.txt as a form of public signaling. Some of the internet’s biggest media players (like The New York Times, CNN, and Reuters) have already deployed it. 

Even if it is not technically enforced, the file serves as a statement of intent much like early days of robots.txt.

## Could There Be SEO Risks in Letting It Get Indexed?

Let us think about SEO for a moment. Would having your llms.txt file indexed harm your rankings?

Unlikely. But there is zero SEO benefit in having it indexed either.

In fact, it could:

- Waste your crawl budget (especially for massive sites).
- Appear in site:yourdomain.com searches and dilutes brand perception.
- Show up in Google Search Console coverage reports as an indexed file with no real SEO value.

In short, this is clutter and SEO is all about clarity.

Adding a noindex header ensures the file does its job (being seen by AI agents) without leaking into search results where it does not belong.

## How to Noindex Your LLMs.txt File — The Right Way

If you are convinced or just curious, here is how you can implement this.

![How to Noindex Your LLMs.txt File](https://www.stanventures.com/news/wp-content/uploads/2025/07/How-to-Noindex-Your-LLMs.txt-File.png)

Let’s say your LLMs.txt file is hosted at:

- arduino: https://yourdomain.com/llms.txt

Add the following X-Robots-Tag HTTP header:

- makefile: X-Robots-Tag: noindex

This tells search engines, including Googlebot, not to index the file, even if they can access and crawl it.

It is clean, non-intrusive and does not affect the file’s discoverability for AI crawlers who may not even honor noindex directives.

## So… Should You Noindex Yours?

If your llms.txt file is public and linked to (even unintentionally) and you don’t want random users landing on a raw text file in search results… then yes, absolutely.

You have nothing to lose and a little clarity to gain.

On the flip side, if you are using llms.txt as a kind of brand statement or want it visible for PR purposes (say, showing journalists you’re AI-conscious), maybe you keep it indexable — for now.

But remember, Google’s stance is pragmatic:

- It is not duplicate content.
- It does not help your rankings.
- But it may confuse users.

So, unless you have a strong reason to showcase your AI exclusion file to the world, just add the noindex and be done with it.

##  This Is Not About Hiding But About Hygiene

The idea of nonindexing llms.txt is not about obscuring your policies from AI. It’s about keeping your user-facing search experience clean.

You would not want your sitemap or robots.txt ranking on page one, right? The same logic applies here.

In the evolving world of SEO meets AI, this is yet another reminder that technical clarity and user experience go hand-in-hand.

Let’s see how the rest of the industry reacts.

 