**In a new round of multilingual SEO tests, major AI search platforms including ChatGPT, Perplexity, Claude, Gemini, and Copilot  showed inconsistent performance when identifying the correct language URLs. **

While Google and Bing continue to surface accurate localized versions, AI models often defaulted to US-English results even when users searched in French, Italian, or Spanish. 

The findings from a [recent GSQI blog](https://www.gsqi.com/marketing-blog/ai-search-hreflang-multilingual-queries/) raise concerns for publishers relying on translated content and hreflang signals for global visibility.

Are AI search systems truly ready for multilingual precision? That question set off a series of tests across languages, tools, and platforms and the results paint a clear picture of the current state of AI Search.

## Why Does Multilingual Content Still Confuse AI Search Platforms?

This question has been echoing in my mind as I worked through site after site, query after query. Despite handling millions of multilingual pages, AI models still seem to stumble where Google and Bing excel.

Let’s break down what the latest testing reveals.

# What Triggered This Round of Multilingual Testing?

It started with a recurring pattern: clients with globally distributed content kept asking whether ChatGPT, Perplexity, Claude, or Gemini were correctly pulling URLs in their users’ native languages.

From multilingual subdirectories to regional domains to same-language, different-country targeting. And with [AI Search](https://www.stanventures.com/news/ai-search-is-changing-seo-faster-than-expected-5874/) becoming a discovery channel, the stakes are higher.

As per Glenn Gabe report the test reveal scenarios:

- Content in different languages
- Content translated but centralized on one domain
- Hreflang-supported pages
- Regional variants of the same pages

The goal? See which URLs different platforms return when the query is made in another language including French, Italian, Spanish, and more.

## How Did Google and Bing Perform Compared to AI Search Platforms?

Google and Bing crushed the AI platforms in accuracy. Decades of experience handling multilingual queries clearly paid off. Here’s the breakdown.

### Example 1: Google’s Search Documentation – Did AI Tools Return the Correct Language?

![Google’s Search documentation. A great place to start.](https://www.gsqi.com/images/ai-search-hreflang-sitemap-10-blue-links.jpg)

**Query: Comment creer un sitemap XML **(Language: French)

**Platform**
**Result Returned**
**Correct Language?**
**Notes**

Google – 10 Blue Links
Correct French URL
Yes
Reliable multilingual handling.

AI Overview (France)
No AIO generated
—
AI Overview did not trigger.

AI Overview (US)
English URL
No
Returned US-English despite French query.

AI Mode
Correct French URL
Yes
Returned correct localized result with scroll-to-text.

Bing
Correct French URL
Yes
Accurate in every test.

ChatGPT
French answer, English URL
No
Response language correct, links incorrect.

Perplexity
French answer, English URL
No
Same mismatch as ChatGPT.

Claude
No links at first; English URL after request
No
Requires prompting, still returns wrong version.

Copilot
Correct French URL
Yes
Performs well due to Bing’s multilingual systems.

Gemini
No links initially; correct French URL when asked
Partial
Provides correct result only after explicit request.

**Net-net? **Google, Bing, Copilot, and AI Mode did well. ChatGPT, Perplexity, and Claude consistently failed to return the correct localized URLs.

### Example 2: Google Documentation in Italian — Did Tools Catch the Right Version?

![example from Google’s Search documentation:](https://www.gsqi.com/images/ai-search-hreflang-http-ai-mode.jpg)

**Query: In che modo i codici di stato HTTP sulla Ricerca Google**

**Platform**
**Result Returned**
**Correct Language?**
**Notes**

Google – 10 Blue Links
Italian URL
Yes
Consistently accurate.

AI Overview (AIO)
Italian URL
Yes
Correct interpretation of language intent.

AI Mode
Italian URL
Yes
Returned correct localized version.

Bing
Italian URL
Yes
Accurate across all tests.

ChatGPT
English URL
No
Returned wrong language version.

Perplexity
Italian URL
Yes
Correct this time, though rare overall.

Claude
English URL
No
Returned English version after asking for sources.

Copilot
Italian URL
Yes
Performs well due to Bing integration.

Gemini
No sources initially; Italian URL when asked
Partial
Needs prompting to show correct links.

**Summary: **Again, traditional search > AI search.

## Example 3: Cloudflare Blog – How Did AI Handle Translated Pages?

![ Cloudflare blog posts](https://www.gsqi.com/images/ai-search-hreflang-cloudflare-10-blue-links.jpg)

**Query: Interrupción de Cloudflare del 18 de noviembre de 2025 **(Language: Spanish)

**Platform**
**Result Returned**
**Correct Language?**
**Notes**

Google – 10 Blue Links
Spanish URL
Yes
Correct localized version ranked.

AI Overview (AIO)
No AIO generated
—
AI Overview did not appear for this query.

AI Mode
Spanish URL and English URL
Partial
Displayed both versions; dual detection.

Bing
Spanish URL
Yes
Consistently accurate across languages.

ChatGPT
Spanish URL and English URL
Partial
Mixed results; included both versions.

Perplexity
English URL
No
Failed to detect Spanish version.

Claude
English URL
No
Returned wrong language version.

Copilot
Spanish URL
Yes
Performs strongly due to Bing backend.

Gemini
No links initially; English URL when asked
No
Inconsistent; correct Spanish URL only in mobile app testing.

 But in the mobile app (location set to Spain), returned the Spanish version. Consistency remains an issue.

## So What Do These Results Really Mean?

After testing financial sites, media portals, press releases, and global blogs, the pattern became clear: AI Search platforms do not reliably understand multilingual intent. [Hreflang support](https://www.stanventures.com/news/googles-take-on-hreflang-for-international-sites-449/) appears weak or absent across ChatGPT, Perplexity, and Claude.

Most important ? Google and Bing remain unmatched in multilingual accuracy.

And then there’s Copilot and Gemini: Copilot rides on Bing’s strengths are consistent.

Gemini mirrors Google, though its failure to automatically return sources is a major usability drawback.

 If AI Search becomes a primary discovery medium, multilingual sites could risk misrepresentation and worse, traffic loss when AI returns the wrong version of the content.

## Why Are AI Search Platforms Struggling With Multilingual Queries?

This question kept surfacing during the analysis. A few possibilities:

### 1. Limited or no use of hreflang signals

Hreflang is the multilingual backbone for Google/Bing. AI search engines seem blind to it.

### 2. Heavy dependence on US-English training data

Models default to English URLs even when responding in another language.

### 3. Weak geographical cues

Even after setting preferred languages, models still fallback to English.

### 4. Lack of structured multilingual indexing

AI search is still evolving, and indexing mechanisms differ from search engines.

The result? A fragmented experience for global users.

## What Should Site Owners Do Now?

What actions truly matter if your content is multilingual. As per Glenn Gabe report here are some measurable key steps. 

### 1. Audit Your Hreflang Setup Thoroughly

Even if AI tools are struggling, Google and Bing still get it right and they remain the largest traffic providers.

### 2. Test Your Multilingual Visibility Across AI Platforms

Search your[multilingual queries](http://AI Multilingual Search struggles with correct language URLs as ChatGPT, Perplexity, Claude, and Gemini fail where Google and Bing excel globally.) across ChatGPT, Perplexity, Gemini, Claude, and Copilot.  What users see in AI tools increasingly shapes discovery.

### 3. Strengthen Your Core Search Visibility

Because as the data shows, traditional search still dominates and performs consistently better.

### 4. Keep an eye on AI Search evolution

These platforms will improve but right now, this inconsistency is creating risk.

##  Is AI Search Ready for Multilingual Accuracy?

As per report, After dozens of tests, hours of comparisons, and repeated cross-checking, the answer is clear: No,  not yet.

Google and Bing remain unmatched in multilingual detection, URL selection, and hreflang interpretation.

But ChatGPT, Perplexity, Claude, and even Gemini still struggle to identify the correct regional pages, often defaulting to English despite user intent.

For brands, publishers, and international businesses, this gap matters. It affects visibility. It affects reach. And ultimately, it affects trust.

The solution? Gabe mentioned to stay vigilant, test across platforms, perfect your hreflang, and monitor AI Search developments closely.

AI search will evolve but understanding its limitations today is the key to staying ahead tomorrow.