**Google Research has announced Titans and MIRAS, a new architecture-and-framework duo designed to help AI models learn, update, and retain information while they are running, enabling them to handle massive context windows far beyond what today’s transformers can process. **

In a year where AI models continue to hit limits around memory, speed, and context length, Google’s latest research signals a major shift in how machine learning systems can adapt in real time.

This news lands at a time when transformer bottlenecks are becoming painfully obvious. 

The world is generating longer documents, more complex sequences, larger genomes, and richer multimodal data streams. 

And today’s architecture struggles to keep up. [Titans and MIRAS](https://research.google/blog/titans-miras-helping-ai-have-long-term-memory/) aim to fix that, not with incremental tweaks, but with a fundamental rethinking of how AI memory should work.

![Google Research Announced Titans + MIRAS](https://www.stanventures.com/news/wp-content/uploads/2025/12/Google-Research-Announced-Titans-MIRAS.png)

## Why Were Titans and MIRAS Needed in the First Place?

The success of the Transformer architecture came from its attention mechanism. It is a way for models to “look back” at previous inputs. Yet attention becomes extremely expensive as sequences grow longer. 

The computational cost scales quadratically. At hundreds of thousands or millions of tokens, traditional Transformers simply break down.

Researchers have tried alternative solutions, efficient RNNs, linear recurrent architectures, and state space models (SSMs) like Mamba-2. 

These models scale linearly and allow fast processing, but at the cost of compressing all context into a fixed-size memory. That compression means crucial details often get lost.

This is where Titans and MIRAS step in. Instead of compressing everything into a static memory vector, they allow a model to update its memory on the fly, preserving the richness of long sequences without slowing down.

## What Exactly Are Titans and MIRAS?

Google describes the two systems as complementary but distinct:

- **Titans**: The actual architecture is a fast, expressive, deep-memory model that merges RNN-like efficiency with Transformer-like precision.

![About Titans](https://www.stanventures.com/news/wp-content/uploads/2025/12/About-Titans.png)

- **MIRAS**: The theoretical foundation that generalizes how memory, attention, forgetting and online optimization should work in any sequence model.

 

Together, they bring a new ability to the forefront of [AI research:](https://www.stanventures.com/news/study-finds-users-still-visit-websites-after-ai-search-6076/) test-time memorization, meaning an AI system can rewrite its own long-term memory while actively processing data, without requiring retraining.

This shifts AI away from the static paradigm, where the model only “knows what it was trained on” towards a dynamic paradigm where the model learns continuously as it reads.

## How Titans Reinvents Long-Term Memory in AI

Unlike traditional RNNs or SSMs, which rely on small vectors or matrices to store memory, Titans introduces a deep neural network (a multi-layer perceptron) as its long-term memory module. 

This allows it to express relationships, concepts, and structures far more richly than previous approaches.

![The Power Of Deep Memory](https://www.stanventures.com/news/wp-content/uploads/2025/12/The-power-of-deep-memory.png)

The architecture does not just store information, it interprets it. As new tokens stream in, Titans evaluate whether each piece of data is relevant, surprising, or redundant.

### The “Surprise Metric”: What Should Be Remembered?

Borrowing inspiration from human psychology, Titans uses an internal “surprise metric,” which compares what the memory expects with what the new input provides. 

A large difference signals something important.

- **Low surprise** means the input matches expectations and can be ignored.
- **High surprise** means the input breaks expectations and must be stored.

In everyday terms, if you are reading financial data and suddenly encounter a “banana peel,” that anomaly will stick in your memory. Titans behaves similarly.

### Momentum and Forgetting

Two refinements for Titans:

- **Momentum**, which ensures that information surrounding surprising tokens is also captured, useful when meaning spans several words or sentences.
- **Adaptive forgetting (weight decay)**, which allows Titans to erase outdated or irrelevant information to make room for new memories during extremely long tasks.

This combination produces a model that learns continuously without drowning in its own accumulated history.

## How MIRAS Rewrites the Theory of Sequence Modeling

MIRAS provides a unified lens through which different architecture like Transformers, RNNs, SSMs, can be viewed as variations of the same concept: **associative memory systems**.

It breaks down every sequence model into four components:

- **Memory architecture**: how information is stored
- **Attentional bias**: what the model prioritizes
- **Retention gate**: how forgetting or regularization works
- **Memory algorithm**: how updates occur over time

This allows researchers to explore new memory rules, attention rules, and retention mechanisms beyond the traditional mean squared error (MSE) paradigm.

### Moving Beyond MSE

Most sequence models today use MSE or dot-product similarity as the basis for attention and memory updates. MIRAS argues that this limits expressiveness and creates sensitivity to outliers.

Applying this idea, Google introduced three MIRAS-based models like YAAD, MONETA, and MEMORA, each using non-Euclidean objectives to create more robust or more stable memory updates.

- **YAAD** uses Huber loss to reduce outlier sensitivity.
- **MONETA** uses generalized norms for strict, stable memory handling.
- **MEMORA** enforces probabilistic consistency for clean, controlled updates.

These variants prove that stepping outside the standard paradigm can provide significant accuracy and robustness improvements.

## What Did Experiments Reveal About Titans’ Performance?

Google tested Titans extensively across multiple categories. The results consistently showed an advantage over leading models such as Transformer++, Mamba-2, and Gated DeltaNet.

### Language Modeling and Reasoning

Across C4, WikiText, HellaSwag, and PIQA, Titans achieved:

- Lower perplexity
- Higher accuracy
- Faster inference

The MIRAS-based models also outperformed comparable baselines, proving that better memory rules lead to better results.

### Genomic and Time-Series Applications

Because these tasks often involve extremely long sequences measured in millions of elements, traditional architectures struggle. Titans, however, generalized effectively, demonstrating its potential far beyond text.

### Long-Context Performance

Perhaps the most remarkable results came from **BABILong**, a benchmark requiring models to reason over facts spread across extremely long documents. Titans:

- Outperformed all baselines
- Surpassed even massive models like GPT-4
- Scaled reliably to context windows beyond 2 million tokens

This showcases the true strength of Titans: Reliable memory over extraordinary lengths.

## Why Is Deep Memory So Important?

Ablation studies showed that the depth of the memory network had a bigger impact than its size. Deeper memory modules consistently delivered lower perplexity and maintained performance even as sequences grew dramatically longer.

This suggests that future models may not need massive attention heads or huge parameter counts, they need deeper, smarter memory. 

## Key Takeaways

- Titans enables real-time long-term memory updates without retraining.
- MIRAS provides a unified framework for all sequence models as associative memories.
- Deep neural memory modules outperform traditional fixed-size RNN states.
- Surprise-based learning lets the model store only novel or important information.
- Titans achieves state-of-the-art long-context performance, surpassing GPT-4 in some tests.
- Models scale to 2M+ tokens while maintaining accuracy and efficiency.

 