HomeAI NewsScienceUnlimiformer: A Breakthrough in Long-Range Transformers with Unlimited Length Input

Unlimiformer: A Breakthrough in Long-Range Transformers with Unlimited Length Input

May 4, 2023

New approach offloads attention computation to a single k-nearest-neighbor index, enabling extremely long input sequences

Unlimiformer can wrap any existing pretrained encoder-decoder transformer, allowing it to handle unlimited length input.
The approach is demonstrated to be effective on several long-document and multi-document summarization benchmarks, without any input truncation at test time.
Unlimiformer improves pretrained models like BART and Longformer without additional learned weights or code modification.

Researchers have proposed a groundbreaking approach called Unlimiformer, which enables transformer-based models to handle unlimited length input. Traditional transformer-based models have a predefined bound to their input length due to their need to attend to every token in the input. Unlimiformer overcomes this limitation by offloading attention computation across all layers to a single k-nearest-neighbor index, which can be kept on either GPU or CPU memory and queried in sub-linear time.

The approach allows for the indexing of extremely long input sequences while enabling every attention head in every decoder layer to retrieve its top-k keys instead of attending to every key. Unlimiformer has been demonstrated to be effective on several long-document and multi-document summarization benchmarks, even summarizing 350k token-long inputs from the BookSum dataset without any input truncation at test time.

Unlimiformer can improve pretrained models, such as BART and Longformer, by extending them to unlimited inputs without additional learned weights and without modifying their code. The researchers have made their code and models publicly available to encourage further development and improvements.

Despite its remarkable capabilities, Unlimiformer does have some limitations. The approach has only been tested on English-language datasets, and its effectiveness in other languages may depend on the quality of the indexed embeddings and availability of strong pretrained models. Interpretability is also a concern, as the retrieved embeddings at each step are difficult to interpret, warranting further research.

Unlimiformer-Long-Range-Transformers-with-Unlimited-Length-Input Download

Unlimiformer’s input length is theoretically bounded by the memory limitations of the computer used. While researchers were able to use a GPU datastore for input examples exceeding 500k tokens, there may be concerns when using smaller GPUs or even larger inputs. Future work may explore methods to further improve the performance of retrieval-augmented LLMs on challenging downstream tasks.

Paper

Github

Karel https://neuronad.com

Shattering the AI Glass Ceiling: Mira Murati’s $2 Billion Triumph

Gemini 2.5 Pro: The AI That Could Snag Gold at the Math Olympics

Robotics: Unitree’s R1 Dances Into the Future

Yume: Dreaming Worlds into Existence

Trump’s Bold AI Gambit: Seizing the Future or Risking It All?

Mistral’s New OCR API: A Game Changer for AI-Ready Documents

China’s Autonomous Agent, Manus, Changes Everything: The Dawn of Self-Directed AI

LLM Inference Hardware Calculator

Claude 3.7 Sonnet: The World’s First Hybrid AI Brain Coding and Reasoning

SambaNova Launches the Fastest DeepSeek-R1 671B with Unmatched Efficiency

Celebrities explaining science? Yes, please!

Breaking News: The world is ending, and influencers are live-reacting to the chaos!

THIS WILL BE A DAY LONG REMEMBERED: DARTH VADER’S AI VOICE LANDS IN FORTNITE

Where AI Baby Wisdom Meets Canine Comedy

The Impact of OpenAI’s 4o Image Generation: A Visual Revolution

From Garage Invite to X-Rated Text: When AI Mishears, Chaos Follows