Continuous Autoregressive Language Models

Best AI papers explained - En podkast av Enoch H. Kang

Podcast artwork

Kategorier:

This paper introduces **Continuous Autoregressive Language Models (CALM)**, a new paradigm designed to overcome the efficiency limitations of conventional, token-by-token generation in Large Language Models (LLMs). CALM achieves significant computational savings by employing a robust **autoencoder** to compress a chunk of $K$ discrete tokens into a single, high-fidelity continuous vector, thereby reducing the number of sequential generation steps by a factor of $K$. This shift necessitates a comprehensive **likelihood-free framework**, including an **energy loss** for generative modeling and a new evaluation metric called **BrierLM**, which offers a reliable alternative to Perplexity for implicit models. Furthermore, the paper details a provably exact, but computationally expensive, **likelihood-free temperature sampling algorithm**, along with a highly efficient batch approximation that demonstrates an equivalent trade-off between accuracy and diversity as traditional LLMs. The empirical results confirm that increasing the **semantic bandwidth** $K$ provides a powerful new axis for achieving a superior performance-compute balance in language modeling.

Visit the podcast's native language site