Large language models (LLMs) are celebrated for their multilingual capabilities, but how do they actually process non-English languages? A recent study called “Do Multilingual LLMs Think In English?” by Lisa Schut, Yarin Gal, and Sebastian Farquhar from the University of Oxford and Google DeepMind suggests that LLMs may be more English-centric than previously thought. Their findings reveal that, regardless of the input or output language, these models tend to reason in an internal representation space closest to English before translating their thoughts into the target language.
An English-centric thought processLLMs are trained on vast amounts of multilingual data, yet the dominant language in their training corpus often dictates how they structure information internally. The study analyzed multiple open-source models, including Llama-3.1-70B, Mixtral-8x22B, Gemma-2-27B, and Aya-23-35B, to investigate whether these systems process meaning in a language-agnostic way or if they default to an English-centric representation space.
Using a technique called the logit lens, researchers decoded the latent representations of these models and discovered a striking pattern: when generating text in non-English languages, LLMs first map semantically significant words (such as nouns and verbs) to their English equivalents before converting them into the target language. This phenomenon was observed across multiple languages, including French, German, Dutch, and Mandarin.
For example, when the model was given the French sentence “Le bateau naviguait en douceur sur l’eau” (“The boat sailed smoothly on the water”), the internal representations showed that words like water and boat were first mapped to their English meanings before being translated back into French. However, grammatical elements such as prepositions and determiners remained in the original language, suggesting that only semantically loaded words undergo this English-centric processing.
AI now handles molecular simulations: Thanks to MDCrow
The steering vector experimentAnother key experiment in the study involved activation steering, a technique used to manipulate LLM responses by nudging them toward specific concepts. The researchers found that steering vectors—mathematical representations that guide the model’s decision-making—were significantly more effective when computed in English than in the input or output language. This further supports the idea that the model’s core reasoning occurs in an English-aligned space.
For instance, when an LLM was prompted to write a sentence about animals in German, the model responded more consistently when the steering vector was derived from the English word animal rather than its German counterpart Tier. This suggests that even when models produce fluent non-English text, their underlying logic remains tied to English representations.
The English-centric nature of LLMs has both advantages and drawbacks. On one hand, it allows these models to perform well across multiple languages despite being trained predominantly on English data. On the other hand, it introduces biases and limitations:
Interestingly, not all models exhibited the same degree of English-centric processing. Aya-23-35B, a model trained on 23 languages, showed the least amount of English routing, whereas Gemma-2-27B, trained primarily on English, showed the most. This suggests that the degree of multilingual proficiency directly influences whether a model relies on English representations.
Additionally, smaller models exhibited a greater tendency to default to English, likely due to their limited ability to store multilingual embeddings efficiently. Larger models, with more parameters and training data, appear to have a slightly better grasp of multilingual semantics, though the English bias still remains.
Can LLMs truly think multilingually?The study’s findings challenge the assumption that LLMs operate in a truly language-agnostic way. Instead, they suggest that multilingual AI is still fundamentally shaped by the dominant language in its training corpus. This raises important questions for AI developers and researchers:
Addressing the English-centric bias in LLMs will be crucial for developing truly multilingual, culturally aware systems. Researchers suggest potential improvements such as:
For now, one thing is clear: while multilingual AI has made impressive strides, the way it “thinks” is still deeply tied to English. Understanding this bias is the first step toward creating fairer, more effective AI systems for global users.
Featured image credit: Kerem Gülen/Ideogram
All Rights Reserved. Copyright , Central Coast Communications, Inc.