Accepted to LoResLM @ COLING 2025
Authors: Sundesh Donthi, Maximilian Spencer, Om Patel, Joon Doh, Eid Rodan
Translating idiomatic expressions remains a challenge for large language models (LLMs), as they often produce literal, semantically incorrect translations. We propose two methods: Semantic Idiom Alignment (SIA), which employs the SentenceTransformers model to semantically generate cosine similarity scores between the meanings of the original and target language idioms, and LLM-based Idiom Alignment (LIA), which uses an LLM to find a corresponding idiom in the target language. Human evaluations on English-Chinese, Chinese-English, English-Urdu, and Hindi-English show the SIA method outperformed others in all GPT-4o translations.

