Skip to content

Comparative Study of Multilingual Idioms and Similes in Large Language Models

Notifications You must be signed in to change notification settings

namazifard/Multilingual-Idioms-Similes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 

Repository files navigation

Comparative Study of Multilingual Idioms and Similes in Large Language Models

Table of Contents

Introduction

This repository contains the dataset and code for the paper Comparative Study of Multilingual Idioms and Similes in Large Language Models.

Abstract: This study addresses the gap in the literature concerning the comparative performance of LLMs in interpreting different types of figurative language across multiple languages. By evaluating LLMs using two multilingual datasets on simile and idiom interpretation, we explore the effectiveness of various prompt engineering strategies, including chain-of-thought, few-shot, and English translation prompts. We extend the language of these datasets to Persian as well by building two new evaluation sets. Our comprehensive assessment involves both closed-source (GPT-3.5, GPT-4o mini, Gemini 1.5), and open-source models (Llama 3.1, Qwen2), highlighting significant differences in performance across languages and figurative types. Our findings reveal that while prompt engineering methods are generally effective, their success varies by figurative type, language, and model. We also observe that open-source models struggle particularly with low-resource languages in similes. Additionally, idiom interpretation is nearing saturation for many languages, necessitating more challenging evaluations.

Contact

The author emails are {paria.khoshtab, namazifard, mostafa.masoudi, ali.akhgary, samin.mehdizadeh, y.yaghoobzadeh}@ut.ac.ir

Please contact [] first if you have any questions.

Citation

note: add proceedings version after the conference

@misc{khoshtab2024comparativestudymultilingualidioms,
      title={Comparative Study of Multilingual Idioms and Similes in Large Language Models}, 
      author={Paria Khoshtab and Danial Namazifard and Mostafa Masoudi and Ali Akhgary and Samin Mahdizadeh Sani and Yadollah Yaghoobzadeh},
      year={2024},
      eprint={2410.16461},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2410.16461}, 
}

About

Comparative Study of Multilingual Idioms and Similes in Large Language Models

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages