Skip to content

๐Ÿ” A comprehensive dataset of emojis and their descriptions in Arabic, for developers and language enthusiasts

License

Notifications You must be signed in to change notification settings

a-ibrahimi/Arabic-Emojipedia

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

25 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Arabic Emojipedia

Welcome to the Arabic Emojis Dataset Repository!๐ŸŒŸ

The aim of this repository is to serve as a linguistic resource for NLP researchers focusing on Arabic language and dialects. The included CSV file and associated package provides an easy way to substitute emojis with their corresponding Arabic descriptions, thereby enhancing interpretability and ensuring consistent representation in Arabic language and dialect datasets, including Moroccan and Tunisian Darija.

The dataset was created by processing the emoji dataset available at https://github.com/datasets/emojis. We wholeheartedly encourage contributions to expand and enrich this resource further.

Package Integration

New Addition: Package for Emoji Descriptions

A Python package associated with this dataset! You can programmatically access emoji descriptions using a dedicated function provided by this package.

Installation

You can install our package via pip:

pip install arabic-emojipedia

Usage Example

from arabic_emojipedia.emoji_description import get_emoji_description

emoji = "๐Ÿ˜Š"
description = get_emoji_description(emoji)
print(f"Description for {emoji}: {description}")

CSV File Structure

The CSV file follows a structured format as shown below:

Emoji Name
๐ŸŽƒ ุฌุงูƒ ูุงู†ูˆุณ
๐ŸŽ„ ุดุฌุฑุฉ ุนูŠุฏ ุงู„ู…ูŠู„ุงุฏ
๐ŸŽ† ุงู„ุนุงุจ ู†ุงุฑูŠู‡
๐ŸŽ‡ ุงู„ู…ุงุณุฉ
๐Ÿงจ ู…ูุฑู‚ุนุฉ ู†ุงุฑูŠุฉ
โœจ ุจุฑูŠู‚
๐ŸŽˆ ุจุงู„ูˆู†
๐ŸŽ‰ ุจูˆุจุฑ ุงู„ุญุฒุจ

The file contains a total of 4,733 emojis.

๐Ÿš€ Feel free to explore and leverage this dataset to enhance your NLP research in Arabic. We look forward to your contributions to make this resource even more comprehensive and valuable.

License

This project is licensed under the terms of the MIT License. See the LICENSE file for details.

About

๐Ÿ” A comprehensive dataset of emojis and their descriptions in Arabic, for developers and language enthusiasts

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages