Skip to content

๐ŸŒ The Indigenous Language Translator Engine (ILTE) is a AI-powered, smart, efficient, and scalable translation tool designed to bridge the linguistic gap between Indonesian (ID), English (EN), and Dayak Kenyah (DYK). Whether for language preservation, cultural research, education, or seamless communication with others. โœจ

License

Notifications You must be signed in to change notification settings

Protostarship/Dyk-Knyh-engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐ŸŒ Indigenous Language Translator Engine (ILTE) ๐ŸŒฟ

banner
๐Ÿ“Œ Developed by XI TJKT 2 | 2024/2025 | โ— Any commercial use or unauthorized exploitation is prohibited

Release:


 โ–ˆโ–ˆโ•—  โ–ˆโ–ˆโ•—     โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—
 โ–ˆโ–ˆโ•‘  โ–ˆโ–ˆโ•‘     โ•šโ•โ•โ–ˆโ–ˆโ•”โ•โ•โ• โ–ˆโ–ˆโ•”โ•โ•โ•โ•โ•
 โ–ˆโ–ˆโ•‘  โ–ˆโ–ˆโ•‘        โ–ˆโ–ˆโ•‘    โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—  
 โ–ˆโ–ˆโ•‘  โ–ˆโ–ˆโ•‘        โ–ˆโ–ˆโ•‘    โ–ˆโ–ˆโ•”โ•โ•โ•โ•โ•  
 โ–ˆโ–ˆโ•‘  โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—   โ–ˆโ–ˆโ•‘    โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—
 โ•šโ•โ•  โ•šโ•โ•โ•โ•โ•โ•โ•   โ•šโ•โ•    โ•šโ•โ•โ•โ•โ•โ•โ•
 -------------------------------
 ILTE - Indigenous Language Translator Engine

๐Ÿ“Œ Overview

The Indigenous Language Translator Engine (ILTE) now offers four distinct versions, each tailored to different translation needs:

  • ๐ŸŒฑ ILTE-ALT (Optimized for Speed) โ€“ A lightweight, dictionary-based translator optimized for fast, low-resource translations.
  • ๐Ÿง  ILTE-ZS (Hybrid, Multi-Processing) โ€“ Combines dictionary-based rules, RBMT, FST, semantic matching, and zero-shot translation while efficiently handling large text files.
  • ๐Ÿง ILTE-ADV (AI-Powered, Context-Aware) โ€“ An advanced, AI-driven translation engine that integrates context awareness, semantic similarity, and zero-shot learning.
  • ๐Ÿ”ฎ ILTE-ATI (Advanced Attention & Iterative Processing) โ€“ The most sophisticated version with hierarchical normalization, iterative refinement, attention-based translation, and multi-level candidate selection.

โœจ Key Features

ILTE-ALT - Simple, Fast & Efficient

  • โœ… Dictionary-Based Lookup for direct translations.
  • โœ… Basic Stemming for Indonesian (ID) & English (EN).
  • โœ… Levenshtein Distance Matching for closest word lookup.
  • โœ… Automated Confidence Scoring for accuracy estimation.
  • โœ… Structured DOCX Report Generation.
  • โœ… Low Memory Usage โ€“ Optimized for lower-end machines.

ILTE-ZS - Hybrid, Large-Scale Processing & Efficient

  • โšก Dictionary + RBMT + FST + Semantic Matching + Zero-Shot Translation.
  • โš–๏ธ Handles Large Files Efficiently via chunking & batch multi-processing.
  • ๐Ÿ› ๏ธ Optimized Resource Management, cleans memory and GPU after processing.
  • ๐Ÿ”„ Auto-Parallelized Translation Pipeline.
  • โณ Faster Preprocessing, No Unnecessary Computation.

ILTE-ADV - AI-Powered, Context-Aware & Smarter

  • ๐Ÿง  Contextual Translation using IndoBERT & Sentence Transformers.
  • ๐Ÿ” Zero-Shot Learning for Handling Unknown Words.
  • ๐Ÿ“š Pattern-Based Learning & Semantic Matching.
  • ๐Ÿ› ๏ธ Enhanced Translation Confidence Metrics.
  • โšก Leverages GPU Acceleration for Faster Processing.

ILTE-ATI - Attention-Based, Iterative & Highly Adaptive

  • โœจ Hierarchical Normalization for Better Preprocessing.
  • ๐Ÿ”„ Iterative Translation for Context Awareness.
  • ๐Ÿ“š Attention-Based Translation for Multi-Level Candidate Generation.
  • โš–๏ธ Refined Confidence Scoring & Adaptive Refinement.
  • โœ… Full Formatting Preservation in DOCX Reports.
  • ๐Ÿš€ Optimized for Dynamic, Multi-Stage Translation Processes.

โš› Models Used in Each Version

๐ŸŒฑ ILTE-ALT (Dictionary-Based)

Feature Model Used
Translation (ID-EN, EN-ID) Helsinki-NLP/opus-mt-id-en, Helsinki-NLP/opus-mt-en-id
Stemming Sastrawi (Indonesian), SnowballStemmer (English)
Fuzzy Matching Levenshtein Distance

๐Ÿง  ILTE-ZS (Hybrid Processing)

Feature Model Used
Dictionary-Based Lookup JSON-based dictionary
Rule-Based Translation (RBMT, FST) Custom FST Rules
Semantic Similarity sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
Zero-Shot Translation facebook/mbart-large-50-many-to-many-mmt

๐Ÿง ILTE-ADV (AI-Powered)

Feature Model Used
Contextual Embeddings cahya/bert-base-indonesian-1.5G
Semantic Matching sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
Zero-Shot Classification typeform/distilbert-base-uncased-mnli
Translation (ID-EN, EN-ID) Helsinki-NLP/opus-mt-id-en, Helsinki-NLP/opus-mt-en-id

๐Ÿ”ฎ ILTE-ATI v3-Alpha.3 (Attention-Based & Iterative Processing)

Feature Model Used
Hierarchical Normalization Regex + Dynamic Stemming
Contextual Translation sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
Iterative Processing Multi-Level Candidate Refinement
Translation (ID-DYK, DYK-ID) Enhanced Dictionary Lookup + Semantic Matching

๐Ÿ“Š Comparison Table

Feature ILTE-ALT ILTE-ZS ILTE-ADV ILTE-ATI
Translation Approach Dictionary Hybrid AI-Based Attention-Based + Iterative
Processing Speed Fast Moderate Slower Balanced
Handling Large Files Struggles Efficient Chunking Slower Optimized Processing
Memory Usage Low Moderate High Optimized
Context Awareness None Partial Strong ๐Ÿ”ฎ Very Strong
Idiomatic Expressions Limited Rule-Based AI-Based AI + Attention
Parallelization Minimal Yes DataLoader Thread + Process Pool
Zero-Shot Capability No Yes Yes Yes
Best Use Case Fast translation Large text processing Context-Aware High-Accuracy, AI-Powered

๐Ÿ“š How to Use

Running ILTE-ALT (Simple Mode)

python engine_ALT.py

Running ILTE-ZS (Hybrid & Efficient Mode)

python engine_ZS.py

Running ILTE-ADV (AI-Powered Mode)

python engine_ADV.py

Running ILTE-ATI (Advanced Iterative Attention Engine)

python engine_ATI.py

๐ŸŽฏ Conclusion

Choose the version that best suits your needs and contribute to indigenous language preservation. ๐Ÿš€

  • โœ… ALT: For lightweight, dictionary-based translations.
  • โœ… ZS: For handling large files efficiently with hybrid translation techniques.
  • โœ… ADV: For AI-powered, context-aware translations.
  • โœ… ATI For advanced attention towards content and context.

๐Ÿ”— Developed for Indigenous Language Preservation ๐ŸŒ๐Ÿ’ก

๐Ÿ“š Licensed under GPL v3 โ€“ Any commercial use is strictly prohibited.

Profile Banner

About

๐ŸŒ The Indigenous Language Translator Engine (ILTE) is a AI-powered, smart, efficient, and scalable translation tool designed to bridge the linguistic gap between Indonesian (ID), English (EN), and Dayak Kenyah (DYK). Whether for language preservation, cultural research, education, or seamless communication with others. โœจ

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages