-
Notifications
You must be signed in to change notification settings - Fork 4
Normalization
iLib support character normalization. But there was a bug about Hangul normalization and it was not found until UCD 9.0.is used for normalization unit test case. That bug was fixed to pass normalization unit test using UCD 9.0(refer glyphString.js) and this article is about it.
The Unicode Standard contains both a large set of precomposed modern Hangul syllables and a set of conjoining Hangul jamo and precomposed Hangul syllables are able to be decomposed to a set of conjoining Hangul jamo.
In Unicode standard chapter 3, there is Definition #131.
-
LVT_Syllable
: A character withHangul_Syllable_Type
property valueLVT_Syllable
. Abbreviated asLVT
.- An
LVT_Syllable
has a canonical decomposition to a sequence of the form <LV
,T
>
- An
For example, Hangul Syllable
'각' has a canonical decomposition to a Hangul Jamo
sequence of the form <가, ᆨ>.
Likewise, Hangul Syllable
'가' has a canonical decomposition to a Hangul Jamo
sequence of the form <ᄀ,ᅡ>.
Here is Definition #130.
-
LV_Syllable
: A character withHangul_Syllable_Type
property valueLV_Syllable
. Abbreviated asLV
.- An
LV_Syllable
has a canonical decomposition to a sequence of the form <L
,V
>.
- An
The way above is how to decompose Hangul and how to compose Hangul is the reverse order of decomposition.
But iLib(has been fixed, https://github.com/enyojs/iLib/pull/85/files#diff-0492a848b42aff4c7126162dd133208b) tried to compose sequence of the form of <LVT
, T
>.
It can't be LVT_Syllable
, because LVT_Syllable
consists of LV
and T
, not LVT
and T
.
That's why unit test using UCD 9.0 is failed.
Here is failed test case.
Failed test case for NFC : <ᄀ각ᆨ>
Expected: <ᄀ각ᆨ>
iLib tried to compose `LVT` and `T`, so actual result was <ᄀ갂>
For more information, see Unicode Standard 9.0 Section 12 of Chapter 3.