Khmer
[This document is for discussion, and should not yet be used as a reference. It closely follows pages 96-98 of the ALA-LC Romanization Tables, with annotations and typographic corrections by Doug Cooper, January 2006 An X entry represents a character from the original table that has not yet been entered here.]
Consonants
Full Form |
Subscript |
Romanization |
|
Full Form |
Subscript |
Romanization |
|
ក |
្ក |
k |
|
ទ |
្ទ |
d |
|
ខ |
្ខ |
kh |
|
ធ |
្ធ |
dh |
|
គ |
្គ |
g |
|
ន |
្ន |
n |
|
ឃ |
្ឃ |
gh |
|
ប |
្ប |
p |
|
ង |
្ង |
ṅ |
|
ផ |
្ផ |
ph |
|
ច |
្ច |
c |
|
ព |
្ព |
b |
|
ឆ |
្ឆ |
ch |
|
ភ |
្ភ |
bh |
|
ជ |
្ជ |
j |
|
ម |
្ម |
m |
|
ឈ |
្ឈ |
jh |
|
យ |
្យ |
y |
|
ញ |
្ញ or X |
ñ |
|
រ |
្រ |
r |
|
ដ |
្ត |
ṭ |
|
ល |
្ល |
l |
|
ឋ |
្ឋ |
ṭh |
|
វ |
្វ |
v |
|
ឌ |
្ឌ |
ḍ |
|
ស |
្ស |
ś |
|
ឍ |
្ឍ |
ḍh |
|
ហ |
្ហ |
ṣ |
|
ណ |
្ណ |
ṇ |
|
ឡ |
- |
ḷ |
|
ត |
្ត |
t |
|
អ |
្អ |
q |
|
ថ |
្ថ |
th |
|
|
|
|
|
[Independent] Vowels
Independent |
Romanization |
ឥ |
i |
ឦ |
ī |
ឧ |
u |
ឪ or ឩ |
ū |
ឯ |
e |
ឰ |
ai |
ឲ or ឱ |
o |
ឳ or X |
au |
ឫ |
r̥ (not ṛ ṝ ḷ ḹ) |
ឬ |
r̥̄ |
ឭ |
l̥ |
ឮ |
l̥̄ |
[Dependent] Vowels
Dependent |
Romanization |
- |
-a |
- - |
-a- |
- ់ |
-á- |
័ - |
-ă- |
ា |
-ā |
ា់ |
-â- |
ិ |
-i |
ី |
-ī |
ឹ |
-ẏ |
ឺ |
-ȳ |
ុ |
-u |
ូ |
-ū |
ួ |
-ua |
ើ |
-oe |
ឿ |
-ẏa |
ៀ |
-ia |
េ |
-e |
ែ |
-ae |
ៃ |
-ai |
ោ |
-o |
ៅ |
-au |
ំ |
-aṃ |
ះ |
-aḥ |
ៈ |
-à |
Diacritical Marks
Vernacular |
Alternative |
Romanization |
៉ |
ុ |
˝ (hard sign) |
៊ |
ុ |
ʹ (soft sign (prime)) |
៌ |
|
r- |
៍ |
|
- ̊ (circle above) [See note 7] |
៎ |
|
-ʼ (alif) |
៏ |
|
-ʻ (ayn) |
៱ X |
|
-˙ (dot above) [See note 7] |
Notes
1. In the consonant portion of this romanization table, the special character – shows the position of a Khmer script character below which a subscript character is written. A subscript character is always romanized after a full form character, without an intervening vowel, as in ក្រខ្វាក់ (krakhvák).
2. When ញ (ñ) occurs with a subscript character, the lower element is omitted, as in ញ្ច (ñj). When ញ occurs as its own subscript, it takes the full form ញ, as in កញ្ញា (kaññā). Otherwise, the subscript has the form of the lower element alone, as in ខ្ញ (khñ).
3. The consonant ប (p), followed by the vowel ា (ā), takes the special form បា.
4. In the vowel columns, - shows the position of the consonant relative to the vowel. This applies to both the Khmer vernacular and to the romanization columns. It should be noted that – in the Khmer vernacular column can also represent a final consonant with no vowel following, in which case it is simply romanized as - , as in ទ័ព (dăb).
5. The consonants ំ (ṃ) and ះ (ḥ) are always preceded by a vowel, but, being finals, never themselves bear a vowel. Vowels other than a may precede them, as in ដុំ (ṭuṃ), សេះ (seḥ).
6. The diacritics ៉ and ៊ are romanized by ˝ and ʹ respectively, immediately following the consonant they modify. They have the alternative form when they co-occur with one of the superscript vowels ិ, ី, ឹ, and ឺ. When –ុ co-occurs with one of the superscript vowels and with one of the consonants ង, ញ, ម, យ, រ, វ, or ប, it is romanized as ˝ , as in ប៉ី (p˝ī ). When ុ co-occurs with one of the superscript vowels and with one of the consonants ស, ហ, or អ, it is romanized ʹ , as in ស៊ី (sʹī ). Otherwise, ុ represents the vowel u, as in មុន (mun).
7. The diacritics -˚, -ʼ, -ʻ, and -˙ in the romanization column are placed after the last letter of the word in which they occur, as in ក្សត្រីយ៍ (ksatriy˚ ); ច៎ាះ (cāḥ); ដ៏ (ṭaʻ); អាត្មន (qātman˙ ).
[Ed: note that this calls for ring above (U+02DA) and dot above (U+02D9), not combining ring above (U+030A) and combining dot above (U+0307) as specified by the MARC code for Character Modifiers, below. Ring above and dot above should be added to the Special Characters table).
8. Conventional signs are: ៗ , romanized by repeating the preceding word or phrase; ។ល។ romanized as .l. ; ។ប។ , romanized as .p. ; X (underscore?) , romanized by means of a hyphen ( - ); ៖ , romanized by means of a colon (: ); and ។ and ៕ , romanized by means of a period (. ). The signs ៙ and ៚ are omitted in romanization.
9. The numerals are ០ (0), ១ (1), ២ (2), ៣ (3), ៤ (4), ៥ (5), ៦ (6), ៧(7), ៨(8), and ៩ (9).
10. Khmer words are not written separately, and spacing occurs only after longer phrases. When romanizing, the shortest written form which can stand alone as a word is treated as such. This applies also to Pali and Sanskrit loan-words. Other loan-words are divided as in the original language.
Special
Characters and Character Modifiers used in Romanization
(See the MARC 21 Specifications for
Record Structure, Character Sets, and Exchange Media / CHARACTER SETS: Part 3 /
Code Tables / January 2000, updated September 2004)
http://www.loc.gov/marc/specifications/specchartables.html
http://lcweb2.loc.gov/cocoon/codetables/45.html (Extended Latin)
http://www.atla.com/tsig/LatinCharacters/Latin%20Characters%20in%20Unicode%20and%20MARC-8.pdf (full set of examples)