Southeast Asian Languages Library Cataloging Tools
Library cataloging of Southeast Asian texts follows two distinct paths, relying on local orthographies within Southeast Asia, and on romanization elsewhere. The SEAcat tools serve three purposes:

 -  assist in generating accurate and consistent Library of Congress romanization from local orthography;
 -  allow searches of romanized records using queries in local orthography, and provide more sophisticated tools for searching SEA records in general;
 -  assess the feasibility of converting existing romanized records back to their original SEA orthographies.

About Library of Congress Romanization
The most widely used (and in many instances, required) approach to romanization is the system approved by the US Library of Congress and American Library Association, and published as the ALA-LC Romanization Tables: Transliteration Schemes for Non-Roman Scripts (1997).  Scans of most pages are available at www.loc.gov/catdir/cpso/roman.html, and copies of mainland SEA sections are linked on the left.

    Unfortunately, as defined for the mainland SEA writing systems, the ALA-LC system has serious shortcomings.  First, although all four writing systems are based on the same underlying Indic script design, the ALA-LC romanization is not.   Although there is some overlap, different symbols are used for each SEA orthography; e.g. vs ng, or œ vs oe.  This has undermined the development of regional tools and expertise.

    Secondly, by design, the romanization is neither strictly phonetic nor orthographic.   It is extremely difficult to read sensibly (for example, the Lao and Thai systems do not indicate tone).  This is not a problem as long as the ALA-LC romanization is used for its intended purpose as a convenient notation for cataloging. 

    However, complex implementation rules undermine the system.  Thai has more than a dozen pages of rules that determine when and where words should be divided, as though the cataloging information were going to be read like ordinary text.  These rules do little to make the text more understandable; rather, they create countless opportunities for inconsistent cataloging.  In the end, the system is neither readable nor reliable.

    The best solution will be to revert catalog entries back to the original Southeast Asian orthographies.  This cannot be done automatically, because ALA-LC romanization is a lossy, many-to-one system.  On the other hand, we can develop tools that can greatly improve the productivity of human back-translation. 
             
The SEAlang SEAcat Tools have been thoroughly tested with Firefox 1.5 (Windows XP and Linux 2.6.14), Netscape 7.2 (Windows XP), and Safari 2.0 (Apple OSX 10.4).  Browswers that do not comply with W3C standards (in particular, Microsoft Internet Explorer) are not supported, and the Library resources will not display properly with them.