SEAlang Library -- SEAcat Tool Help

SEAcat Tools Help

Fixed search: don't resegment the search query.

Robust search: automatically generate all alternative segmentations before searching the title/author corpus.

Robust & rough search: also tries every combination. In addition, it will ignore MARC diacritics (which usually show vowel length). This lets us find records that have not been catalogued following ALA-LC guidelines.
- long vowels (e.g. ā) match short ones (a);
- ư matches u;
- o̜ matches o;
- ʻ is ignored;
- æ matches ae;
- œ matches oe;

Return links / html: "Links" is a very compact form; while "html" prints the complete record.

Romanize for interchange: use character components (e.g. diacritics like dots and macrons) rather than precomposed characters. This follows the MARC21 interchange standard, and makes it easier to cut-and-paste SEAcat output.

Romanize for looks: use precomposed characters whenever possible. This will greatly improve text appearance in most fonts. However, a font that is designed to handle diacritics properly (like Doulos SIL) will usually render both components and precomposed characters properly.

Check word breaks: automatically generate all alternative segmentations before searching the title/author corpus. Return counts only.

Check pronunciation: if possible, check native orthography against an authoritative reference. This is particularly helpful for SEA words of Indic origin (these are not always nativized to the same extent).