Vietnamese Bitext Corpus

This site is open for test purposes only.  Not all functions work yet, and bitext content will change.
About the SEAlang Library Bitext Corpus 
A bitext corpus shows words, phrases, and sentences in translation.  Insofar as possible, translated texts are aligned sentence-by-sentence.  Bitext corpora have many applications:
 - in education   bitexts can markedly increase student reading and comprehension in a second language.  Because the raw volume of text they read jumps so dramatically, students are exposed to a much wider vocabulary Moreover, when text is easier to read, students can begin to understand large-scale features of style and grammar. Bitexts have long been a mainstay of second-language education for European languages, and are equally valuable for students of English and Southeast Asian languages.
   Bitext search tools are a cornerstone of data-driven learning. Calling up a dozen examples of a word, phrase, or construction helps students understand and retain subtle distinctions of meaning and usage. It is even more helpful in teaching writing than reading, because bitext searches let real-world experts - writer and translators - provide on-the-spot advice and examples.
 - in research   bitexts are an essential part of research in translation, word-sense disambiguation, and lexicography. Because they let us leverage tools and techniques from other languages, particularly English, they are extremely important for learning how to build search engines, summarize documents, align texts, and so on for SEA languages.
These resources are primarily based on William Peter Hyde's A New Vietnamese-English Dictionary (2008, Dunwoody Press, 928 pages; ISBN 978-1-931546-43-0 more information). Additional materials are provided by the Free Vietnamese Dictionary Project and Baamboo Tra Tu, a Wiki-style dictionary managed by VC Corp, which kindly provided the raw data. 
Look for continuing development of SEAlang Library Vietnamese resources.