About the SEAlang Munda Etymological Dictionary |
The Munda Languages Project's primary resources are this Etymological Dictionary,
built to support work in comparative and historical linguistics, and a companion
Languages Database devoted
to preservation and sharing of language and lexical resources.
Please read the
MKED Tutorial and Cookbook
if you're a first-time user.
Organization Data are obtained from a range of sources, including:
etymological dictionaries
that include proposed proto-language reconstructions,
comparative dictionaries that group citations into
etymological or semantic sets, but do not propose reconstructions,
"linguist's lexicons" that provide careful phonemic
renderings and brief glosses, and
ordinary dictionaries that may or may not include
phonemic rendering.
Project technology, including the underlying software architecture and documentation, are
based on the results of the
Mon-Khmer Languages Project (CRCL / NEH 2007-2011).
We are pleased to collaborate with Paul Sidwell and the ANU on the
Proto-Austroasiatic Lexicon Project (Sidwell / ARC 2012-2016). This
will result in complete reconstructions for proto-Austroasiatic and
its branches and sub-branches (Sidwell), and an AA languages website
with gold-standard comparative datasets (CRCL).
Capabilities The Etymological Dictionary provides four basic functions:
searching data, based on phonemic, orthographic, or semantic queries,
organizing results into comparative or historical sets,
restricting searches, based on language and/or source,
naming datasets and individual items for citation and reuse.
Developing reliable mechanisms for on-line
collaboration is a central project goal.
New datasets of citations, reconstructions, relations, and
comments are welcome, and are readily added to the database.
However, all datasets are individually identified: every
user can easily decide which sets to include or exclude from searches.
Data entry and indexing As noted above, data sources are inconsistently organized. We make every effort to:
expose data for searching, e.g. by expanding bracketed
reconstructions. For example, a head that is originally listed
as *b[h]raap may be searched as
braap or bhraap.
extend queries in a manner that meets the user's intention.
For example, unvoiced consonant variants are automatically included
in searches, as are breathy, creaky, dipthonged, or long vowel variations.
This behavior can be overridden.
preserve non-explicit information from original sources.
For example, dialect identifiers, glosses, and derivational relations
may be inferred, or phonemic equivalents may be added.
Experimental features The Munda Etymological Dictionary is an experimental laboratory as well as a working resource. Our concerns include:
community development
Discussing dataset content and analysis is critical
to the linguistics community.
We are seeking to discover and define the middle ground
between the overly restrictive methods of the past (passing
manuscripts from hand to hand), and the unconstrained Wiki-style
publication seen today.
query specification
Historical language change and inconsistent data quality
can make formulating useful phonemic queries extremely difficult.
Our work on IPA query builders and both phonemic and notational
approximation are intended to help account for
language drift, and variations in research practice.
database design
The underlying design of the Munda database is extraordinarily
simple: it contains only citations, reconstructions, comments, and
relations.
We wish to see if this experimental approach will continue to allow us
to manipulate and extend the database, while preserving the
rich web of relations that characterize comparative language data.
Please click to read the
MKED Tutorial and Cookbook
if you're a first-time user.
|