Wir verwenden Cookies und Analyse-Tools, um die Nutzerfreundlichkeit der Internet-Seite zu verbessern und für Marketingzwecke. Wenn Sie fortfahren, diese Seite zu verwenden, nehmen wir an, dass Sie damit einverstanden sind. Zur Datenschutzerklärung.
Automatic Generation of Parallel Treebanks
Details
The need for syntactically annotated data for use in natural language processing has increased dramatically in recent years. This is true especially for parallel treebanks, of which very few exist. The ones that exist are mainly hand-crafted and too small for reliable use in data- oriented applications. This work is targeted at the developers and users of Machine Translation technology. It introduces a novel open-source platform for the fast and robust automatic generation of parallel treebanks through sub-tree alignment, using a limited amount of external resources. The intrinsic and extrinsic evaluations that were undertaken demonstrate that this system is a feasible alternative to the manual annotation of parallel treebanks. Therefore, the presented platform is expected to help boost research in the field of syntax- augmented machine translation and lead to advancements in other fields where parallel treebanks can be employed.
Autorentext
Originally from Kyustendil, Bulgaria, Dr. Zhechev had an early interest in Computing and a talent with languages. This brought him to the Computational Linguistics B.A. program at the University of Tübingen that he completed in 2005. Now he is working on Machine Translation in Dublin City University, where he successfully defended his PhD in 2009.
Weitere Informationen
- Allgemeine Informationen
- Sprache Englisch
- Titel Automatic Generation of Parallel Treebanks
- Veröffentlichung 14.09.2010
- ISBN 3838327950
- Format Kartonierter Einband
- EAN 9783838327952
- Jahr 2010
- Größe H220mm x B150mm x T9mm
- Autor Ventsislav Zhechev
- Untertitel An Efficient Unsupervised System
- Gewicht 238g
- Genre Sprach- und Literaturwissenschaften
- Anzahl Seiten 148
- Herausgeber LAP LAMBERT Academic Publishing
- GTIN 09783838327952