Resources [WIP]#
The awesome digital Tibetan
list. A collection of pointers to projects that provide information and resources around all topics Digital Tibetan.
Sites:#
Lotsawahouse: Translations of Tibetan Buddhist Texts. A vast collection of Tibetan text with translation into English, Chinese, French, German, and other languages spanning many different genres.
BUDA: The Buddhist Digital Archives by the Buddhist Digital Resource Center. A cooperative platform for expanding access to Buddhist literature. Explore the millions of pages of texts contributed by BDRC and its many partners.
University Wien: Resources for Kanjur & Tanjur Studies. Etexts and pdfs with focus on Kangyur and Tengyur, good search, accessible URL-API.
Github Esukhia Collection of tools (phonetics, spelling, Python), corpora and parallel corpora.
Github Tibetan NLP Meta-lists of Tibetan NLP projects, linguistic NLP datasets, Annotated corpora.
Development: Tibetan tools and software#
botok: Tibetan tokenizer in Python by Esukhia.
pybo: Tibetan tokenizer in Python by Esukhia.
Repositories of open tools and data#
Corpora#
Tibetan corpora#
The 2013 UVA-SOAS eKangyur. An ongoing Esukhia-Barom proofreading project of the Kangyur. An older version of this project is found here. (Things seem to move around a bit still)
The Digital Derge Tengyur. Working repository for the digital Derge Tengyur prepared by Esukhia and Barom Theksum Choling.
Diverse collection of Tibetan corpora by Esukhia
Meta: Awesome tibetan canon, a collection of Tibetan corpora at the Tibetan NLP project.
Parallel corpora#
Structured#
Unstructured#
Databases and Wikis#
Dictionaries#
https://github.com/Esukhia/sympound-python
Tibetisches Wörterbuch
: first scientific Tibetan dictionary, Tibetan-German, WIP
Translator’s resources#
https://sites.google.com/view/tibvocab/home
NLP#
Meta: Awesome Tibetan NLP a collection of NLP projects at the Tibetan NLP project