Online database of Pamir languages

About pamiri.online

Pamiri.online is a project dedicated to the study of Pamir languages. Our objectives encompass the development of digital language resources for studying these languages and their application in our research on grammar and phonetics. Currently, we offer online dictionaries for several Pamir languages, including Shughni, Khufi, Rushani, Bartangi, Wakhi, and Sarikoli. Additionally, our platform provides an online morphological analyzer and a corpus for the Shughni language. Please note that these resources are based on sources written in Russian, requiring some knowledge of the language or the use of online translators. Moreover, we host the research seminar on Iranian languages, where professional linguists and linguistics students present the results of their research in Iranian languages and related fields. 

Citing the project

Please cite this publication when referring to the project or any tool developed by us:

Yury Makarov, Maksim Melenchenko, and Dmitry Novokshanov. (2022). Digital Resources for the Shughni Language. Proceedings of The Workshop on Resources and Technologies for Indigenous, Endangered and Lesser-Resourced Languages in Eurasia within the 13th Language Resources and Evaluation Conference, 61–64. https://aclanthology.org/2022.eurali-1.9

Acknowledgements

The online dictionary of Shughni was created as part of projects Computational and Linguistic Resources for the Shughni Language (2020–2021) and Computational and Corpus Instruments for Iranian Studies (2021–2022), which were supported by the Faculty of Humanities, HSE University. In 2023, we continued to digitize dictionaries of Pamir languages with the support of Linguistic Convergence Laboratory.

Thanks to Umed Kalandarov for helping to cover the costs of website hosting in 2022–2023.

We are grateful to the University of Central Asia for supporting our fieldwork in Khorugh, Tajikistan.

Team

Elena Armand

Project member's photograph
Artyom Badeev

Project member's photograph
Sofya Glavatskikh

Undergraduate student at the School of Linguistics, HSE University.

Interests: morphology, grammatical semantics, grammaticalization, sociolinguistics, typology of grammatical categories, linguistic typology.

Personal web page
Project member's photograph
Yury Makarov

Junior researcher at the Section of Typology at the Institute of Linguistics, RAS. Research fellow at Vinogradov Russian Language Institute, RAS. Yury administers and develops pamiri.online. He is also in charge of the research seminar on Iranian languages.

Yury is the managing editor of the "Indo-Iranian Languages" journal.

Areas of interest: phonetics, phonological typology, online lexicography, Iranian languages, Pamir languages, digital language resources.

Personal web page
Project member's photograph
Maks Melenchenko

Developer of the morphological analyzer, the ortography converter and keyboard layouts for the Shughni alphabet. Assistant researcher at the International Linguistic Convergence Laboratory in HSE.

Personal web page
Project member's photograph
Dmitry Novokshanov

MA student at HSE University, Moscow. Shughni corpus developer.

Areas of interest: grammar of Pamir languages, corpus linguistics, Iranian languages and literature.

Personal web page
Project member's photograph
Polina Padalka

Research assistant at International Linguistic Convergence Laboratory. Polina is responsible for glossing Shughni oral texts and their uploading to the Shughni corpus. She also conducts research in case relations expression in Pamir languages.

Personal web page
Project member's photograph
Vladimir Plungian

Deputy Director and Head of the Department of Corpus Linguistics and Linguistic Poetics at the Vinogradov Russian Language Institute (Russian Academy of Sciences);
Head of the Section of Typology at the Institute of Linguistics (Russian Academy of Sciences);
Professor at the Department of Theoretical and Applied Linguistics, Lomonosov Moscow State University.

Chief Editor of the journal "Voprosy Jazykoznanija" ("Topics in the Study of Language").

Main research interests: general morphology, typology of grammatical categories, corpus linguistics, field linguistics; verse theory and metrics.

Personal web page
Project member's photograph
Ekaterina Rakhilina

Professor and Head of the School of Linguistics at HSE University, Senior Research Fellow at the Department of Culture of Russian Speech at the V. V. Vinogradov Russian Language Institute, member of the editorial board of the journal "Voprosy Jazykoznanija", member of the expert council of the Higher Attestation Commission under the Ministry of Education and Science of the Russian Federation in Philology and Art Studies.

Areas of interest: semantics, lexicology, corpus linguistics, cognitive linguistics, construction grammar, lexical typology, history of the Russian language.

Project member's photograph
Daria Ryzhova

Personal web page
Project member's photograph
Sofia Sedunova

Undergraduate student at the School of Linguistics, HSE University. Sofia glosses Shughni oral texts for their subsequent uploading to the Shughni language corpus.

Areas of interest: phonetics, phonological typology, Iranian languages, Pamir languages.

Personal web page
Project member's photograph
Alexander Sergienko

PhD student of Université Paris Cité / Masaryk university / HSE University

Research interests: negation, Pamiri languages, formal morphology, ergativity

Personal web page
Project member's photograph
Daria Chistiakova

Head of the student project on Shughni vocabulary at HSE University.

Areas of interest: morphology, syntax, linguistic typology, lexical typology, Iranian languages, Pamir languages.

Personal web page
Project member's photograph
Anastasia Shavrina

Intern researcher at the Linguistic Laboratory of Corpus Studies

Projects: Crowdsourcing for Error Correction in L2 Writing

Research interests: oriental languages, artificial intelligence, corpus linguistics, syntax

Project member's photograph
Boris Yakubson

Student at the School of Linguistics at HSE University. Assistant researcher at the International Linguistic Convergence Laboratory at HSE University.

Areas of interest: linguistic typology, general morphology, typology of grammatical categories, deixis

Personal web page

Vladimir Butolin, Elizaveta Vostokova, Faina Daniel, Valeria Grebneva, Violetta Ivanova, Stepan Mikhailov, Roman Ronko, Ivan Sarkisov, Victoria Timofeyeva and others.

Languages

Shughni

The Shughni language belongs to the Eastern Iranian group within the Indo-European language family. It is spoken in the east of Tajikistan and the northeast of Afghanistan and is not only the language of the Shughni people but also, according to some researchers, the lingua franca of the Pamir peoples. Prof D. I. Edelman estimates that there are around 100,000 speakers of Shughni. On this website, users can make queries within the digital version of the Shughni-Russian dictionary by D. Karamshoev and also by I. I. Zarubin as well as explore entries created by the members of our research project.

Rushani

The language is represented by the digitized version of V. S. Sokolova's 'Rushani and Khufi texts and dictionary.'

Khufi

The language is represented by the digitized version of V. S. Sokolova's 'Rushani and Khufi texts and dictionary.'

Tools available

Online dictionary

The dictionary enables users to search through a collection of dictionaries featuring various Pamir languages, currently represented by Shughni, Rushani, and Khufi (with new languages to be added soon). This compilation includes digitized dictionaries (as mentioned above) and entries created by our project.

Morphological analyzer

The morphological analyzer splits words into morphemes and glosses them, that is, assigns labels for their grammatical and lexical meaning. This allows one to perform automatic grammatical analysis of large text corpora.

Corpus

A corpus is a collection of annotated texts that allows users to make search queries within grammatical features, morphemes, words and their combinations, translations, etc. In addition to written texts, there are transcribed oral stories aligned with audio files. The corpus operates on the Tsakorpus platform developed by Timofey Arkhangelsky.

Orthography converter

The orthography converter translates a text in Shughni from the most common writing systems to the Latin orthography adopted by the project. This simplifies the study of written texts with extremely varied orthographies.

Other

Russian-English-Shughni phrasebook

In April 2021, the first version of the Russian-English-Shughni phrasebook by A. A. Sergienko was published. The PDF version is available on this page.