Indo-Aryan languages

South Asia
Linguistic classificationIndo-European
ISO 639-2 / 5inc
Linguasphere59= (phylozone)
1978 map showing geographical distribution of the major Indo-Aryan languages. (Urdu is included under Hindi. Romani, Domari, and Lomavren are outside the scope of the map.) Dotted/striped areas indicate where multilingualism is common.

The Indo-Aryan or Indic languages is the dominant language family of the Indian subcontinent. They constitute a branch of the Indo-Iranian languages, itself a branch of the Indo-European language family. In the early 21st century, Indo-Aryan languages were spoken by more than 800 million people, primarily in India, Bangladesh, Nepal, Pakistan, and Sri Lanka.[2] There are 219 Indo-Aryan languages. [3]

The largest in terms of speakers are Hindustani (Hindi-Urdu, about 329 million),[4] Bengali (242 million),[5] Punjabi (about 100 million),[6] and other languages, with a 2005 estimate placing the total number of native speakers at nearly 900 million.[7]



Proto-Indo-Aryan, or sometimes Proto-Indic, is the reconstructed proto-language of the Indo-Aryan languages. It is intended to reconstruct the language of the pre-Vedic Indo-Aryans. Proto-Indo-Aryan is meant to be the predecessor of Old Indo-Aryan (1500–300 BCE) which is directly attested as Vedic and Mitanni-Aryan. Despite the great archaicity of Vedic, however, the other Indo-Aryan languages preserve a small number of archaic features lost in Vedic.

Indian subcontinent

Old Indo-Aryan

The earliest evidence of the group is from Mitanni Indo-Aryan.[8] The only evidence of it is a few proper names and specialized loanwords.[8]

Rigvedic Indo-Aryan has been used in the ancient preserved religious hymns of the Rigveda, the earliest Vedic literature.

From the Rigvedic language, "Sanskrit" (literally "put together", meaning perfected or elaborated) developed as the prestige language of culture, science and religion, as well as the court, theatre, etc. Sanskrit is, by convention, referred to by modern scholars as 'Classical Sanskrit' in contradistinction to the so-called 'Rigvedic Sanskrit', which is largely intelligible to Sanskrit speakers.[citation needed]

Middle Indo-Aryan (Prakrits)

Mitanni inscriptions show some middle indo aryan characteristics along with old indic, for example sapta in old indo aryan becomes satta ('pt' is transformed into middle indo aryan 'tt'). According to S.S. Misra this language can be similar to Buddhist hybrid sanskrit which might be infact not a mixed language but an early middle indo aryan occurring much before prakrit.[9][10].

Outside the learned sphere of Sanskrit, vernacular dialects (Prakrits) continued to evolve. The oldest attested Prakrits are the Buddhist and Jain canonical languages Pali and Ardhamagadhi Prakrit, respectively. By medieval times, the Prakrits had diversified into various Middle Indo-Aryan languages. Apabhraṃśa is the conventional cover term for transitional dialects connecting late Middle Indo-Aryan with early Modern Indo-Aryan, spanning roughly the 6th to 13th centuries. Some of these dialects showed considerable literary production; the Śravakacāra of Devasena (dated to the 930s) is now considered to be the first Hindi book.

The next major milestone occurred with the Muslim conquests in the Indian subcontinent in the 13th–16th centuries. Under the flourishing Turco-Mongol Mughal Empire, Persian became very influential as the language of prestige of the Islamic courts due to adoptation of the foreign language by the Mughal emperors. However, Persian was soon displaced by Hindustani. This Indo-Aryan language is a combination with Persian, Arabic, and Turkic elements in its vocabulary, with the grammar of the local dialects.

The two largest languages that formed from Apabhraṃśa were Bengali and Hindustani; others include Sindhi, Gujarati, Odia, Marathi, and Punjabi.

New Indo-Aryan

Dialect continuum

The Indo-Aryan languages of North India and Pakistan form a dialect continuum. What is called "Hindi" in India is frequently Standard Hindi, the Sanskritized version of the colloquial Hindustani spoken in the Delhi area since the Mughals. However, the term Hindi is also used for most of the central Indic dialects from Bihar to Rajasthan. The spoken New Indo-Aryan dialects from Assam in the east to the borders of Afghanistan in the west form a linguistic continuum across the plains of North India, Pakistan and Bangladesh.

Medieval Hindustani

In the Central Zone Hindi-speaking areas, for a long time the prestige dialect was Braj Bhasha, but this was replaced in the 19th century by the Khariboli-based Hindustani. Hindustani was strongly influenced by Sanskrit and Persian, with these influences leading to the emergence of Modern Standard Hindi and Modern Standard Urdu as registers of the Hindustani language.[11][12] This state of affairs continued until the division of the British Indian Empire in 1947, when Hindi became the official language in India and Urdu became official in Pakistan. Despite the different script the fundamental grammar remains identical, the difference is more sociolinguistic than purely linguistic.[13][14][15] Today it is widely understood/spoken as a second or third language throughout South Asia[16] and one of the most widely known languages in the world in terms of number of speakers.


Some theonyms, proper names and other terminology of the Mitanni exhibit an Indo-Aryan superstrate, suggest that a Indo-Aryan elite imposed itself over the Hurrians in the course of the Indo-Aryan expansion. In a treaty between the Hittites and the Mitanni, the deities Mitra, Varuna, Indra, and the Ashvins (Nasatya) are invoked. Kikkuli's horse training text includes technical terms such as aika (eka, one), tera (tri, three), panza (pancha, five), satta (sapta, seven), na (nava, nine), vartana (vartana, turn, round in the horse race). The numeral aika "one" is of particular importance because it places the superstrate in the vicinity of Indo-Aryan proper as opposed to Indo-Iranian or early Iranian (which has "aiva") in general[17]

Another text has babru (babhru, brown), parita (palita, grey), and pinkara (pingala, red). Their chief festival was the celebration of the solstice (vishuva) which was common in most cultures in the ancient world. The Mitanni warriors were called marya, the term for warrior in Sanskrit as well; note mišta-nnu (= miẓḍha, ≈ Sanskrit mīḍha) "payment (for catching a fugitive)" (M. Mayrhofer, Etymologisches Wörterbuch des Altindoarischen, Heidelberg, 1986–2000; Vol. II:358).

Sanskritic interpretations of Mitanni royal names render Artashumara (artaššumara) as Ṛtasmara "who thinks of Ṛta" (Mayrhofer II 780), Biridashva (biridašṷa, biriiašṷa) as Prītāśva "Whose Horse is Dear" (Mayrhofer II 182), Priyamazda (priiamazda) as Priyamedha "whose wisdom is dear" (Mayrhofer II 189, II378), Citrarata as Citraratha "Whose Chariot is Shining" (Mayrhofer I 553), Indaruda/Endaruta as Indrota "helped by Indra" (Mayrhofer I 134), Shativaza (šattiṷaza) as Sātivāja "Winning the Race Price" (Mayrhofer II 540, 696), Šubandhu as Subandhu "Having Good Relatives" (a name in Palestine, Mayrhofer II 209, 735), Tushratta (tṷišeratta, tušratta, etc.) as *tṷaiašaratha, Vedic Tvastar "Whose Chariot is Vehement" (Mayrhofer, Etym. Wb., I 686, I 736).

Romani, Lomavren, and Domari languages


Domari is an Indo-Aryan language spoken by older Dom people scattered across the MENA. The language is reported to be spoken as far north as Azerbaijan and as far south as central Sudan, in Turkey, Iran, Afghanistan, Pakistan, India, Iraq, Palestine, Israel, Jordan, Egypt, Sudan, Libya, Tunisia, Algeria, Morocco, Syria and Lebanon.[18] Based on the systematicity of sound changes, we know with a fair degree of certainty that the names Domari and Romani derive from the Indo-Aryan word ḍom.[19]


Lomavren is a nearly extinct mixed language, spoken by the Lom people, that arose from language contact between a language related to Romani and Domari[20] and the Armenian language.


The Romani language is usually included in the Western Indo-Aryan languages.[21] Romani — spoken mainly in various parts of Europe — is conservative in maintaining almost intact the Middle Indo-Aryan present-tense person concord markers, and in maintaining consonantal endings for nominal case – both features that have been eroded in most other modern languages of Central India. It shares an innovative pattern of past-tense person concord with the languages of the Northwest, such as Kashmiri and Shina. This is believed to be further proof that Romani originated in the Central region, then migrated to the Northwest.

There are no known historical documents about the early phases of the Romani language.

Linguistic evaluation carried out in the nineteenth century by Pott (1845) and Miklosich (1882–1888) showed that the Romani language is to be classed as a New Indo-Aryan language (NIA), not Middle Indo-Aryan (MIA), establishing that the ancestors of the Romani could not have left India significantly earlier than AD 1000.

The principal argument favouring a migration during or after the transition period to NIA is the loss of the old system of nominal case, and its reduction to just a two-way case system, nominative vs. oblique. A secondary argument concerns the system of gender differentiation. Romani has only two genders (masculine and feminine). Middle Indo-Aryan languages (named MIA) generally had three genders (masculine, feminine and neuter), and some modern Indo-Aryan languages retain this old system even today.

It is argued that loss of the neuter gender did not occur until the transition to NIA. Most of the neuter nouns became masculine while a few feminine, like the neuter अग्नि (agni) in the Prakrit became the feminine आग (āg) in Hindi and jag in Romani. The parallels in grammatical gender evolution between Romani and other NIA languages have been cited as evidence that the forerunner of Romani remained on the Indian subcontinent until a later period, perhaps even as late as the tenth century.

Other Languages
Afrikaans: Indo-Ariese tale
azərbaycanca: Hind-ari dilləri
беларуская: Індаарыйскія мовы
беларуская (тарашкевіца)‎: Індаарыйскія мовы
客家語/Hak-kâ-ngî: Yin-thu Arya Ngî-kî
hornjoserbsce: Indoariske rěče
বিষ্ণুপ্রিয়া মণিপুরী: ইন্দো-আর্য ঠাররজিনা
Bahasa Indonesia: Rumpun bahasa Indo-Arya
italiano: Lingue indoarie
қазақша: Үнді тілдері
македонски: Индоариски јазици
Bahasa Melayu: Bahasa Indo-Arya
Nederlands: Indo-Arische talen
日本語: インド語群
Napulitano: Lengua Indo-Arjan
norsk nynorsk: Indoariske språk
Piemontèis: Lenghe indoarian-e
Runa Simi: Indu rimaykuna
Simple English: Indo-Aryan languages
srpskohrvatski / српскохрватски: Indoarijski jezici
українська: Індоарійські мови
Tiếng Việt: Ngữ chi Ấn-Arya