Sora (pronounced [ˈsoːra] ⓘ or [soʔoːˈra]) is a south Munda language belonging to the Austroasiatic family, spoken by the Sora people, an ethnic group of eastern India, mainly in the states of Odisha and Andhra Pradesh. Sora contains very little formal literature but has an abundance of folk tales and traditions. Most of the knowledge passed down from generation to generation is transmitted orally. Like many languages in eastern India, Sora is listed as ‘vulnerable to extinction’ by UNESCO.[2] Sora speakers are concentrated in Odisha and Andhra Pradesh. The language is endangered according to the International Mother Language Institute (IMLI).[3]
Distribution
Speakers are concentrated mainly in Ganjam District, Gajapati District (including the central Gumma Hills region (Gumma Block),[4] and Rayagada District, and are also found in adjacent areas such as Koraput and Phulbani districts; other communities exist in northern Andhra Pradesh (Vizianagaram District, Parvatipuram Manyam District and Srikakulam District).
History and sociolinguistics
History
The Sora call themselves Soranji (human-NSFX-PL, cf. Santali hɔɽ “man”, Mundari hoɽo “human”, Korku koro “man”, proto-Munda **kOrOˀ “person, human”) and refer to their language as soralaŋən, “Sora tongue.” Like the vast majority of tribal and low-caste communities in India, the Sora speech echoed through oral traditions but remained unwritten until the colonial period. A small number of Sora lexical items were recorded by Colonel Dalton (1872) and Pendercast (1881). The British inspector of Madras, Fred Fawcett (1853-1926), later included a short Sora wordlist in his 1887 ethnographic description of the Sora tribe. The first systematic descriptive grammar of Sora appeared in 1931, authored by the Telugu scholar G. V. Ramamurti. The 1960s saw a boom in Austroasiatic linguistics, and the Sora language got the attention of several American Austroasiaticists and typologists such as Norman & Arlene Zide, David Stampe, Patricia Donegan, and Stanley Starosta.[5]
By the 1990s and early 2000s, linguistic investigation slowed considerably due to security concerns in tribal regions affected by Naxalite insurgency, limiting sustained fieldwork. In recent times, Sora was featured in the PBS Ironbound documentary The Linguists (2008), in which they followed two linguists G. D. S. Anderson and K. David Harrison during their field trip in southern Odisha where they visited a Sora village and had a documentation of the language.[6]
Ethnographer Piers Vitebsky (1993) states that the Sora tribe used to have wider distribution than present before it was claimed by later Indo-Aryan and Dravidian expansions, based on Sora-origin village names extend along the coastal plains from the Mahanadi river to Northern Andhra, far beyond the current Sora communities.[7] Accordingly, Lord Jagannath of Odisha’s Puri Temple is claimed to be originated from the Sora tribe, and the term aɳasara for the period when Jagannath suffers from fever for two weeks after the ritual bath on the full-moon day of the Hindu month Jyēṣṭha (mid-May to mid-June) might be related to Sora root -asar- (“to wipe, to dry”).[8] Pliny the Elder (77 CE) and Claudius Ptolemy (c. 130 CE) also did identify Kalinga as the land of the Sora (Suari, Sabaræ).[7]
According to Bihari tradition, around 500 CE a group of Sora/Savara tribals invaded the region and deposed the Cheros, a Hinduized Munda tribe,[9] and presumably to have replaced them as local rulers.[10] Today, the Sabars of West Bengal claim descent from the ancient Savaras. Partly caused by scant data, Gordon (2005) proposes that Sabar/Lodha is related to Sora, although Anderson (2008) suggests that Sabar/Lodha may be an Indo-Aryan language.[11] An example of West Bengal Sabar sentence is provided in the #Sample texts section.
The Sora language has faced a wavelike pattern of usage—that is, the number of people who speak Sora climbed steadily for decades before crashing down. In fact, the number of people who spoke Sora went from 157 thousand in 1901 to 166 thousand in 1911.[12] In 1921, this number marginally rose to 168 thousand and kept climbing.[12] In 1931, speaker numbers jumped to 194 thousand but in 1951, a period of exponential growth occurred, with speaker numbers jumping to 256 thousand.[12] in 1961, numbers topped at 265 thousand speakers before crashing down in 1971 when speaker numbers dropped back down to 221 thousand.[12]
Dialects
Ota & Patel (2021) identify several Sora dialects, including Lanjia-Sora, Imani, Kansid, Kampu, Tenkala, Sarda, and Juray-Sora. Zide (1982) characterizes Juray as a divergent variety of Sora,[13] while later studies (e.g. Gomango & Anderson 2017) emphasize both sociolinguistic and structural differences between Juray and Sora. Recent study by Ota & Patel (2021) has brought focus on the geographic distribution and interrelations of Sora dialects, as well as the status of Juray as an separate language.[14]
Culture
Sora is spoken by the Sora people, who are a part of the Adivasi, or tribal people, in India, making Sora an Adivasi language.[15] Sora is found in close proximity to Odia and Telugu speaking peoples so that many Sora people are bilingual.[15] Sora had little literature except for a few songs and folk tales which are usually transmitted orally.[15]
Phonology
On a similar note, our understanding of Sora phonology is limited at best but there are some generalizations that can be made. Most syllables are of the Consonant, Vowel, Consonant form and morphemes usually contain one to three syllables.[16] There are 18 identifiable consonants and they fall into most of the established origins of sound. Five consonants originate from the palate while only one consonant originates from the glottis. Although vowels may be pronounced differently, there exist only six vowels in Sora. There are no diacritics and aspiration varies depending on the speaker. It is likely that the influence of English, Odia, and Telugu has also affected vowel pronunciation over the course of Sora’s use.[17] Pronunciations also change in prevocalic (occurring before a vowel) and non prevocalic environments.
Consonants
| Bilabial | Dental/ Alveolar |
Retroflex | Palatal | Velar | Glottal | ||
|---|---|---|---|---|---|---|---|
| Stop | voiceless | p | t | k | ʔ | ||
| voiced | b | d | dʒ | ɡ | |||
| Fricative | voiceless | s | |||||
| voiced | z | ||||||
| Nasal | m | n | ɲ | ŋ | |||
| Liquid | l r | ɽ | |||||
| Approximant | (w) | j | |||||
- In Sora, the plosives /t/ and /d/ are asymmetric: /t/ is a voiceless dental [t̪], whereas voiced alveolar /d/ sometimes varies between voiced alveolar [d] and dental [d̪]. Their realization appears to be conditioned by neighboring segments. Accordingly, dental /t/ may surface as alveolar [t] in an alveolar articulatory environment.
- An earlier account by Ramamurti describes the j-consonant as a voiced palatal fricative /ʝ/ with allophone [z]. Arlene R. Zide (1982) reports that /ɟ/ is often realized as the palatal affricate [ɟ͡ʝ] or the voiced alveolar fricative [z]. Horo & Sarmah (2015) analyze it as a voiced postalveolar affricate /d͡ʒ/, with [z] as a free variation, particularly in Odisha dialects.
- The trill /r/ and lateral /l/ fluctuate between dental and alveolar articulation, depending on whether adjacent consonants are dental (e.g., /t/) or alveolar (e.g., /d/). Both /r/ and /l/ can freely occur in initial, medial, and final positions, but the retroflex tap /ɽ/ is restricted to word medially position only.
- The alveolar nasal /n/ may assimilate to dental place in the vicinity of a dental consonant such as [t̪].
Vowels
Except schwa, all Sora vowels have long counterparts. They may be stressed or unstressed. According to Ramamurti (1986), vowels can be short, half-long, or long. Vowel length may denote expressive formations for certain stems, eg. sura (‘big’) and suːra (‘really big’), but these require further studies.
| Front | Central | Back | |
|---|---|---|---|
| Close | i | ɨ | u |
| Near-close | ʊ | ||
| Mid | e | ə | o |
| Open-mid | ɛ | ɔ | |
| Open | a |
Horo & Sarmah (2015) reported /a, e, i, u, o, ə/ for the Sora Assam dialect’s vowel inventory.[19]
Morphophonology
Sora consonants and vowels can undergo a process of sound alternation at prosodic level based on stress-shifts and morphosyntactic conditioned during which the consonants and vowels assimilate to match with the sound of preceding or following stem, or the final nasal with the initial obstruent of the following word. By doing this, some suffixes will merge with its verb phonotactically and a word can have several allomorphs depending on morphological properties of morpheme initials and codas produced during a casual or rapid speech.[20]
Morphology
Overview
Sora is polysynthetic and noun-incorporating.[21][22] A single Sora word can convey the meaning of a whole sentence. However, while researchers consider Sora sentence-words to be single individual words, native Sora speakers perceive them as phrases and break them into sequences of iambic words with a rising contour.
For example:
ǝd-
NEG–
mǝl-
DES–
tij
give
-dar
-rice
-iŋ
–1.UND
-da
–AUX:TAM
-e
–3.ACT
‘(he) does not want to give me rice’
The grammatically correct form in Sora however requires a subject:
anin
he
ǝd-
NEG–
mǝl-
DES–
tij
give
-dar
-rice
-iŋ
–1.UND
-da
–AUX:TAM
-e
–3.ACT
‘he does not want to give me rice’
A full sentence in Sora:
Ňen
1SG
ǝd-
NEG–
mǝl-
DES–
jom
eat
-jɛl
-meat
-yɔ
-fish
-aj
-all
-t
–NPST
-en
–INTR
-ay
–1.SS
‘I don’t want to eat all the fish.’
Sora uses grammatical devices, including subject and object agreement, word order, and noun compounding to show case. It is seen as a predominantly nominative-accusative language and once again differs from most other languages with its lack of a passive structure.[23] However, just because Sora lacks a passive case does not mean other established forms of grammatical case are also missing. Rather, Sora has some complex grammatical cases (In Nominal morphology).[23]
In addition, Sora, like many other Munda languages, uses relator nouns to link nouns with the other parts of the sentence in order to provide a more specific meaning, called compounding.[17] These monosyllabic nouns that enhance meaning are called semantic relator nouns and are used widely in Sora.[23]
Nominal morphology
Number
Plural in nouns and verbs is marked by enclitic/suffix -ji positions just right behind noun-suffix -ən. Animate nouns generally attach (not obligatorily) the plural suffix, but inanimate nouns may often not.[24]
ətɛŋ
many
kəndʊd-ən-ji
frog-NSFX–PL
‘many frogs’
si-leŋ
hand-1PL
‘our hand(s)’
The plural suffix is not attached after countable numerals and finite numbers but it may trigger plural-verb agreement. Numerals can form compounds with nouns; in those cases, they are nominalized and become plural marking by themselves.[25]
bagu-mər-an-ji
two-person-NSFX–PL
‘two people’
Pronouns
| singular | plural | |
|---|---|---|
| 1st person | nen | anlen |
| 2nd person | amən | ambeŋ |
| 3rd person | anɪn | anɪnji |
Third person pronouns can be used as definite markers in noun phrases.[26] A reflexive pronoun can be formed with the reflexive enclitic =dəm. For example, anɪnji + =dəm → anɪnji=dəm ‘themselves’.
Demonstratives
Demonstratives in Sora listed by Starosta (1967).[27]
| Proximal | Distal | |
|---|---|---|
| ‘this/that’ | kəni/kun | -ənt/kun |
| ‘like this/that’ | ɛʔne | ɛʔte |
| ‘like this way/that way’ | ɛʔnegɔj | ɛʔtegɔj |
| ‘this/that little’ | dəkiyne | dəkiyte |
| ‘this/that big’ | dəkəʔne | dəkəʔte |
| ‘here/there’ | teʔne | teʔte |
| ‘around here’ | arɛʔne | – |
| ‘at that time’ | – | səlɛʔte |
Cases
Case marking in Sora is generally murky. There is no role marker to show syntactic alignments between subjects and objects, i.e. nominative-accusative nor ergative-absolutive. A number of grammatical constructions that may or may not be expressed morphologically into an animate primary object argument of the verb, eg. the oblique-dative marker –dɔ[ʔɔ]ŋ– can manifest as a standalone morpheme adɔŋ.[28]
- Absolutive?–Nominal -n
- Dative–Oblique -dɔ[ʔɔ]ŋ-
- Adessive maŋ-
- Possessive a- and -a
- Allative -ban
- Locative–Inessive–Illative -leŋ-ən and -leŋ
anɪnji-a-siː
3PL–POSS.3-hand
‘their hand(s)’
Gender & Class
Grammatical gender or class is not deeply encoded in Sora morphosyntax. To signal something masculine/feminine, Sora speakers utilize indigenous compound endings =mar (male, person) and =boj (female) while also use, albeit rarely, gender suffixes borrowed from Indo-Aryan like -a and -i.[29]
Adjectives
Whether adjectives constitute an independent lexical category in Sora is still an understudied subject. Many adjectival elements are found in compounds with the nominal elements that they modify. Eg. sənna-dud-ən “small-frog-NSFX”. Other adjectives precede the nominal constituents of the phrase. They also may be used in predicative sentence with the copula (past) or without it (nonpast).[30] It appears that many adjectival lexemes in Sora cannot function as external modifiers in certain constructions, possibly because speakers perceive them as indistinguishable from nouns (e.g. “green” meaning “green one”). As a result, such uses may sound awkward to native speakers and are often rejected as ungrammatical.[31]
Tester: Gregory Anderson. Participant: Oranchu Gomango, 50, native Sora speaker.
ɲen
I
suɽa
big
ɲam-jaʔt-tɪ-n-ay
catch-snake-NPST–INTR/MID–1
‘I am big snake catching.’ (Accepted) ✓
ɲen
I
suɽa
big
ɲam-jaʔt-mar
catch-snake-man
‘I am a big snake catcher.’ ✓
iɲen
I
kuluʔ
green
jaʔr-an
snake-NSFX
nɛm-t-ay
catch-NPST–1
‘I am catching the green snake.’ ✓
ɲen
I
kuluʔ
green
ɲam-jaʔt-tɪ-n-ay
catch-snake-NPST–INTR/MID–1
‘I am green snake catching.’ (Rejected) ✘
Another way of attributive modification in Sora is the use of morphologically derived modifiers. In such cases, unlike lexical juxtapositions in North Munda and Kharia, in Sora there is a dependent prefix ə- that attaches onto the nominal, which helps link the modifier with the modified head noun.[32] Consider the following example:
lo-mo-mo-te-n
rapid-swallow-RDPL–NPST.PTCP–NMLZ.NSFX
ə-manra
DEP-man
‘the ravenous man’
Adpositions
Adverbs
Adverbs are uninflected.[33] Some Sora adverbs are: tiki ‘after’, tikki ‘afterwards’, mailen ‘together’, dɔ ‘so’, əntəpsɛlɛ ‘therefore’, biɲdɔ ‘but’, bɔibɔi ‘very’, annəŋ ‘during’, nam ‘now’, aŋaːn aŋaːn ‘sometimes’, moyed ‘previous’, moyed moyəd ‘recently’, etc.
Derivation
In Sora, nouns frequently exhibit a morphological alternation between free forms and combining forms. The combining form (CF) represents a compacted, monosyllabic, phonologically dependent nominal root, which mostly appears in compounding and noun incorporation. Since there is no attested example in which a clitic interrupts the augmented disyllabic noun, the CF is not a phonological word.[34] The free form is a fully phonological, prosodically unified word and the expanded independent lexical item derived from this CF through various morphological operations, including prefixation, reduplication, suffixation, or root augmentation.[17] This alternation is governed by a bimoraic constraint on independent lexical words. Free nouns must satisfy a minimal prosodic requirement of two morae, which leads to phonological expansion of otherwise monosyllabic roots. CFs do not need to meet this constraint because they occur within larger prosodic domains, particularly inside compounds, adjectival, or incorporated constructions. The two shapes are therefore not distinct lexical entries but prosodically conditioned realizations of the same nominal base. The CF functions as the semantic nominal base in verb-noun complexes, compounds, and adjectival constructions, whereas the free form appears in syntactically independent nominal positions.[35] The CF allows the noun to be attached to a verb root to create a more semantically complex word, similar to compounding in other languages.[34] The full form is the form seen when the noun is standing alone or functioning not in tandem with other parts of speech.[34] Some templates of Sora combinations between nouns and verbs are as follows:[34]
Verb + Combining Form
Verb + Combining Form + Combining Form
Full Form + Combining Form
Full Form + Combining Form + Combining Form
An example of a full Form noun shortened into CF is as follows: mənra, the full form of “man”, transform into the CF -mər-. The combining form itself cannot occur in free isolation. It is subminimal and prosodically dependent, surfacing only within compounds or incorporated structures where the larger host provides bimoraic licensing.[34] Although by no means conclusive, a few general guidelines about the CF is that it depends on where the combination with the verb or other element is to take place.[34] If the CF is attached to an infix, then its resulting form will be different from if it were to be combined as a prefix. The meanings of the complements in full forms are difficult to be schematically mapped: sometimes there is a specific affix that occurs with a specific quasi-definable semantic group of words, such as animal names, perhaps that affix appears to be a remnant of a noun classifier. However, it may not be the patterns of many other free forms. Some examples of full form nouns and their CFs are as follows:[36]
(1) a prefix ə-
| Full form | Combining form | Meaning |
|---|---|---|
| əsu | -su- | “illness” |
| əbaj | -baj- | “seed” |
| ədaŋ | -daŋ- | “bee hive” |
| əsoŋ | -soŋ- | “dung” |
(2) prefix kVN-
| Full form | Combining form | Meaning |
|---|---|---|
| kɨnsod | -sod- | “dog” |
| kəmbud | -bud- | “bear” |
| kəndud | -dud- | “frog” |
| kənjeŋ | -jeŋ- | “hedgehog” |
| kəntuj | -tuj- | “owl” |
| kɨmmed | -med- | “goat” |
| kimbuŋ | -buŋ- | “belly” |
(3) Switched
| Full form | Combining form | Meaning |
|---|---|---|
| gundij | -gun- | “squirrel” |
| kumbul | -kum- | “rat” |
(4) a prefix Vn-/V-
| Full form | Combining form | Meaning |
|---|---|---|
| enjum | -jum- | “axe” |
| on[d]ri | -ri- | “pestle” |
| uab | -ab- | “vegetable” |
| umud | -mud- | “smoke” |
| ontid | -tid- | “bird” |
| arel | -rel- | “ice” |
(5). Noun-noun compounds
| Full form | Combining form | Meaning |
|---|---|---|
| bomaŋ | -maŋ- | “chameleon” |
| gorzaŋ | -zaŋ- | “village” |
(6). Reduplication
| Full form | Combining form | Meaning |
|---|---|---|
| saŋsaŋ | -saŋ- | “turmeric” |
| pupu | -pu- | “cake” |
| tujtuj | -tuj- | “star” |
| tittin | -tin- | “tamarind” |
(7). an infix -ʔ-
| Full form | Combining form | Meaning |
|---|---|---|
| daʔa | -da- | “water” |
| raʔa | -ra- | “elephant” |
| suʔuŋ | -suŋ- | “house” |
| oʔon | -on- | “child” |
| siʔi | -si- | “hand” |
(8). an infix -n-
| Full form | Combining form | Meaning |
|---|---|---|
| kənuŋ | -kuŋ- | “razor” |
| sənaŋ | -saŋ- | “door” |
| pənad | -pad- | “latch” |
| ənʔeb | -neb- | “tree” |
| ərəneŋ | -reŋ- | “wind” |
(9). Miscellaneous
| Full form | Combining form | Meaning |
|---|---|---|
| kədib | -kib- | “sword” |
| arsi | -ar- | “monkey” |
| pander | -pan- | “hare” |
| kina | -kid- | “tiger” |
| bisiŋ | -biŋ- | “district chief” |
| boda | -bol- | “fox” |
| ənselo | -boj- | “woman” |
| sora | -sor- | “Sora person” |
| bandradʒ | -ban- | “flour” |
| darej | -dar- | “cooked rice” |
| kappara | -kab- | “duck” |
| ali | -sal- | “liquor” |
| rogo | -san- | “red gram” |
| bati | -pud- | “mushroom” |
| toʔod | -tam- | “mouth” |
Verbal morphology
Sora verbal morphology makes use of prefixes, infixes, and suffixes per grammatical categories. In typical Munda synthetic structure, the verb phrase in Sora is head-final subject-object-verb SOV. However, Sora has developed an elaborate and productive noun incorporation system which appears to have originated from an earlier offshoot of proto-Munda. Its noun incorporation clearly distinguishes free forms and combining forms (CFs) of lexical nouns. In polysynthetic morphosyntax, Sora verb phrases display a strict head-first SVO order like those typically seen in non-Munda Austroasiatic languages. The most intriguing aspect of Sora syntactic noun incorporation is transitive subject incorporation, describes that the language allows transitive verb to incorporate (absorb) its transitive subject/agent with the verb stems remaining transitive and object indexing stays active even after being incorporated.[37]
sa:-bud-t-am
mangle-bear-NPST–2.OBJ
‘Bear will mangle you’
Verb serialization and clause-chaining can be realized by forming compound of verb1-verb2-nonpast. It also works with pairs of incorporated nouns.
mal-jum-pu-daː-tam-t-əm
wish-eat-cake-AUX-mouth-NPST–2
poʔŋ?
Q
‘Do you wish to eat cake?’
Person indexation
| Argument | Subject | Object |
|---|---|---|
| 1SG | Σ-ay | Σ-iɲ |
| 1DU | Σ-aj | |
| 1PL | a/aʔ-Σ | Σ-lɛn |
| 1PL.INCL | Σ-be/biy | |
| 1PL.EXCL | ə-Σ-aj | |
| 2SG | Σ-Ø/e(y)/am | Σ-əm |
| 1>2 | Σ-am | |
| 2PL | ə-Σ-ɛ | Σ-bɛn |
| 3SG | Σ-e(y) | Σ-e |
| 3PL | Σ-ji | Σ-ji |
Object indexing is not obligatory in Sora. In some cases, the object may be omitted or suppressed.[38]
iɛr:-ai-ɛn-a
go/come-1/3SG/CLOC–NMLZ–GEN
tiki
after
aninji
they
gudeŋ-le
call-PST
‘After he came, he called them.’
Possessor of an incorporated direct object is marked by pronominal object markers, therefore Sora incorporation is not entirely a valency-reducing process like in many languages.
lem-jeŋ-te-bɛn-ji
bow-foot-NPST–2–3
‘They bow to your legs.’
Orientation/Directionality
In Sora, there are two types of deixis marking in verb: itive (translocative, away from the argument/speaker) suffixes -a/-ə/-e/-Ø and ventive (cislocative, motion towards the argument/speaker) suffixes -ai/-ay.[39] The cislocative markers sometimes also denote first person subject indexes.[40]
ʔamin
You
bazar-ən
market-NSFX
yer-e
go-2.NPST.TLOC
‘You go to the market.’
əntannəŋ
then
kuni
DEF
anin
he
suɽa-dʊd-ən
big-frog-NSFX
iɛr-ai-ted
come-CLOC–PST.3
‘Then that one, him, the big-frog, came back.’
ɲem
I
ɲam-kit-te-n-ai
catch-tiger-NPST–INTR/MID–1.CLOC
‘I will seize the tiger.’
Syntax
In Sora, the basic clausal constituent order is SOV.
anlen
We
aman
you
daʔa-n
water-NSFX
aʔ-tiy-t-am
1PL.SUBJ-give-NPST–1>2SG.OBJ
‘We give you water.’
Vocabulary
Compared to other Munda languages such as Kharia whose vocabulary is reported as having 40 percent of words borrowed from Indo-Aryan, Sora has very few, if not negligible, number of foreign loan words. Sora also has zero foreign phonemes.[41] (Donegan & Stampe, 2004) Sora borrows words from surrounding languages like Telugu and Oriya.[34] An example of a word borrowed from Oriya is kɘ’ra’ñja’ which is a tree name.[34] From Telugu mu’nu’, which means black gram, is borrowed.[34] Ramamurti (1931) identified three Sora words that apparently were borrowed from Prakrit during ancient time: siŋger (green ginger, cf. Pali singi vera and Sanskrit sr͎inga veram), kaːrella (bitter melon, cf. Sanskrit kāravella), and keda (fragrant screwpine, cf. Sanskrit keta-ki). Moreover, within the Munda family itself most words appear to be mutually intelligible owing to minor differences in pronunciations and phonology. Kharia and Korku, two other Munda languages, share mutually intelligible words with Sora.[23] For example, the number 11 in Kharia is ghol moŋ, in Korku it is gel ḑo miya, and in Sora it is gelmuy.[23] Each 11 in each language looks and sounds remarkably similar to the other 11’s. This phenomenon is not just contained in numbers but rather a great deal of vocabulary is mutually intelligible among the Munda languages. Within the Austroasiatic language family more knowledge about Sora vocabulary can be found. The Mon-Khmer language family which encompasses the languages primarily spoken in Southeast Asia has lexical cognates with the Munda family.[17] That means that some words found in Sora are of direct proto-Austroasiatic origin and share similarities with other derived Austroasiatic language families.[17] Words that relate to the body, family, home, field, as well as pronouns, demonstratives, and numerals are the ones with the most cognates.[17]
Numerals
The Sora numeral system uses a base 12, which only a few other languages in the world do. Ekari, for example, uses a base 60 system.[42] For example, 39 in Sora arithmetic would be thought of as (1 * 20) + 12 + 7. Here are the first 12 numerals in the Sora language :[42]
English: one two three four five six seven eight nine ten eleven twelve
Sora: aboy bago yagi unji monloy tudru gulji thamji tinji gelji gelmuy migel
Similar to how English uses the suffix from the numeral ten after twelve (such as thirteen, fourteen, etc.), Sora also uses a suffix assignment to numerals after 12 and before 20. Thirteen in Sora is expressed as migelboy (12+1), fourteen as migelbagu (12+2), etc.[42] Between numerals 20 and 99, Sora adds the suffix kuri to the first constituent of the numeral. For example, 31 is expressed as bokuri gelmuy and 90 as unjikuri gelji.[42]
The Sora number system was featured in a puzzle by Lera Boroditsky, found in the More Resources section associated with her “TED talk”.
Writing systems
The Sora language is written using multiple systems. The Sora Sompeng script was developed in 1936 by Mangei Gomango as a native writing system created for the Sora language.
Sora is also written in the Latin script, in the Odia alphabet in Odisha, and in the Telugu script in Andhra Pradesh.
Sample texts
Text 1: Article 1 of UN Human Rights Declaration
The following text is Article 1 of the Universal Declaration of Human Rights, written in Sora:[43]
Sora Sompeng Script
𑃦𑃨𑃙𑃑𑃣𑃙𑃢𑃐𑃢 𑃒𑃢𑃙𑃐𑃤𑃖𑃢𑃙𑃥𑃐𑃣𑃙𑃠𑃤 𑃖𑃢𑃚𑃢𑃖𑃢𑃚𑃢𑃜𑃒𑃣𑃙 𑃘𑃦𑃘𑃘𑃣𑃙𑃙𑃢𑃦 𑃣𑃑𑃑𑃣𑃘𑃢 𑃑𑃣𑃑𑃑𑃣 𑃦𑃨𑃑𑃑𑃢𑃤 𑃕𑃙𑃢𑃦𑃨𑃝𑃟𑃢𑃔𑃨𑃢𑃙𑃑𑃣𑃙𑃢 𑃖𑃢𑃙𑃐𑃤𑃖𑃢𑃙𑃥𑃐𑃣𑃙𑃠𑃤. 𑃦𑃨𑃙𑃑𑃣𑃙𑃠𑃤𑃑𑃣 𑃒𑃥𑃔𑃨𑃔𑃨𑃠𑃤 𑃣𑃑𑃑𑃣𑃘𑃢 𑃒𑃤𑃒𑃣𑃟 𑃦𑃨𑃝𑃤𑃔𑃨𑃢 𑃖𑃣𑃙𑃙𑃣. 𑃦𑃨𑃙𑃑𑃣𑃙𑃢𑃐𑃢 𑃦𑃨𑃙𑃑𑃣𑃙𑃠𑃤 𑃣𑃟𑃥𑃢𑃙 𑃒𑃤𑃒𑃤 𑃣𑃑𑃑𑃣𑃘𑃢 𑃖𑃙𑃑 𑃖𑃙𑃑 𑃖𑃣𑃙𑃙𑃣.
Odia Script
ଅନ୍ତେନ୍ଆସା ବାନ୍ସିମାନୁସେନ୍ଜି ମାୱାମାୱାୟବେନ୍ ଲୋଲ୍ଲେନ୍ନାଓ ଏତ୍ତେଲା ତେତ୍ତେ ଅତ୍ତାଇ ଗନାଅରକାଦାନ୍ତେନ୍ଆ ମାନ୍ସିମାନୁସେନ୍ଜି। ଅନ୍ତେନ୍ଜିତେ ବୁଦ୍ଧି ଏତ୍ତେଲା ବିବେକ ଅରିଦା ମେନ୍ନେ। ଅନ୍ତେନ୍ଆସା ଅନ୍ତେନ୍ଜି ଏକୁଆନ୍ ବଇବଇ ଏତ୍ତେଲା ମନ୍ତ ମନ୍ତ ମେନ୍ନେ।
Romanisation
Antenāsa bānsimānusengi māwāmāwāyaben lollennāo ettelā tett e attāi ganārakādāntenā mānsimānusengi. Antenjit e buddhi ettelā viveka aridā men ne. Antenāsa antenji ekuān baibai ettelā mant mant men ne.
English
All human beings are born free and equal in dignity and rights. They are endowed with reason and conscience and should act towards one another in a spirit of brotherhood.
Text 2: Psalm 23, Sora Bible
The following text is from Psalm 23 of the KJV-derived Sora Bible.
Gəmaŋtuŋən-na
rich.god-EMPH
gupa(ː)mar-ɲen
shepherd:HUM–1SG
ɲen-ate
1SG-?
er-asu-ige.
NEG-be.ill-AUX:TAM
Anin
3SG
doʔoŋ-ɲen
OBL–1SG
ledeŋgab-leŋ-ən
tender.green.grass-LOC–NSFX
ab-tabmu-t-in;
CAUS-lay.down-NPST–1SG.OBJ
Anin
3SG
lagadnana
peaceful
daʔa-n
water-NSFX
ademadem-ban
along-ALL
doʔoŋ-nen
OBL–1SG
uruŋ-t-iɲ.
convey-NPST–1SG.OBJ
‘The LORD is my shepherd; I shall not want (be ill). He maketh me to lie down in green pastures; He leadeth me beside the still waters.’
Anin
3SG
puraːrda-ɲen
life-1.POSS
ab-samag-te;
CAUS-restore-NPST
Anin
he
aɲim-damən
name-PURP
əpsele
PURP
roi̯taˀdgoˀd-leŋ-ən
righteousness.road-LOC–NSFX
doʔoŋ-ɲen
OBL–1SG
uruŋ-t-iŋ
convey-NPST–1SG.OBJ
‘He restoreth my soul; He guideth me in paths of righteousness for his name’s sake.’
Text 3: West Bengal Sabar/Lodha, conversation
təb=na
then=FOC
a-undʒi-n-len
DEP-four-NSFX–1PL
daku-li
COP–PST
kudubə-mər-an
total-people-NSFX
bagu
two
ɲen
I
daŋgaɽi
youth.FEM
oʔon-ɲen
child-1SG.POSS
dadigti=gi
that’s.all=TOP
‘We were surely then four people in total, only two of us four (remain now), me and my daughter, that’s all (of us now).’
Media coverage
Sora was one of the subjects of Ironbound Films‘ 2008 American documentary film The Linguists, in which two linguists attempted to document several moribund languages.[citation needed]
Further reading
- Hammarström, Harald; Forkel, Robert; Haspelmath, Martin; Bank, Sebastian, eds. (2016). “Sora”. Glottolog 2.7. Jena: Max Planck Institute for the Science of Human History.
- Ramamurti, R. S. (1931). A Manual of the Sora (Savara) Language. Delhi: Mittal Publication.
- Veṅkaṭarāmamūrti, G. (1986). Sora–English dictionary. Delhi: Mittal Publication.
- Anderson, Gregory D.S (ed). 2008. The Munda languages. Routledge Language Family Series 3.New York: Routledge. ISBN 0-415-32890-X.
- Anderson, Gregory D. S.; Harrison, K. David (2008). “Sora”. The Munda Languages. New York: Routledge. pp. 299–380. ISBN 0-415-32890-X.
- Anderson, Gregory D. S. (2007). The Munda verb: typological perspectives. Trends in linguistics. Vol. 174. Berlin: Mouton de Gruyter. ISBN 978-3-11-018965-0.
References
- ^ “Statement 1: Abstract of speakers’ strength of languages and mother tongues – 2011”. www.censusindia.gov.in. Office of the Registrar General & Census Commissioner, India. Retrieved 2018-07-07.
- ^ “Sora”. UNESCO Atlas of the World’s Languages in danger. UNESCO. Retrieved 2018-03-18.
- ^ দেশোয়ারা, মিন্টু (21 February 2022). “হারিয়ে যাচ্ছে সৌরা ভাষা”. The Daily Star Bangla. Retrieved 21 February 2022.
- ^ Anderson, Gregory D.S (ed). 2008. The Munda languages. Routledge Language Family Series 3.New York: Routledge. ISBN 0-415-32890-X.
- ^ Anderson (2008:5)
- ^ Anderson (2008:xviii)
- ^ a b Rau, Felix; Sidwell, Paul (2019). “The Munda Maritime Hypothesis”. Journal of the Southeast Asian Linguistics Society. 12 (2). hdl:10524/52454. ISSN 1836-6821.
- ^ Bradley, David; Mohanty, Panchanan (2024). “Sociolinguistics of South Asia: Tibeto-Burman, Austroasiatic and other languages”. In Ball, Martin J.; Mesthriei, Rajend; Meluzzi, Chiara (eds.). The Routledge handbook of sociolinguistics around the world. Routledge. pp. 184–196. ISBN 978-1-003-19834-5.
- ^ Parkin, Robert (1991). A Guide to Austroasiatic Speakers and Their Languages. University of Hawaiʻi Press. p. 9.
- ^ Parkin, Robert (1991). A Guide to Austroasiatic Speakers and Their Languages. University of Hawaiʻi Press. p. 28.
- ^ Anderson & Harrison (2008:299)
- ^ a b c d Mahapatra, B.P. (1991). “Munda Languages in Census”. Bulletin of the Deccan College Research Institute. 51/52: 329–336. JSTOR 42930411.
- ^ Anderson & Harrison (2008:299)
- ^ Bapuji, Mendem; Gamango, Opino; Krishna, P. Phani (2025). “Juray: An Endangered Variety of the Sora Group of Lects”. In Dash, Niladri Sekhar Dash; Arulmozi, S.; Ramesh, N. (eds.). Handbook on Endangered South Asian and Southeast Asian Languages. Springer Nature. pp. 173–209. doi:10.1007/978-3-031-80752-7_9. ISBN 978-303-180-751-0.
- ^ a b c Chatterji, Suniti Kumar (1971). “‘Adivasi’ Literatures of India: The Uncultivated ‘Adivasi’ Languages”. Indian Literature. 14 (3): 5–42. JSTOR 23329913.
- ^ Stampe, David L. (1965). “Recent Work in Munda Linguistics I”. International Journal of American Linguistics. 31 (4): 332–341. doi:10.1086/464864. JSTOR 1264042. S2CID 224807949.
- ^ a b c d e f Donegan, Patricia; Stampe, David (2002). South-East Asian Features in the Munda Languages: Evidence for the Analytic-to-Synthetic Drift of Munda. Proceedings of the Twenty-Eighth Annual Meeting of the Berkeley Linguistics Society: Special Session on Tibeto-Burman and Southeast Asian Linguistics. pp. 111–120.
- ^ Horo, Luke; Priyankoo, Sarmah (2021). “Phonetic Comparison of Orissa Sora and Assam Sora”. In Mohan, Shailendra (ed.). Advances in Munda Linguistics. Cambridge Scholars Publishing. pp. 199–216. ISBN 1527570479.
- ^ Horo, Luke; Sarmah, Priyankoo (2015). “Acoustic analysis of vowels in Assam Sora”. North East Indian Linguistics. 7: 69–88.
- ^ Anderson, Gregory D. S.; Harrison, K. David (2008). “Sora”. The Munda Languages. New York: Routledge. pp. 299–380. ISBN 0-415-32890-X.
- ^ Horo, Luke; Anderson, Gregory D. S.; Singha, Aman; Sonowal, Ria Borah; Gomango, Opino (2023). “Acoustic phonetic properties of p-words and g-words in Sora”. Proceedings of (Formal) Approaches to South Asian Linguistics 12: 144–167.
- ^ Horo, Luke; Anderson, Gregory D. S. (2021). “Prosody and Morphosyntax in Sora: A Preliminary Study”. Living Tongues Institute for Endangered Languages: 51–55. doi:10.21437/TAI.2021-11.
- ^ a b c d e Starosta, Stanley (1976). “Case Forms and Case Relations in Sora”. In Jenner, Philip N.; Thompson, Laurence C.; Starosta, Stanley (eds.). Austroasiatic Studies, Part 2. Oceanic Linguistics Special Publications. University Press of Hawaii. pp. 1069–1107. ISBN 978-0-8248-0280-6. JSTOR 20019195. OCLC 6015240755.
- ^ Anderson & Harrison (2008:308)
- ^ Anderson & Harrison (2008:319)
- ^ Anderson & Harrison (2008:316)
- ^ Anderson & Harrison (2008:317)
- ^ Anderson & Harrison (2008:309–311)
- ^ Anderson & Harrison (2008:315)
- ^ Anderson & Harrison (2008:325)
- ^ Anderson & Harrison (2008:354)
- ^ Anderson, Gregory D. S. (2023). “The Many Ways to Invisible in Sora”. In Alves, Mark; Sidwell, Paul (eds.). Papers from the Ninth and Tenth International Conference on Austroasiatic Linguistics. JSEALS Special Publication No. 12. University of Hawai’i Press. pp. 19–33.
- ^ Anderson & Harrison (2008:326)
- ^ a b c d e f g h i Zide, Arlene R. K. (1976). Nominal Combining Forms in Sora and Gorum. Oceanic Linguistics Special Publications. University of Hawai’i Press. pp. 1259–1294. JSTOR 20019202.
- ^ Anderson & Harrison (2008:321-322)
- ^ Anderson & Harrison (2008:322-324)
- ^ Anderson, Gregory D. S. (2017). “Polysynthesis in Sora (Munda) with Special Reference to Noun Incorporation”. In Fortescue, Michael; Mithun, Marianne; Evans, Nicholas (eds.). The Oxford Handbook of Polysynthesis. Oxford University Press. pp. 930–947. doi:10.1093/oxfordhb/9780199683208.013.50.
- ^ Anderson & Harrison (2008:330)
- ^ Anderson & Harrison (2008:341)
- ^ Anderson & Harrison (2008:342)
- ^ Donegan, Patricia; Stampe, David (2004). “Rhythm and the Synthetic Drift of Munda”. In Singh, Rajendra (ed.). The Yearbook of South Asian Languages and Linguistics. De Gruyter Mouton. pp. 3–36.
- ^ a b c d Mohan, Shailendra (2012). “Numeral Expressions in Kharia Korku, and Sora: A Comparative Account”. Bulletin of the Deccan College Post-Graduate and Research Institute. 72/73: 367–374. JSTOR 43610713.
- ^ “Sora Alphabet and Language”. Omniglot.com. Retrieved 16 February 2026.
Further reading
- Nayak, Abhilas (1995). A morpho syntactic study of the saora language spoken in the Koraput district of Orissa (PhD). Sambalpur University. hdl:10603/187215.
External links
Media related to Sora language at Wikimedia Commons- Austroasiatic Languages: Munda and Mon–Khmer