1302. IrvingSnodgrass - Jan. 27, 1999 - 6:43 PM PT
PE:
Proto-Indo-European (according to my source, in this case the Encyclopaedia Britannica) had 8 cases (N, A, G, D, L, Ab, I, V), with four declensions. Modern languages have retained from 0 to all 7 of these cases (though all eight are found in various languages). Some modern languages have *added* declension categories, as Latin once did.
Proto-IE also had three numbers (singular, plural and dual). As far as I can tell, the dual is only retained in some dialects of Lithuanian.
Proto-IE had three genders (masculine, feminine and neuter). Gender exists in most modern IE languages in one form or another (with English being a notable exception).
Wrt to cases, I have prepared the following summary of what I know, based on the discussions here and a little additional research I've done:
Case in Modern IE languages
I. Armenian (yes - 6-7 cases)
II. Indo-Iranian (see below)
III.Albanian (yes, 3-4)
IV. Greek (yes, 4)
V. Romance (two (N,A) in Rumanian, none in other languages)
VI. Celtic (unclear)
VII. Germanic (no)
VIII. Baltic (yes, 7)
IX. Slavic (yes, 6-7)
Indo-Iranian
A. Romany (unknown)
B. Sinhalese-Maldivian (unknown)
C. Dardic (unknown - Kashmiri is the main language)
D. Western (unknown, until Marj provides info on Marathi and Konkani)
E. Central (3 for Hindi/Urdu. Panjabi may have more - see note below)
F. East-Central (unknown --Naepali and Bhojpuri are the main languages)
G. Eastern (4)
As for Gender, three genders are retained in Gujarati, Marathi, Konkani and Sinhalese. All other languages have two genders, except the Eastern languages (Bengali, Oriya, Assamese), which have lost grammatical gender.
[continued]
1303. IrvingSnodgrass - Jan. 27, 1999 - 6:43 PM PT
Wrt case, the following quotes from EB are informative:
"Over a large area of New Indo-Aryan the noun has only two casesdirect and oblique. A lack of distinction between direct and obliqoe cases in the plural is typical of several languages, including forms in Hindi, Gujarati, Marathi, and Bhojpuri.
"Though the nominal (noun) system of Punjabi is very close to that of Hindi, it has separate ablative... and locative... forms in the singular and plural.
"Kashmiri [a Dardic language] has nominative, dative, ablative and agentive cases. Not all such case forms are inherited from Middle Indo-Aryan."
The final quote is of particular note, as it's the first evidence I've seen of innovation (as opposed to reduction) in modern IE case systems.
1304. IrvingSnodgrass - Jan. 27, 1999 - 6:53 PM PT
It is of interest to note that there are 144 modern Indo-European languages, as follows:
A. Armenian: 1 (with many, often divergent, dialects)
B. Indo-Iranian: 93
...1. Indic: 48
...2. Nuristani: 5
...3. Iranian: 40
C. Albanian:1 (with two rather distinct dialects)
D. Greek: 2
E. Romance: 16
F. Celtic: 4
G. Germanic: 12
H. Baltic: 2
I. Slavic: 13
I forgot to put case information on the remaining Indo-Iranian languages in my info above.
Nuristani (unknown)
Iranian
1. Eastern (Pashto is similar to modern Indo-Aryan systems)
2. Western (unknown - Kurdish is an example)
3. Southwest (Persian has lost all inflection in number, gender and case)
1305. pseudoerasmus - Jan. 27, 1999 - 7:41 PM PT
Snirv:
Why on earth do you say no cases for Germanic? German has got four of them, though used primarily with articles and adjectives. (Nouns are only slightly declined.)
Punjabi and Pashto, for which I've already cited references, have 4 cases and 2 cases, respectively. The former has nominative, accusative, "oblique 1" (serving dative, locative & genitive functions) and "oblique 2" (serving ablative & instrumental functions). The latter has nominative and oblique.
"A lack of distinction between direct and obliqoe cases in the plural is typical of several languages, including forms in Hindi..."
Well this is certainly false for Urdu and, I assume, for Hindi as well.
kamre (rooms) | kamrõ me (in the rooms)
meze (tables) | mezõ par (on the tables)
larkiyã (girls) | larkiyõ se (from the girls)
[The tilde is meant to convey nasalisation of the vowel.] There may be nouns which in the plural do not differ between the nominative and the oblique, but I can't think of any.
"Persian has lost all inflection in number, gender and case."
Untrue. Persian has lost all inflection in gender and case, but not in number. Somewhat like Pashto, nouns in Persian are pluralised by adding "-an" at the end of animate nouns but "ha" at the end of inanimate nouns. Thus, baradar/baradaran (brother/brothers) and gol/golha (rose/roses). Nouns ending in a short E (medial dot) are pluralised by dropping the E and replacing it with "egan". Thus, parande/parandegan (bird/birds). And there are countless words in both Persian (and Pashto) which take the "Arabic plural": manzel/manazel (house/houses).
1306. IrvingSnodgrass - Jan. 27, 1999 - 9:02 PM PT
[Writing this for the second time. I've got to remember to save all posts, given the Fray's current idiosyncracies. What kind of moron runs this place, anyway? Uh, wait a minute... don't answer that.]
PE:
(1) My error. I menat to wite "None, except for German." The other Germanic languages have lost their case systems, but German's remains alive and well.
(2) The citation I provided indicates that Panjabi may have a more extensive case system than previously indicated. I have no idea if this is true or not.
Pashto's system is similar to the Indo-Aryan systems, though (as we've seen) there is a great deal of variation in Indo-Aryan systems.
(3) I would not be at all surprised if the EB citation on Hindi is inaccurate. It is certainly different from your information, which was confirmed by the information I provided from Comrie. If I had to choose, I'd choose Comrie and your source over EB, no question.
(4) Again, my error. I read the following in Comrie (p. 530):
"[Persian] has almost completely lost the inherited synthetic nominal and verbal inflection and their inflectional classes, and thus the inflectional distinction of case, number and gender as well as of tense, mood, aspect and verbal gender."
If I'd bothered to read a few lines further on, I would have found:
"Person and number are, however, distinguished, so is human and non-human gender."
Thanks for the corrections.
1307. IrvingSnodgrass - Jan. 27, 1999 - 11:06 PM PT
For all of you who have expressed interest (=Alistair), I am continuing my look at the AUSTRONESIAN language family. As you may recall, I have already looked at the languages of the Philippines, Celebes (Sulawesi), and Borneo.
The next branch in the family is by far the most important politically and the largest numerically: the 46 languages of the SUNDIC branch, spoken by approximately 334 million speakers (according to Ethnologue's numbers) in 8 nations (although 95% of the speakers are in Indonesia).
This chart shows the relationships of the languages of the Sundic group. The exact classification within the Sundic group is still unclear and needs more work.
At the highest level, there are 4 unclassified languages. Based on my own experience, I would group Sundanese and Javanese together, along with Balinese and possibly Madurese. Balinese has no business being classified with the distantly-related Sasak and Sumbawa languages: it split off from Javanese in historical times and is very similar to Javanese (and Sundanese). Compare the forms for thank you in the three languages:
S hatur nuhun
J matur nuwun
B matur suksme (suksme is a loan word from Sanskrit)
The relationships are already transparent. So I would propose a branch with Sundanese, Javanese, Balinese, and maybe Madurese, and leave Sasak and Sumbawa where they are.
Someday I'd like to have the time to do the research to settle all these questions, but until then someone else can do it.
[continued]
1308. IrvingSnodgrass - Jan. 27, 1999 - 11:09 PM PT
Here's a little exercise for you. This chart presents the numbers from 1-10 in 8 Austronesian languages. Can you tell which are *not* Sundic languages? Five are Sundic (from three different sub-branches of Sundic) and three are from completely different sub-families of Austronesian. Two are easy. The other is nearly impossible.
Hint: Ignore the form sedoso/dase, which is a loan word from Sanskrit.
This exercise shows just how difficult classification is. Perhaps the entire Austronesian classification needs rethinking.
Even within the fairly closely-related Sundic languages, basic words can differ greatly. Look at these words for black:
Sundanese - hideung
Javanese - item
Balinese - selem
No relations there, but strangely enough, the Javanese word is cognate to Malay hitam (possibly through borrowing).
The Sundic branch has the following sub-branches, along with languages of a million of more speakers (all figures from Ethnologue):
0. Unclassified
Javanese 75.5 million
Sundanese 27 million
1. Lampung (South Sumatra)
Lampung 4.2 million
2. Bali-Sasak (the islands of Bali, Lombok and Sumbawa)
Balinese 3.8 million
Sasak 2.1 million
3. Sumatra (North Sumatra)
Batak (all groups) 5.83 million
4. Malayic (West/Central/North Sumatra, Borneo, Malaysia, Singapore, Brunei, China, Vietnam, Cambodia, Thailand)
Minangkabau 6.5 million
Iban 1 million
Madurese 10 million
Malay 18.6 million
Betawi 2.7 million
Indonesian 170 million
Rejang 1 million
Pattani Malay (Thailand) 2.4 million
Acehnese 3 million
The Malayic sub-branch alone has about 210 million speakers.
[continued]
1309. IrvingSnodgrass - Jan. 27, 1999 - 11:09 PM PT
One of the strangest features of the chart is the Acehnese-Chamic group under the Malayic sub-branch.
This links Acehnese, a language spoken by 3 million people on the northern tip of Sumatra with the 10 Chamic languages spoken in Mainland Indochina (Vietnam and Cambodia) and on Hainan Island in China (with about 1 million speakers, total). The Chams use a Devanagari-based script and follow the Muslim religion (even in Hainan). Other than Malays in Malaysia and Thailand, the Chams are the only mainland-based Austronesian speakers.
Another interesting group is the Urak Lawoik,in the Para-Malay sub-branch. The other speakers in this group are found in West Sumatra, but the Urak Lawoik are found on the islands off the west coast of Thailand (Phuket and environs).
Next: A look at the Central/Eastern sub-family of Austronesian.
1310. IrvingSnodgrass - Jan. 27, 1999 - 11:22 PM PT
The links in my posts above are not working (I drafted them in a WP program and forgot to fix the quotes)>
The correct link for Message #1307 is: Sundic chart
The correct link for Message #1308 is: Numbers Comparison Exercise
1311. IrvingSnodgrass - Jan. 27, 1999 - 11:25 PM PT
Sheesh. I'm not doing too well here.
The correct link for Message #1307 is Sundic Chart
1312. stostosto - Jan. 28, 1999 - 4:22 AM PT
Irv
"That's our "Surfing Rabbi," the mascot of our Deli. You see, in the American consciousness Deli=Jewish, and Bali is a surfer's heaven. It's a bit of American humor. Hmmm... Lesson 1 for the course?"
Oh, *that's* American humour? I see. I see. I see.
In that case you'd better stick to your Dane Law...
I'll drop you a line when I get the time.
1313. IrvingSnodgrass - Jan. 28, 1999 - 4:37 AM PT
Sto:
Noooo, the Brits have "humour," as in Mr. Bean, Benny Hill, etc. Americans have "humor," a much more refined thing.
1314. PsychProf - Jan. 28, 1999 - 4:45 AM PT
Someone decided Benny was not PC, and he vanished from TV here...
1315. DanDillon - Jan. 28, 1999 - 5:01 AM PT
Benny Hill, never mind his unPCness, was not funny.
Irv,
To avoid losing your posts (it's happened to me a time or two as well), simply copy what you've typed before clicking anything and if/when it appears not, just paste it back and try, try again. Does the trick.
1316. RyckNelson - Jan. 28, 1999 - 5:01 AM PT
Some have a cause or is it that they are a cause?
Bakun is not one of my 'direct' concerns but it does affect the same group of oppressed peoples I've been trying to report.
The oppression in the link I've given is again the loss of customary lands held by peoples for thousands of years. To be taken and destroyed. The loss of biodiversity is the WORLDS concern, but the patornizing government impossing it's loyalty tests upon these Orang Ulu peoples is utterly dishonorable!
There are links which allow people to write to the government, ministry and the construction company.
PLEASE. Please write to them.
1317. RyckNelson - Jan. 28, 1999 - 5:01 AM PT
patronizing
1318. IrvingSnodgrass - Jan. 28, 1999 - 5:02 AM PT
PP:
Actually, a much higher power cancelled Benny's show. He died in 1992.
1319. IrvingSnodgrass - Jan. 28, 1999 - 5:07 AM PT
Dan:
I usually draft my longer posts elsewhere (where they are saved). Now I'm drafting everything first.
Well, almost everything... I figure something like this post I can recreate easily enough.
1320. RyckNelson - Jan. 28, 1999 - 5:25 AM PT
Here is a book of my "cause".
1321. DanDillon - Jan. 28, 1999 - 5:44 AM PT
Ryck,
Are you in the right thread?
1322. RyckNelson - Jan. 28, 1999 - 5:58 AM PT
For a one timer, yup.
See above, my cause or cause question.
1323. Ronski - Jan. 28, 1999 - 6:02 AM PT
Benny Hill was a riot.
1324. PsychProf - Jan. 28, 1999 - 6:29 AM PT
Uh Irv...Benny was shown on tape here...like reruns...
1325. stostosto - Jan. 28, 1999 - 6:48 AM PT
Irv.
"Sto:
Noooo, the Brits have "humour," as in Mr. Bean, Benny Hill, etc. Americans have "humor," a much more refined thing."
I once read something about the difference between British and American humour/humor. The guy concluded something like yourself. Needless to say he was American. Mike Twian or something.
It's funny because many people have been fooled to think of Americans as being generally less refined than their "old countries" (which can of course be put down to mere snobbery, esp. in the case of the Brits).
---
For the record: Benny Hill was so embarrassingly unfunny that he was very very funny.
1326. IrvingSnodgrass - Jan. 28, 1999 - 6:50 AM PT
Returning to the languages of the world has inspired me to go out and do some first-hand linguistic research. I was wondering if the 5000+ languages in Ruhlen's "A Guide to the World's Languages" were accurate, as well as the 6000+ languages listed in Ethnologue (which is the better source, imho).
So I tracked down an acquaintance of mine, a security guard who originally comes from the island of Alor. If you look on a good map of Indonesia, you'll see Alor: a small island near the islands of Flores and Timor.
Ethnologue lists 9 languages (a number confirmed by Ruhlen's lists), with an approximate population for all languages of 176,000 speakers. 8 of the languages (covering 36 dialects and 116,00 speakers) belong to the Trans-New Guinea Branch of the Indo-Pacific family (a non-Austronesian family). The other language, with 60,000 speakers is an Austronesian language in the Central-Eastern sub-family.
So, what did I find? The first thing I discovered was that my informant had no idea what I meant by "language." He has always considered what he spoke back in Alor to be somehow inferior, and not deserving of the status of language. Once I showed him the list of languages in my book, he became very excited and talked on and on about the linguistic situation on his home island.
It turns out he is fluent in three of the languages of Alor, and can recognize most of the others, as well as being able to identify dialects (which he described as "turned-around languages" before he remembered the Indonesian term "logat"). His linguistic intuitions were impressive, especially for someone who, until today, had never heard of linguistics and didn't realize that the languages of Alor had any status whatsoever in the real world.
[continued]
1327. IrvingSnodgrass - Jan. 28, 1999 - 6:53 AM PT
He confirmed the lists in Ruhlen and Ethnologue as valid, with one startling exception. To my informant, many of the "dialects" listed for each of the languages are separate languages. He based his descriptions on mutual intelligibility. He was also able to confirm that languages were either "closely related" or "distantly related" as in Ruhlen's classifications (based on the work of Wurm and Hattori). All of the languages and dialects he was familiar with were Indo-Pacific, and he was not aware of the Austronesian language on the island.
My informant is a native speaker of the Kalamana dialect (listed in Ethnologue (E) as "Langkuru-Kalomano," although my informant says these are separate but closely-related dialects). This dialect is listed in E as a dialect of "Woisika," which my informant informed me is properly "Waisika." He also said it is not mutually intelligible with his dialect, and he regards it as a separate language. In addition to Kalamana and Waisika, he also speaks Tanglapui a distantly-related language (which is still "unclassified," according to Ruhlen).
1328. pseudoerasmus - Jan. 28, 1999 - 6:55 AM PT
Message #1326
You've just started Indonesia on the road to being a Yugoslavia....
1329. PsychProf - Jan. 28, 1999 - 6:55 AM PT
Dan...although I did not find Benny particularly funny, I must say that my friends in Scotland certainly did. I spent a year at U Aberdeen, and watching British humour with colleagues was a treat. Culture and humor interact in interesting ways.
1330. pseudoerasmus - Jan. 28, 1999 - 6:57 AM PT
A paper is in order, Snirv, to be published as in "The Proceedings of the Indo-Pacific Linguistics Conference".
1331. IrvingSnodgrass - Jan. 28, 1999 - 7:05 AM PT
PP:
My comment to sto was mostly in jest. There is much I enjoy of British humor, the best of which is at least equal to anything America has produced.
PE:
I'm starting to feel rather inspired, actually. These languages of Indonesia need proper classifying. What I've found in my brief look at the situation over the past few days is appalling.
1332. marjoribanks - Jan. 28, 1999 - 7:29 AM PT
Gentlemen,
I'm still looking for appropriate reference material for Marathi and Konkani. In the meanwhile, please note that there is no language called "Panjabi." It's Punjabi.
1333. Ronski - Jan. 28, 1999 - 7:29 AM PT
I think there is British humor and British schoolboy humor. I watched Benny Hill for many years with my late partner, who was a Franco-Englishman. He, believe it or not, was extremely sophisticated, but liked schoolboy humor nevertheless, as do I. I say believe it or not since he was with me.
1334. PsychProf - Jan. 28, 1999 - 7:32 AM PT
Ronski...sounds like a wonderful touch in the relationship.
1335. pseudoerasmus - Jan. 28, 1999 - 7:32 AM PT
The language of Punjab is spelt either Panjabi or Punjabi, indifferently. "Panjabi" would be more accurate, linguistically.
1336. marjoribanks - Jan. 28, 1999 - 7:34 AM PT
Benny Hill, though ridiculously over the top, had the ability to make one laugh by just looking at the camera with a leer on his face. This is quite rare, others who could do this included that fruity guy from the Carry on series, Chevy Chase and the SNL fellow Will Farrell.
1337. pseudoerasmus - Jan. 28, 1999 - 7:34 AM PT
Excuse me, pUnjabi would be more accurate linguistically.
1338. marjoribanks - Jan. 28, 1999 - 7:36 AM PT
Why, pray tell is "Panjabi" more accurate linguistically? The word itself is pronounced pun-jahbi.
1339. marjoribanks - Jan. 28, 1999 - 7:37 AM PT
Oh okay.
1340. pseudoerasmus - Jan. 28, 1999 - 7:37 AM PT
But there is a distinction between "Punjabi" and "Panjabi", asides from the purely orthographic difference. "Punjabi" invariably refers to Western Punjabi, whereas "Panjabi" sometimes refers to Mirpur Panjabi of Kashmir, a dialect which is sometimes considered a different language. My Punjabi grammar, written by a Lahori, spells it "Panjabi" throughout the book.
1341. marjoribanks - Jan. 28, 1999 - 7:41 AM PT
Well, Brit linguists often used "Panjabi" but then so what? They're also responsible for "Cawnpore" and "Poona".
BTW, Pseuder, have you checked out Chowk yet? They've got Urdu resources among other things.
1342. IrvingSnodgrass - Jan. 28, 1999 - 7:42 AM PT
My primary reference for the posts I have been making (Comrie) lists the language as "Panjabi," which is more accurate etymologically because the schwa sound in the first syllable descends from a short "a" sound, not a "u" sound.
1343. IrvingSnodgrass - Jan. 28, 1999 - 7:45 AM PT
Are there no takers on my language exercise from Message #1308 (with the correct chart in Message #1310?
1344. pseudoerasmus - Jan. 28, 1999 - 7:46 AM PT
The reason "Panjabi" is more accurate etymologically because "panj" with a strong A sound, means "five" in Persian, from which the name of the language is derived.
"Punjab" = "panj" + "abesh" = "five" + "river" in Persian
1345. pseudoerasmus - Jan. 28, 1999 - 7:47 AM PT
Message #1344
substitute "is that" for "because".
1346. pseudoerasmus - Jan. 28, 1999 - 7:48 AM PT
What exactly is a schwa? I've always wondered about this linguistic term.
1347. IrvingSnodgrass - Jan. 28, 1999 - 7:58 AM PT
PE:
A schwa is the sound made in the English pronunciation of the first syllable of Punjabi. It's a mid-central vowel, linguistically, and is denoted by an upside-down "e."
Schwas are commonly found in English in unstressed syllables. A stressed schwa is represented by a "caret" (an upside down "v"), as it the English word "but." The short "a" in Indo-Aryan (and Dravidian) languages is commonly sounded as a schwa sound.
1348. pseudoerasmus - Jan. 28, 1999 - 8:06 AM PT
I assume when Marzipranks transcribes the name "punjabi", he means for the first syllable to rhyme (more or less) with the English word "pun". At least that is what I mean.
In the Perso-Arabic script of Urdu, the word "Punjabi" is spelt with the long A vowel mark (alif), represented by something which looks like a T at the beginning of the word but by a vertical line in the middle of words. This is exactly like in Punjabi itself, at least the Punjabi of Pakistan.
1349. Jgeffert1 - Jan. 28, 1999 - 8:08 AM PT
Irv: Why? and Who cares? re: 1346. Boy this is fun being a new 'paid' subscriber and feeling much more friskie about putting in my two cents, 230 reals, 1240 riats?
1350. Jgeffert1 - Jan. 28, 1999 - 8:10 AM PT
Oops, that was 1343, Irv.
1351. IrvingSnodgrass - Jan. 28, 1999 - 8:17 AM PT
PE:
Yes, a schwa is the sound in "pun" (although the vowel in "pun" is more properly a "caret," since it is stressed).
Jgeffert:
Gee thanks. You can't imagine how much I appreciate that, after putting a couple of hours into the language exercise. I hope some of the participants here find it of interest.
1352. Ronski - Jan. 28, 1999 - 8:19 AM PT
Irv,
I love your work in this thread, and I love this thread.
1353. PsychProf - Jan. 28, 1999 - 8:21 AM PT
Irv...we do indeed.
1354. IrvingSnodgrass - Jan. 28, 1999 - 8:21 AM PT
The final sub-family in the Austronesian family is the massive (in numbers of languages, *not* numbers of speakers) Central-Eastern sub-family (see chart. The group spans 571 languages ranging from Eastern Indonesia through New Guinea and all the way across the Pacific Ocean to Easter Island and Hawaii.
The Western sub-family had 374 languages in 13 nations, with about 400 million speakers. The Central-Eastern sub-family has 571 languages in 20 nations and territories, with about 6 million speakers. Quite a contrast.
The first branch, Central, covers 89 languages spoken in eastern Indonesia in Maluku (the Moluccas) and on and around the islands of Timor, Flores, Sumba and Sumbawa.
The Eastern branch has 482 languages, in two sub-branches. 56 languages are spoken in South Halmahera (Maluku) and NW New Guinea. The remaining 426 languages make up the Oceanic sub-branch.
The Oceanic sub-branch has 17 Groups, 5 of which are spoken on New Guinea, and another 11 on islands near New Guinea. Of the 270 languages in these 16 groups, none are well-known or has a significant number of speakers.
The remaining group, Remote Oceanic, is where things get interesting. The first sub-group (Micronesian) includes 3 national languages (Nauruan, Gilbertese, Marshallese) among its 9 languages. The next two sub-groups include 116 languages spoken in the Solomon Islands and on New Hebrides.
The final sub-group, Central Pacific, is of interest. It includes languages with official status such as Fijian (a fascinating language -- I wish I still had my Fijian grammar to describe it), Niuean, Tongan, Samoan, Tuvalu, Rapanui (the language of Easter Island), Tahitian, Maori Marquesan and Hawaiian.
[continued]
1355. pseudoerasmus - Jan. 28, 1999 - 8:21 AM PT
Snirv: But you said that a schwa is a short A in Indo-Aryan languages. "Punjabi" is spelt with a long A in Urdu as well as in Punjabi.
Marzipranks: How is it spelt in the Devanagiri script? Long or short A?
1356. IrvingSnodgrass - Jan. 28, 1999 - 8:23 AM PT
Note that Maori and Hawaiian are in the same sub-branch (Central) of the same branch (Eastern Polynesian) of the same sub-group (Polynesian) of the same group (Remote Oceanic) of the same sub-branch (Oceanic) of the same branch (Eastern) of the same primary branch (Central-Eastern) of the same sub-family (Malayo-Polynesian) of Austronesian, which itself is a member of the Austro-Tai Sub-phylum of the Austric phylum.
So, Alistair, there is your Maori, deeply embedded in the world's largest (in terms of number of languages) language family. Maori, btw, has the most speakers of any language in the entire Central-Eastern branch, with 70,000-100,000 speakers.
And that concludes my overview of Austronesian.
1357. DanDillon - Jan. 28, 1999 - 8:27 AM PT
The romanization of just about any foreign term is problematic by design, if not by intent. While doing so isn't entirely arbitrary, the phonetic diversity among languages, especially languages of different families, renders the task very difficult, proven by the discussion on Punjabi... I mean Panjabi. Anyway, I have encountered supreme difficulty in transliterating the laryngeals in Arabic. After all, how to standardize such a thing? And in such a way so that all langauges can agree? A weighty task indeed.
1358. Jgeffert1 - Jan. 28, 1999 - 8:28 AM PT
Irv: I am just so intimidated by true genious. I am properly ashamed of myself and I will go botheer somebody elso in some other thread where i don't have to be a smart-aleck to be wanted. (sob,sob)
1359. IrvingSnodgrass - Jan. 28, 1999 - 8:29 AM PT
PE:
A long "a" sound in Indo-Aryan languages is different from a schwa. It is comparable to the first vowel in "father," and the linguistic symbol is a simple "a." I find it convenient to use "aa" for long vowels and "a" for short vowels here in the Fray.
If Panjabi is actually "Paanjabi," then the "u" of Punjabi has no business at all in there.
1360. pseudoerasmus - Jan. 28, 1999 - 8:32 AM PT
Well, in Urdu and Punjabi itself, I would say that the first vowel in "Punjabi" is pronounced somewhere in-between the vowels of "but" and "father".
1361. IrvingSnodgrass - Jan. 28, 1999 - 8:32 AM PT
True "genious," indeed.
Dan:
Transliterating a language such as Arabic with a phonetic keyboard is one thing. Trying to do it here in the Fray with a limited number of symbols available is even harder.
1362. marjoribanks - Jan. 28, 1999 - 8:36 AM PT
It's a short a if I understand the difference at all, which may not be the case.
Devanagiri offers one the chance to say (about P) in my rough transliteration:
Puh
Pa
Pi
Piy
Pu
Pooh
Pei
Paiy
Po
Pou
Puh
Pha.
When the sound is split with the particular N as in Punjabi, the word is spelt with the character for Puh.
1363. IrvingSnodgrass - Jan. 28, 1999 - 8:36 AM PT
PE:
Yes, a short "a" in Indic is actually in between a schwa and an "a." I think to confirm whether "Punjabi" is an acceptable spelling, we would need to know if it is a long or a short vowel.
1364. IrvingSnodgrass - Jan. 28, 1999 - 8:41 AM PT
Marj:
The vowel sounds in Indic are generally long and short i, e, a, o and u (plus diphthongs) - 12 sounds, as in your list. I'm not sure how your list (with two different "puh" sounds) works, though.
1365. hashke - Jan. 28, 1999 - 8:41 AM PT
Let ^=schwa. The schwa can be heard in the word 'apple', i.e., 'ap^l'
Dan: The laryngeal in Arabic is easily represented by `, as in *`arab*, *`id il fiTr*, etc.
1366. DanDillon - Jan. 28, 1999 - 8:43 AM PT
The shwa/schwa is a neutral vowel. I wouldn't liken it to a 'short A' or anything else. In fact, it carries a unique status in that it is often called the "indefinite vowel." You can hear it yourself if you say "amazing" aloud. Say it. It's the very first sound you made. Isolate it, and you've something quite different from a short A. (I never liked the terms "short" and "long" to describe vowels anyway.) As Irv pointed out, it is a mid-central vowel in English, the only other being the 'u' in "but" /^/. But they are rather unalike. The schwa can range from sounding like the first sound in "amazing" to the 'i' in "bodily," while the vowel sound in "but" does not vary. The /^/ sounds consistently like somebody in conversation who's searching for the right word... uhh.
1367. marjoribanks - Jan. 28, 1999 - 8:46 AM PT
Sorry, that second puh should have been 'punh'.
1368. pseudoerasmus - Jan. 28, 1999 - 8:50 AM PT
IrvingSnodgrass (Message #1363)
"I think to confirm whether 'Punjabi' is an acceptable spelling, we would need to know if it is a long or a short vowel."
But I've already told you. In the Perso-Arabic script of Urdu and Punjabi, the first vowel of the name of the language of Punjab is written with the long A -- alif.
I've also received e-mail confirmation from an Indian I know here. He says that in the Devanagiri script of Hindi, "Punjabi" is spelt with a long A, which is consistent with what I said about Urdu and Punjabi. But he did also say there are regional variations in the Hindi pronunciation of "Punjabi".
1369. DanDillon - Jan. 28, 1999 - 8:50 AM PT
hashke,
Nice to see you.
"Let ^=schwa."
The phonteic symbol /^/ actually stands for the 'u' sound in "but," the 'short A' as some are calling it. The schwa is always denoted by the upside down 'e', as Irv mentioned.
"The laryngeal in Arabic is easily represented by `, as in *`arab*, *`id il fiTr*, etc."
Yes, but how to get everyone to agree on that?
1370. pseudoerasmus - Jan. 28, 1999 - 8:51 AM PT
I take it that what Marzipranks means by the H in "punh" is the nasalised N.
1371. IrvingSnodgrass - Jan. 28, 1999 - 8:52 AM PT
Here's a Hindi wordlist to show some of the vowels (initial vowels are what we're looking at):
i = inaam ('gift')
ii = ghii ('ghee')
e = ek ('one')
a = ab ('now')
aa = aaj ('today')
o = or ('side')
u = uttar ('north')
uu = uunt ('camel')
1372. pseudoerasmus - Jan. 28, 1999 - 8:52 AM PT
But the Indian in question is a Tamil, so I wouldn't know whether that's reliable.
1373. IrvingSnodgrass - Jan. 28, 1999 - 8:57 AM PT
Dan & Hashké:
Sounds good to me. Let's use ^ to denote the short a sound in Indic languages.
PE:
If Panjabi is spelt with a long a in Hindi/Urdu, then the transliteration should be Panjabi, Paanjabi, or P(a with a line over it)jabi.
1374. pseudoerasmus - Jan. 28, 1999 - 9:01 AM PT
Well, the subcon acquaintance I mentioned in Message #1368 just e-mailed me again and said that I shouldn't take his word as gospel since his Hindi is not great.
Anyway, all I know is that in Urdu, "Punjabi" is written with an alif.
1375. IrvingSnodgrass - Jan. 28, 1999 - 9:03 AM PT
Wrt to my Message #1373, Yamuna Kachru, in her article on Hindi-Urdu in Comrie, consistently uses a schwa, not a caret, to denote the short a in Hindi and Urdu. She also confirm's PE's observation that in Urdu orthography, an alif following a consonant is always pronounced as a long vowel.
1376. marjoribanks - Jan. 28, 1999 - 9:06 AM PT
It's a short A, without a doubt.
1377. hashke - Jan. 28, 1999 - 9:10 AM PT
Dan:
Correction: read ` as *pharyngeal*. Laryngeal is a stop denotable by ', or a fricative *h*, as distinguished from fricative pharyngeal *H*.
1378. IrvingSnodgrass - Jan. 28, 1999 - 9:16 AM PT
Marj, PE:
Even if it's a short a, I think Panjabi is a better transliteration, for phonemic reasons. There is no phonemic schwa or caret in Indic languages.
"Punjabi" is fine for English speakers, but it smacks of the same colonial language which gave us Ooty and Pondicherry (and, as was mentioned earlier, "Poona" for Pune).
1379. hashke - Jan. 28, 1999 - 9:17 AM PT
Dan:
I know that the upside down 'e' is used to represent the schwa, but since we can't use it here I was suggesting an alternative.
Well, you and I could agree on an Arabic transliteration system were we to have much discussion about it, eh? Choose your weapons, amigo.
1380. IrvingSnodgrass - Jan. 28, 1999 - 9:19 AM PT
Isn't anyone going to give my language quiz in Message #1308 a try? I'll post the answers tonight.
1381. IrvingSnodgrass - Jan. 28, 1999 - 9:23 AM PT
Hashké:
I hope you and Dan can come up with a workable system of transliteration for Arabic. This is certainly the place to do it. Go for it.
1382. DanDillon - Jan. 28, 1999 - 9:34 AM PT
Well, hashke, we have Irv's blessing. (And maybe he'll help us along if we become misguided somehwere.) I figure such a task would be most easily undertaken if we just go down the Arabic alphabet and assign letters or symbols as appropriate. (Alif, Ba, Noon, etc.) The letter/sound correspondences in Arabic are fairly constant, thankfully. (Or, at least they are when you hold it up against a crazy language like, oh, I dunno, say, English?)
Would you like to begin? (I need to get through Rhys' *Wide Sargasso Sea* as I contribute here today, so I'll pop in sporadically.)
1383. IrvingSnodgrass - Jan. 28, 1999 - 9:40 AM PT
Hahaha. Don't expect me to help. My knowledge of Arabic is limited to Indonesian loanwords and Islamic ritual phrases ("Minal aidin wal faizin"). Oh yes, I know how to write my name.
1384. DanDillon - Jan. 28, 1999 - 10:18 AM PT
Oh.
One of the few, certainly, that you haven't ascertained, my good man?
1385. hashke - Jan. 28, 1999 - 11:15 AM PT
Dan:
I did a suggested transliteration, hit 'post this message' and got only the word 'slate'. So, it was all lost. Some other time!
1386. pseudoerasmus - Jan. 28, 1999 - 11:23 AM PT
According to S. Sangaji's giant "Urdu-English Dictionary", the word distribution in Urdu by origin is as follows:
30% Arabic
30% Persian
25% Hindi
5-8% Turkish
2-5% Greek
I find this very hard to believe, because spoken Hindi is more comprehensible to an Urdu speaker than this lexical distribution implies. However, it is possible that:
(1) just as the most common words used in English are Germanic in origin, the most common words used in Urdu, especially spoken Urdu, are Indic and not Iranic or Semitic;
or (2) the colloquial lexicon -- without the Perso-Arabic literary lexicon (which this dictionary includes) -- reflects a more Indic etymology;
or (3) Hindi itself borrowed a lot of Persian and Arabic words so that many Urdu words deemed Arabic or Persian in origin reflect an overlap with Hindi.
But I have no idea whether (3) is true.
Also, except for "marmar" (marble), I can't think of any Greek words in Urdu. I was rather surprised that Greek loanwords were numerous enough in Urdu as to be counted.
1387. DanDillon - Jan. 28, 1999 - 12:04 PM PT
hashke,
That's happened to many of us lately. Dunno what the mushkeel is. Pain is the ass. Another time.
1388. IrvingSnodgrass - Jan. 28, 1999 - 6:30 PM PT
PE:
Wrt your Message #1386, I think all three of your factors are part of the true story, plus (and this is just a guess) some good old fashioned sub-con exaggeration thrown in.
Your (3) is very true. There is a very large component of Arabic and Persian loan words in Hindi, such as "uunt" on my brief list above.
I also don't know of any major influx of Greek words into Hindi/Urdu, and a figure of 2-5% seems absurd. The 5-8% for Turkish is also very suspect. In fact *all the numbers are suspect. I would guess it's more like this:
60% descended from Indo-Aryan roots
20% Arabic
10% Persian
8% English (There are more words from English in Hindi-Urdu than a casual glance might indicate)
2% All Other (Turkish, Greek, Dravidian, etc.)
One final note: your mention of "marmar" in Hindi/Urdu reminds me of the interesting paths words take to get into languages. Indonesian has "marmer," but borrowed in from Dutch (which, of course, borrowed it from Greek). I wonder if there was an intermediary language in Hindi/Urdu's case (Persian or Arabic?).
1389. IrvingSnodgrass - Jan. 28, 1999 - 6:52 PM PT
Some common loan words in Hindi/Urdu (taken from the opening chapters of my "Hindi for Idiots" textbook):
Arabic
vakt 'time'
kitaab 'book'
madarsaa 'school'
akhbaar 'newspaper'
aadhiiraat 'midnight'
subah 'early morning'
almaarii 'cupboard'
maalik 'employer'
vakiil 'lawyer'
Persian
takht 'throne'
anguur 'grape'
English
klass 'class'
sekand 'second (time)'
minat 'minute'
kap 'cup'
tamaatar 'tomato'
kaard 'card'
skuul 'school'
kaalej 'college'
prinsipal 'principal'
hedmaastar 'headmaster'
Portuguese
kamiij 'shirt'
anaanaas 'pineapple'
Note: This was done entirely unscientifically, and depends solely on the words I happened to notice (meaning I had to already know them in the source languages, most of which I don't speak).
Btw, my textbook indicates pañch ('five') has a short "a."
1390. pseudoerasmus - Jan. 28, 1999 - 7:13 PM PT
Message #1388
No, after looking more closely through the dictionary, I disbelieve your distribution and believe Sangaji's -- except the Greek part, for I only saw about 5 words Greek in origin. On what basis do you come up with your distribution anyway? Out of the blue?
Sangaji labels the origin of almost every word in the dictionary. Noun after noun after noun was borrowed from Persian, whereas verb after verb after verb is from Arabic and to a lesser extent Turkish. Sometimes whole pages go by without any words of Indic origin at all. For instance, under the letter khe, I can barely find words of Indic origin. Which is sort of strange, since so many common words which I'm sure must also be used in Hindi are listed as derived from Persian or Arabic: khareednaa (to buy), khabar (news), khoosh (happy), kharaab (bad). There are of course many duplicates, some rare, others literary, others simply "alternate".
All the same, I might agree with your distribution if it were limited to the _colloquial_ language.
By the way, did you know that the word "Urdu" itself is a Turkish word? It means "army camp".
1391. pseudoerasmus - Jan. 28, 1999 - 7:20 PM PT
Message #1389
Why do you keep referring to a "Hindi/Urdu". The term is appropriate grammatically, but probably not lexically.
To wit, two of those words you list aren't even right for Urdu. A cup is "saak" and class is "dafa".
1392. IrvingSnodgrass - Jan. 28, 1999 - 7:23 PM PT
PE:
My distribution is just a guess, based on what I know of borrowing. I would guess that many of the Arabic/Persian words Sangaji lists are rarely used, if at all, and the vast majority of Arabic and Persian words are restricted to the literary language.
The reason the words in the "kh" section are all borrowed is the phoneme "kh" does not occur naturally in Hindi/Urdu... it's a borrowed sound.
Are you saying that you agree with Sangaji's 0% for English? Or does he merely fail to list any English words in his dictionary? There are many in Hindi/Urdu (car parts, for one small example).
1393. IrvingSnodgrass - Jan. 28, 1999 - 7:27 PM PT
HPE:
Hindi and Urdu are two varieties of a single mutually intelligible language. The common spoken tongue is referred to as "Hindustani."
1394. IrvingSnodgrass - Jan. 28, 1999 - 7:30 PM PT
That should be addressed to "PE"... I have no idea where the H came from.
1395. pseudoerasmus - Jan. 28, 1999 - 7:32 PM PT
Message #1392
But there are so many Urdu words of Arabic and Persian origin which are common, in everyday use. I don't know whether the same words are used in Hindi, but many of them must be. So, while I agree that Arabic and Persian words are strongly represented in the literary language, I do not at all agree that the "vast majority of Arabic and Persian words are restricted to the literary language". The claim is ludicrous prime facie.
"Are you saying that you agree with Sangaji's 0% for English? Or does he merely fail to list any English words in his dictionary?"
Sangaji certainly does label English loanwords, but for some reason does not include them in the frequency distribution table.
1396. pseudoerasmus - Jan. 28, 1999 - 7:35 PM PT
Message #1393
Sigh. When you make assertions about the lexicon, as opposed to the grammar, it is inappropriate to talk about Urdu as though it were the same as Hindi. So you don't have to keep repeating the linguist's mantra about Hindustani as though it were some kind of revelation.
1397. IrvingSnodgrass - Jan. 28, 1999 - 7:45 PM PT
PE:
Hindi has a very large number of loan words from Arabic and Persian as well. But a "very large" number of loan words still wouldn't begin to approach Sangaji's figures for Urdu.
If Sangaji claims that 65-68% of Urdu vocabulary comes from Arabic, Persian and Turkish, then the "vast majority" of them indeed must be found in the literary language, since the spoken varieties of Hindi and Urdu are mutually intelligible, and no more than 20% of Hindi vocabulary is of Arabic/Persian origin, and a minimum of 80% intelligibility is required for languages to be mutually intelligible.
I repeated the linguist's mantra because you asked why I "keep referring to a "Hindi/Urdu"."
1398. IrvingSnodgrass - Jan. 28, 1999 - 7:55 PM PT
Grammatically, phonologically and morphologically, Hindi-Urdu is one language. The spoken language developed at one time and place (during the period of Muslim invasions and establishment of Muslim rule in India between the 8th and 10th centuries), and the vocabulary, from the start, had a strong component of Arabic and Persian loan words.
As literary languages, they diverge greatly, and lose their mutual intelligibility. Urdu has always looked to Arabic and Persian as sources for linguistic borrowings, while Hindi has drawn from Sanskrit, Prakrits and Apabhramsas. I am sure that literary dictionaries of the languages are significantly different.
But the common, spoken language remains mutually intelligible.
The above information is condensed from Kachru's article in Comrie. One interesting statement from the article:
"The development of prose, however, begins only in the eighteenth century under the influence of English, which marks the emergence of Hindi and Urdu as fully-fledged literary languages."
1399. pseudoerasmus - Jan. 28, 1999 - 8:21 PM PT
Double sigh. Some arithmetic.
Hindi/Urdu spoken lexical overlap: 80%
spoken Hindi lexicon: 20% Perso-Arabic
spoken Urdu lexicon: 60% Perso-Arabic
Now, to ease computation, suppose that spoken Hindi and spoken Urdu have 100 words each in their lexicons. It would be reasonable to assume that all Perso-Arabic loanwords which exist in spoken Hindi also exist in spoken Urdu. Thus, in order to maintain the 80% lexical overlap, Hindi can have 80 Indic words, 20 Perso-Arabic; Urdu then can have 60 Indic words and 40 Perso-Arabic words. That's 40% frequency of Perso-Arabic in spoken Urdu.
But now comes Mr. Sangaji claiming that the total Urdu lexicon, colloquial and literary, is 60% Perso-Arabic. If this is true, then the following must also be true. In addition to the 100 words in the Urdu colloquial lexicon, there must be another 50 words in the literary lexicon. That's about 33% of the total. 33% literary out of 60% Perso-Arabic is a majority, but hardly a "vast majority".
Now, if the Perso-Arabic component of Hindi is 25% rather than 20%, the calculation that would be redone to include Turkish loanwords in Urdu would still produce more or less the same results.
1400. pseudoerasmus - Jan. 28, 1999 - 8:23 PM PT
errata (Message #1399)
Delete the line containing the parameter, "spoken Urdu lexicon: 60% Perso-Arabic".