Fantasy Languages

Tolkien was the god of world-builders. He knew the legends and lore of every culture in Middle Earth, and he created fourteen different languages, complete with their own unique grammatical structures. Us mere mortals cannot hope to compete with him, but we can at least do our humble best to create realistic languages for our fantasy cultures, where such is appropriate.

The first thing to keep in mind is that a fantasy language is not a conlang. It is, but it isn’t. Yes, it’s constructed by a human—you—but it’s supposed to resemble a real, living language, not the ideal to which language ought to strive.

In your world, this language is spoken by living, breathing people. They use it in their everyday life. Therefore, it should share features with actual languages in ways that are based on how languages have evolved over time.

One of the first things that comes to mind is irregular verbs. Every inflected language has irregular verbs. These tend to be the most used verbs; infrequent usage causes verbs to conform to more regular conjugations, which can be seen in English with the irregular dove giving way to the regular dived. In contrast, the verb to be, which is the most frequently used verb in the English lexicon, is also the most irregular. Similarly, to have, to go, to do, and to say are irregular verbs that see frequent use.

The reason behind this is that when we don’t have a firm memory of an irregular past tense or participial form to block the creation of a regularized past tense, we simply apply the rule we’ve learned, adding -ed to the end of the verb. But when a verb is in frequent use, the irregular forms are passed along from generation to generation, and remain in the lexicon.

The frequency of irregular verbs in your language will depend upon that language’s history. A language like English that borrows a great deal from foreign languages and from other parts of speech will have fewer irregulars.

But how do they get there in the first place? In the case of to be, originally there were three versions of the word, with different shades of meaning and distributions of use, similar to Spanish ser and estar. Gradually the three were conflated into a single verb, and different forms took on different uses. With go, the past tense went was originally attached to the verb to wend. Other irregularities arise due to habits of pronunciation. It’s easy to imagine haves being mumbled and interpreted as has, and haved becoming had.

Linguistic drift is caused primarily by two forces: laziness on the part of the speakers, and a desire for clarity. People don’t want to put forth more effort than they have to when speaking, so sounds have a tendency to absorb and elide into neighboring syllables. For instance, the reason the plural of foot is feet is that back in the day, the plural was formed by adding -i, so foot became footi, but the -i at the end pulled the initial vowel towards it to make it easier to say, resulting in feeti. Then the final syllable fell off, but the vowel change was preserved as a marker of the plural. Similarly, slept is the past tense of sleep because it’s difficult to go from a voiced to an unvoiced consonant or vice versa, and adding a consonant to the end of the syllable lengthens it, causing the vowel to shorten in accommodation.

This phenomenon can also be seen in the regular -ed ending for past tense verbs. There are three different ways to pronounce it, based upon what comes before: d, t, or ǝd. If I raked the leaves, the final consonant is unvoiced; if I hailed a friend, it’s voiced. And then of course we have words like mended, where the vowel is pronounced. This rule is invisible on paper but quite audible when spoken—yet most native speakers aren’t cognizant of its existence. Knowing what rules of pronunciation underly your made-up language can help make it that much richer and more realistic.

Irregular verbs often fall into patterns: ring, rang, rung; sing, sang sung. These words can pull regular verbs into their orbit. One could imagine a new verb, pling, with a past tense of plang and a participle of plung. Why, then, is it mended rather than ment, when we have the example of rend and rent?

Though I have no experimental evidence to support this observation, it seems that when there’s an association with an already-existing word, irregular past tenses are avoided. For instance, ment would get confused with meant, and therefore is less likely to be produced. This is due to the second cause of linguistic drift: the desire to avoid ambiguity.

In any language, ambiguity will exist. Homophones are a thing. Puns are a humorous exploitation of linguistic ambiguity. But language is a vessel of communication, and it cannot serve its purpose if an utterance can be interpreted in many different ways. In most cases context will fill in the missing pieces. “Either this, or on this,” the Spartan mother says, as she hands her son his shield. The context of him going off to war supplies the remainder of the meaning: He must return either carrying the shield (implied by the accusative case of the first pronoun), or carried upon it.

Case is quite common in other languages, but only vestigial in English. We see it exclusively in pronouns. I write a blog post; someone reads my entry; they chastise me for my overly technical terms. In this example, I is in the nominative (subject) case, my is in the genitive (possessive) case, and me is in the accusative (objective) case. Proto-Indo-European, from which English descends, had eight cases, including dative (indirect object), vocative (object of address), locative (locational), and ablative (object of a preposition).

Declension of nouns and conjugation of verbs come from the remnants of old markers for person and tense. In English, the -ed ending likely originated in the use of the verb do to mark the past tense. We see similar things in will and go as future-tense markers, and have to mark the perfect tense. In Farsi, the verb to be is used with the participle to indicate perfect tense, and the verb to have indicates continuous action in either the present or the past, depending upon how it’s conjugated. For example, a sentence that looks like it says I have walk would translate to I am walking.
This use of common verbs to act as tense-markers is quite common. The most common verbs are also the most common markers of tense or intention. Have, be, do, go, etc. Over time these verbs can get absorbed into the words that precede them and become inflected endings. In English, the -ed past tense ending likely arose from the use of do to indicate past tense.

It’s good to keep this in mind when you’re coming up with the conjugations for your made-up verbs. What process led to those endings meaning that particular tense, person, and number? A fantasy example: the verb says going through a natural change from says-me to sez-ǝ-me and eventually becoming sezem. Says-you might become sez-ǝ-you and eventually sezay. Says-he is pronounced sez-hí and lands on sezi. Or if you’re using the stem, say-me becomes say-ǝ-me, later say-um, and finally saym; say-you changes to sayu; say-he is difficult to pronounce and in order to distinguish that middle consonant it changes to say-khí and then sayki.

This phenomenon can be observed in Farsi, which inflects prepositions. Beh man (به من), meaning to me, becomes behem (بهم). Beh to (به تو), or to you, becomes behet (بهت). Beh ou (به او), which means to him or to her, becomes behesh (بهش). These same endings can be used to indicate possession, since in Farsi possession is indicated by an ezafeh (trailing -e) on the thing possessed followed by the possessor; ketab-e man (کتاب من), or my book, becomes ketabam (کتابم). Same with ketab-e to (ketabet) and ketab-e ou (ketabesh).

Declensions form in a similar manner. Speakers add markers to denote what the word is doing in the sentence. For instance, in Farsi, the direct object is often marked by the particle ra (را). For the first person pronoun, this changes from man ra (من را) to mera (مرا). In a way, mera has become the accusative form of man.

English marks the possessive with ’s and the plural with s, but this isn’t necessary. As mentioned before, Farsi marks possession with an ezafe. This grammatical construction also marks the relationship between a noun and the adjective that modifies it: ketab-e sefid (کتاب سفید), a white book. Since short vowels aren’t written, the ezafe doesn’t show up in text, and must be inferred. Farsi does have a plural marker, but it’s not used when a number is specified. Ketabha (کتابها) means books, but if there are two of them, it’s do ketab (دو کتاب), literally two book. Malay uses reduplication to indicate plural: orang means person, orang-orang means people. Other languages don’t have a plural marker, and instead require a counting word to specify when it’s plural: a book, many book. English does this with some words such as deer and fish, which take the same form in the singular as the plural. In these cases plurality can often be determined from context, and if not, must be specified.

In English, we’ve retained rudimentary declension in our pronouns. She likes to read her book; I spoke to him. But that isn’t necessary. Even the possessive ’s can be dropped while maintaining context. Centuries ago there was a pronoun ou which could refer to any gender. If I tell you that ou read ou book, there’s no ambiguity about what’s going on.

Marking the subject, object, and other nouns makes the meaning of the sentence more clear, but it needn’t be done with word endings. Grammatical function could be tacked on to the beginning of the word, or even change the pronunciation of a vowel or consonant in the middle. Or it could elicit no change at all. English marks grammatical function with word order. The subject comes first, then the verb, then the object. Linguists refer to this as subject-verb-object, or SVO, word order. Irish uses VSO construction; Farsi utilizes SOV. Some languages prefer to put the topic of the sentence first, with markers for the actor and actee. In the English sentence “The boy kicked the ball,” there’s no indication of which part is most important. In Latin, however, case endings allow us to put the ball first if we want to emphasize what was being kicked, or the verb first if the action of kicking is what we’d like to highlight. Most Latin sentences take SOV word order like Farsi, but their structure is more flexible than that of English.

Even in fully inflected languages, there can be some confusion between forms. An ending of -is in a second declension Latin noun might indicate either dative (indirect object) or ablative (object of a preposition) plural. In these cases, meaning is determined by context. Just like in English we can tell that a dove that landed on a branch is a different word from a swimmer who dove into a pool, I know that if I give the pueris a gift it’s dative but if the gift was given by the pueris it’s ablative.

This can also happen with helping words. In Farsi, the perfect tense (rendered in English as has done) is formed with the participle followed by the copular verb (conjugated form of to be). We see this in French as well: I have arrived is rendered je suis arrivée, or I am arrived. But Farsi also uses this same construction when describing states of being. I am sitting becomes neshideh hastam, because hastam means am and the participle neshideh describes the state of being rendered in English as sitting. (When I learned this in Farsi class, I looked at the teacher and asked, “So it’s just the participle used as an adjective followed by the copular verb?” To which one of my classmates responded, “Do you even have any idea what you just said?”) In English, there’s similar confusion. “I am sitting” could either mean that I’m in a state of being with my butt in a chair, or that I’m currently in the process of lowering myself into said chair.

Another thing to keep in mind is the natural human tendency to assign non-arbitrary meaning to sound. Most utterances are arbitrary in meaning: a rose by any other name would smell just as sweet. Onomatopoeias are less arbitrary, though a pig doesn’t actually say oink nor a cat meow. (Interestingly, cats in different countries have different accents, which may in part explain the differences in cat sounds between different languages.)

There are, however, certain universals in sound. In English, vowels pronounced higher and further front tend to be used to indicate the stems of verbs and the singular forms of nouns, as well as anything that indicates selfness or presentness. Vowels that are lower and toward the back are used for marked forms such as past tense and plural. This can be seen in the distinction between sing, sang, and sung, or between here and there, this and that, or me and you. In Turkish, I is ben and you is sen, which don’t have the vowel distinctions but do share consonant distinctions with some Indo-European languages.

Vocabulary is also only semi-arbitrary. Related words often share similarities: mother, father, sister, brother. Others are formed from related words, such as grandmother. Some relationships are harder to pierce than others. For instance, there’s no sense of standing in the verb to understand; yet it still maintains the irregular conjugation of its formative verb. English wink becomes Farsi cheshmak, which is just the word for eye followed by that voiceless stop. It’s no coincidence that a closure of the eye is described with a closure of the throat.

Prefixes and suffixes often come from actual words, such as -able. If something is legible, it’s able to be read (from the Latin legere, meaning to read). These forms may also change along with their accompanying word. For instance, something is insoluble or indispensable, yet impossible to discount. In Arabic, the prefix al- (ال) is used to denote the definite article, but changes pronunciation based on the following word. Mullah Nasr-al-din (نصرالدین) becomes Nasruddin in common parlance. Knowing how your language absorbs sounds to make things easier to pronounce—and how foreigners might mess this up—can add nuance to your story and worldbuilding.

Prepositions and adverbs can be subsumed into a verb to change its meaning. A program puts out a result, or outputs it. Your language might render this verb as putouts instead. Something that caused a hangup in the past might have hungup the process or hangupped it, depending on how your language forms new verbs.

Then there’s the fact that no language is a homogenous edifice, unless it’s only spoken by a single tribe. Each new generation changes the language in their own unique way. This gives rise to regional dialectical differences. For instance, the dialect of the Southern United States contains a near future tense formed by the helping verb “fixing,” as in “I’m fixing to go to the store. While there, I might could buy me some milk.” That me serves the same function as the middle voice in Greek, which indicates an action done on one’s own behalf; such nuance in meaning is impossible to convey in the standard American English dialect. For all that people dismiss Southerners as ignorant, uneducated hicks, their dialect, far from being ungrammatical, contains grammatical nuance that can’t be found in the “standard” version of the language.

Pronunciation, vocabulary, and even grammar can vary from region to region. The differences between dialects show the first steps in two languages splitting off from a shared root. Communication between the populations will slow the linguistic drift, and there might even be borrowing of forms back and forth, keeping the dialects mutually intelligible; but no matter how many times someone with an English degree insists that might could is ungrammatical and I shouldn’t use it in my writing, I know that it’s a perfectly acceptable expression. And I might know that withdrawal doesn’t have an r in the final syllable, but that doesn’t stop me from pronouncing it like it does. Other differences include the use of reckon for think, as in “I reckon it might rain later today,” and pronouncing naked like nekkid. (If you’re now imagining me speaking with a thick Texas drawl, don’t; I only sound like that when I’m around my family. Most of the time I have a very neutral accent, with the exception of a few words.)

All these are things too keep in mind when constructing your language. The language of the court might not be exactly the same as the language spoken by the working class, which might itself be different from that of rural farmers. None of these dialects are grammatically superior to any other. A Brit saying “the government are” is neither more nor less correct than an American saying “the government is,” the same way that it’s just as accurate for a Frenchman to say “je suis arrivé” as it is for an English speaker to say “I have arrived.”

A final consideration: What sounds does your language use? Sound clusters? Russian is perfectly happy to produce strings of consonants as in v Moskvye. It can even switch between voiced and unvoiced consonants in the middle of a string. English, on the other hand, doesn’t like changing between voiced and unvoiced consonants, and avoids longer strings. Farsi avoids consonant clusters even more; Maz Jobrani has a great bit about how when Persians say the word “gangster” becomes more like four syllables. Ski becomes eski, student becomes estudent. In Greek, the Persian Akhashuersha is rendered as Xerxes (Kserkses). In English we pronounce it Zerkseez, since we can’t start a word with the “ks” sound. What your characters find difficult to pronounce will probably differ from what we as English speakers find difficult to pronounce, but there should be some rationale to what sound clusters do and do not appear, and how sounds change to accommodate the surrounding vocalizations.

If you have any other questions about creating your own language, please don’t hesitate to hit me up. Language is my jam. I love to write about it, read about it, and talk about it. If there’s anything I didn’t cover in this post that you’re curious about, let me know.
 •  0 comments  •  flag
Share on Twitter
Published on December 16, 2018 14:53 Tags: conlang, fantasy, grammar, language
No comments have been added yet.