Meaning, sound and vision, a defence of the non-arbitrariness of language  

Kunstformen der Natur (1904) by Ernst Haeckel
"Meaning, sound and vision, a defence of the non-arbitrariness of language" (2017) is a paper on sound symbolism by J. W. Geerinck. It builds on The Natural Origin of Language (2012) is a book by Robin Allott on the origin of language.

Full text

‘Die Möwen sehen alle aus, als ob sie Emma hiessen.’ – "Das Möwenlied" by Christian Morgenstern

At the beginning of the 20th century the Swiss linguist Ferdinand de Saussure (1857- 1913) stated in his canonical Course in General Linguistics (1916) that:

The bond between the signifier and the signified is arbitrary. Since I mean by sign the whole that results from the associating of the signifier with the signified, I can simply say: the linguistic sign is arbitrary (Saussure, 1959).

The assertion that the linguistic sign is arbitrary implies that in the evolution of language, the naming of objects has happened by chance and there is no relation between sounds and the words they represent. This may not strike us as illogical at all at first sight. There is, for example, nothing red about the word red and the word big is itself rather small (Simner, Cuskley, & Kirby, 2015).

However, there are certain forms of spoken and written language which would indicate that there is a relationship between signifier and signified. It is for example hard to deny that the onomatopoeic word woof, when pronounced and written in English, mimics and refers to the sound of a dog barking. Additionally, when a non-English individual hears the interjection huh?, he will not fail to notice that it expresses a question of some sort.

Yet, denying a causal relationship is precisely what de Saussure does when he argues for the non-arbitrariness between signifier and the signified, even in the case of interjections and onomatopoeia. Of the onomatopoeia he says that they are ‘never organic elements of a linguistic system’, that ‘their number is much smaller than is generally supposed’ and that even a ‘suggestive sonority’ in a word such as the French fouet (whip) is actually a ‘fortuitous result of phonetic evolution’ because fouet does not mimic the swishing sound of a whip but is actually derived from the Latin fagus (beech tree). De Saussure admits that there are some genuine ‘authentic onomatopoeic words (e.g. glug-glug, tick-tock, etc.),’ but yet again he minimizes their importance as they are firstly ‘limited in number’, secondly ‘chosen arbitrarily […], for they are only […] conventional imitations of certain sounds’ (de Saussure compares English bow-bow and French ouaoua for barking sounds) and thirdly subjected to phonetic and morphological evolution evolving in the character ‘of the linguistic sign in general, which is unmotivated’.

Next de Saussure treats interjections and states that they are ‘closely related to onomatopoeia’ but do not refute his thesis that the linguistic sign is arbitrary, because 1) ‘we need only compare two languages on this point to see how much such expressions differ from one language to the next (e.g. the English equivalent of French aïe! is ouch!)’ and 2) ‘many interjections were once words with specific meanings.’ As an example he gives the French mordieu! which evolved from mort Dieu ('God's death).

De Saussure concludes that ‘onomatopoeic formations and interjections are of secondary importance, and their symbolic origin is in part open to dispute’.

For several years, not a single scholar – linguistic or otherwise – seemed to have disputed Saussure’s findings. The world seemed to have forgotten that Willem von Humboldt had in 1836 tried to demonstrate that sound and meaning were closely interrelated. He keenly remarked that the German consonant w when placed at the beginning of a word evoked a ‘vacillating, wavering motion’. He found this motion in the following series of words: ‘wehen [to blow], Wind [wind], Wolke [clouds] wirren [to move chaotically] and Wunsch [wish]. To his credit, Humboldt was modest enough to concede that his generalization was perhaps just either a coincidence or a hineininterpretierung and he warned that the practice of associating sound and meaning is a ’slippery path’ . It is indeed not hard to come up with other words in the German language which do not start with w- but nevertheless conjure a ‘vacillating motion’. Conversely, many words which start with w- do not evoke any such motion.

The world seemed also rightly to have discarded the pseudo-linguistic notions found in Plato’s dialog Cratylus where Cratylus, Socrates and Hermogenes discuss whether language is a system of arbitrary signs or whether words have an intrinsic relation to the things they signify. Hermogenes defends the position that language is arbitrary and vocabularies are a matter of mere convention. Cratylus posits that a name is either a natural name or not a name at all, a word is either the perfect expression of a thing, or a mere meaningless sound. Socrates has the final say and unites their views.

Cratylus and Hermogenes illuminate the two positions. On the one hand the naturalists, i.e. believers in what is nowadays known as sound symbolism, on the other the conventionalists, the non-believers of said theory. The extreme of their arguments go something like this: if sound symbolism were true, if ‘the sound of a word affects its meaning, then you should be able to tell what a word means just by hearing it. There should be only one language.’ (Magnus, 2013). Because this is manifestly not the case, naturalists are believed to be wrong. The naturalists’ evidence has been scarcer and it has been the poets who have come to their aid making for convincing but always anecdotal evidence. As Anatoly Liberman (born 1937) notes: ‘Consider the English words “glow, gleam, glimmer, glare, glisten, glitter, glacier, and glide.” They suggest that in English the combination gl- conveys the idea of sheen and smoothness. Against this background, glory, glee and glib emanate brightness by their very form, glance and glimpse reinforce our conclusion (because eyesight is inseparable from light), and glib has no other choice than to denote specious luster, and, indeed, in the sixteenth century, when it became known in English, it meant “smooth and slippery.”’

But before Liberman, there was John Wallis (1616 – 1703) who observed, in a whole list of initial consonantal groups that ‘wr shows obliquity or twisting: wry, wrong, wreck, and wrist, “which twists itself and everything else in all directions”; ‘br points to a breach, violent and generally loud splitting apart: break, breach, brook and ‘thr indicates a violent movement: to throw, through.’

The problem with these two pieces of ‘evidence’ is that they are on the one hand anecdotal and on the other only valid in English. However, in 1909, the American linguist Leonard Bloomfield (1887 –1949) moved the discussion out of the speculative arena and into that of science. He wrote his doctoral thesis on ablaut relationships in words such as sing, sang, sung. The introduction of his thesis still stays within the sphere of anecdotal speculation (Bloomfield, 1909):

‘If a word designating some sound or noise contains a high pitched vowel like i it strikes us as implying a high pitch in the sound or noise spoken of; a word with a low vowel like u implies low pitch in what it stands for. […] We need only think of bim! : bam! : bum! […]. Who would apply Bim! to the roar of a cannon. Bum! to the tinkling of a bell? And Bam! would better fit the bang of a fist on the Biertisch than either of the above noises.’

The particular importance of Bloomfield is not the above (which is in a way a repetition of Wallis and Liberman), but rather the fact that in his study of the ablauts in the German language he itemized all the major roots in Germanic of the complete corpus and cannot be accused of cherry picking. ‘It therefore can't be said that he picked out certain words or phoneme combinations that supported his case and conveniently left out the others. He thereby made it possible for the first time to quantify the correlation, and this is the first step toward broadening the discussion from philosophy and speculation to real science.’ (Magnus, 2013)

Conventionalist critics might respond to this by saying that what is true for the Germanic languages is not necessarily true for other language families. I’ll come back to that later. First I want to broaden the discussion a bit by quoting Bloomfield again when he observes further consequences of his findings with regarding to meaning and connotation.

‘[Sound symbolism’s] far-reaching effects on our vocabulary are surprising. It has affected not only words descriptive of sound [like bim, bam, bum] [...] but also their more remote connotative effects. A high tone implies not only shrillness, but also fineness, sharpness, keenness; a low tone not only rumbling noise, but also bluntness, dullness, clumsiness; a full open sound like a not only loudness, but also largeness, openness, fullness.’ (Bloomfield, 1909)

And lastly, Bloomfield points to another significant factor in the relationship between sound and meaning when analysing the places of articulation of different sounds.

‘Nor must the subjective importance of the various mouth positions that create the different vowel sounds be forgotten: the narrow contraction of i, the wide opening of a, the back-in-the-mouth tongue position of u are as important as the effect of these vowels on the ear of the hearer.’ (Bloomfield, 1909)

So how about the criticism of the conventionalists that Wallis, Bloomfield and Liberman only have provided examples from Germanic languages? Touché. I could provide them with examples from Romance languages and other families (except tonal languages such as Chinese, these rely on pitch to convey extra meaning). But this would be a tiresome activity as there are hundreds of language families and, depending on the definition, five to eight thousand languages. No, with one foot, we need to step out of the realm of language to prove our point.

In 1913 a certain Wolfgang Köhler (1887 –1967) was invited by the Prussian Academy of Sciences to lead the first anthropoid station at Tenerife. His mission consisted of conducting experiments with regards to animal cognition. He would later be associated with the Gestalt school of psychology but his current claim to fame is a small experiment that he set up while in Tenerife, an island he and his family would not be able to leave for seven years due to the outbreak of the war.

Nowadays, the experiment is known as the bouba/kiki effect and it is no more than a proverbial footnote in its initial description in the 1929 book by Köhler with the title Gestalt Psychology, in which Köhler is trying to correlate ‘words’ […] to subjective experiences.’ Köhler introduces his experiment by appropriately mentioning a poet first: ‘The German poet Morgenstern once said of seagulls, that they ‘all look as though their name were Emma.’ Köhler agrees. ‘The sound of “Emma”’, he says ‘as a name and the visual appearance of the bird appear to me similar.’ We are then given the details of his experiment.

‘Another example is of my own construction: when asked to match the takete and maluma with the two patterns shown as Figs. 18 and 19, most people answer without any hesitation.’

What is that answer, you may wonder? Tellingly, Köhler fails to provide it, thinking it so obvious and inevitable that it is unnecessary to give it. But as you may have guessed, the majority of respondents assigned the nonsense word maluma to the rounded, amoeba-like, curvilinear bulbous shape of figure 18 [left] and takete to the spiky, jagged , star-like shape of figure 19 [right].

Indeed, subsequent tests (there have been many, see Lockwood and Dingemanse, 2015) across languages and even with toddlers of as young as two-and-a-half years old – of course too young to read – have always yielded the same results. This suggests that the human brain is somehow able to extract abstract properties from shapes and sounds. The effect has also been shown to emerge when the words to be paired are actual first names, suggesting that some familiarity with the linguistic stimuli does not eliminate the effect. A recent study (Sidhu, Pexman and Penny, 2015) showed that individuals will pair names such as ‘Molly’ with round silhouettes, and names such as ‘Kate’ with sharp silhouettes. Moreover, individuals will associate different personality traits with either group of names (e.g., easygoingness with ‘round names’; determination with ‘sharp names’).

This is consistent with my own findings. Informal research conducted by myself indicates that when informants (high school students, I’m a teacher) are asked which of the two shapes is bouba and which is kiki, I’m given the usual responses. When I ask the additional question, ‘which shape is the “smart” shape and which is the “dumb” one’, kiki is usually designated as the smart one (remember, sharp in English also means intelligent) and bouba the dumb one (likewise, dull means ‘not intelligent’ ). This means that shapes can be connected both to sounds and connotatively to affects and may hint at a role of abstract concepts in the effect.

There is one caveat. Individuals who have autism do not show as strong a preference. Compared to neurotypical individuals, individuals with autism agree only 56% of the time with the standard results. It has of course been previously noted that individuals with autism have a hard time interpreting figurative language such as double entendres and metaphors.

That this mapping of shapes, sounds and meanings is non-arbitrary speaks for itself. That they are connected to the realm of metaphors as well. Ramachandran and Hubbard (2001) mention Lakoff and Johnson (1980), who ‘have systematically documented the non-arbitrary way in which metaphors are structured’ and how ‘a large number of metaphors refer to the body and many more are inter-sensory (or synaesthetic).’ Whorf (1956) observed universals of content: ‘in the psychological experiments, human subjects seem to associate the experiences of bright, cold, sharp, hard, high, light (in weight), quick, high-pitched, narrow, and so on in a long series, with each other; and conversely, the experiences of dark, warm, yielding, soft, blunt, low, heavy, slow, low-pitched, wide, etc. in another long series.’

Another observation – similar to the one Bloomfield offered supra – with regards to the bouba/kiki effect can be made about to the place and manner of articulation. The rounded shape may most commonly be named bouba because the mouth makes a more rounded shape to produce that sound while a more taut, angular mouth shape is needed to make the sound kiki. Alternatively, the distinction may be between coronal or dorsal consonants like /k/ and labial consonants like /b/.

Unfortunately, all of these observations are instances of covariation, no more than symptoms of a phenomenon, and what we are looking for now, is an explanation for the phenomena observed. Why is it that we find covariations of shapes, sounds and meaning? How are these interrelated? From here on I will rely on a line of thought developed by independent scholar Robin Allott (born 1926) in his ground-breaking The Natural Origin of Language (2012). Allott’s theory, which ‘brings together evidence […] drawing on linguistics, the physiology of perception, neurology, psychology and the philosophy of language and perception’ (Allott, 2012:7) is in many ways a forerunner to the enactivists who first entered the philosophical scene in the 1990s and became popular in the 2010s. The Natural Origin of Language takes a naturalist stance in the naturalist-conventionalist debate and argues against the arbitrariness of language professed by de Saussure.

It builds its arguments from two sources : science and philosophy. On the science side, the disciplines are linguistics (Noam Chomsky, Roger Brown, Edward Sapir), physiology of perception (Richard Gregory), neurology (Karl Lashley, Eric Lenneberg, Torsten Wiesel and David Hubel) and psychology (Konrad Lorenz). With the interdisciplinary and intertwined findings of these scientists Allott does an admirable job of proving the naturalist hypothesis that language is intimately connected to action and perception.

He breaks his hypothesis down in eight propositions (Allott, 2012:63-5): Bodily actions (acquiring food, escaping from enemies, fighting, reproducing) are central to the life of the animal and the human. Perception develops in the service of bodily action to increase the effectiveness of an animal's said action. Perception must be fed to the action-organization, there must be a direct and intimate contact between the structures controlling the organization of action and the structures of perception. Perception is an active process, the integration of perception and bodily action has to be extraordinarily complete. Language is a further stage in the development of effective bodily action, first a means for collaboration within the group, pooling of perceptions, allowing them to travel across space, a physical space or a space in time, to another individual or other individuals. Since perception comes before language and is the more vital activity, the structure and contents of language must be derived from and dependent on the structure and contents of perception. Perception is extended to perception of the individual's own bodily states and activities, making the structure and contents of language similar to external perception. Since visual perception is a very important part of total perception, language is primarily structured by visual perception.

Allott summarizes as follows: ‘the vital needs of the organization of action have determined the character of perception; the role of language as a supplement to perception has meant that language has had its character determined by the characteristics of perception. Action structured perception and perception structured language.’ (Allott, 2012:66)

Surprisingly, the book does not mention the bouba/kiki experiment but does give considerable attention to sound symbolism, as it is a central part of its hypothesis that the linguistic sign is not arbitrary and language is interrelated with action and perception. It also concedes (Allott, 2012:302) that it has not entirely succeeded in contradicting the conventionalist criticism that if naturalism were true, we would all speak the same language. However it gives six solid arguments why in different languages we use different words for different precepts (Allott, 2012:304).

Apart from the scientific explanation for the non-arbitrariness of language, Allott posits a philosophical explanation in a separate final chapter dedicated to philosophical issues. During the course of the book Allott had already mentioned the aforementioned Cratylus dialogue. The empiricist John Locke (1632 – 1704) was also featured, denying in his An Essay Concerning Human Understanding the possibility of innate ideas and thus the possibility of sound symbolism, denying the possibility of ‘any natural connection […] between particular articulate Sounds and certain Ideas, for then there would be but one Language amongst all Men’ and stating that ‘sounds […] have all their significance from the arbitrary imposition of men [...] a man may use what words he pleases to signify his own ideas to himself’.

Throughout the book the presence is felt of Ludwig Wittgenstein (1889 – 1951) who went from a position in which he was in complete agreement with the ‘naturalists’ to being a full-blown ‘conventionalist’.

Importantly, since perception is central to Allott’s theories, they reflect on the philosophy of perception. Which brings Allott to the two ‘destroyers of metaphysics’(Allott, 2012:345), Hume and Kant, who have landed us in a ‘philosophical impasse’. Hume because he relegated beliefs about the outside world to mental habits and customs, Kant because he bifurcated the phenomenal and the noumenal. How does Allott find a way to lead us out of this philosophical impasse? By appealing to three philosophers, one of whom is not a philosopher but an ethologist: Alfred North Whitehead (1861 – 1947), Maurice Merleau-Ponty (1908 – 1961) and Konrad Lorenz (1903 - 1989).

In the theory of perception of Whitehead, one finds the notion of prehension (the Latin term prehendere means ‘to grasp’). Whitehead coined the term in Science in the Modern World (1925) as "uncognitive apprehension." It’s easy to see why this term appeals to Allott, it represents an embodied manner of perceiving the world. In the words of Allott: ‘perception does not present us with isolated sensations; perception is experience from within nature of the system of events which make up nature’(Allott, 2012:357). That same connection between external world and subject can be found in the work of French philosopher Merleau-Ponty whose views Allott describes as ‘strikingly similar’ (ibid) to those of Whitehead. Perception is key to Merleau-Ponty’s philosophy and his concept of the body-subject (French corps-sujet) runs parallel to that of Whitehead’s prehension. Via the foregrounding of the human body, Merleau-Ponty stands at the beginning of embodied philosophy. A third exit from the impasse is proposed by Konrad Lorenz. ‘For him, both the perceiving subject and the object of perception are equally real and all knowledge derives from the interaction between them. (Allott, 2012:357)’

For all three, there is an intimate connection between perceiver and the perceived.

But I’ve moved beyond my point: to show that language is not arbitrary.

I give the final word to Robin Allot:

‘WE SPEAK AS WE SEE and WE SPEAK AS WE ACT. (Allott, 2012:57)’


