A Statistical Investigation into the Cross-Linguistic Distribution of Mass and Count Nouns: Morphosyntactic and Semantic Perspectives
Keywords:mass count, entropy, variance, distribution,
AbstractWe collected a database of how 1,434 nouns are used with respect to the mass/count distinction in six languages; additional informants characterized the semantics of the underlying concepts. Results indicate only weak correlations between semantics and syntactic usage. In five out of the six languages, roughly half the nouns in the database are used as pure count nouns in all respects; the other half differ from pure counts over distinct syntactic properties, with fewer nouns differing on more properties, and typically very few at the pure mass end of the spectrum. Such a graded distribution is similar across languages, but syntactic classes do not map onto each other, nor do they reflect, beyond weak correlations, semantic attributes of the concepts. Considerable variability is seen even among speakers of the same language. These findings are in line with the hypo-thesis that much of the mass/count syntax emerges from language- and even speaker-specific grammaticalization.
LicenseAuthors who submit to and publish with BIOLINGUISTICS agree to the following terms:
- The author(s) retain(s) copyright and grant(s) the journal the right of first publication with the work simultaneously licensed under a Creative Commons CC-BY License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in BIOLINGUISTICS.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., archiving a format-free manuscript in institutional repositories, on their personal website, or a preprint server such as LingBuzz, PsyArXiv, or similar) prior to and during the submission process, because we believe that this behaviour can lead to productive exchanges, as well as earlier and greater citation of published work (see The Effect of Open Access).