Book Chapter
Details
Citation
Wiegand V (2026) Representativeness and Corpus Sampling. In: International Encyclopedia of Language and Linguistics. 3 ed. Elsevier, p. 283–289. https://doi.org/10.1016/b978-0-323-95504-1.01324-7
Abstract
Corpus compilation involves a range of decisions to ensure that the resulting corpus is as suitable as possible to address the intended research question(s). Accordingly, the corpus should contain language use that is typical of the language variety, registers, or discourses that it is meant to represent. The article introduces the concepts of ‘representativeness’ and ‘corpus sampling’ and outlines how different types of corpora raise distinct considerations. It also argues that representativeness and corpus sampling are particularly relevant principles that should inform the development of future large language models.
| Status | Published |
|---|---|
| Publication date | 31/12/2026 |
| Publication date online | 30/06/2026 |
| Publisher | Elsevier |
| ISBN | 9780443157851 |
People (1)
Lecturer in Education (TESOL), Education