Validating the Logistic Model of Article Usage Preceding Multi-word Organization Names with the Aid of Computer Corpora

Grace Y.W. Tse

Research output: Contribution to journalArticlepeer-review

8 Citations (Scopus)

Abstract

This paper aims to show that model validation is of great importance to ensure the predictive accuracy of a statistical model. By extending the use of logistic regression analysis, it further demonstrates the value of logistic modelling of non-discrete linguistic categories in language performance. This statistical technique is illustrated on a corpus-based study of the theory on the grammatical factors for the oscillation between the use and omission of the definite article preceding multi-word organization names (e.g. the Foreign Office, Mansfield College) in the English language. By validating the preliminary model on fresh corpora, the final logistic model can capture more precisely the gradience in the grammatical factors that affect article usage preceding multi-word organization names. As the logistic model is a model of language in use rather than a purely statistical model, this paper further translates the regression coefficients into the probability statements that a name is favouring the use of the definite article.

Original languageEnglish
Pages (from-to)287-313
Number of pages27
JournalLiterary and Linguistic Computing
Volume18
Issue number3
DOIs
Publication statusPublished - Sept 2003

Fingerprint

Dive into the research topics of 'Validating the Logistic Model of Article Usage Preceding Multi-word Organization Names with the Aid of Computer Corpora'. Together they form a unique fingerprint.

Cite this