AbstractThis study examines twenty majority language and minority language versions of Wikipedia (English, French, German, Spanish, Italian, Danish, Icelandic, Welsh, Gaelic, Scots Gaelic, Manx, Cornish, West Frisian, North Frisian, Saterland Frisian, Breton, Catalan, Galician and Sardinian). The results are used to compare the majority language versions with the minority languages that compete with them. The statistical results are gathered by a custom designed software program that analyses several content metrics (number of words, images and an analysis of the quantity and location of links) relating to each article, allowing for a more accurate measurement of the contents of a language edition of Wikipedia than is usually given by a raw count of the number of articles.
While studying Wikipedia in isolation is an interesting question, there is no difficulty in understanding that the minority language editions of Wikipedia are quite small in both range of articles, as amply demonstrated by a study of Wikipedia's statistics in this regard. The goal is to put forward a potential model for estimating and quantifying minority language material on the web.
Outlined in this study is a definition of 'web presence that is a two-dimensional concept combining both breadth of coverage and depth of content; a formula to measure the presence of a particular language edition of Wikipedia and, by extension, the web; a formula to compare the presence of one language with that of another; a "language constellation" system that measures languages in meaningful groupings based on real world competition model; a "tiered classification" model, that uses presence values to be predictive and descriptive of where a language's presence relates to other languages.
|Date of Award||2011|
|Supervisor||Daniel Cunliffe (Supervisor)|
- Linguistic minorities
- World wide web
- Minority Language