Navigation auf


English Department ARCHER 3


A Representative Corpus of Historical English Registers

Marianne Hundt joined the ARCHER consortium in 2000, when teams at Freiburg, Helsinki and Uppsala decided to expand the original ARCHER corpus (compiled by Douglas Biber and Edward Finegan), originally to make it more widely available outside the US. In 2005, Marianne Hundt moved to Zurich.

Gerold Schneider from University of Zurich joined the ARCHER consortium in 2013. Marianne Hundt left the consortium in 2013.

Corpus Design

ARCHER covers the years 1600-1990 and currently contains a little over 3 million words. It is subdivided into 50-year periods. The original design was to sample 10 texts of around 2,000 words per register and period. The corpus consists of both British and American English and includes both written and speech-based registers. The written registers represented in the corpus are newspaper reportage, journals/diaries, letters, fiction prose, legal opinion, medical writing, (other) science writing, advertisements; the speech-based registers represented are drama, fictional conversation, and sermons/homilies.

Aims of the Project

The principal aim of stage 3 of the ARCHER project has been to enhance the representativeness of the corpus by complementing it with suitable new texts. Other important aims have been to provide detailed documentation of the texts included in the corpus and syntactic annotation. The corpus has been tagged and parsed at Zürich.

ARCHER 3.1 was coordinated from Heidelberg by Nadja Nesselhauf and Marianne Hundt (2004-2008). Since 2008, it has been coordinated from Manchester.


The following universities are currently members of the consortium. (Years in brackets indicate when a university joined the consortium.)
Lancaster University (since 2006; Paul Rayson)
Northern Arizona University (Douglas Biber)
University of Bamberg (since 2006; Manfred Krug [formerly Freiburg])
University of Trier (since 2009; Sebastian Hoffmann [formerly Zürich and Lancaster])
University of Freiburg (since 2000; Christian Mair, Bernd Kortmann)
University of Heidelberg (since 2004; Nadja Nesselhauf)
University of Helsinki (since 2000; Matti Rissanen, Arja Nurmi)
University of Manchester (since 2005; David Denison, Nuria Yáñez-Bouza [now also University of Vigo])
University of Michigan (since 2004; Anne Curzan [also formerly †Richard Bailey and Chris Palmer]).
University of Salford (since 2008; Nick Smith [formerly Lancaster])
University of Santiago de Compostela (since 2008; Teresa Fanego, María José López-Couso, Belén Méndez-Naya)
University of Southern California (Edward Finegan)
University of Zürich (since 2005; Marianne Hundt [formerly Freiburg and Heidelberg], Gerold Schneider)
Uppsala University (since 2000; Merja Kytö)

For further information on ARCHER

Biber, Douglas, Edward Finegan and Dwight Atkinson. 1994. ARCHER and its challenges: Compiling and exploring A Representative Corpus of Historical English Registers. In U. Fries, P. Schneider and G. Tottie (eds.). Creating and using English language corpora. Papers from the 14th International Conference on English Language Research on Computerized Corpora, Zurich 1993. Amsterdam: Rodopi. pp. 1-13. Yáñez-Bouza, Nuria. 2011. ARCHER past and present (1990-2010). ICAME Journal 35, 205-236.

Tagging and parsing ARCHER

Improving POS Tagging on historical corpora (click on the image to download the PDF)

Weiterführende Informationen