Corpora4Learning Home | Bibliography | English corpora | Tools & websites | Projects
NB: This section focusses on the features available online. The corpora themselves (e.g. Bank of English, British National Corpus, Brown Corpus) are briefly described in the English corpora section.
|
- Search in books by word or phrase, and then browse relevant books online.
- Search in books by word or phrase, and then browse relevant books online.
The archives listed below offer a variety of texts and smaller corpora for download. To search them with corpus analysis methods, you will normally need an offline text/corpus analysis tool, i.e. a concordancer. Alternatively, you may be able to carry out some simple analyses with online text analysis tools.
More than 5000 full text, audio and (streaming) video versions of public speeches, sermons, legal proceedings, lectures, debates, interviews, other recorded media events.
A digital library of Internet sites and other cultural artifacts in digital form (text, audio, video).
Free online search (concordances and a range of interesting features).
Free access to texts in different formats (meta search in a number of archives).
Free download as well as online search (concordances), wide variety of languages.
Free download (e.g. complete works of Shakespeare).
All Sate of the Union addresses, provided by (transcripts, and since 1989 video clips as well).
Approx. 2,000 literary texts in html format.
This section lists a selection of simple text analysis tools that can be used online, i.e. without installation. These tools allow you to create e.g. concordances, wordlists, text profiles from your own texts or from web pages of your choice.
- KWIC concordance for each word in the text.
- See also 'phrase extractor' section to build concordance with word clusters.
- Compares the text against well-known word lists (1000/2000 most frequent English words and others).
- Highlights words of different frequency bands in different colours.
- See also 'Unique Words Text Profiler' (finds all words which occur only once in a text).
- Returns a variety of word lists.
- KWIC concordance for all words in the text/web page
- Frequency lists and other features
This section lists software packages that are commonly referred to as concordancers. They provide a more comprehensive range than the online analysis tools listed above (usually creation of concordances, alphabetical and frequency word lists, comparison of word lists and other statistical functions). Most packages can be freely downloaded but require installation.
- For Windows and Linux.
- Reads text, html, and xml files.
- Main functions: concordances, citation of search term in its co-text, collocates, word clusters, frequency lists, text profiling through key rod lists.
- For Windows.
- Main functions: concordances, collocate search, frequency lists.
- For Windows.
- Creates a complete concordance for each word in a corpus and supports
its publication as a web concordance.
- Other functions: individual concordances, citation of search term in its co-text,
frequency lists, text profiling through key rod lists, and a range of other statistical functions.
- For Windows.
- Different from the other packages in that it focusses on the analysis of web pages.
- For Windows.
Very comprehensive package.
- For Window and Mac.
- Main functions: concordances, citation of search term in context, frequency lists.
- For Windows, Linux and Mac.
- Reads text, html, Word and Open Office files.
- Web spider facility for corpus creation directly from Internet sources.
- Main functions: concordances, citation of search term in context, frequency lists.
- For Windows.
- Very comprehensive package.
This section focusses on corpus-related resources for the learning and teaching context.
Module on Using concordance programs in the modern foreign languages classroom
by Marie-Noëlle Lamy and Hans Jørgen Klarskov Mortensen.
Module on Corpus linguistics by Tony McEnery and Andrew Wilson.
English and German online dictionary based on newspaper corpus, with frequency of occurrence, explanation, grammatical information and more
A corpus-based pedagogical grammar of English
The following websites include resources and link collections generally related to corpus linguistics.
back to top |
S.Braun (at) surrey.ac.uk |
updated 03/06/06
|