Corpus Usage Guidelines

Access Policies

In order to ensure compliance with the licenses for the various corpora we have installed, we have instituted the following policies.

  1. Compling laboratory members are granted access to corpora solely for coursework and research projects in the context of their affiliation with the UW.
  2. Corpora may not be copied from the servers, nor used in commercial applications, unless permitted by the corpus license agreement.
  3. Many of the corpora have additional licensing conditions (see the CompLing Database.) Before you access any particular corpus, you are responsible for reading and understanding the license. For LDC corpora, you should also read the general membership agreement.
  4. For some of the corpora, we must maintain a list of individuals granted access and/or have each user sign an individual license agreement. This is indicated in the "Restriction" column in the database. To access these corpora, you'll need to contact the lab director to obtain read permissions on the relevant directories.
  5. Whenever you use a corpus for course work or for a paper, you should cite the corpus among your references. The proper citation information should be found in the license or README file of the corpus.
  6. Failure to follow these policies could result in loss of access to the corpora, or to the lab/servers in general.

Available corpora

For a list of currently available corpora, along with their licensing and access information, see the CompLing Database.

(If your browser prompts you with a certificate warning, you need to install the UW root certificate.)

Requesting additional corpora

Lab members who would like access to a corpus listed as "Available" in the database should send an email to linghelp@u with a request for it to be installed.

Lab members who would like access to a corpus not listed in the database should list it on CorpusWishList, and send an email to Emily (ebender at u) with the request.

