Differences between revisions 1 and 2
Revision 1 as of 2007-08-29 15:37:10
Size: 935
Editor: lab1
Comment:
Revision 2 as of 2007-08-29 15:46:23
Size: 1278
Editor: lab1
Comment:
Deletions are marked like this. Additions are marked like this.
Line 6: Line 6:

== Gigaword ==

 * Chinese

== Parsed Switchboard ==
You must be a member of the `pswbd` Unix group to access this corpus.

== TGrep2able ==
Corpora that have been processed to make them usable with the TGrep2 tool. See [wiki:/HlpLab/CorpusTools/ Corpus Tools] for more info on TGrep2.

== TIGER Corpora ==

== Tiger2 Corpus ==

Corpora

Gigaword

  • Chinese

Parsed Switchboard

You must be a member of the pswbd Unix group to access this corpus.

TGrep2able

Corpora that have been processed to make them usable with the TGrep2 tool. See [wiki:/HlpLab/CorpusTools/ Corpus Tools] for more info on TGrep2.

TIGER Corpora

Tiger2 Corpus

Treebanks

Title

File

LDC Catalog number/Original name

Language

#word

#sentence

#story

Original format

Arabic Treebank Part 1 V3

ATB1_V3/

LDC2005T02

Arabic

145386

734

Arabic Treebank Part 2 V2

ATB2_V2/

LDC2004T02

Arabic

144199

501

Arabic Treebank Part 3 V1

ATB3_V1/

LDC2004T11

Arabic

340281

600

Chinese Treebank V5.1

ChineseTreebank5.1/

LDC2005T01U01

Chinese

507222

18782

Prague Dependency Treebank 2.0

pdt_2/

LDC2006T01

Czech

2000000

Danish Dependency Treebank V1.0

ddt1.0/

ddt-1.0.tar

Danish

5540

NEGRA corpus V2.0

Negra2.0/

negra-corpus.tar.gz

German

20602

export/Penn Treebank

Merge files

mrg/

Corpora (last edited 2018-06-07 17:57:11 by dhcp-10-5-21-163)

MoinMoin Appliance - Powered by TurnKey Linux