⇤ ← Revision 1 as of 2007-08-29 15:37:10
Size: 935
Comment:
|
Size: 1278
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 6: | Line 6: |
== Gigaword == * Chinese == Parsed Switchboard == You must be a member of the `pswbd` Unix group to access this corpus. == TGrep2able == Corpora that have been processed to make them usable with the TGrep2 tool. See [wiki:/HlpLab/CorpusTools/ Corpus Tools] for more info on TGrep2. == TIGER Corpora == == Tiger2 Corpus == |
Corpora
Gigaword
- Chinese
Parsed Switchboard
You must be a member of the pswbd Unix group to access this corpus.
TGrep2able
Corpora that have been processed to make them usable with the TGrep2 tool. See [wiki:/HlpLab/CorpusTools/ Corpus Tools] for more info on TGrep2.
TIGER Corpora
Tiger2 Corpus
Treebanks
Title |
File |
LDC Catalog number/Original name |
Language |
#word |
#sentence |
#story |
Original format |
Arabic Treebank Part 1 V3 |
ATB1_V3/ |
LDC2005T02 |
Arabic |
145386 |
|
734 |
|
Arabic Treebank Part 2 V2 |
ATB2_V2/ |
LDC2004T02 |
Arabic |
144199 |
|
501 |
|
Arabic Treebank Part 3 V1 |
ATB3_V1/ |
LDC2004T11 |
Arabic |
340281 |
|
600 |
|
Chinese Treebank V5.1 |
ChineseTreebank5.1/ |
LDC2005T01U01 |
Chinese |
507222 |
|
18782 |
|
Prague Dependency Treebank 2.0 |
pdt_2/ |
LDC2006T01 |
Czech |
2000000 |
|
|
|
Danish Dependency Treebank V1.0 |
ddt1.0/ |
ddt-1.0.tar |
Danish |
|
|
5540 |
|
NEGRA corpus V2.0 |
Negra2.0/ |
negra-corpus.tar.gz |
German |
|
|
20602 |
export/Penn Treebank |
Merge files |
mrg/ |
|
|
|
|
|
|