Attachment 'cheatsheet.txt'
Download 1 ####################################################
2 # UNIX, TGREP2, & TDT - THE MOST IMPORTANT COMMANDS
3 # -- created by judith degen on 07/06/2009
4 ####################################################
5
6 ************************************************
7 * PART I: navigating the directory structure
8 ************************************************
9
10 # Log onto the LSA server:
11 # ssh USERNAME@174.129.205.212
12 ssh lsa1@174.129.205.212
13
14 # Change your password!
15 passwd
16
17 # Show the contents of your current directory. If you just logged on, this will be your home directory.
18 ls
19
20 # Show the contents of the directory /corpora/TDTlite.
21 ls /corpora/TDTlite
22 ls -l
23
24 # Move to the directory /corpora/TDTlite
25 cd /corpora/TDTlite
26
27 # Show the contents of the directory /corpora/TDTlite/sample_project
28 ls sample_project
29
30 # Move back to your home directory.
31 cd
32
33 # Move one directory up.
34 cd ..
35
36 # Check where you are.
37 pwd
38
39 # Figure out how to use a command.
40 # man COMMANDNAME
41 man cp
42
43 # Copy the sample_project directory to your home directory and rename it.
44 cp -r /corpora/TDTlite/sample_project .
45 mv sample_project myproject
46
47 # Create a directory.
48 mkdir mydir
49
50 # Remove the project directory from your home directory. BE VERY CAREFUL WITH THE rm COMMAND!
51 rm -r myproject
52
53 # To copy file myfile.txt from your home directory (for user lsa1 - insert your own username to download your files) on the server to your current directory on your computer (for Mac people):
54 scp lsa1@174.129.205.212:./myfile.txt .
55
56 # To copy the directory myproject from your home directory on the server to your current directory on your computer (for Mac people):
57 scp -r lsa1@174.129.205.212:./myproject .
58
59 ************************************************
60 * PART II: the basics of tgrep2
61 ************************************************
62
63 # Run tgrep2
64 tgrep2 "ADJP"
65
66 # Run tgrep2 on the Wall Street Journal. Search for all VPs
67 tgrep2 -c /corpora/TGrep2able/wsj_mrg.t2c.gz "ADJP"
68
69 # The same, but print only terminals
70 tgrep2 -c /corpora/TGrep2able/wsj_mrg.t2c.gz -t "ADJP"
71
72 # Print the entire sentence
73 tgrep2 -c /corpora/TGrep2able/wsj_mrg.t2c.gz -tw "ADJP"
74
75 # Save the output to a file
76 tgrep2 -c /corpora/TGrep2able/wsj_mrg.t2c.gz -tw "ADJP" > adjp.txt
77
78 # View the contents of adjp.txt
79 less adjp.txt
80
81 # Search inside adjp.txt for "awesome":
82 /awesome
83
84 # Count the lines in adjp.txt
85 wc -l adjp.txt
86
87 # Output the match ID in front of the match itself. \t is a special character that inserts a tab, similarly \n inserts a newline
88 tgrep2 -c /corpora/TGrep2able/wsj_mrg.t2c.gz -m '%xm\t%tm\n' "ADJP"
89
90 # Always use the -af options, they make sure all your matches are found if for example there are multiple matches within one sentence
91
92 # Two ADJP that are sisters, print first one, tab, second one
93 tgrep2 -c /corpora/TGrep2able/wsj_mrg.t2c.gz -m '%t=a1= \t %t=a2=\n' "ADJP=a1 $ ADJP=a2"
94
95 # Create a MACRO file
96 vi
97
98 # In the vi, there are a number of commands you can use:
99 # :w FILENAME - save file as FILENAME
100 # :q - quit
101 # :wq - save and quit
102 # 0 - move to start of line
103 # $ - move to end of line
104 # 1G - move to first line
105 # G - move to last line
106 # i - insert text before cursor, until <Esc> is hit
107 # a - insert text after cursor, until <Esc> is hit
108 # r - replace single character under cursor
109 # R - replace characters until <Esc> is hit
110 # x - delete character under cursor
111 # dd - delete entire current line
112 # yy - copy the current line
113 # p - paste the line(s) in the buffer into the text after the current line
114
115 # Create a macro @AA that contains the ADJP pattern from above:
116 i
117 @ AA ADJP=a1 $ ADJP=a2;
118 <Esc>
119 :w MACRO.ptn
120 :q
121
122 ************************************************
123 * PART III: regular expressions
124 ************************************************
125
126 # Regular expressions in tgrep2 belong between //
127 # Probably the most useful one will be /^NODENAME_START/ - which finds all nodes that begin with NODENAME_START
128 tgrep2 -c /corpora/TGrep2able/wsj_mrg.t2c.gz "/^ADJP/"
129
130 # Special characters:
131 # ^ - start of string
132 # $ - end of string
133 # . - any character
134 # * - any node
135 # | - any of the strings separated by |
136
137 # /^AD/ matches "ADJP", "ADJP-PRD", "ADVP", "ADVP-LOC", "ADVP-MNR"...
138 # /VP$/ matches "ADVP", "VP"
139 # /AD.P/ matches "ADVP", "ADJP", "WHADVP"
140 # /ADVP|ADJP/ matches "ADVP", "ADJP"
Attached Files
To refer to attachments on a page, use attachment:filename, as shown below in the list of files. Do NOT use the URL of the [get] link, since this is subject to change and can break easily.You are not allowed to attach a file to this page.