Attachment 'cheatsheet.txt'

Download

   1 ####################################################
   2 # UNIX, TGREP2, & TDT - THE MOST IMPORTANT COMMANDS
   3 # -- created by judith degen on 07/06/2009
   4 ####################################################
   5 
   6 ************************************************
   7 * PART I: navigating the directory structure
   8 ************************************************
   9 
  10 # Log onto the LSA server:
  11 # ssh USERNAME@174.129.205.212
  12 ssh lsa1@174.129.205.212
  13 
  14 # Change your password!
  15 passwd
  16 
  17 # Show the contents of your current directory. If you just logged on, this will be your home directory.
  18 ls
  19 
  20 # Show the contents of the directory /corpora/TDTlite.
  21 ls /corpora/TDTlite
  22 ls -l
  23 
  24 # Move to the directory /corpora/TDTlite
  25 cd /corpora/TDTlite
  26 
  27 # Show the contents of the directory /corpora/TDTlite/sample_project
  28 ls sample_project
  29 
  30 # Move back to your home directory.
  31 cd
  32 
  33 # Move one directory up.
  34 cd ..
  35 
  36 # Check where you are.
  37 pwd
  38 
  39 # Figure out how to use a command.
  40 # man COMMANDNAME
  41 man cp
  42 
  43 # Copy the sample_project directory to your home directory and rename it.
  44 cp -r /corpora/TDTlite/sample_project .
  45 mv sample_project myproject
  46 
  47 # Create a directory.
  48 mkdir mydir
  49 
  50 # Remove the project directory from your home directory. BE VERY CAREFUL WITH THE rm COMMAND!
  51 rm -r myproject
  52 
  53 # To copy file myfile.txt from your home directory (for user lsa1 - insert your own username to download your files) on the server to your current directory on your computer (for Mac people):
  54 scp lsa1@174.129.205.212:./myfile.txt .
  55 
  56 # To copy the directory myproject from your home directory on the server to your current directory on your computer (for Mac people):
  57 scp -r lsa1@174.129.205.212:./myproject .
  58 
  59 ************************************************
  60 * PART II: the basics of tgrep2
  61 ************************************************
  62 
  63 # Run tgrep2
  64 tgrep2 "ADJP"
  65 
  66 # Run tgrep2 on the Wall Street Journal. Search for all VPs
  67 tgrep2 -c /corpora/TGrep2able/wsj_mrg.t2c.gz "ADJP"
  68 
  69 # The same, but print only terminals
  70 tgrep2 -c /corpora/TGrep2able/wsj_mrg.t2c.gz -t "ADJP"
  71 
  72 # Print the entire sentence
  73 tgrep2 -c /corpora/TGrep2able/wsj_mrg.t2c.gz -tw "ADJP"
  74 
  75 # Save the output to a file
  76 tgrep2 -c /corpora/TGrep2able/wsj_mrg.t2c.gz -tw "ADJP" > adjp.txt
  77 
  78 # View the contents of adjp.txt
  79 less adjp.txt
  80 
  81 # Search inside adjp.txt for "awesome":
  82 /awesome
  83 
  84 # Count the lines in adjp.txt
  85 wc -l adjp.txt
  86 
  87 # Output the match ID in front of the match itself. \t is a special character that inserts a tab, similarly \n inserts a newline
  88 tgrep2 -c /corpora/TGrep2able/wsj_mrg.t2c.gz -m '%xm\t%tm\n' "ADJP"
  89 
  90 # Always use the -af options, they make sure all your matches are found if for example there are multiple matches within one sentence
  91 
  92 # Two ADJP that are sisters, print first one, tab, second one
  93 tgrep2 -c /corpora/TGrep2able/wsj_mrg.t2c.gz -m '%t=a1= \t %t=a2=\n' "ADJP=a1 $ ADJP=a2"
  94 
  95 # Create a MACRO file
  96 vi
  97 
  98 # In the vi, there are a number of commands you can use:
  99 # :w FILENAME - save file as FILENAME
 100 # :q - quit
 101 # :wq - save and quit
 102 # 0 - move to start of line
 103 # $ - move to end of line
 104 # 1G - move to first line
 105 # G - move to last line
 106 # i - insert text before cursor, until <Esc> is hit
 107 # a - insert text after cursor, until <Esc> is hit
 108 # r - replace single character under cursor
 109 # R - replace characters until <Esc> is hit
 110 # x - delete character under cursor
 111 # dd - delete entire current line
 112 # yy - copy the current line
 113 # p - paste the line(s) in the buffer into the text after the current line
 114 
 115 # Create a macro @AA that contains the ADJP pattern from above:
 116 i
 117 @ AA	ADJP=a1 $ ADJP=a2;
 118 <Esc>
 119 :w MACRO.ptn
 120 :q
 121 
 122 ************************************************
 123 * PART III: regular expressions
 124 ************************************************
 125 
 126 # Regular expressions in tgrep2 belong between //
 127 # Probably the most useful one will be /^NODENAME_START/ - which finds all nodes that begin with NODENAME_START
 128 tgrep2 -c /corpora/TGrep2able/wsj_mrg.t2c.gz "/^ADJP/"
 129 
 130 # Special characters:
 131 # ^ - start of string
 132 # $ - end of string
 133 # . - any character
 134 # * - any node
 135 # | - any of the strings separated by |
 136 
 137 # /^AD/ matches "ADJP", "ADJP-PRD", "ADVP", "ADVP-LOC", "ADVP-MNR"...
 138 # /VP$/ matches "ADVP", "VP"
 139 # /AD.P/ matches "ADVP", "ADJP", "WHADVP"
 140 # /ADVP|ADJP/ matches "ADVP", "ADJP"

Attached Files

To refer to attachments on a page, use attachment:filename, as shown below in the list of files. Do NOT use the URL of the [get] link, since this is subject to change and can break easily.
  • [get | view] (2021-04-22 12:55:37, 192.3 KB) [[attachment:StatsNotes1.pdf]]
  • [get | view] (2021-04-22 12:55:37, 147.8 KB) [[attachment:StatsNotes2.pdf]]
  • [get | view] (2021-04-22 12:55:37, 162.3 KB) [[attachment:TGrep2Manual.pdf]]
  • [get | view] (2021-04-22 12:55:37, 4.1 KB) [[attachment:cheatsheet.txt]]
  • [get | view] (2021-04-22 12:55:37, 248.9 KB) [[attachment:cqp_tutorial.pdf]]
  • [get | view] (2021-04-22 12:55:37, 196.5 KB) [[attachment:swbd_bracketing.pdf]]
  • [get | view] (2021-04-22 12:55:37, 364.4 KB) [[attachment:tdt_manual.pdf]]
 All files | Selected Files: delete move to page copy to page

You are not allowed to attach a file to this page.

MoinMoin Appliance - Powered by TurnKey Linux