#################################################### # UNIX - THE BASIC COMMANDS # -- created by judith degen on 10/17/2009 #################################################### ************************************************ * PART I: navigating the directory structure ************************************************ # Log onto slate: # ssh USERNAME@slate.hlp.rochester.edu ssh jdegen@slate.hlp.rochester.edu # Change your password! passwd # Show the contents of your current directory. If you just logged on, this will be your home directory. ls # For more information about file size/type/permissions ls -l # Show the contents of the directory /corpora/TDTlite. ls /p/hlp/tools/TDTlite # Move to the directory /corpora/TDTlite cd /corpora/TDTlite # Show the contents of the directory /corpora/TDTlite/sample_project ls /p/hlp/tools/TDTlite/sample_project # Move back to your home directory. cd # Move one directory up. cd .. # Check where you are. pwd # Figure out how to use a command. # man COMMANDNAME man cp # Copy the sample_project directory to your home directory with cp and rename it with mv. cp -r /p/hlp/tools/TDTlite/sample_project . mv sample_project myproject # Create a directory. mkdir mydir # Remove the directory you just created from your home directory. BE VERY CAREFUL WITH THE rm COMMAND! rm -r mydir # To copy file myfile.txt from your home directory on the server to your current directory on your computer: scp jdegen@slate.hlp.rochester.edu:./myfile.txt . # To copy the directory myproject from your home directory on the server to your current directory on your computer: scp -r jdegen@slate.hlp.rochester.edu:./myproject . ************************************************ * PART II: the basics of tgrep2 ************************************************ # Run tgrep2 tgrep2 "ADJP" # Run tgrep2 on the Wall Street Journal. Search for all VPs tgrep2 -c /corpora/TGrep2able/wsj_mrg.t2c.gz "ADJP" # The same, but print only terminals tgrep2 -c /corpora/TGrep2able/wsj_mrg.t2c.gz -t "ADJP" # Print the entire sentence tgrep2 -c /corpora/TGrep2able/wsj_mrg.t2c.gz -tw "ADJP" # Save the output to a file tgrep2 -c /corpora/TGrep2able/wsj_mrg.t2c.gz -tw "ADJP" > adjp.txt # View the contents of adjp.txt less adjp.txt # Search inside adjp.txt for "awesome": /awesome # Count the lines in adjp.txt wc -l adjp.txt # Output the match ID in front of the match itself. \t is a special character that inserts a tab, similarly \n inserts a newline tgrep2 -c /corpora/TGrep2able/wsj_mrg.t2c.gz -m '%xm\t%tm\n' "ADJP" # Always use the -af options, they make sure all your matches are found if for example there are multiple matches within one sentence # Two ADJP that are sisters, print first one, tab, second one tgrep2 -c /corpora/TGrep2able/wsj_mrg.t2c.gz -m '%t=a1= \t %t=a2=\n' "ADJP=a1 $ ADJP=a2" # Create a MACRO file vi # In the vi, there are a number of commands you can use: # :w FILENAME - save file as FILENAME # :q - quit # :wq - save and quit # 0 - move to start of line # $ - move to end of line # 1G - move to first line # G - move to last line # i - insert text before cursor, until is hit # a - insert text after cursor, until is hit # r - replace single character under cursor # R - replace characters until is hit # x - delete character under cursor # dd - delete entire current line # yy - copy the current line # p - paste the line(s) in the buffer into the text after the current line # Create a macro @AA that contains the ADJP pattern from above: i @ AA ADJP=a1 $ ADJP=a2; :w MACRO.ptn :q ************************************************ * PART III: regular expressions ************************************************ # Regular expressions in tgrep2 belong between // # Probably the most useful one will be /^NODENAME_START/ - which finds all nodes that begin with NODENAME_START tgrep2 -c /corpora/TGrep2able/wsj_mrg.t2c.gz "/^ADJP/" # Special characters: # ^ - start of string # $ - end of string # . - any character # * - any node # | - any of the strings separated by | # /^AD/ matches "ADJP", "ADJP-PRD", "ADVP", "ADVP-LOC", "ADVP-MNR"... # /VP$/ matches "ADVP", "VP" # /AD.P/ matches "ADVP", "ADJP", "WHADVP" # /ADVP|ADJP/ matches "ADVP", "ADJP"