Wednesday, October 7, 2015

UNIX Basic Commands

LINUX ESSENTIALS



Index

  1. INTRODUCTION
  2. BASICS
  3. UNIX HELP
  4. FINDING THINGS
  5. PERMISSIONS & OWNERSHIP
  6. USEFUL COMMANDS
  7. JOB/PROCESS MANAGEMENT
  8. TEXT VIEWING
  9. TEXT EDITORS
  10. THE UNIX SHELL
  11. SIMPLE SHELL ONE-LINER SCRIPTS
  12. SIMPLE PERL ONE-LINER SCRIPTS
  13. REMOTE COPY
  14. ARCHIVING AND COMPRESSING
  15. SIMPLE INSTALLS
  16. DEVICES
  17. ENVIRONMENT VARIABLES
  18. EXERCISES


  • INTRODUCTION

  • Why UNIX?
    • Multitasking
    • Remote tasking ("real networking")
    • Multiuser
    • Access to shell, programming languages, databases, open-source projects
    • Better performance, less expensive (free), more up-to-date
    • Many more reasons

    How to get access

    UNIX variants
    • UNIX: Solaris, IRIX, HP-UX, Tru64-UNIX, Free's, LINUX, ...
    LINUX distributions
    • RedHat, Debian, Mandrake, Caldera, Slackware, SuSE, ...

    1. BASICS
    2. Syntax for this manual
        Remember the UNIX/LINUX command line is case sensitive!
        "$" indicates start of command
        "#" indicates end of command and start of comment
        The text in green or red monospace font represents the actual command. The "$" and "#" symbols are not part of it. The commands in red emphasize essential information for beginners.
        "<...>" or "my_..." refers to variables and file names that need to be specified by the user. The arrows "<...>" need to be excluded, because they are generic UNIX redirection functions!

      Login from Windows:

      Login from Mac OS-X or LINUX
      • open terminal and type:

      • $ ssh @
        $ user name: ...
        $ password: ...

      Changing password:
        $ passwd # follow instructions

      Orientation
        $ pwd # present working directory
        $ ls # content of pwd
        $ ll # similar as ls, but provides additional info on files and directories
        $ ll -a # includes hidden files (.name) as well
        $ ll -R # lists subdirectories recursively
        $ ll -t # lists files in chronological order
        $ stat # provides all attributes of a file
        $ whoami # shows as who you are logged in
        $ hostname # shows on which machine you are

      Files and directories
        $ mkdir # creates specified directory
        $ cd # switches into specified directory
        $ cd .. # moves one directory up
        $ cd ../../ # moves two directories up (and so on)
        $ cd # brings you to highest level of your home directory
        $ rmdir # removes empty directory
        $ rm # removes file name
        $ rm -r # removes directory including its content, but asks for confirmation, 'f' argument turns confirmation off
        $ mv # renames directories or files
        $ mv # moves file/directory as specified in path
        $ cp # copy file/directory as specified in path (-r to include content in directories)

      Copy and paste
      • Depends on local environment. Usually one of the following methods works:

      • Copy: Ctrl&Shift&c or right/middle mouse click
        Paste: Ctrl&Shift&p or right/middle mouse click

      Handy shortcuts
        $ . # refers to pwd
        $ ~/ # refers to user's home directory
        $ history # shows all commands you have used recently
        $ ! # starts an old command by providing its ID number
        $ up(down)_key # scrolls through command history
        $ TAB # completes path/file_name
        $ SHIFT&TAB # completes command
        $ Ctrl a # cursor to beginning of command line
        $ Ctrl e # cursor to end of command line
        $ Ctrl d # delete character under cursor
        $ Ctrl k # delete line from cursor, content goes into kill buffer
        $ Ctrl y # paste content from Ctrl k

    3. UNIX HELP
      • $ man # general help
        $ man wc # manual on program 'word count' wc
        $ wc --help # short help on wc
        $ info wc # more detailed information system (GNU)
        $ apropos wc # retrieves pages where wc appears
        Online help: SuperMan Pages, Linux Documentation Project (LDP)

    4. FINDING THINGS
    5. Finding files, directories and applications
        $ find -name "*pattern*" # searches for *pattern* in and below current directory
        $ find /usr/local -name "*blast*" # finds file names *blast* in specfied directory
        $ find /usr/local -iname "*blast*" # same as above, but case insensitive
        additional useful arguments: -user , -group , -ctime
        $ find ~ -type f -mtime -2 # finds all files you have modified in the last two days
        $ locate # finds files and dirs that are written into update file
        $ which # location of application
        $ whereis # searches for executeables in set of directories
        $ dpkg -l | grep mypattern # find Debian packages and refine search with grep pattern

      Finding things in files
        $ grep pattern file # provides lines in 'file' where pattern 'appears', if pattern is shell function use single-quotes: '>'
        $ grep -H pattern # -H prints out file name in front of pattern
        $ grep 'pattern' file | wc # pipes lines with pattern into word count wc (see chapter 8); wc arguments: -c: show only bytes, -w: show only words, -l: show only lines; help on regular expressions: $ man 7 regex or man perlre
        $ find /home/my_dir -name '*.txt' | xargs grep -c ^.* # counts line numbers on many files and records each count along with individual file name; find and xargs are used to circumvent the Linux wildcard limit to apply this function on thousands of files.

    6. PERMISSIONS & OWNERSHIP
    7. How does it work
        $ ls -al # shows something like this for each file/dir: drwxrwxrwx
          d: directory
          rwx: read write execute
          first triplet: user permissions (u)
          second triplet: group permissions (g)
          third triplet: world permissions (o)

      To assign write and execute permissions to user and group:
        $ chmod ug+rx my_file

      To remove all permissions from all three user groups:
        $ chmod ugo-rwx my_file
          '+' causes the permissions selected to be added
          '-' causes them to be removed
          '=' causes them to be the only permissions that the file has.
          Example for number system:
        $ chmod +rx public_html/ or $ chmod 755 public_html/

      Change ownership
        $ chown # changes user ownership
        $ chgrp # changes group ownership
        $ chown : # changes user & group ownership

    8. USEFUL UNIX COMMANDS
      • $ df # disk space
        $ free # memory info
        $ uname -a # shows tech info about machine
        $ bc # command-line calculator (to exit type 'quit')
        $ wget ftp://ftp.ncbi.nih.... # file download from web
        $ /sbin/ifconfig # give IP and other network info
        $ ln -s original_filename new_filename # creates symbolic link to file or directory
        $ du -sh # displays disk space usage of current directory
        $ du -sh * # displays disk space usage of individual files/directories
        $ du -s * | sort -nr # shows disk space used by different directories/files sorted by size

    9. JOB/PROCESS MANAGEMENT
      • $ who # shows who is logged into system
        $ w # shows which users are logged into system and what they are doing
        $ ps # shows processes running by user
        $ ps -e # shows all processes on system; try also '-a' and '-x' arguments
        $ ps aux | grep # shows all processes of one user
        $ top # view top consumers of memory and CPU
        $ mtop # displays multicomputer/CPU processes
        $ Ctrl z bg or fg # suspends a process to bring into back- or foreground
        $ Ctrl c # stops an initiated process
        $ kill # Kills specified job; if this doesn't do it, add -9 as argument. Also, type <%1> then .
        $ renice -n # change priority value, which range from 1-19, the higher the value the lower the priority, default is 10

    10. TEXT VIEWING
      • $ less # more versatile text viewer than 'more', 'G' moves to end of text, 'g' to beginning, '/' find forward, '?' find backwards
        $ more # views text, use space bar to browse, hit 'q' to exit
        $ cat # concatenates files and prints content to standard output

    11. TEXT EDITORS
    12. VI and VIM
        Non-graphical (terminal-based) editor. Vi is guaranteed to be available on any system. Vim is the improved version of vi.

      EMACS
        Window-based editor. You still need to know keystroke commands to use it. Installed on all Linux distributions and on most other Unix systems.

      XEMACS
        More sophisticated version of emacs, but usually not installed by default. All common commands are available from menus. Very powerful editor, with built-in syntax checking, Web-browsing, news-reading, manual-page browsing, etc.

      PICO
        Simple terminal-based editor available on most versions of Unix. Uses keystroke commands, but they are listed in logical fashion at bottom of screen.

      VIM MANUAL (essentials marked in red)
      BASICS
        $ vim my_file_name # open/create file with vim
        $ i # INSERT MODE
        $ ESC # NORMAL (NON-EDITING) MODE
        $ : # commands start with ':'
        $ :w # save command; if you are in editing mode you have to hit ESC first!!
        $ :q # quit file, don't save
        $ :q! # exits WITHOUT saving any changes you have made
        $ :wq # save and quit
        $ R # replace MODE
        $ r # replace only one character under cursor
        $ q: # history of commands (from NORMAL MODE!), to reexecute one of them, select and hit enter!
        $ :w new_filename # saves into new file
        $ :#,#w new_filename # saves specific lines (#,#) to new file
        $ :# go to specified line number

      HELP
        $ Useful list of vim commands: Vim Commands Cheat Sheet, VimCard, Vim Basics
        $ vimtutor # open vim tutorial from shell
        $ :help # opens help within vim, hit :q to get back to your file
        $ :help # opens help on specified topic
        $ |help_topic| CTRL-] # when you are in help this command opens help topic specified between |...|, CTRL-t brings you back to last topic
        $ :help CTRL-D # gives list of help topics that contain key word
        $ : # like in shell you get recent commands!!!!

      MOVING AROUND IN FILE
        $ $ # moves cursor to end of line
        $ A # same as $, but switches to insert mode
        $ 0 (zero) # moves cursor to beginning of line
        $ CTRL-g # shows at status line filename and the line you are on
        $ SHIFT-G # brings you to bottom of file, type line number (isn't displayed) then SHIFT-G # brings you to specified line#

      DISPLAY
        WRAPPING AND LINE NUMBERS
        $ :set nowrap # no word wrapping, :set wrap # back to wrapping
        $ :set number # shows line numbers, :set nonumber # back to no-number mode

      WORKING WITH MANY FILES & SPLITTING WINDOWS
        $ vim *.txt # opens many files at once; ':n' switches between files
        $ :wall or :qall # write or quit all open files
        $ vim -o *.txt # opens many files at once and displays them with horizontal split, '-O' does vertical split
        $ :args *.txt # places all the relevant files in the argument list $ :all # splits all files in the argument list (buffer) horizontally $ CTRL-w # switch between windows
        $ :split # shows same file in two windows
        $ :split # opens second file in new window
        $ :vsplit # splits windows vertically, very useful for tables, ":set scrollbind" let's you scroll all open windows symultaneously
        $ :close # closes current window
        $ :only # closes all windows except current one

      SPELL CHECKING & Dictionary
        $ aspell -c # shell command
        $ aspell -l # shell command
        $ :! dict # meaning of word
        $ :! wn 'word' -over # synonyms of word

      PRINTING FILE
        $ :ha # prints entire file
        $ :#,#ha # prints specified lines: #,#

      MERGING/INSERTING FILES
        $ :r # inserts content of specified file after cursor

      UNDO/REDO
        $ u # undo last command
        $ U # undo all changes on current line
        $ CTRL-R # redo one change which was undone

      DELETION/CUT (switch to NORMAL mode)
        $ x # deletes what is under cursor
        $ dw # deletes from curser to end of word including the space
        $ de # deletes from curser to end of word NOT including the space
        $ cw # deletes rest of word and lets you then insert, hit ESC to continue with NORMAL mode
        $ c$ # deletes rest of line and lets you then insert, hit ESC to continue with with NORMAL mode
        $ d$ # deletes from cursor to the end of the line
        $ dd # deletes entire line
        $ 2dd # deletes next two lines, continues: 3dd, 4dd and so on.

      PUT (PASTE)
        $ p # uses what was deleted/cut and pastes it behind cursor

      COPY & PASTE
        $ yy # copies line, for copying several lines do 2yy, 3yy and so on
        $ p # pastes clipboard behind cursor

      SEARCH IN FILE (most regular expressions work)
        $ /my_pattern # searches for my_pattern downwards, type n for next match
        $ ?my_pattern # seraches for my_pattern upwards, type n for next match
        $ :set ic # switches to ignore case search (case insensitive)
        $ :set hls # switches to highlight search (highlights search hits)

      REPLACE WITH REGULAR EXPRESSIONS (great intro: A Tao of Regular Expressions)
        $ :s/old_pat/new_pat/ # replaces first occurence in a line
        $ :s/old_pat/new_pat/g # replaces all occurence in a line
        $ :s/old_pat/new_pat/gc # add 'c' to ask for confirmation
        $ :#,#s/old_pat/new_pat/g # replaces all occurence between line numbers: #,#
        $ :%s/old_pat/new_pat/g # replaces all occurence in file
        $ :%s/\(pattern1\)\(pattern2\)/\1test\2/g # regular expression to insert, you need here '\' in front of parentheses (<# Perl)
        $ :%s/\(pattern.*\)/\1 my_tag/g # appends something to line containing pattern (<# .+ from Perl is .* in VIM)
        $ :%s/\(pattern\)\(.*\)/\1/g # removes everything in lines after pattern
        $ :%s/\(At\dg\d\d\d\d\d\.\d\)\(.*\)/\1\t\2/g # inserts tabs between At1g12345.1 and Description
        $ :%s/\n/new_pattern/g #Replaces return signs
        $ :%s/pattern/\r/g #Replace pattern with return signs!!
        $ :%s/\(\n\)/\1\1/g # insert additional return signs
        $ :%s/\(^At\dg\d\d\d\d\d.\d\t.\{-}\t.\{-}\t.\{-}\t.\{-}\t\).\{-}\t/\1/g # replaces content between 5th and 6th tab (5th column), '{-}' turns off 'greedy' behavior
        $ :#,#s/\( \{-} \|\.\|\n\)/\1/g # performs simple word count in specified range of text
        $ :%s/\(E\{6,\}\)/\1<\/font>/g # highlight pattern in html colors, here highlighting of >= 6 occurences of Es
        $ :%s/\([A-Z]\)/\l\1/g # change uppercase to lowercase, '%s/\([A-Z]\)/\u\1/g' does the opposite
        $ :g/my_pattern/ s/\([A-Z]\)/\l\1/g | copy $ # uses 'global' command to apply replace function only on those lines that match a certain pattern. The 'copy $' command after the pipe '|' prints all matching lines at the end of the file.
        $ :args *.txt | all | argdo %s/\old_pat/new_pat/ge | update # Command 'args' places all relevant files in the argument list (buffer); 'all' displays each file in separate split window; command 'argdo' applies replacement to all files in argument list (buffer); flag 'e' is necessary to avoid stop at error messages for files with no matches; command 'update' saves all changes to files that were updated.

      MATCHING PARENTHESES SEARCH
        - place curser on (, [ or { and type % # curser moves to matching parentheses

      HTML EDITING
        -Convert text file to html format:
        $ :runtime! syntax/2html.vim # run this command with open file in Vim

      SHELL COMMAND IN VIM
        $ :! # executes any shell command, hit to return
        $ :sh # switches window to shell, 'exit' switches back to vim

      USING VIM AS TABLE EDITOR
        $ v # starts visual mode for selecting characters
        $ V # starts visual mode for selecting lines
        $ CTRL-V # starts visual mode for selecting blocks (use CTRL-q in gVim under Windows). This allows column-wise selections and operations like inserting and deleting columns. To restrict substitude commands to a column, one can select it and switch to the command-line by typing ':'. After this the substitution sytax for a selected block looks like this: '<,'>s///.
        $ :set scrollbind # starts simultaneous scrolling of 'vsplitted' files. To set to horizontal binding of files, use command ':set scrollopt=hor' (after first one). Run all these commands before the ':split' command.
        $ :AlignCtrl I= \t then :%Align # This allows to align tables by column separators (here '\t') when the Align utility from Charles Campbell's is installed.
        To sort table rows by selected lines or block, perform the visual select and then hit F3 key. The rest is interactive. To enable this function one has to include in the .vimrc file from Gerald Lai the Vim sort script.

      MODIFY VIM SETTINGS (in file .vimrc)
        - see last chapter of vimtutor (start from shell)
        - useful .vimrc sample
        - when vim starts to respond very slowly then one may need to delete the .viminf* files in home directory

    13. THE UNIX SHELL
    14. When you log into UNIX/LINUX the system starts a program called SHELL. It provides you with a working environment and interface to the operating system. Usually there are many different shell programs installed.
        $ finger # shows which shell you are using
        $ chsh -l # gives list of shell programs available on your system (does not work on all UNIX variants)
        $ # switches to different shell

      STDIN, STDOUT, STDERR, REDIRECTORS, OPERATORS & WILDCARDS (more on this @ LINUX HOWTOs)
        By default, many UNIX commands read from standard input (STDIN) and send their output to standard out (STDOUT). You can redirect them by using the following commands:
        $ file* # * is wildcard to specify many files
        $ ls > file # prints ls output into specified file
        $ command < my_file # uses file after '<' as STDIN
        $ command >> my_file # appends output of one command to file
        $ command | tee my_file # writes STDOUT to file and prints it to screen; alternative way to do this:
        $ command > my_file; cat my_file
        $ command > /dev/null # turns off progress info of applications by redirecting their output to /dev/null
        $ grep my_pattern my_file | wc # Pipes (|) output of 'grep' into 'wc'
        $ grep my_pattern my_non_existing_file 2 > my_stderr # prints STDERR to file

      Useful shell commands
        $ cat > # concatenate files in output file 'cat.out'
        $ paste > # merges lines of files and separates them by tabs (useful for tables)
        $ cmp # tells you whether two files are identical
        $ diff # finds differences between two files
        $ head - # prints first lines of a file
        $ tail - # prints last lines of a file
        $ split -l # splits lines of file into many smaller ones
        $ csplit -f out fasta_batch "%^>%" "/^>/" "{*}" # splits fasta batch file into many files at '>'
        $ sort # sorts single file, many files and can merge (-m) them, -b ignores leading white space, ...
        $ sort -k 2,2 -k 3,3n input_file > output_file # sorts in table column 2 alphabetically and column 3 numerically, '-k' for column, '-n' for numeric
        $ sort input_file | uniq > output_file # uniq command removes duplicates and creates file/table with unique lines/fields
        $ join -1 1 -2 1 # joins two tables based on specified column numbers (-1 file1, 1: col1; -2: file2, col2). It assumes that join fields are sorted. If that is not the case, use the next command:
        $ sort table1 > table1a; sort table2 > table2a; join -a 1 -t "`echo -e '\t'`" table1a table2a > table3 # '-a ' prints all lines of specified table! Default prints only all lines the two tables have in common. '-t "`echo -e '\t'`" ->' forces join to use tabs as field separator in its output. Default is space(s)!!!
        $ cat my_table | cut -d , -f1-3 # cut command prints only specified sections of a table, -d specifies here comma as column separator (tab is default), -f specifies column numbers.
        $ grep and egrep # see chapter 4

    15. SIMPLE SHELL ONE-LINER SCRIPTS
    16. Useful One-Liners (script download)
        $ for i in *.input; do mv $i ${i/name\.old/name\.new}; done # renames file name.old to name.new
        - To test things first, insert 'echo' between 'do mv' (above).
        $ for i in *.input; do ./application $i; done # runs application in loops on many input files
        $ for i in *.input; do fastacmd -d /data/../database_name -i $i > $i.out; done # runs fastacmd in loops on many *.input files and creates *.out files
        $ for i in *.pep; do target99 -db /usr/../database_name -seed $i -out $i; done # runs SAM's target99 on many input files
        $ for j in 0 1 2 3 4 5 6 7 8 9; do grep -iH *$j.seq; done # searches in > 10,000 files for pattern and prints occurences together with file names.
        $ for i in *.pep; do echo -e "$i\n\n17\n33\n\n\n" | ./tmpred $i > $i.out; done # example of how to run an interactive application (tmpred) that asks for file name input/output
        $ for i in *.fasta1; do blast2 -p blastp -i $i -j ${i/_*fasta1/_*fasta2} >> my_out_file; done # runs BLAST2 for all *.fasa1/*.fasta2 file pairs in the order specified by file names and writes results into one file. This example uses two variables in a for loop. The content of the second variable gets specified in each loop by a replace function.
        $ for i in *.fasta; do for j in *.fasta; do blast2 -p blastp -F F -i $i -j $j >> my_out_file; done; done; # runs BLAST2 in all-against-all mode and writes results into one file; '-F F' turns low-complexity filter off

      How to write a script
        - create file which contains in first line:
        #!/bin/bash
        - place shell commands in file
        - run to make it executable
        - run shell script like this: ./my_shell_script
        - when you place it into /usr/local/bin you only type its name from any user account

    17. SIMPLE PERL ONE-LINER SCRIPTS
    18. Useful One-Liners
        $ perl -p -i -w -e 's/pattern1/pattern2/g' input_file # replace something (e.g. return signs) in file using regular expressions; use $1 to backreference to pattern placed in parentheses
        '-p' lets perl know to write program; '-i.bak' creates backup file *.bak, only -i doesn't; '-w' turns on warnings; '-e' executeable code follows
        $ perl -ne 'print if (/my_pattern1/ ? ($c=1) : (--$c > 0)) ; print if (/my_pattern2/ ? ($d = 1) : (--$d > 0))' my_infile > my_outfile # parses lines that contain pattern1 and pattern2
        following lines after pattern can be specified in '$c=1' and '$d=1'; for OR function use this syntax: '/(pattern1|pattern2)/'

    19. REMOTE COPY: WGET, SCP and NCFTP
    20. WGET (file download from the www)
        $ wget ftp://ftp.ncbi.nih.... # file download from www; add option '-r' to download entire directories

      SCP (secure copy between machines)
        General syntax
        $ scp source target # Use form 'userid@machine_name' if your local and remote user ids are differnt. If they are the same you can use only 'machine_name'.

        Examples
        Copy file from Server to Local Machine (type from local machine prompt):
        $ scp user@remote_host:file.name . # '.' copies to pwd, you can specify here any directory, use wildcards to copy many files at once.
        Copy file from Local Machine to Server:
        $ scp file.name user@remote_host:~/dir/newfile.name
        Copy entire directory from Server to Local Machine (type from local machine prompt):
        $ scp -r user@remote_host:directory/ ~/dir
        Copy entire directory from Local Machine to Server (type from local machine prompt):
        $ scp -r directory/ user@remote_host:directory/
        Copy between two remote hosts (e.g. from bioinfo to cache):
        similar as above, just be logged in one of the remote hosts:
        $ scp -r directory/ user@remote_host:directory/

      NICE FTP
        $ open ncftp
        $ ncftp> open ftp.ncbi.nih.gov
        $ ncftp> cd /blast/executables
        $ ncftp> get blast.linux.tar.Z (skip extension: @)
        $ ncftp> bye

    21. ARCHIVING AND COMPRESSING
    22. Archiving and compressing
        $ tar -cvf my_file.tar mydir/ # Builds tar archive of files or directories. For directories, execute command in parent directory. Don't use absolute path.
        $ tar -czvf my_file.tgz mydir/ # Builds tar archive with compression of files or directories. For directories, execute command in parent directory. Don't use absolute path.

      Viewing Archives
        $ tar -tvf my_file.tar
        $ tar -tzvf my_file.tgz

      Extracting
        $ tar -xvf my_file.tar
        $ tar -xzvf my_file.tgz
        $ gunzip my_file.tar.gz # or unzip my_file.zip, uncompress my_file.Z, or bunzip2 for file.tar.bz2
        $ find -name '*.zip' | xargs -n 1 unzip # this command usually works for unziping many files that were compressed under Windows
        try also:
          $ tar zxf blast.linux.tar.Z
          $ tar xvzf file.tgz
        options:
          f: use archive file
          p: preserve permissions
          v: list files processed
          x: exclude files listed in FILE
          z: filter the archive through gzip

    23. SIMPLE INSTALLS
    24. Systems-wide installations
        Installations for systems-wide usage are the responsibility of system administrator
        To find out if an application is installed, type:
        $ which
        $ whereis # searches for executeables in set of directories, doesn't depend on your path
        Most applications are installed in /usr/local/bin or /usr/bin. You need root permissions to write to these directories.
        Perl scripts go into /usr/local/bin, Perl modules (*.pm) into /usr/local/share/perl/5.8.0/. To copy executables in one batch, use command: cp `find -perm -111 -type f` /usr/local/bin

      Applications in user accounts
        Create a new directory, download application into this directory, unpack it (see chapter 13) and follow package-specific installation instructions.
        Usually you can then already run this application when you specify its location e.g.: /home/user/my_app/blastall.
        If you want you can add this directory to your PATH by typing from this directory:
        $ PATH=.:$PATH; export PATH # this allows you to run application by providing only its name; when you do echo $PATH you will see .: added to PATH.

      Intstallation of RPMs
        $ rpm -i application_name.rpm
        To check which version of RPM package is installed, type:
        $ rpm --query
        Help and upgrade files for RPMs can be found at http://rpmfind.net/.

      Installation of Debian packages
        Check whether your application is available at: http://www.debian.org/intro/about, then you type (no download):
        $ apt-cache search phylip #searches for application "phylip" from command line
        $ apt-cache show phylip #provides description of program
        $ apt-get install phylip # example for phylip install, manuals can be found in /usr/doc/phylip/, use zless or lynx to read documentation (don't unzip).
        $ apt-get update # do once a month do update Debian packages
        $ apt-get upgrade -u # to upgrade after update from above
        $ dpkg -i # install data package from local package file (e.g. after download)
        $ aptitude # Debian package manageing interface (Ctrl-t opens menues)
        $ aptitude search vim # search for packages on system and in Debian depositories

    25. DEVICES
    26. Mount/unmount usb/floppy/cdrom
        $ mount /media/usb
        $ umount /media/usb
        $ mount /media/cdrom
        $ eject /media/cdrom
        $ mount /media/floppy

    27. ENVIRONMENT VARIABLES
      • $ xhost user@host # adds X permissions for user on server.
        $ echo DISPLAY # shows current display settings
        $ export (setenv) DISPLAY=:0 # change environment variable
        $ unsetenv DISPLAY # removes display variable
        $ printenv # prints all environment variables
        $ $PATH # list of directories that the shell will search when you type a command
        - You can edit your default DISPLAY setting for your account by adding it to file .bash_profile.

    28. EXERCISES
    29. Exercise 1
      1. Download proteome of Halobacterium spec. from ftp://ftp.ncbi.nih.gov/genbank/genomes/Bacteria/Halobacterium_sp/AE004437.faa (use wget or web browser for download)
      2. How many predicted proteins are there?
        • $ grep '>' AE004437.faa | wc
      3. How many proteins contain the pattern "WxHxxH[1-2]"?
        • $ egrep 'W.H..H{1,2}' AE004437.faa | wc
      4. Use the find function (/) in 'less' to fish out the proteins containing this pattern or more elegantly do it with awk:
        • $ awk --posix -v RS='>' '/W.H..(H){1,2}/ { print ">" $0;}' AE004437.faa | less
      5. Create a BLASTable database with formatdb
        • $ formatdb -i AE004437.faa -p T -o T
          '-p F' for nucleotide and '-p T' for protein databases
      6. Generate list of sequence IDs for above pattern match result and retrieve its sequences with fastacmd from formatted database
        • $ fastacmd -d AE004437.faa -i my_IDs > seq
      7. Generate several lists of sequence IDs from various pattern match results and retrieve their sequences in one step using the fastacmd in for loop
        • $ for i in *.my_ids; do fastacmd -d AE004437.faa -i $i > $i.out; done
      8. Run blastall with a few proteins against newly created database or against Halobacterium or UniProt database (/data/UNIPROT/blast/uniprot)
        • $ blastall -p blastp -i input.file -d AE004437.faa -o blastp.out -e 1e-6 -v 10 -b 10 &
      9. Parse blastall output into Excel spread sheet:
        • a) using biocore parser
          $ blastParse -c -i -o
          b) using BioPerl parser
          $ bioblastParse.pl blast.out
      10. Run HMMPFAM search with above proteins against Pfam database
        • $ hmmpfam -E 0.1 --acc -A0 /data/PFAM/Pfam_ls input.file > output.pfam
          Parse result with BioPerl parser
          $ hmmSummary output.pfam > hmm.summary
      Exercise 2
      1. Split sample fasta batch file with csplit (use sequence file from exercise 1).
      2. Concatenate single fasta files from (1) to one batch file.
      3. BLAST two related sequences, retrieve the result in table format and use join to identify common hit IDs in the two tables.
      Exercise 3
      1. write a shell script that executes several BLAST searches at once:
        • #!/bin/sh
          blastall -p blastp -d /.../my_database -i /.../my_input -o my_out -e 1e-6 -v 10 -b 10 &
          blastall -p blastp -d /.../my_database -i /.../my_input -o my_out -e 1e-6 -v 10 -b 10 &
      Exercise 4
      1. Create multiple alignment with ClustalW (e.g. use sequences with 'W.H..HH' pattern)
        • $ clustalw my_fasta_batch
      Exercise 5
      1. Reformat alignment into PHYILIP format using 'seqret' from EMBOSS
        • $ seqret clustal::my_align.aln phylip::my_align.phylip
      Exercise 6
      1. Create neighbor-joining tree with PHYLIP
        • $ cp my_align.phylip infile
          $ phylip protdist # creates distance matrix
          $ cp outfile infile
          $ phylip neighbor # use default settings
          $ cp outtree intree
          $ phylip retree # displays tree and can use midpoint method for defining root of tree, my typical command sequence is: 'N' 'Y' 'M' 'W' 'R' 'R' 'X'
          $ cp outtree my_tree.dnd
          View your tree in TreeBrowse or open it in TreeView

    No comments: