Monday, September 3, 2012

Grep And Sed Substitution Commands To The Rescue

Since there are a bunch of freshers joining in as sysads in my office, I think it is an appropriate time to write this post about two very basic yet powerful Linux utilities.

Grep: It searches for a given string in the text provided. Along with the pipe character ('|'), I think, it would be the most used command on Linux maybe second to "ls". It'll display the lines containing the strings specified by default. Let us see how the options can be varied to obtain the desired output.

General Syntax:
grep <string> <filename>
You can also do:
cat <filename> | grep <string>

Some helpful flags and options:
  • Grep to search recursively:
    grep -r 'search this' /abc
    This flag will make grep to go through all the files in directory /abc and look for word "search this"
  • Grep to ignore case:
    grep -i "search this" abc.txt
    This flag will make grep to match against "search this" being insensitive to case. So it'll match "Search THis" or "seaRCh thIs" or "search this" and so on.
  • Match whole word or whole line:
    grep -w "search this" abc.txt
    grep -x "search this whole line" abc.txt

    The first command will macth the whole word. It will not match "search thistoo" or "whysearch this". Similarly the second one will match a whole line and not a sub string.
  • Invert match or not matching the given string
    grep -v "search this" abc.txt
    If you want to have all the lines which do not contain "search this" then use the command above.
  • Print line number with matches:
    grep -n "search this" abc.txt
  • Print the name of the files containing or not containing the matches:
    grep -l "search this" abc.txt
    grep -L "search this" /abc/*

    First command will print all the file names which contains the string "search this" while the second one will print all the file names in which "search this" do not appear.

Sed: Also known as Stream Editor. It is a very powerful editor to work on a collection of files simultaneously. Let us learn some basics of sed.

General Syntax:
sed <options> <input>

Some helpful flags and options:
  • sed 's/old/new/' <original.txt
    In the above command s stands for substitute. In this the word "old" will be replaced by word "new" but keep in mind that it won't be written to original.txt. It'll be just displayed on stdout.
  • sed doesn't mandate the use of '/' as a delimiter. Practically you can use almost anything as the delimiter. For example:
    $ echo "aditya's blog" | sed 's|aditya|aditya patawari|'
    $ echo "aditya's blog" | sed 'sXadityaXaditya patawariX'

    will give the same output which is:
    aditya patawari's blog
  • The commands above will replace only the first occurrence of the word in a line. To substitute a word globally in a text pass 'g' flag like below:
    $ cat test.txt | sed 's/aditya/aditya patawari/g'
  • Now if you want to replace and write the output to a file, you can use 'w' flag along with the file name like below (use with caution):
    $ cat test.txt | sed 's/aditya/aditya patawari/gw test.txt'
    You can also use redirection '>' but it has weird consequences when the same file is being read from and written to.
  • If you want to make changes in the file in-place then you can use 'i' flag:
    $ sed -i 's/aditya/aditya patawari/g' test.txt
  • If you have more than one change to make then you can join sed commands using -e option.
    $ cat test.txt | sed -e 's/aditya/aditya patawari/g' -e 's/patawari/linux/g'
  • If you have a lots of changes to make then specifying everything commandline is too shabby. Use a sed script in that case:
    $ cat script.sed
    s/e/3/g
    s/E/3/g
    s/o/0/g
    s/O/0/g
    s/l/1/g
    s/L/1/g
    $ cat test.txt | sed -f script.sed

    Above will change e,o,l to 3,0,1 respectively.
This is not even the basic. Grep and Sed can do much more. You don't have to learn everything now but you should keep in mind that they are some very powerful, yet simple tools at your disposal.