Grep, Find and AWK : In Day to Day DevOps

Grep, Find and AWK : In Day to Day DevOps

Linux - explored in depth : Part 2

Table of contents

Continuing our Linux - Explored in-depth series, let's talk about grep, find and awk command and their uses in day-to-day tasks of DevOps.

Grep

Grep: Global regular expression print.

The grep command is majorly used for searching particular text or patterns in a file.

grep [options] pattern [file...]
  • options: These are optional flags that modify the behaviour of the grep command. I will discuss them later in the blog.

  • pattern: This is the regular expression pattern that you want to search for in the given files or input.

  • file...: These are the input files you want to search in. If not specified, grep will read from the standard input (usually the output of a previous command via a pipe).

Discussing the options :

  • -i: Ignore the case when searching for the pattern.

  • -v: Invert the match, i.e., show lines that do not contain the pattern.

  • -r or -R: Recursively search for the pattern in directories and subdirectories.

  • -l: Print only the names of files containing the pattern, not the matched lines.

  • -n: Print the line numbers along with the matched lines.

  • -w: Match whole words only, not partial matches.

  • -A num: Print num lines of trailing context after matching lines.

  • -B num: Print num lines of leading context before matching lines.

  • -C num: Print num lines of context before and after matching lines.

Let's consider a file fruits.txt Having names of fruits banana, apple, pineapple, guava and cheeks.

#we need to search apple word in file fruits.txt
grep "apple" fruits.txt
# When we want to search banana in the current directory ignoring the Upper and Lower case
grep -i "banana" *
grep -r "error" logs/
ls -l | grep "success"

Imagine we have a log file.

picture

#to search for all the error lines in log file
grep -r "error" logs/
#
ls -l | grep "success"

Find

the find command is used to search for files/directories in the particular directory hierarchy. We can search with the help of their name, size, modification time and various other criteria.

# basic fomat
find [path] [expression]
  • path: The starting directory for the search. If not specified, find will start from the current directory.

  • expression: This is a set of options and tests that define the search criteria.

This option can be --name pattern,iname pattern,-type type

Let's see this with a few examples-

#if you want to search for files that end with .txt in the current directory and sub directories and print their -name
find . -name "*.txt"
#search for the file in /path/to/start/directory location which were modified in last 7 days.
# mtime stands for modification time
# d stands for directories
find /path/to/start/directory -type d -mtime -7
#search for the file in /path/to/start/directory location which are greater that 10 Mb and delete them.
#f stand for regular files
find /path/to/start/directory -type f -size +10M -delete

Awk

  • Awk is a powerful text-processing tool and programming language. It is used on Unix-based operating systems.

  • Its various purpose is analyzing various structured and unstructured data.

  • It takes input from the user, processes it line by line and performs various operations like searching for patterns, extracting information, and performing calculations.

But here I will just focus on how we can use the awk command to print specific columns of a file.

Lets take an example of a log file.

First using the grep command I extracted the all the WARNING logs and stored them in a new log file Errors.log

Now I have first printed the third row of the Errors.log file

Also, I have printed the 3rd and the 5th column.

Also, imagine we have to print a perticular column of command ls -l

#so to code it
ls -l | awk '{print $5,$9}'
#this will print the 5th and 9th column of list

Now imagine you are doing a specific project and while performing some task you encounter few commands to be running successfully and few failing. With the help of the grep and awk commands combined you can first filter all the failures and then print specific columes of that new log file. This is one of the applications used in day-to-day DevOps tasks.