In the first part of this article series on shell scripting we covered basics of what shell scripting is and how we can use it like other programming languages to automate our work. Now we are going to cover some more advanced technique such as arrays, functions, networking etc which makes these shell scripts much more than just bunch of commands.
A re-look at shell scripting part-1
In last article we saw some basics of bash variable, how we can create variables assign values to them and retrieve back their values. Bash also support some operation on these variables. Here are some of the operations that can be performed on them.
var=”I am test string. You can run many operations on me”
- String Length
$ echo ${#var}
Output: 51
Substring Operations
- From given index
$ echo ${var:4}
Output: test string. You can run many operations on me
- From given index of given length
$ echo ${var:4:5}
Output: test
- Remove Suffix Pattern
$ echo ${var%%.*}
Output: I am test string
- Remove Prefix Pattern
$ echo ${var##*.}
Output: You can run many operations on me
- Substitution
$ echo ${var/on me/on this string}
Output: I am test string. You can run many operations on this string
Case Modification
We can specify a pattern after operator ( ^^ ,, and ~ ) to run this operation on part of string only.
- To Uppercase
$ echo ${var^^}
Output: I AM TEST STRING. YOU CAN RUN MANY OPERATIONS ON ME
- To Lowercase
$ echo ${var,,}
Output: i am test string. you can run many operations on me
- To Title Case
$ echo ${var~}
Output: i Am Test String. you Can Run Many Operations On Me
Parsing Command Line Options
We have seen how we can pass command line arguments to our scripts. But if we want to support the POSIX style command line options that almost all the tools available in Linux follow i.e. the character ‘-’ followed by one or more characters with these options clubbed together or passed separately. then bash also provide tool to parse these options..
e.g.
$ls -la /home $ ls -l -a /home
Both the above commands have same results and telling the ls program to provide output in list order and include all the files in the output.
Bash also provide tools to parse these POSIX style command line options. Bash built in command getopts is used to parse these arguments.
It is used as follows.
getopts OPTSTRING NAME [ARGS...]
- OPTSRING : This is a list of all possible options your script supports. If any option requires value it is followed by a colon. e.g. “vhf:r”. Here it will parse option -v , -h -r, and -f option will need a value.
- NAME : Name stores the option that is found.
- ARG : If not defined “$@” is used otherwise we can specify a custom string to be parsed.
Here is a list of variables used by Getopts for parsing
- OPTIND : This is the index of the next argument to be processed. If can set it to 1 to start parsing again.
- OPTARG : It stores the value of the option. e.g. -f file.txt , here file.txt will be saved in OPTARG.
- OPTERR : It can have two values If its value is 0 error messages will no be displayed. If its value is 1 error messages will be shown.
Error Reporting
If the first character of OPTSTRING is a colon, getopts uses silent error reporting otherwise it is verbose error mode
- Silent Mode : In this mode, no error messages are printed. If an invalid option is seen, getopts places the option character found into OPTARG. If a required argument is not found, getopts places a ‘:’ into NAME and sets OPTARG to the option character found.
- Verbose Mode : If invalid option is seen, getopts places ‘?’ into NAME and unsets OPTARG. If a required argument is not found, a ‘?’ is placed in NAME, OPTARG is unset, and a diagnostic message is printed.
#!/bin/bash # cmd.sh -v -r -f <text> while getopts ":vf:r" opt;do case $opt in v) echo "Opion v found" ;; f) echo "Opion f found with value $OPTARG" ;; r) echo "Opion r found" ;; : ) echo "Option $OPTARG requires an argument." >&2 \?) echo "Invalid option: -$OPTARG" ;; esac done
There is one limitation with getopts that it can not handle long option name like -abc or –option. To handle long option names there is another tool getopt (note missing ‘s’) which is an external tool.
Arrays
Arrays are the variables which holds multiple values. Bash support single dimensional array of data which may have same or different types. There is no limit on the length of the array, the index in the array does not need to be continuous, there can be missing indexes, their index does not need to be start from 0.
Creating An Array
There are more than one way to create array in bash shell.
Creating Array beforehand
We can create array using the (). All the elements need to be separated by space, to specify a combined word use quotes to club them together.
month_names=( "None", "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov" "Dec")
The first element is placed at the index 0 and rest follows. There are no holes in the array. Notice that we need to use None at position 0 to shift all the values right.
Creating array on the run
You can directly assign any value to any index in the array. This way we can create any value at any index, we can create miss any index if we want to.
month_names[1]="Jan"; month_names[2]= "Feb"; month_names[3]= "Mar";month_names[4]= "Apr" month_names[5]= "May";month_names[6]= "Jun"; month_names[7]= "Jul";month_names[8]= "Aug" month_names[9]= "Sep";month_names[10]= "Oct"; month_names[11]= "Nov";month_names[12]= "Dec"
Notice we missed the index 0.
Declaring Array
We can also declare empty array using the keyword declare. declare -a months_array This creates an empty array with no values in it.
Accessing Array Elements
To access array use ${array[index]}
- Print Whole array
$ echo ${month_names[@]}
- Print one value
$ echo ${month_names[1]}
- Length of Array
$ echo ${#month_names[@]}
Traversing an array
- Traverse each value
for i in ${month_names[@]};do echo $i done
In this way if there is a value at some index in the array which have space in it that it will be considered as two separate strings.
- Traverse each index
for ((i=0;i<${#month_names[@]};i++));do echo ${month_names[$i]} done
Functions
In higher programming languages functions or subroutines are set of statements, they have a name which can be used in the code instead of original statements. They helps in reducing the redundancy in code and also makes the code maintainable. In Shell Scripting we can consider these functions as other shell scripts which runs in the same shell. They can access Environment Variables which are persistent and can create new variables, overall they behave like the internal shell commands.
Create Functions
As the scripts are not compiled but interpreted so it is necessary to create a function before using it. We can use functions in a script as well as the command line. In bash we can create a function by using the keyword function. function print() { echo -ne “$*” } This function will print its arguments and will not append a line break (-n option) and will also parse special characters (-e). You can see the commonly used functions defined in .bashrc or .bash_functions.
Using Functions
Once we have created these functions in our script we can use them same as any other command.
$ print 1 2 3 4
$ print “Test”
Command Line Arguments
Creating functions helps but it will not be so useful if we can not pass the arguments to our functions. Like the other commands there are no limits imposed on the number or arguments passed to a function, we can pass as many as we like. To handle these arguments in function we use them same way as we handle the command line arguments in a script, using $1, $2 $3 and so on, with one change that $0 remains the command you used to call your script and not the name of your function.
Returning From Function
Functions perform on the given input and return a output. There are two methods with which functions in bash cab return a value to its caller.
Return Statement
Like the return statement of other programming language you can use the return statement to return a value. Return statement sets the exit status of the function, as this can only be a numerical value so there is this limitation that it can only be integer value. The value returned can be accessed by the special variable ?.
Echo or Printf
There is one another way to return a value to its caller as used by the script we can print the value using echo or printf and then retrieve it in the caller function using $() construct or back-tick.
Stack Trace
If the $0 is the name of the script then how do we get the name of function we are in ? The default Bash array FUNCNAME is the stack of functions called. The value at index 0 is the name of current function. We can even get the depth of stack we are in by using ${#FUNCNAME[@]}.
Lets implement a stack using functions and array, we can add these function in .bashrc.
# Bash Stack Implementation using functions and array declare -a __STACK__ __STACK_TOP__=0 function push() { __STACK__[$__STACK_TOP__]="$*" __STACK_TOP__=$((__STACK_TOP__+1)) return 0 } function pop() { if [ $__STACK_TOP__ -gt 0 ];then __STACK_TOP__=$((__STACK_TOP__-1)) echo "${__STACK__[$__STACK_TOP__]}" unset __STACK__[$__STACK_TOP__] return 0 fi return 1 } function peek() { if [ $__STACK_TOP__ -gt 0 ];then echo "${__STACK__[$((__STACK_TOP__-1))]}" return 0 fi return 1 }
We can enter these functions in our bashrc file and then use them as regular commands. Apart from performing push, pop, and peek operation these functions are also returning the success or failure status, which can be accessed by $?.
Local Variables
Unlike other programming languages all the variables you create inside a function can be accessed outside of its boundaries. By default all the variables are available to whole script. But if we need function local variable which is not accessible outside of the function we can use the local keyword.
local var
This statement inside a function will create a function local variable.
Exceptions or Traps
Most of the time our scripts are handling opened file which needs to be flushed before the scripts exits. There are some temporary files which needs to be do deleted, overall sometimes scripts needs to do some cleanup work before it exits. To handle these cases trap command creates signal handler which will be invoked when a particular signal is received.
$ trap command list of signals
command can be function some script or a simple command.
#!/bin/bash #trap example function runme { echo "Signal received" exit 0 } trap runme SIGINT SIGTERM while true; do sleep 1 done
Input Output
We have used read, echo and printf for input output in our bash script. We have also used the redirection to modify the default input output stream. Lets see some more advanced input output methods provided by bash shell. Bash shell provide methods to handle file descriptors, open new file descriptors, redirect one in another and write in file opened with given file descriptor. By default Three file descriptors are opened for any program in linux.
- File descriptor 0 is stdin or standard input
- File descriptor 1 is stdout which is standard output
- File descriptor 2 is stderr which is standard error device
Default all these are set as terminal for terminal programs, when we use the redirection the shell opens the file and redirect the input/output by duplicating the file descriptors. Same way we can open other files and manually redirect output, input and/or error to that file.
Open and Close File descriptors
The command exec is used to open and close file descriptor.
$ exec 5 <>file.txt
This command will open file file.txt and assign the file descriptor 5.
$ exec 5>&-
This command will close the file descriptor.
Redirection
To handle redirection we can use following format m>&n Redirect output of file descriptor m (ms is default to 1) to file decrioptor n m<&n Redirect input of file descriptor m ( ms is defualt to 0 ) to file decrioptor n
Redirecting Standard Output to file descriptor
$ date >&5
This command will redirect the output of date command to file descriptor 5.
Redirecting Standard Input from file descriptor
$ read line
This command will read the first line from file descriptor 5.
Redirecting Standard Error
As the standard error is just a output stream with file descriptor 2 we can easily redirect it to some other file decsriptor.
$ ls file.txt 2>&5
This command will redirect the standard error of ls to file descriptor 5.
Read Command Revisited
We have used the read command for taking input from terminal or reading a file using redirection, lets see some more stuffs that can be done with this command.
Read from file descriptor
read command can be used to read from a given file descriptor.
$ read line -u 5
Reading Multiple Values
We can use read command to read multiple variables at once
$ read a b c
Read Array From stdin
Read command can also be used to read an array from stdin and assign the values to successive indices in an array
$ read -a array
Inter Process Communication(IPC) with Named Pipe or FIFO
IPC or inter process communication is very essential when we want to distribute the jobs among various processes. We have already seen the use of pipe which is used to communicate between parent and child process. But If the processes are not related we can still communicate by using the named pipe or fifo. Pipes and Fifo are nothing but the redirection of the data streams
We can create a named pipe of Fifo using the command mknod or mkfifo.
$ mkfifo /tmp/test
This command will create a fifo file in /tmp directory.
We can see the file by using command
$ ls -la /tmp/test prw-r--r-- 1 sherill hadmin 0 Dec 10 16:24 /tmp/test|
The initial p character shows its a fifo file.
Now open two terminals on one type
$ read line < /tmp/test
This read will blocked until it read something from fifo file.
In another terminal enter command
$ echo "Fifo Example" > /temp/test
This command will enter text in the fifo, and the read command on previous terminal will be completed. Now if you type echo $line on the previous terminal you will see that the data has successfully being read.
If you run the same command again the read will wait for the data, as the Fifo is empty due to last read.
We can use a timeout with read to stop hanging infinitely on the input.
for example
exec 7<> /tmp/test while true;do read -t0 -u7 line if [ $? -eq 0 ];then echo "$line is read" # Do something else echo " No Data. Sleep" # Do something else fi done
Use of fifo
We have seen how we can use the redirection to redirect the input or output to or from a program using pipe. But this works only in one way i.e. either you redirect input from the command or you redirect output to that command. If we need to redirect the input to a program and along with its output to be read by same program then we can do this using fifo.
For example here we want to redirect output from netcat to shell and then read the output of this shell back so in netcat.
So we created a fifo and redirected the output of netcat command to fifo. Then we are reading this fifio using cat and redirectin it to shell which process the data and then its output is redirected to netcat again.
$ mkfifo /tmp/myfifo $ cat /tmp/myfifo | /bin/sh | nc -l 1356 > /tmp/myfifo
This simple command will produce a netcat server running a shell which can be accesed using command
$ netcat localhost 1356
If you want to access the shell from a remote machine just change the localhost to the IP address of the machine and you get yourself a remote shell.
Indirect Reference
We can easily reference a variable value with $variable syntax, but if we need value of a value .i.e indirect reference. Bash also supports the indirect references with the help of eval command and \$$variable syntax. e.g.
x=10 y=x echo $(eval echo \$$y)
We can use indirect references to create a look up table.
Lets create a program which will print the frequency of each word.
#!/bin/bash # word_freq function usage { echo "usage: $0 " } # Check argument is given and it is a file if [ -z "$1" -o ! -f "$1" ];then usage exit 1 fi # Declare array declare -a map # Open File exec 7<> "$1" index=0 while read -a array -u 7 ;do count=${#array[@]} # Traverse Each WOrd for((i=0;i<count;i++));do elem=${array[$i]} # Each Word stores its frequency # So it can not be a special character or reserved words. # Even if it matches vars that are used in this script it will not work # e.g. If we use map in the input it will corrput the output, # Add all checks here. # Check Word is not space if [ ! -z "$elem" ];then ecount=$(eval "echo \$$elem") if [ -z "$ecount" ];then # Add the word in the map map[$index]=$elem index=$((index+1)) fi # Frequency of that word eval "$elem=$((ecount+1))" fi done # Unset array unset array done # Close File exec 7>&- echo "Total $index words Found" count=$index # Print Frequency for((i=0;i<$count;i++));do echo -n "${map[$i]}:" elem=${map[$i]} # Indirect Refenrence to word's frequency echo $(eval "echo \$$elem") done
I ran this script with following input.
$ cat input.txt This is an input to word frequency script This script will occurrence the frequency of each word and report back This script uses bash indirect references to create a Map and uses it to save occurrence $ ./word_freq input.txt Total 25 words Found This:3 is:1 an:1 input:1 to:3 word:2 infrequence script:3 will:1 occurrence:2 the:1 of:1 each:1 and:2 report:1 back:1 uses:2 bash:1 indirect:1 references:1 create:1 a:1 Map:1 it:1 save:1
This script is just an example of indirect reference and is very simple and will not handle many words for example if any of the variable ecount, map, count etc are in the input then it will not work correctly also it is not able to handle any special characters. But with little effort it can be made to handle all the cases this can be a good learning exercise.
Networking
There are two ways to use networking in your scripts
Bash internal socket handling
As we know all devices in linux are files, we can open then using open system call and can do basic input output. This can also be done with devices /dev/cp and /dev/udp.
- From the bash manpage:
-
/dev/tcp/host/port If host is a valid host name or Internet address, and port is an integer port number or service name, bash attempts to open a TCP connection to the corresponding socket. /dev/udp/host/port If host is a valid host name or Internet address, and port is an integer port number or service name, bash attempts to open a UDP connection to the corresponding socket.
For example to get home page of Google in your script, without using any other tool we can use following commands.
$ exec 7/dev/tcp/mylinuxbook.com/80 $ echo -en "GET / HTTP/1.0\r\nHost: mylinuxbook.com\r\nConnection: close\r\n\r\n">&7 $ cat &-
In the first line we are opening a connection with the server mylinuxbook.com at port 80 with the file descriptor 7, then we sent a Get request for Home page using HTTP protocol. Then are reading the server response using cat command. At last when its done we close the connection using exec command.
Using External Tools
Depending on need there are numerous tools available on linux to choose from. One among them is Netcat. Which is also called “Swiss-army knife” of networking which can be used to create TCP as well as UDP connections.
Lets create a simple time server using netcat.
At server (172.31.100.7) enter command
$ while true; do date | nc -u -l 13 ; done
This command will create a udp server listening at port 13. When any machine connects it will return the output of date command. And then exit. While loop is to create the server again.
Now to get time at any machine enter command
$ nc -u 172.31.100.7 13
This command will connect to serer 172.31.100.7 at the port 13 and retrieve print the data received from the server
To know more about it read necat. Apart from necat wget and curl are also great tools to which are used used inside the scripts for handling networking.
Single Instance Programs (.lock File)
.lock files are used to constrained our script to run in single instance mode, specially in the case when multiple instance can corrupt the system state like aptd trying to install a software or updating. If multiple aptd are allowed to run in parallel they may corrupt the database. Usually lock files are present in the /var/lock or /tmp directory.
Lets create a script which allows only single instance to run.
#!/bin/bash # single_instance [-n] # If -n option given Kill APP_NAME=single_instance DIR=/tmp LOCK_FILE=${DIR}/${APP_NAME}.lock PID_FILE=${DIR}/${APP_NAME}.pid # An Array for temporary files declare -a temp_files # Add Temporary files in the global array function add_file() { local index=${#temp_file} if [ ! -z "$1" -a -f "$1" ];then index=$((index+1)) temp_files[$index]="$1" fi } # Perform the cleanup and exit function cleanup_and_exit() { local count=${#temp_file} for((i=0;i<count;i++)) { rm -rf ${temp_files[$i]} >/dev/null 2>&1 } exit 0 } # Check if instance of scipt is already running function check_single_instance() { # Check Lock File if [ -f ${LOCK_FILE} ];then return 1 fi return 0 } # Add Signal Handler for SIGTERM trap cleanup_and_exit SIGTERM # Check if instance of this script is running if check_single_instance ;then echo "Instance is running" if [ ! -z "$1" -a $1 = "-n" ];then # Kill Current Process read pid <<<$(cat ${PID_FILE}) if [ ! -z "$pid" ];then # Send SIGTERM kill $pid while kill -0 $pid;do echo "Waiting for $pid to exit" done fi else exit 0 fi fi # Create Lock File touch ${LOCK_FILE} # Create PID File echo $$ > ${PID_FILE} # Add files for Clean Up add_file "${LOCK_FILE}" add_file "${PID_FILE}" # Add Your Script Functionality here while true; echo "This is an example of single Instance Script" sleep 1; done
In the next part of this tutorial series we will use these techniques and create some working programs.
The post Bash shell scripting – Part II appeared first on MyLinuxBook.