How to Automate Common Tasks with Shell Scripts
We’re Earthly.dev. We make building software simpler and therefore faster using containerization. This article is about how to automate common tasks with shell scripts. If you’re interested in a different approach to building and packaging software then check us out.
While performing various software development practices, it is common to repeat specific tasks. This repetition can lead to human error, which introduces the task to bugs and other security vulnerabilities. Also, doing repetitive tasks manually can be time-consuming and inefficient in software development. However, automation of tasks can help solve these problems.
In software development, automation involves using tools and scripts to execute repetitive tasks, including testing, monitoring, and analyzing logs that do not require human intervention. With automation, you don’t have to worry about human error or running multiple commands several times. Instead, a single line of code can execute a script to run the entire task for you.
Shell scripts are an excellent way to automate repetitive tasks. Shell scripts are programs written in a shell language such as bash
, csh
, or sh
that can be executed from the command line. As a result of their flexibility and power, shell scripts allow developers to automate tasks according to their needs. Implementing changes to an existing script is also very easy, making it a fast and more effective tool for software development.
In this article, I will walk you through automating everyday tasks using bash scripts. You will learn various fundamental software development techniques, such as loops, variables, regular expressions, etc., and how they are incorporated into shell scripts.
The repository for the code used in this article can be found on GitHub.
Prerequisites
To fully utilize this article, you should have basic knowledge of software development and the command line interface. In particular, you should understand the following:
- Bash commands while using the command line interface, such as navigating directories, creating, and editing files, etc.
- Bash scripting basics, such as writing and executing scripts
Task Selection
Developers can automate many typical software development tasks with shell scripts. These tasks range from simple file backups to complicated data processing and system maintenance. This article focuses on five tasks developers often perform and can be automated using shell scripts.
Common software development tasks include:
File backup: File backups preserve data and prevent data loss. In this task, you will learn how to automate backups with shell scripts and understand concepts such as variables, conditional statements, and loops.
Data processing: Software development requires data processing. It can take a lot of time and energy to process data manually. This task will teach you how to automate data processing using shell scripts. In addition, you will be introduced to a core programming concept such as functions.
Log Analysis: Log analysis is a crucial component of software development, and when dealing with large amounts of data, it can be challenging to do manually. This task demonstrates how shell scripts can automate log analysis and introduce concepts like regular expressions and command-line tools.
System maintenance: Performing backups, cleaning up disk space, and monitoring system resources are all time-consuming and tedious system maintenance duties. This task introduces concepts such as user input, error handling, and how to automate system maintenance tasks with shell scripts.
Local Application Deployment with Docker: Deploying software to production environments can be complex and error-prone when done manually. This task introduces concepts such as version control, environment variables, and containerization to automate deployment using shell scripts.
The tasks above were chosen because they are common procedures in software development and can be performed using shell scripts. By automating these tasks, developers can save time, reduce errors, and increase productivity. The following section is a step-by-step guide on automating each task using shell scripts.
Task 1: Automating File Backup
File backups prevent data loss due to unexpected incidents. Performing a backup manually can expose data to the possibility of human errors, such as the omission of important files, incorrect selection, inconsistent backup schedules, or failure to configure the backup system correctly. This can also be time-consuming, especially when you regularly back up large data volumes. Automating file backup can simplify this task.
Concepts Covered
In this task, you will learn and utilize the following concepts to achieve an automated file backup:
Variables: Variables refer to storage locations where you store data. You can reference and manipulate this data anywhere in the shell script.
Conditional statements: Conditional statements allow you to execute code blocks based on a specific condition.
Loops: With loops, you can execute a block of code several times while the condition remains true.
Step-by-Step Guide
The automated backup script will perform the following tasks:
Define two variables called source and destination. The location of the source and destination directories will be stored in these variables.
Check if a destination directory exists using a conditional statement.
Copy the file from the source to the destination directory.
Additionally, the script includes error handling that actively detects and promptly displays any issues encountered during the backup process. This approach ensures that errors are not overlooked, providing the user with an opportunity to address and resolve them. This error-handling mechanism increases the chances of successful backup execution, as users can take appropriate corrective actions before reattempting the backup.
To automate the backup of a file using a bash script, follow the steps below:
Create a file with the
.sh
extension in your terminal and open it using any text editor such as Vim or Nano. For example, you can call this filebackup.sh
:vim backup.sh
Define the source and destination variables:
#!/bin/bash SRC_DIR=/path/to/source/directory DST_DIR=/path/to/backup/directory
The code assigns values to two variables
SRC_DIR
andDST_DIR
. These variables are used to store the source directory and backup directory paths, respectively.To ensure that the script works correctly, make sure to replace the placeholder paths
/path/to/source/directory
and/path/to/backup/directory
with the appropriate paths to your source and backup directories.Note: For your script, add either a relative or full path for the variables
SRC_DIR
andDST_DIR
. If the script is in the same directory as the source and destination directory, use a relative path; otherwise, use a full path.Check if the destination directory exists using a conditional statement. If the directory doesn’t exist, create one using the
mkdir
command:if [ ! -d "$DST_DIR" ]; then mkdir -p "$DST_DIR" fi
The code block above is a conditional statement. The explanation of each component is as follows:
- The square brackets
[ ]
denote the start and end of a conditional expression. - The exclamation mark
!
is the logical NOT operator. It negates the result of the expression that follows it. - The
-d
flag is used to check if a directory exists. - The dollar sign
$
before theDST_DIR
variable is used to expand the variable and retrieve its value. - The variable name is enclosed in double quotes
""
to handle cases where the variable value contains spaces or special characters. The quotes ensure that the variable is treated as a single entity. - The semicolon
;
terminates the conditional statement. - The
fi
closes the conditional statement. It is the reverse of if and indicates the end of the conditional block.
- The square brackets
Add error handling to the script:
if [ ! -d "$SRC_DIR" ]; then echo "Error: Source directory does not exist" exit 1 fi
Copy the files from the source directory to the destination directory using a loop:
for file in "$SRC_DIR"/*; do # Loops over each file in the SRC_DIR. cp "$file" "$DST_DIR" # Copy the files to the DST_DIR. done
The following is an explanation of the code block above:
- The asterisk
*
afterSRC_DIR
is a wildcard character that matches any file within the directory. - The
for file in "$SRC_DIR"/*; do
line initiates a loop that iterates over each file in the directory specified by theSRC_DIR
variable. The$SRC_DIR"/*
is a [glob pattern](https://www.malikbrowne.com/blog/a-beginners-guide-glob-patterns/] that expands to all files in theSRC_DIR
directory. - The
do
keyword indicates the start of the loop’s body, which contains the code to be executed for each iteration. - The
cp "$file" "$DST_DIR":
line inside the loop copies each file to the destination directory specified by theDST_DIR
variable. The$file
represents the current file in each iteration of the loop, and$DST_DIR
specifies the destination directory to which the file will be copied. - The
done
keyword marks the end of the loop.
- The asterisk
Save and close the file.
Update the script permissions using
chmod
. The commandchmod u+x backup.sh
grants the user of thebackup.sh
file execute permission. It allows the user to execute the script as a program. Theu
stands for user, and+x
adds the execute permission to the file for the user:chmod u+x backup.sh
Run the script to backup files:
./backup.sh
When you’re done, your script should look like this:
#!/bin/bash SRC_DIR=/path/to/source/directory DST_DIR=/path/to/backup/directory if [ ! -d "$DST_DIR" ]; then mkdir -p "$DST_DIR" fi for file in "$SRC_DIR"/*; do cp "$file" "$DST_DIR" done if [ ! -d "$SRC_DIR" ]; then echo "Error: Source directory does not exist" exit 1 fi
You can confirm that the backup has occurred by checking the content of the backup destination.
With this procedure, you can back up files faster and more efficiently.
Task 2: Automating Data Processing
Processing large amounts of data can be time-consuming and error-prone if done manually. However, shell scripts enable automated data processing, reducing mistakes and saving time.
Concepts Covered
For this task, you will learn the following key concepts related to creating an automated data processing script:
- Functions: These are blocks of code that perform a particular task. They can accept inputs (arguments) and return values.
Step-by-Step Guide
To create an automated data processing script, follow the steps below:
Write a function to process the data. This function should take the input data as an argument and return the processed data:
#!/bin/bash # Define the processing function process_data() { # Take the input file as an argument input_file=$1 # Process the data using command-line tools output_file=$(cut -f 1,3 $input_file | grep 'foo' | sort -n) # Return the processed data echo $output_file }
The code sample above demonstrates a simple processing pipeline. The
process_data
function performs data processing by extracting specific columns, filtering lines based on a pattern, sorting the data, and returning the processed output.Below is an explanation of the processing steps.
cut -f 1,3 $input_file
: This uses thecut
command to extract specific fields from each line of the input file. In this case,-f 1,3
specifies that we want to extract the first and third fields from each line. The result of this operation is a subset of the original data.grep 'foo'
: This uses thegrep
command to search for thefoo
is the pattern. Only the lines that contain the word foo will be retained in the output. In this code block,foo
is a placeholder which represents a specific word or string that you want to search for in each line of the input file.sort -n
: This uses thesort
command to sort the data in ascending order. The-n
option specifies that the sorting should be done numerically, ensuring that numeric values are sorted correctly.
To define a function in shell scripting, you use the following syntax:
function_name() { ... }
. In the provided code, theprocess_data
function is defined using this syntax.The function expects an input data file as an argument. Within the function, the input file is accessed through the
$1
variable. In shell scripting, arguments passed to a function are accessed using positional parameters, where$1
represents the first argument,$2
represents the second argument, and so on.Here’s an example of how you would call the
process_data
function and pass an input file as an argument:input_file="path/to/input/file.txt" process_data "$input_file"
In this example, the variable
input_file
is set to the path of the input file. Then, theprocess_data
function is invoked with$input_file
as the argument. Inside the function, the value of$input_file
can be accessed as$1
.By using this approach, you can pass different input files to the
process_data
function, allowing you to process various data sets within your script.Write a loop to process multiple files. This loop should call the processing function on each file:
# Loop through each file and call the processing function for file in /path/to/files/*.txt; do # Call the processing function on the current file processed_data=$(process_data $file) # Write the processed data to a new file echo $processed_data > "${file}_processed.txt" done
Here’s an explanation of the code block above:
for file in /path/to/files/*.txt; do
: A loop that iterates over each file in the/path/to/files/
directory with a.txt
extension. The loop assigns each file’s path to the variable file.processed_data=$(process_data $file)
: This line calls theprocess_data
function and passes the current file as an argument. The output of the function is stored in theprocessed_data
variable. It uses the$()
syntax for command substitution, which captures the output of a command.echo $processed_data > "${file}_processed.txt"
: This line writes the value of processed_data to a new file with a name derived from the original file name. The${file}_processed.txt
notation appends_processed.txt
to the original file name.
When you’re done, your script should look like this:
#!/bin/bash
# Define the processing function
process_data() {
# Take the input file as an argument
input_file=$1
# Process the data using command-line tools
output_file=$(cut -f 1,3 $input_file | grep 'foo' | sort -n)
# Return the processed data
echo $output_file
}
# Loop through each file and call the processing function
for file in /path/to/files/*.txt; do
# Call the processing function on the current file
processed_data=$(process_data $file)
# Write the processed data to a new file
echo $processed_data > "${file}_processed.txt"
done
This task demonstrates how shell scripts can efficiently process data. It introduces core bash scripting concepts, such as functions, and the use of command-line tools such as grep
, cut
, and sort
.
Check out this article to learn more about text-processing commands in Linux.
Task 3: Automating Log Analysis
Any software system needs log files to keep track of significant events and diagnose problems. In this task, you will automate log file analysis using shell scripts.
Concepts Covered
To accomplish this task, you will need to understand the following:
Regular expressions: These are characters you can use to define a search pattern. You can use regular expressions for data validation, searching and manipulating text, and extracting specific information from files.
Command line tools: These are programs or libraries that accomplish a specific task. You can use command line tools to perform text processing and file manipulation.
Step-by-Step Guide
The script to perform log analysis automatically will perform the following tasks:
- Search for errors in the syslog file.
- Remove irrelevant information.
- Count the number of errors found.
- Saves errors to a newly created file.
The following steps will guide you in creating an automated log analysis script:
Define the log file to be analyzed using a variable:
#!/bin/bash # Define the log file LOG_FILE="/var/log/syslog"
Use the
[grep](https://man7.org/linux/man-pages/man1/grep.1.html)
command to search for specific keywords or patterns in the log file:# Search for errors in the log file ERRORS=$(grep "error" "$LOG_FILE")
The code above searches for the word error in the log file specified by
$LOG_FILE
and assigns the output to theERRORS
variable. Below is a detailed explanation:ERRORS=
: This assigns the result of the command to the variableERRORS
. It prepares the variable to store the output of the grep command.grep "error" "$LOG_FILE"
: This command searches for the word error within the contents of the file stored in the variable$LOG_FILE
.grep
: This is a command-line tool used for searching patterns within files."error"
: This is the pattern or keyword we are searching for. In this case, it is the word error."$LOG_FILE"
: This is the variable that contains the path to the log file we want to search in. The double quotes ensure that the variable is expanded to its value.
Use the
[sed](https://earthly.dev/blog/sed-find-replace/)
command to modify or filter log file data. The code below aims to remove any leading text before the string error and any trailing text after the string at:# Filter out irrelevant data ERRORS=$(echo "$ERRORS" | sed -e "s/.*error: //" -e "s/ at .*$//")
Note: From the code above, the irrelevant data refers to the leading and trailing text that is not part of the actual error message.
Use the
wc
command to count the number of errors:# Count the number of errors ERROR_COUNT=$(echo "$ERRORS" | wc -l)
The
wc
command is used in conjunction with the-l
option to count the number of lines in the input.Here’s a breakdown of the syntax:
echo "$ERRORS"
: Theecho
command is used to print the content of the$ERRORS
variable.|
: The pipe symbol (|
) is used to redirect the output of the previous command (echo “$ERRORS”) as input to the next command (wc -l
).wc -l
: The wc command is used to count lines in the input. The-l
option tellswc
to count only the lines and not other elements like words or characters.
Write the log analysis output to the file or display it in the terminal:
# Display the results echo "Found $ERROR_COUNT errors:" echo "$ERRORS" > errors.txt
In the end, your script should look like this:
#!/bin/bash # Define the log file LOG_FILE="/var/log/syslog" # Search for errors in the log file ERRORS=$(grep "error" "$LOG_FILE") # Filter out irrelevant data ERRORS=$(echo "$ERRORS" | sed -e "s/.*error: //" -e "s/ at .*$//") # Count the number of errors ERROR_COUNT=$(echo "$ERRORS" | wc -l) # Display the results echo "Found $ERROR_COUNT errors:" echo "$ERRORS" > errors.txt
This task introduces regular expressions and command-line tools like
grep
,sed
, andawk
, commonly used in shell scripting.Using sed for find and replace, is a great article to properly understand text processing.
Task 4: Automating System Maintenance
Among the most critical aspects of software development is system maintenance. System maintenance involves a set of tasks and processes that are performed regularly to keep the system running smoothly and efficiently.
Automating system maintenance processes involves using tools, scripts, and technologies to streamline and simplify the execution of routine maintenance tasks. Instead of manually performing each task, automation helps to get rid of repetitive and time-consuming tasks, reducing human error and increasing efficiency. In this task, you’ll learn how to automate system maintenance processes.
Concepts Covered
This task covers the following concepts:
User input: The user provides information to a program while it runs.
Error handling: This is detecting and resolving errors when running a program.
Step-by-Step Guide
The script to perform an automatic system maintenance task will include the following:
- Prompt the user for confirmation before proceeding. If the user fails to respond, an error code is displayed. -Create a log file if it doesn’t exist -Check for system errors and act -Send the log file via email to the system admin
- Restart the Apache web server: After performing the necessary system maintenance tasks, the script includes the command
service apache2 restart
to restart the Apache web server. This step ensures that any changes or updates made during the maintenance process take effect, ensuring the proper functioning of the web server.
You can use the following steps to create an automated system maintenance script:
Get user confirmation from the user:
#!/bin/bash # Get user confirmation before proceeding read -p "This script will perform system maintenance tasks. Are you sure you want to proceed? (y/n) " confirm # Check if user confirmation is not "y" or "Y" if [[ $confirm != [yY] ]]; then echo "Aborting script." # Print a message indicating script abortion exit 1 # Exit the script fi
Here’s an explanation for the code block above:
read -p
: Prompts the user with the message in quotes and stores their input in theconfirm
variable.if [[ $confirm != [yY] ]]; then
: Starts an if statement and checks if the value of the confirm variable is not equal to eithery
orY
.echo "Aborting script."
: Prints the message Aborting script. to the console.exit 1
: Exits the script with an exit code of 1, indicating an error or abnormal termination.
Define the
log_file
location in a variable:# Define variables log_file="/var/log/system_maintenance.log"
Create a log file if it doesn’t exist:
# Create a log file if it doesn't exist touch $log_file
Check the system disk usage and append the result to the log file:
df -h >> $log_file
Here’s a breakdown of the command:
df
: Thedf command
stands for “disk free” and is used to display information about file system disk space usage.-h
: The-h
option is used to display the disk space in a human-readable format, making it easier to understand the sizes in a more familiar unit (e.g., GB, MB).>> $log_file
: The>>
operator is used for output redirection and appends the output of thedf -h
command to the specified log file$log_file
variable. This allows you to capture the disk usage information in the log file for later reference or analysis.
Remove the old files:
# Remove old log files find /var/log/ -type f -name "*.log" -mtime +30 -delete >> $log_file
The code block above searches for log files
*.log
in the/var/log/
directory that are older than 30 days and deletes them. Here is an explanation for the commands above:find /var/log/
: This command searches for files under the/var/log/
directory.-type f
: This option specifies that only regular files should be considered, excluding directories or other types of files.-name "*.log"
: This specifies that the search should only include files with names ending in.log
.-mtime +30
: This condition specifies that the files should have a modification timestamp older than 30 days.-delete
: This action deletes the matching files.
Restart the Apache webserver service:
service apache2 restart >> $log_file
This code block above restarts the Apache webserver service and logs the output of the command to the log file.
service apache2 restart
: This command restarts the Apache webserver service.>> $log_file
: This appends the output of the command to the log file specified by the$log_file
variable.
Send an email to the system admin:
# Send email to the sysadmin with the log file attached mailx -a $log_file -s "System maintenance report" \ sysadmin@example.com
mailx
: This command is used to send emails from the command line.-a $log_file
: This option specifies the attachment (log file) to include in the email.-s "System maintenance report"
: This option sets the subject of the email to “System maintenance report”.sysadmin@example.com
: This is a placeholder email address.
The complete script looks like the following:
#!/bin/bash # Get user confirmation before proceeding read -p "This script will perform system maintenance tasks. Are you sure you want to proceed? (y/n) " confirm if [[ $confirm != [yY] ]]; then echo "Aborting script." exit 1 fi # Define variables log_file="/var/log/system_maintenance.log" # Create a log file if it doesn't exist touch $log_file # Check system disk space usage and append the result to the log file df -h >> $log_file # Remove old log files find /var/log/ -type f -name "*.log" -mtime +30 -delete >> $log_file service apache2 restart >> $log_file # Send an email to the sysadmin with the log file attached mailx -a $log_file -s "System maintenance report" \ sysadmin@example.com
Automating system maintenance tasks using shell scripts can save time and reduce errors. Furthermore, introducing user input and error handling can create robust and reliable shell scripts that simplify our system’s maintenance.
Task 5: Automating Local Application Deployment with Docker
Software deployment involves moving an application from development to production. It can be complex and time-consuming, but it is necessary for software development. Automation can streamline deployment and reduce errors.
Concepts Covered
In this task, you will assume your application is stored in a version control system like Git. You will use a containerization tool like Docker to build and deploy the application.
The following concepts are covered in the task:
Version control: Version control allows developers to keep track of different code versions and collaborate effectively on software development projects.
Environment variables: These are variables defined in the operating system’s environment and can be accessed by programs running on the system.
Containerization: This is creating and managing lightweight, portable software containers that run software applications.
Step-by-Step Guide
The steps below can be used to create an automated deployment script:
Clone the Git repository containing your application code:
#!/bin/bash git clone <repository-url>
Set environment variables in your shell script to store sensitive information such as API keys and passwords:
export DB_PASSWORD=<password> export API_KEY=<api-key>
Build a Docker image of your application using a Dockerfile:
docker build -t <image-name> .
Run the Docker container with the following command:
docker run -p 8080:8080 -e DB_PASSWORD=$DB_PASSWORD -e API_KEY=$API_KEY <image-name>
From your browser, you can access your application by navigating to
http://localhost:8080
.When you’re done, your script should look like this:
#!/bin/bash git clone <repository-url> export DB_PASSWORD=<password> export API_KEY=<api-key> docker build -t <image-name> . docker run -p 8080:8080 -e DB_PASSWORD=$DB_PASSWORD -e API_KEY=$API_KEY <image-name>
This task provides a guide for creating an automated deployment script using containerization, which is a modern and widely-used approach to app deployment. Although the example demonstrates running the container locally, the principles can be applied to a production environment to ensure consistency and reliability.
Conclusion
Shell scripts offer a powerful solution to automate various tasks, such as file backups, data processing, and system maintenance. This article has provided a hands-on understanding of scripting features like variables, loops, system calls, and more. Shell scripting can decrease errors, boost productivity, and enhance system stability, becoming a highly sought skill in the DevOps domain.
While it’s essential to select the right tool, shell scripting isn’t a cure-all. If you’ve enjoyed learning about automation with shell scripts, you might want to explore Earthly for container-based build scripting and automation. It’s super cool stuff and a great next step in your journey towards mastering automation tools.
Earthly makes CI/CD super simple
Fast, repeatable CI/CD with an instantly familiar syntax – like Dockerfile and Makefile had a baby.Writers at Earthly work closely with our talented editors to help them create high quality tutorials. This article was edited by:Get notified about new articles!We won't send you spam. Unsubscribe at any time.