Categories
How-To

Linux File Integrity Bash Script

We’ve built a quick and dirty Linux file integrity bash script to help you check your file integrity recursively in Linux. We call it (un-originally) our Linux File Integrity Bash Script. What this bash script does is allows you to generate file hashes and compare them to your files. By comparing the known good file hash with the current copy of the file you can have confidence that your data has not been altered by corruption or other malicious intent.

The Windows 10 Powershell version of the script is here.

This version differs from the powershell version as it uses a weaker MD5 hashing algorithm compared to the sha256 algorithm. This one is much faster however and for the purpose of verifying file integrity, this is not a concern.

DISCLAIMER: If you choose to use this, it’s done so at your own risk.

Step 1: Copy/paste this code snippet into a file called verifyfileintegrity.sh. Store that file in your home directory.

#!/bin/bash
# Date: 2022-01-11
# File: verifyfileintegrity.sh
# Date: 2022-01-12
#
# This script has two functions
# 1) Generate a list of file hashes for a given directory (recursively)
# 2) Verify that list against the current hashes for the files in that subdirectory

# The hash catalogue should be stored separately from the machine you are running the script on (ie: usb flash)
# this ensures that the hash catalogue maintains integrity (hasn't been modified).  If the hash
# catalogue is modified you will not be able to trust it.

# Catalogue generation should occur when contents change/files added/removed to ensure that they are added
# to the catalogue for comparison.

if [ $# -lt 4 ]; then
    echo "Use this script to create a file catalog of MD5 hashes that you can compare"
    echo "against to determine if file corruption or unintentional file modifications"
    echo "have occurred."
    echo ""
    echo "Four arguments required"
    echo "Usage: verifyfileintegrity.sh <path to check> <output file> <mode> <results file>"
    echo ""
    echo "Example usage: verifyfileintegrity.sh . files.md5 -c output.txt"
    echo " This example creates a new md5 hash catalogue with the files from"
    echo " the current working directory (recursively)"
    echo ""
    echo "Example usage: verifyfileintegrity.sh . files.md5 -v output.txt"
    echo " This example verifies the current hashes from files in the"
    echo " current working directory against the list contained in files.md5"
    echo ""
    echo "Path to check: The file path to compute the file hashes on"
    echo "Catalogue file: resulting file containing the file hashes"
    echo "Mode: one of:"
    echo "      -c create new hash catalogue"
    echo "      -v verify files against known hash catalogue"
    echo "Results file: File containing the results of the hashing verification process"
    echo ""
    echo "MD5 hashes are good and fast at determining if files have changed, however"
    echo "it is rare, but possible to get the same md5 hash from two different files."
    exit 1
fi

if [ $3 == "-c" ]; then
  if [ -f "$2" ]; then
    echo "Removing old hash catalogue"
    rm $2
  fi

  now=`date`
  echo "[$now] Computing MD5 hashes and storing in $2"
  FILES=`find $1 -type f ! -path *.cache* ! -path *.Trash* -exec md5sum {} \; >> $2` 
  now=`date`
  echo "[$now] Completed. MD5 Hash Catalogue: $2"
fi

if [ $3 == "-v" ] && [ -f "$2" ]; then
  now=`date`
  echo "[$now] Verifying MD5 hashes from $2 and storing results in $4"
  md5sum -c $2 --ignore-missing --quiet > $4
  cat $4
  now=`date`
  echo "[$now] Completed MD5 Hash verification."
fi

Step 2: Set the execute bit on the script

$ chmod 700 verifyfileintegrity.sh

Setting the execute bit on the script allows you to run the script.

Step 3: Run the script to generate the hash catalogue

$ ./verifyfileintegrity.sh ./ files.md5 -c results.txt

Step 4: Periodically run the script to verify your files against your hash catalogue at your discretion.

$ ./verifyfileintegrity.sh ./ files.md5 -v results.txt

This command will verify the hashes in the catalogue (stored in files.md5) against the current hashes for those files and place the results in results.txt.

NOTE: we do not advise disabling antivirus software and it is unnecessary to do so to use this script. However, it is significantly faster if you disable your antivirus software while running this.

This can take a while to run especially when running against large and numerous files, and during that time you may notice some file access error messages. You can safely ignore these as they are commonly due to permissions or if the file is currently being accessed.

Step 5: Check the results.txt file for discrepancies. When you’ve confirmed the legitimacy of the changes, re catalogue the files as per Step 3 above. An example entry is below.

./.local/share/gvfs-metadata/home: FAILED

A great way to use this script is with automation. You can run the verify function scheduled during the evening and review the results in the morning. Changes listed should not be a surprise to you. If there are file changes you can re-catalogue the files perhaps once a week. Just know that until you re-catalogue, the verification process will continue to flag those files you may have already seen flagged previously.

Check out more of our How-To’s for additional great tips like this one.

Leave a Reply